Skip to main content
ARS Home » Northeast Area » Leetown, West Virginia » Cool and Cold Water Aquaculture Research » Research » Publications at this Location » Publication #421074

Research Project: Integrated Research Approaches for Improving Production Efficiency in Rainbow Trout

Location: Cool and Cold Water Aquaculture Research

Title: Annotation of four rainbow trout genomes in support of a pan-genome reference [abstract]

Author
item KONVALINA, JOHN - University Of Missouri
item Gao, Guangtu
item Tripathi, Vibha
item Wiens, Gregory
item Palti, Yniv
item ELSIK, CHRISTINE - University Of Missouri

Submitted to: Aquaculture America Conference
Publication Type: Abstract Only
Publication Acceptance Date: 1/17/2025
Publication Date: 3/9/2025
Citation: Konvalina, J.D., Gao, G., Tripathi, V., Wiens, G.D., Palti, Y., Elsik, C.G. 2025. Annotation of four rainbow trout genomes in support of a pan-genome reference [abstract]. Aquaculture America Conference. 03092025.

Interpretive Summary:

Technical Abstract: Rainbow trout (Oncorhynchus mykiss) are a widespread aquaculture species and a model organism for fish research. Currently, there are four chromosome-level rainbow trout genome assemblies available. The first genome (Omyk_2.0) was assembled from the Swanson line, a line of rainbow trout from Alaska that has been in hatcheries for two generations and is thus classified as “semi-domesticated”. The second rainbow trout genome assembly, USDA_OmykA_1.1, came from a different clonal line, Arlee, which is fully domesticated and originated in northern California. The other two genome assemblies are from wild fish that were captured in the Whale Rock Reservoir in southern California (USDA_OmykWR_1.0) or Keithly Creek in Idaho (USDA_OmykKC_1.0). Annotation of protein coding genes from the four genomes is needed to enrich the reference transcriptome and to enable pan-gene comparative analyses. However, a high-quality RefSeq annotation from NCBI is currently only available for the Arlee reference genome assembly. Here, we developed a bioinformatic annotation pipeline to generate a reference transcriptome for each of the four genome assemblies using the Comparative Annotation Toolkit, with the Arlee RefSeq gene set as the reference, along with the BRAKER3 pipeline for the incorporation of novel gene predictions. Input for gene models came from public rainbow trout RNA-seq data and the OrthoDB database. New long-read transcriptome (Iso-Seq) data that we generated from a disease resistance study was used for discovery of novel genes and transcript isoforms in all four genomes. For functional annotation of the predicted gene models, we leveraged the Arlee RefSeq gene set, as well as the InterPro database to predict protein domains, gene ontology and pathways. Additionally, to better understand the impact of rainbow trout genome structural variation on gene structure and content, we used the program MCScanX to identify syntenic blocks based on gene order collinearity among the four genomes. The synteny information will enable us to identify gene differences that may be associated with differences in the life history and evolution of the four genetic lines. Overall, the annotated rainbow trout genomes and synteny dataset provide vital resources for the aquaculture research community and for basic research on the physiology, genetics and evolution of rainbow trout.