Location: Cool and Cold Water Aquaculture ResearchTitle: A long reads-based De novo assembly of the genome of the Arlee homozygous line reveals structural genome variance in rainbow trout
|MAGADAN, SUSANA - Universidade De Vigo|
|Waldbieser, Geoffrey - Geoff|
|YOUNGBLOOD, RAMEY - Mississippi State University|
|WHEELER, PAUL - Washington State University|
|THORGAARD, GARY - Washington State University|
Submitted to: Genes, Genomes, and Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/18/2021
Publication Date: 2/22/2021
Citation: Gao, G., Magadan, S., Waldbieser, G.C., Youngblood, R., Wheeler, P., Scheffler, B.E., Thorgaard, G., Palti, Y. 2021. A long reads-based De novo assembly of the genome of the Arlee homozygous line reveals structural genome variance in rainbow trout. Genes, Genomes, and Genomics. https://doi.org/10.1093/g3journal/jkab052.
Interpretive Summary: A high-quality reference physical genome map is important for facilitating meaningful genetic analyses and enhancing research on the physiology of the organism or species. In an effort to improve the rainbow trout reference genome assembly, we used recent improvements in DNA sequencing technology and bioinformatics to generate a new and improved reference genome assembly for rainbow trout. The new rainbow trout genome assembly and chromosome sequences provide major improvements for rainbow trout aquaculture genetics research, and for all aspects of research aimed at better understanding of the biology of this economically- and scientifically-important fish. In this report we highlight the improved contiguity of the new assembly. The number of gaps in the chromosome sequences was reduced from over 427,000 in the most recent version of the genome assembly to only 486 in the current assembly. We demonstrate how this improvement in the genome assembly has a very meaningful impact on our ability to better annotate and discover genes in the complex genome region that harbor the IGH genes that are very important for the immune response of the fish.
Technical Abstract: Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2N=64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ~95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.