|SANCHEZ CASTANO, CECILIA - West Virginia University|
|Smith, Timothy - Tim|
|SALEM, MOHAMED - West Virginia University|
|YAO, JIANBO - West Virginia University|
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/9/2009
Publication Date: 11/25/2009
Citation: Sanchez Castano, C., Smith, T.P., Wiedmann, R.T., Vallejo, R.L., Salem, M., Yao, J., Rexroad Iii, C.E. 2009. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library. Biomed Central (BMC) Genomics. 10:559.
Interpretive Summary: The National Center for Cool and Cold Water Aquaculture (NCCCWA) is selectively breeding rainbow trout broodstock for improvement of aquaculture production traits. Molecular genetic technologies and the use of marker assisted selection (MAS) strategies have the potential of increasing the rate of genetic gain of traditional selective breeding schemes. To enhance capabilities for genomic analyses in rainbow trout, including genomic selection, a large suite of polymorphic markers must be identified. Single Nucleotide Polymorphisms (SNPs) are highly abundant markers which are evenly distributed throughout the genome and can be functionally relevant. They are suitable markers for fine mapping of genes and candidate gene association studies to identify alleles potentially affecting important traits. We employed a high-throughput strategy to discover SNPs in rainbow trout. Over twenty thousand putative SNPs were identified, 384 were tested resulting in a 48% validation rate; 167 (43.5%) of those markers were placed on the rainbow trout linkage map. According to the validation results, we anticipate that at least 10,000 putative SNPs from the original data set will be useful in future genomic studies. However, to increase the efficiency of SNP discovery we must dramatically decrease the false discovery rate that results from an evolutionarily recent whole genome duplication event.
Technical Abstract: BACKGROUND: To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high-throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. RESULTS: The reduced representation library consisted of 440bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing and average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 36% of the sequences from the validated markers were associated with rainbow trout transcripts. CONCLUSIONS: The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.