|SALEM, MOHAMED - Middle Tennessee State University|
|PANERU, BAM - Middle Tennessee State University|
|AL-TOBASEI, RAFET - Middle Tennessee State University|
|ABDOUNI, FATIMA - Middle Tennessee State University|
|THORGAARD, GARY - Washington State University|
|YAO, JIANBO - West Virginia University|
Submitted to: PLOS ONE
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/4/2015
Publication Date: 3/20/2015
Citation: Salem, M., Paneru, B., Al-Tobasei, R., Abdouni, F., Thorgaard, G., Rexroad III, C.E., Yao, J. 2015. Transcriptome assembly, gene annotation and tissue gene expression atlas of the rainbow trout. PLoS One. 10(3):1-27. DOI:10.1371/journal.pone.0121778.
Interpretive Summary: Efforts to conduct biological research in rainbow trout, including experiments aiming to improve aquaculture production efficiency, are enhanced with the public availability of gene sequence information (RNA). We sequenced RNA from 13 highly studied tissues to obtain 1.167 billion sequences that provide tissue specific gene information. This includes comparisons with sequence information from better studied species that in many cases indicate potential gene functions. This information has been made available in public archives.
Technical Abstract: Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complimented by transcriptome information that will enhance genome assembly and annotation. Previously, we reported a transcriptome reference sequence using a 19X coverage of Sanger and 454-pyrosequencing data. Although the previous work added a great wealth of annotated sequences, the transcriptome is still incomplete. In addition, gene expression in different tissues was not included in the previous study. In this study, 13 non-normalized cDNA libraries were sequenced from different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 46,457 representative contigs. Analysis of the selected contigs identified 45,291 transcripts with protein-coding ORFs, of which 15,802 were full-length sequences. A total of 5,529 contigs (11.9%), including 754 full-length sequences, did not match mRNA sequences in the rainbow trout genome reference. Mapping reads to the reference genome identified a large number of new genes. A digital gene expression atlas revealed 7,614 housekeeping and 4,830 tissue-specific genes. Expression of about 17,000-33,000 genes (37%- 71% of the identified genes) accounted for basic and specialized functions of each tissue. Spleen, muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with large numbers of genes involved in their expression patterns. This study provides a comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout; including annotation of the genome.