Location: Endemic Poultry Viral Diseases ResearchTitle: Prediction of full-length transcripts in 19 chicken tissues by Oxford Nanopore long-read sequencing
|GUAN, DAILU - University Of California, Davis|
|HALSTED, MICHELLE - University Of California, Davis|
|ISLAS-TREJO, ALMA - University Of California, Davis|
|GOSZCZYNSKI, DANIEL - University Of California, Davis|
|ROSS, PABLO - University Of California, Davis|
|ZHOU, HUAIJUN - University Of California, Davis|
Submitted to: Frontiers in Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/30/2022
Publication Date: 9/30/2022
Citation: Guan, D., Halsted, M.M., Islas-Trejo, A.D., Goszczynski, D.E., Cheng, H.H., Ross, P., Zhou, H. 2022. Prediction of full-length transcripts in 19 chicken tissues by Oxford Nanopore long-read sequencing. Frontiers in Genetics. https://doi.org/10.3389/fgene.2022.997460.
Interpretive Summary: A major biological goal is to associate variation in the genome of an organism with phenotypic (trait) variation. This is no different for chicken where the first genome assembly was released in 2004. A key effort is to identify all the transcripts (RNAs) that are generated with respect to development age and tissue. In this submission, 19 different tissues from ARS experimental birds were sequenced using a new technology that provides more bases that can be read for each RNA molecule. By mapping these reads to the chicken genome followed by computational analyses, over 74,000 transcripts were identified with approximately 40% not previously known. This effort greatly increases the power of the chicken genome, which will ultimately result in more accurate methods to bred and rear poultry.
Technical Abstract: Comprehensive annotation of transcript isoforms across tissues is critical for understanding the phenotypic variability of farm animals. Herein, we generated Oxford Nanopore long-read sequencing data from a diverse panel of 19 chicken tissues comprising 68 samples collected from adult line 6 × line 7 F1 males and females in order to identify and annotate full-length transcripts. More than 23.8 million reads with a mean read length of 790 bases and average quality of 18.2 were generated. The annotation using the StringTie pipeline resulted in identification of 74,665 transcripts with mean length of 1,670 bases at 50,569 loci, representing ~1.5 transcripts per locus. The reference annotations (GRCg6a, Ensembl v102 and NCBI v105) reported ~24K genes with ~ 39K transcripts. Our findings revealed 50% of them were partially or fully supported by our predicted transcripts (i.e. known transcripts), and 40% cannot match any of reference transcripts (i.e., novel transcripts) indicating a high diversity of transcript isoforms in the chicken genome. Functional enrichment analysis of known transcripts with tissue-specificity reflected biological function of the tissue, while novel transcripts with coding potential and matching to the SwissProt database were often enriched in functions, e.g., synaptic transmission, olfactory transduction, and ABC transporters. In summary, our study generated long-read Nanopore transcriptomes derived a broad set of tissues focusing on the identification and annotation of full-length transcripts and isoforms in the chicken. The results have substantially improved the annotation of the chicken genome, and provided important knowledge in connecting genotype to phenotype in livestock species.