Location: Genetics and Animal BreedingTitle: Evaluation of transcript assembly in multiple porcine tissues suggests optimal sequencing depth for RNA-Seq using total RNA library
Submitted to: Animal Gene
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/4/2020
Publication Date: 9/1/2020
Citation: Keel, B.N., Oliver, W.T., Keele, J.W., Lindholm-Perry, A.K. 2020. Evaluation of transcript assembly in multiple porcine tissues suggests optimal sequencing depth for RNA-Seq using total RNA library. Animal Gene. 17-18. Article 200105. https://doi.org/10.1016/j.angen.2020.200105.
Interpretive Summary: High-throughput sequencing of RNA (RNA-seq) has become the standard approach for measuring and comparing levels of gene expression, and it has been used for the identification of novel transcripts and alternative splicing events. RNA-Seq technology can be categorized into three subclasses according to the types of RNA sequenced: messenger RNA (mRNA), micro RNA (miRNA), and total RNA. To date, the most popular type of RNA-Seq technology has been mRNA sequencing, focusing on the expression of protein coding genes. However, in recent years, the mRNA-centric paradigm of the transcriptome landscape has shifted to include noncoding regions of the genome, which requires the use of total RNA-Seq technologies. The selection of an appropriate sequencing depth is a critical step in RNA-Seq analysis. Too little depth can complicate the process by hindering the ability to identify and quantify lowly expressed transcripts, while too much depth can significantly increase the cost of the experiment while providing little to no gain in information. ARS scientists have used a random down-sampling approach to generate total RNA-Seq libraries with different sequencing depths from three porcine tissues to evaluate the optimal depth of sequence needed for transcriptome profiling using total RNA-Seq. As expected, the depth of sequencing had the greatest effect on lowly expressed transcripts. The results indicate that that a depth of 80 million reads per library is desirable to identify and quantify expression of transcripts across the genome. This work provides ARS scientists and other researchers with guidelines for generating total RNA-Seq data that will boost the cost-effectiveness of experiments.
Technical Abstract: RNA sequencing (RNA-Seq) libraries are prepared by either selecting poly(A) messenger RNAs (mRNA-Seq) or by depleting total RNA of highly abundant ribosomal RNAs (total RNA-Seq). The ribosomal RNA (rRNA) depletion protocols offer an attractive option for novel transcript discovery, as they facilitate the simultaneous characterization of polyadenylated and non-polyadenylated RNAs, including non-coding RNAs. However, the cost associated with total RNA-Seq is much greater than that of mRNA-Seq. Hence, the determination of an optimal target sequencing depth for total RNA-Seq would assist researchers in optimizing the cost-effectiveness of their experiments. In this study, we evaluate the appropriate depth of sequencing needed for transcriptome profiling in total RNA-Seq using a random sampling method to generate varying levels of sequencing depth in three different porcine tissues. As expected, our results indicated that the depth of sequencing has the greatest effect on the identification and quantification of lowly expressed transcripts. We propose that a depth of 80 M reads per library is desirable to identify and quantify expression of transcripts across the genome.