Location: Genetics and Animal BreedingTitle: Evaluation of transcript assembly in multiple porcine tissues suggests optimal sequencing depth for RNA-Seq using total RNA library
Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: 10/29/2019
Publication Date: 1/15/2020
Citation: Keel, B.N., Oliver, W.T., Keele, J.W., Lindholm-Perry, A.K. 2020. Evaluation of transcript assembly in multiple porcine tissues suggests optimal sequencing depth for RNA-Seq using total RNA library. In: Proceedings of Plant and Animal Genome XXVIII Conference, January 11-15, 2020, San Diego, California. Poster No. PE0128. Available: https://plan.core-apps.com/pag_2020/abstract
Technical Abstract: RNA sequencing (RNA-Seq) libraries are prepared by either selecting polyadenylated (poly(A)) messenger RNAs or by depleting total RNA of highly abundant ribosomal RNAs. The latter facilitates novel transcript discovery by simultaneously characterizating both poly(A) and non-poly(A) RNA. Due to higher cost of preparing total RNA libraries, determining an optimal target sequencing depth would assist researchers in optimizing the cost-effectiveness of their experiments. The depth of sequencing needed for transcriptome profiling in total RNA-Seq was evaluated using a random sampling method. RNA-Seq was performed on 4 longissimus dorsi libraries, 4 liver libraries, and 8 hypothalamus libraries to produce base libraries of ~130 million (M) reads. Libraries were down-sampled to appropriate percentage of total reads to obtain approximately 5, 10, 20, 40, 60, 80, 100, and 120 M reads. Aligning reads to the genome and assembling transcripts resulted in 16,647 unique transcripts being identified in muscle, while 19,851 and 26,664 were identified in liver and brain, respectively. Multiple analyses highlighted the following: 1) sequencing depth below 10 M reads is insufficient for detecting transcript expression especially in medium and lowly expressed transcripts, 2) when sequencing depth is above 40 M reads relatively reliable measurement of expression is expected, and 3) sequencing deeper than 80 M reads does not have a significant effect on the number of transcripts identified. Thus, we propose that a depth of 80 M reads per library is sufficient to identify and quantify expression of transcripts across the genome.