Skip to main content
ARS Home » Plains Area » Clay Center, Nebraska » U.S. Meat Animal Research Center » Genetics and Animal Breeding » Research » Publications at this Location » Publication #326689

Title: A comprehensive porcine blood transcriptome

Author
item LIU, HAIBO - Iowa State University
item Smith, Timothy - Tim
item Nonneman, Danny - Dan
item DEKKERS, JACK - Iowa State University
item TUGGLE, CHRISTOPHER - Iowa State University

Submitted to: International Society for Animal Genetics (ISAG)
Publication Type: Abstract Only
Publication Acceptance Date: 3/15/2016
Publication Date: 7/23/2016
Citation: Liu, H., Smith, T.P.L., Nonneman, D.J., Dekkers, J.C.M., Tuggle, C.K. 2016. A comprehensive porcine blood transcriptome [abstract]. In: Proceedings of 35th International Society for Animal Genetics (ISAG) Conference, 23-17 July 2016, Salt Lake City, UT. pg. 69. P3033. Available: https://www.isag.us/docs/Proceedings/ISAG_Proceedings_2016.pdf

Interpretive Summary:

Technical Abstract: Blood sample analyses are extensively used in high throughput assays in biomedicine, as well as animal genetics and physiology research. However, the draft quality of the current pig genome (Sscrofa 10.2) is insufficient for accurate interpretation of many of these assays because of incomplete gene and transcript isoform annotations. In this study, we assembled a comprehensive blood de novo transcriptome, by using the Trinity platform on 162,285,683 pairs of paired-end and 183,116,578 single-end, clean and normalized Illumina reads (25 to 100 bases in length) from 5 independent RNA-seq studies of pig blood. This raw assembly consisted of 490,209 “putative transcripts” (PTs) from 397,560 genomic loci, and includes more than 97% of the normalized reads. To verify the porcine origin of these assembled PTs, we mapped them to a PacBio long read-based USMARC pig genome assembly (T. Smith et al., unpubl.) and to the Sscrofa 10.2 assembly. Overall, 99.4% and 94.2% of PTs could be mapped to the USMARC and Sscrofa 10.2 assemblies, respectively, with more than 97% coverage. Notably, the majority of 3,089 PTs that could not be mapped to the USMARC assembly were of bacterial, viral, or plant origin. We removed unmapped PTs that did not align to mammalian sequences in the NCBI nt database, and filtered the PTs using relaxed criteria based on minimum expression level, length, and splicing potential, producing a set of 159,146 unique PTs, 121,057 of which were putative spliced transcripts. We aligned the reduced set of PTs along with a newly available dataset of IsoSeq transcripts from pig liver, spleen and thymus (H. Liu et al. unpubl.) to the current Sscrofa 10.2 genome assembly. Visual inspection of these alignments in IGV showed that a large number of the assembled blood transcripts were structurally more complete/accurate than their counterparts in the Sscrofa 10.2 annotation. We will report further filtering, validation, and annotation of these PT, including alignment to IsoSeq-derived transcripts, as well as comparisons to the Ensembl pig transcriptome, the NCBI nucleotide database, and the SwissProt Uniprot/UniRef90 and EMBL-EBI Xfam databases to support robust prediction of the coding/non-coding potentials of novel PTs. This assembled and validated transcriptome can be used to improve pig genome annotation and enhance future high throughput studies of blood samples. Acknowledgement: USDA-NIFA-AFRI #2011-68004-30336.