Page Banner

United States Department of Agriculture

Agricultural Research Service

Research Project: SoyBase and the Legume Clade Database

Location: Corn Insects and Crop Genetics Research

Title: De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity

item Haas, Brian
item Papanicolaou, Alexie
item Yassour, Moran
item Grabherr, Manfred
item Blood, Philip
item Bowden, Joshua
item Couger, Matthew
item Eccles, David
item Li, Bo
item Lieber, Matthias
item Weeks, Nathan

Submitted to: Nature Protocols
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/21/2013
Publication Date: 7/11/2013
Citation: Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P.D., Bowden, J., Couger, M.B., Eccles, D., Li, B., Lieber, M., Weeks, N.T. 2013. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity. Nature Protocols. 8(8):1494-1512.

Interpretive Summary: While the cost of whole-genome sequencing has fallen in recent years, whole-genome assembly is still expensive, time consuming, and requires a high level of expertise. Transcriptome sequencing with RNA-Seq costs much less than whole-genome sequencing, allowing researchers on a limited budget to identify and characterize genes for organisms that lack assembled genome sequences. De novo transcriptome assembly software is required to identify such genic sequence from RNA-Seq. Trinity is one of the most popular de novo transcriptome assembly pipelines. Trinity also facilitates downstream analysis; for example, researchers can provide RNA-Seq for multiple samples (e.g., different tissues or environmental conditions) and use components to analyze differentially-expressed genes.

Technical Abstract: De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than five hours.

Last Modified: 10/18/2017
Footer Content Back to Top of Page