|PEARCE, S - University Of California
|VAZQUEZ-CROSS, H - University Of California
|HERIN, S - University Of California
|HANE, D - David L Hane
|WANG, Y - University Of California
|DUBCOVSKY, J - University Of California
Submitted to: BMC Plant Biology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/17/2015
Publication Date: 12/24/2015
Citation: Pearce, S., Vazquez-Cross, H., Herin, S.Y., Hane, D., Wang, Y., Gu, Y.Q., Dubcovsky, J. 2015. WheatExp: an RNA-seq expression database for polypoid wheat. Biomed Central (BMC) Plant Biology. 15:299.
Interpretive Summary: Wheat is a globally important crop, accounting for 20 percent of the calories consumed by humanity. Major research efforts to increase wheat production have been focused on expanding genetic diversity, analyzing key traits, and developing genomic resources. However, the large size and complexity of the wheat genome have been substantial barriers to the analyses of its structure and the expression of individual genes. Bread wheat is a hexaploid with three related genomes within a single nucleus, each with its own version of any given gene. However, the quantitative and qualitative contributions of those different versions to the overall expression of a gene and the traits influenced by it could differ. The recent accumulation of large volumes of gene transcript data generated by next-generation sequencing technologies provides a great resource for analyzing the expression patterns of specific genes in the hexaploid genome background. However, processing and analyzing such large volumes of sequence data is a technical challenge and time-consuming task, limiting their application in gene function studies, particularly for smaller laboratories that lack access to high-performance computing infrastructure. In this article, we utilized the publicly available wheat transcript data and developed a bioinformatics resource to distinguish among transcripts from each version of each annotated gene in the hexaploid bread wheat genome. Data from multiple studies was processed and compiled into a database, WheatExp, which can be queried either by DNA sequence matching or by searching for a known gene of interest by its name or functional domains. WheatExp is hosted in the GrainGenes website (http://wheat.pw.usda.gov/WheatExp/) for public access and will serve as a critical resource to the research community for wheat improvement.
Technical Abstract: For functional genomics studies, it is important to understand the dynamic expression profiles of transcribed genes in different tissues, stages of development and in response to environmental stimuli. The proliferation in the use of next-generation sequencing technologies by the plant research community has led to the accumulation of large volumes of expression data. However, analysis of these datasets is complicated by the frequent occurrence of polyploidy among economically-important crop species. In addition, processing and analyzing such large volumes of sequence data is a technical and time-consuming task, limiting their application in functional genomics studies, particularly for smaller laboratories which lack access to high-powered computing infrastructure. Wheat is a good example of a young polyploid species with three similar genomes (97% identical among homoeologous genes), rapidly accumulating RNA-seq datasets and a large research community. We present WheatExp, an expression database and visualization tool to analyze and compare homoeologue-specific transcript profiles across a broad range of tissues from different developmental stages in polyploid wheat. Beginning with publicly-available RNA-seq datasets, we developed a pipeline to distinguish between transcripts from each annotated gene in the hexaploid bread wheat genome, including homoeologue-level resolution. Data from multiple studies is processed and compiled into a database which can be queried either by BLAST or by searching for a known gene of interest by name or functional domain. Expression data of multiple genes can be displayed side-by-side across all expression datasets providing immediate access to a comprehensive panel of expression data for specific subsets of wheat genes. The development of a publicly accessible expression database hosted on the GrainGenes website - http://wheat.pw.usda.gov/WheatExp/ - coupled with a simple and readily-comparable visualization tool will empower the wheat research community to use RNA-seq data and to perform functional analyses of target genes. The presented expression data is homoeologue-specific allowing for the analysis of relative contributions from each genome to the overall expression of a gene, a critical consideration for breeding applications. Our approach can be expanded to other polyploid species by adjusting sequence mapping parameters according to the specific divergence of their genomes.