Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Soybean Genomics & Improvement Laboratory » Research » Publications at this Location » Publication #382136

Research Project: Biotechnology Strategies for Understanding and Improving Disease Resistance and Nutritional Traits in Soybeans and Beans

Location: Soybean Genomics & Improvement Laboratory

Title: Proteomic identification and meta-analysis in Salvia hispanica RNA-Seq de novo assemblies

item KLEIN, ASHWIL - University Of The Western Cape
item HUSSELMANN, LIZEX - University Of The Western Cape
item WILLIAMS, ACHMAT - University Of The Western Cape
item BELL, LIAM - University Of The Western Cape
item Cooper, Bret
item RAGAR, BRENT - Harvard Medical School
item TABB, DAVID - Stellenbosch University

Submitted to: Plants
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/28/2021
Publication Date: 4/14/2021
Citation: Klein, A., Husselmann, L.H., Williams, A., Bell, L., Cooper, B., Ragar, B., Tabb, D.L. 2021. Proteomic identification and meta-analysis in Salvia hispanica RNA-Seq de novo assemblies. Plants. 10(4):765.

Interpretive Summary: Chia is a plant native to Mexico and Guatemala regions valued for its nutritional properties such as high amounts of fiber, antioxidants, minerals, vitamins, protein, and unsaturated fatty acids. Compared to plants like soybean and rice, little DNA information exists for chia to help identify seed proteins that contribute to its nutritional properties. In this collaborative research effort, scientists from South Africa sequenced RNA from different chia tissues like leaves, roots, and seeds while scientists at ARS and other centers performed mass spectrometry on leaves, roots and seeds. The RNA information was then used to interpret the mass spectrometry data to identify the proteins. Three to four thousand proteins were found in leaves, roots, and stems, but only two thousand were found in seeds. Nevertheless, comparative analysis identified sets of proteins unique to each tissue, particularly seeds. These findings will be useful to scientists in the government, at universities, or at private industry who want to identify the unique protein nutritional components of chia seed.

Technical Abstract: While proteomics has demonstrated its value for model organisms and for those with mature genome annotations, proteomics has been of less value in non-model organisms that are unaccompanied by draft genome annotations. This project sought to determine the value of RNA-Seq experiments as a basis for establishing a set of protein sequences to represent a non-model organism, in this case the pseudocereal chia. Assembling four publicly available chia RNA-Seq datasets produced transcriptomes with high BUSCO completeness, though the number of transcripts and Trinity ‘genes’ varied considerably among them. After six-frame translation, ProteinOrtho detected substantial numbers of orthologs among other species within the taxonomic order Lamiales. These protein sequence databases demonstrated good identification efficiency for three different LC-MS/MS proteomics experiments, though a seed proteome showed considerable variability in identification of peptides based on seed protein sequence inclusion. These results demonstrate that proteomics laboratories will particularly benefit from generating sequence databases from RNA-Seq studies that incorporate many tissues, particularly if the LC-MS/MS experiments emphasize one of the tissues included in the RNA-Seq experiment.