Location: Soybean Genomics & Improvement Laboratory
Title: Proteomic identification and meta-analysis in Salvia hispanica RNA-Seq de novo assembliesAuthor
KLEIN, ASHWIL - University Of The Western Cape | |
HUSSELMANN, LIZEX - University Of The Western Cape | |
WILLIAMS, ACHMAT - University Of The Western Cape | |
BELL, LIAM - University Of The Western Cape | |
Cooper, Bret | |
RAGAR, BRENT - Harvard Medical School | |
TABB, DAVID - Stellenbosch University |
Submitted to: Plants
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 3/28/2021 Publication Date: 4/14/2021 Citation: Klein, A., Husselmann, L.H., Williams, A., Bell, L., Cooper, B., Ragar, B., Tabb, D.L. 2021. Proteomic identification and meta-analysis in Salvia hispanica RNA-Seq de novo assemblies. Plants. 10(4):765. https://doi.org/10.3390/plants10040765. DOI: https://doi.org/10.3390/Plants10040765 Interpretive Summary: Chia is a plant native to Mexico and Guatemala regions valued for its nutritional properties such as high amounts of fiber, antioxidants, minerals, vitamins, protein, and unsaturated fatty acids. Compared to plants like soybean and rice, little DNA information exists for chia to help identify seed proteins that contribute to its nutritional properties. In this collaborative research effort, scientists from South Africa sequenced RNA from different chia tissues like leaves, roots, and seeds while scientists at ARS and other centers performed mass spectrometry on leaves, roots and seeds. The RNA information was then used to interpret the mass spectrometry data to identify the proteins. Three to four thousand proteins were found in leaves, roots, and stems, but only two thousand were found in seeds. Nevertheless, comparative analysis identified sets of proteins unique to each tissue, particularly seeds. These findings will be useful to scientists in the government, at universities, or at private industry who want to identify the unique protein nutritional components of chia seed. Technical Abstract: While proteomics has demonstrated its value for model organisms and for those with mature genome annotations, proteomics has been of less value in non-model organisms that are unaccompanied by draft genome annotations. This project sought to determine the value of RNA-Seq experiments as a basis for establishing a set of protein sequences to represent a non-model organism, in this case the pseudocereal chia. Assembling four publicly available chia RNA-Seq datasets produced transcriptomes with high BUSCO completeness, though the number of transcripts and Trinity ‘genes’ varied considerably among them. After six-frame translation, ProteinOrtho detected substantial numbers of orthologs among other species within the taxonomic order Lamiales. These protein sequence databases demonstrated good identification efficiency for three different LC-MS/MS proteomics experiments, though a seed proteome showed considerable variability in identification of peptides based on seed protein sequence inclusion. These results demonstrate that proteomics laboratories will particularly benefit from generating sequence databases from RNA-Seq studies that incorporate many tissues, particularly if the LC-MS/MS experiments emphasize one of the tissues included in the RNA-Seq experiment. |