Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 10/25/2004
Publication Date: 2/1/2005
Citation: Bridges, S.M., Hodges, J.E., Wang, Y., Xian, H., Luthe, D., Williams, W.P. 2005. Computational support for research in maize proteomics and marker assisted selection [abstract]. Proceedings 2004 4th Annual Fungal Genomics, 5th Annual Multi-Crop Fumonisin Elimination and 17th Annual Multi-crop Aflatoxin Elimination Workshop. p. 54.
Technical Abstract: The Corn Host Plant Resistance Research Unit, USDA-ARS, Mississippi State University is conducting a number of different types of studies to determine the effects of biotic and abiotic factors on Aspergillus flavus infection and aflatoxin accumulation in maize with the goal of developing resistant maize cell lines. Computational support for this research has provided scientists with improved methods for data access and retrieval, new methods to analyze the effects of environmental factors on aflatoxin accumulation in different cell lines, and tools to improve protein identification rates in proteomics studies and to support efficient and informative annotation of the proteome. An integrated database system with data management, data mining, and data modeling capabilities has been developed that provides a comprehensive view of the maize genetics research at MSU. The database archives raw data, derived data and metadata collected or generated by the biologists. The database system has five partitions for germplasm data, field data, quantitative trait loci (QTL) analysis data, proteomics data, and weather data. A web-based interface provides fast, flexible access to the database system for investigators. The effects of environmental variables on aflatoxin levels are complex and difficult to predict. A genetic algorithm approach has been developed to extract features describing environmental factors that are correlated to aflatoxin levels. Data available for this study include aflatoxin levels at maturity, middle silk (flowering) date, and environmental data from 1998 to 2003. Functions for computing values of environmental variables are represented as a population of randomly initialized artificial chromosomes. Each chromosome represents the environmental variable to be measured, the period of time over which it should be measured, and how the values should be combined over the interval. The correlation of the index specified by each chromosome and measured aflatoxin levels is used to evaluate the "fitness" of each chromosome. The highest scoring environmental variables derived using this method were found to have R2 values of greater 0.90. Characterization of the maize proteome of the developing ear under different conditions has the potential to reveal the fundamental processes that confer resistance in some cell lines. Advances in proteomics have been made possible by high-throughput methods for gel electrophoresis and new technologies for mass spectrometry such as LC/MS/MS. However, these technologies can only be used to their full potential for protein identification if they are supported by the availability of high quality, annotated protein data sets. A method for generating such a data set from EST clusters was developed that is based on homology-based search. The effectiveness of this method for generating a high quality annotated set of translated ESTs for identification of proteins was tested by comparing the protein identification rates in proteins from cob using a data set (called the PIE Maize set) generated from assembled ESTs available from TIGR and with those generated using the NCBI protein database for corn, rice, and Arabidopsis. The PIE data set resulted in identification of 87.5% of the spots, while only 56% of the spots were identified with the NCBI database. Additional tools have been developed to streamline the protein identification process and to provide the Gene Ontology annotation of the identified proteins.