Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 10/16/2006
Publication Date: 2/1/2007
Citation: Bridges, S.M., Kelley, R., Magee, G.B., Wang, N., Burgess, S., Luthe, D., Williams, W.P. 2007. Annotation and computational analysis of maize proteome and gene expression datasets [abstract]. Proceedings 2006 Annual Multi-Crop Aflatoxin/Fumonisin Elimination and Fungal Genomics Workshop. p. 56.
Technical Abstract: Characterization of the transcriptome and proteome of the developing maize ear under different conditions has the potential to reveal the fundamental processes that confer resistance in some cell lines. We have previously reported the development of the PepIdent database of translated ESTs for maize for identification of proteins using mass spectroscopy, the PepSort tool for analysis of multi-dimensional protein identification technology (MudPIT) datasets, and the AgBase suite of tools for functional analysis of protein datasets. During the past year, the AgBase suite of tools has been extended for annotation of gene expression data, a new method has been developed for validation of peptide identifications, a semi-quantitation method has been developed for MudPIT proteomics data, and methods for characterizing changes in expression of both RNA and protein have been investigated. Protein identification from Sequest analysis of MudPIT data depends on accurate peptide identifications from trypsin digested mixtures. Several Sequest scoring outputs are provided for each peptide that allow a scientist to assess the quality of the peptide identifications, but the standard practice of using fixed thresholds for the scores can reduce the number of peptide identifications or admit an unacceptable number of false positive identifications. We have developed a new method that compares identifications from a random database and real database for different scoring values and uses an outlier detection approach to determine which identifications should be accepted. Relative quantification of proteins in MudPIT datasets has typically been done using differential labeling with agents such as ICAT. We have implemented tools for relative quantification of proteins that do not require labeling. The MaizeGO is part of Agbase www.agbase.msstate.edu, a curated, open-source, Web-accessible resource for functional analysis of maize gene products. The AgBase Goanna tool has been extended to provide functional annotations for gene expression data. The parent sequences of the probes of interest from the array (those that are differentially regulated) are provided as input in fasta format. A blastn search is performed against the AgBase database and a user-specified number of top hits (above a user-specified E-value) are returned along with their GO annotations. For each hit, a link is provided to the alignment that produced the hit to allow the scientist to evaluate the quality of the alignment. After review, the GO categories from the accepted alignments can be analyzed. The GO Slim Viewer tool provides an overview of the membership in GO categories of a gene expression data set using categories defined in a GO Slim. Output is in a form that can be easily imported into Excel for formatting as a pie chart. When this tool was used with a gene expression study of the differences in Aspergillus-inoculated and un-inoculated plants of two different varieties, a few of the GO terms came from annotated maize proteins, but most came from homology with rice or Arabidopsis proteins. Current efforts are underway to integrate these results with systems biology tools such as Cytoscape for modeling biological networks.