|Kayser, Jean Patrick|
|Vallet, Jeffrey - Jeff|
Submitted to: Journal of Biomolecular Techniques
Publication Type: Abstract Only
Publication Acceptance Date: 12/20/2003
Publication Date: 3/20/2004
Citation: Kayser, J.R., Vallet, J.L., Cerny, R.L. 2004. Defining parameters for homology tolerant database searching [abstract]. Journal of Biomolecular Techniques. 15(1):15.
Technical Abstract: Comparison of observed masses obtained by MS/MS to predicted masses from sequence databases does not work well for species with limited sequence information, because an exact sequence match is required. Homology searching (i.e., BLAST, MS-homology from ProteinProspector) is relatively uncharacterized. Our objective was to define a strategy for this analysis. MS/MS data from 9 proteins were generated during our ongoing examination of the swine intrauterine proteome using 2-D PAGE, trypsin digestion and a QTOF Ultima API interfaced with LC packing nano-HPLC system. Peak lists were generated using MassLynx NT software (Version 3.5, Micromass UK Ltd). The 20 most intense peptides, selected either on precursor trigger intensity or on total ion current, were de novo sequenced using PEAKS (Bioinformatics Solutions, Inc.). For each method, sequences from the most intense 5, 10 or 20 peptides were searched against the NCBInr mammals database using MS-homology, allowing for 10, 30 or 50% mismatch (2 x 3 x 3 factorial design). Protein scores were similar between methods of ranking and were greatest when 20 peptides were submitted and allowing at least 30% mismatch (p < 0.01). However, sets of random peptide sequences generated similar patterns in protein scores. Thus, specific protein scores of the 9 proteins were corrected by subtraction of the random peptide mean protein scores+2 standard deviations. Greatest average specific protein score was obtained using 30% mismatch and 20 peptides (p < 0.01). These data indicate that for species where sequence information is limited, MS-homology using the 20 most intense peptides based on trigger intensity, allowing for 30% mismatch, and using subtraction of random peptide protein scores gives a reliable method for protein identification.