Molecular Characterization of Foodborne Pathogens Site Logo
ARS Home About Us Helptop nav spacerContact Us En Espanoltop nav spacer
Printable VersionPrintable Version     E-mail this pageE-mail this page
Agricultural Research Service United States Department of Agriculture
Search
  Advanced Search
 
Programs and Projects
Subjects of Investigation
 

Research Project: GENOMIC AND PROTEOMIC ANALYSIS OF FOODBORNE PATHOGENS

Location: Molecular Characterization of Foodborne Pathogens

Title: BS-KNN: an effective algorithm for predicting protein subchloroplast localization

Authors
item Hu, Jing -
item Yan, Xianghe

Submitted to: Evolutionary Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: December 20, 2011
Publication Date: January 5, 2012
Citation: Hu, J., Yan, X. 2012. BS-KNN: an effective algorithm for predicting protein subchloroplast localization. Evolutionary Bioinformatics. 8:79-87.

Interpretive Summary: Although the role of chloroplasts as the photosynthetic apparatus in cells of green plants and eukaryotic algae has been well defined, little is known about the location of the proteins residing in the intricate network of membranes within the chloroplast. Chloroplasts are enveloped by four membrane layers providing an ideal computational, experimental, and theoretical point of view to study protein sub-cellular localization. The identification of sub-cellular location of these proteins could provide an in-depth understanding of protein-protein interactions and protein function prediction. In this study, we present a computer-based method to predict the location of proteins and to assign them to defined regions within the chloroplast. This case study could provide an excellent way of developing and applying new algorithm/software for functional bacterial protein prediction as well.

Technical Abstract: Chloroplasts are organelles found in cells of green plants and eukaryotic algae that conduct photosynthesis. Knowing a protein’s subchloroplast location provides in-depths insights about the protein’s function and the microenvironment where it interacts with other molecules. Despite the chloroplast proteome projects and several computational methods for identifying chloroplast proteins, there are only a very limited number of methods for predicting proteins’ subchloroplast locations. In this paper, we present a bit-score weighted K-nearest neighbor method for predicting protein subchloroplast locations. The method makes prediction based on the bit-score weighted Euclidean distance calculated from the composition of selected pseudo-amino acids. Our method achieves 76.4% overall accuracy in assigning proteins to 4 subchloroplast locations in cross-validation. When tested on an independent set that was not seen by the method during the training and feature selection, the method achieves a consistent overall accuracy of 76.0%. Comparisons showed that it outperformed previously published methods. The method was also applied to predict subchloroplast locations of proteins in the chloroplast proteome. The software and datasets of the proposed method is available at https://edisk.fandm.edu/jing.hu/bsknn/bsknn.html.

   

 
Project Team
Fratamico, Pina
Yan, Xianghe
Gunther, Nereus - Jack
Liu, Yanhong
 
Publications
   Publications
 
Related National Programs
  Food Safety, (animal and plant products) (108)
 
Related Projects
   MOLECULAR SEROTYPING AND CHARACTERIZATION OF SHIGA TOXIN-PRODUCTING E. COLI (STEC)
   ENHANCEMENT OF MINORITY STUDENT PARTICIPATION IN FOOD SAFETY
   GENOMICS AND PROTEOMIC TECHNOLOGY FOR MICROBIAL PATHOGEN CHARACTERIZATION AND IDENTIFICATION
 
 
Last Modified: 05/19/2013
ARS Home | USDA.gov | Site Map | Policies and Links 
FOIA | Accessibility Statement | Privacy Policy | Nondiscrimination Statement | Information Quality | USA.gov | White House