|Akavaram, Surya Tej
|SCHAUT, ROBERT - Oak Ridge Institute For Science And Education (ORISE)
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/25/2019
Publication Date: 3/8/2019
Citation: Sharma, V.K., Akavaram, S.N., Schaut, R.G., Bayles, D.O. 2019. Comparative genomics reveals structural and functional features specific to the genome of a foodborne Escherichia coli O157:H7. BMC Genomics. 20(1). Article 196. https://doi.org/10.1186/s12864-019-5568-6.
Interpretive Summary: Escherichia coli O157:H7 (O157) causes sporadic outbreaks of diarrheal illnesses in humans, which in children and the elderly could lead to kidney failure and even death. Cattle are considered the major reservoir for O157 bacteria and shedding of O157 bacteria by cattle in their feces is the major risk factor in contamination of meats at slaughter plants. Run-off water from cattle farms could also serve as a source for the contamination of vegetables and other produce. O157 bacteria are considered to be emerging pathogens and novel types of O157 bacteria capable of causing severe disease symptoms and greater morbidity and mortality have been increasingly linked to human disease outbreaks. Thus, identifying underlying genetic factors which make some O157 bacteria more powerful in causing disease in humans would allow use of this information in developing methods for rapid detection of virulent O157 bacteria. Availability of technologies for cheaper and faster decoding and analysis of bacterial genetic information is assisting in identifying genetic fingerprints unique to different O157 bacterial populations. We tested one technology for comparing the whole genome of an O157 bacterial strain isolated in 1986 from a human disease outbreak to other O157 strains linked to disease outbreaks reported at different time points and in different geographic regions. This comparison allowed identification of a specific genetic fingerprint for the 1986-O157 strain. The practical field applications of having this SNP-fingerprint would be to use this as a reference tool for determining how O157 genetic fingerprints change in cattle or in the environment, and what effect these changes have on the disease-causing potential of O157 bacteria having newer genetic fingerprints.
Technical Abstract: Escherichia coli O157:H7 (O157) have caused human disease outbreaks since 1982. Genome comparisons of outbreak strains is crucial for understanding their epidemiology, evolution, metabolism, and virulence potential. In the current study, we identified single nucleotide polymorphisms (SNPs) in a foodborne O157 strain NADC 6564 by using a SNP-finding software pipeline and two reference O157 strains. A SNP-based phylogenetic tree grouped NADC 6564 with the lineage I O157 strains and showed its divergence from this group due to the presence of additional SNPs. The KEGG annotation of proteins encoded by the genes containing missense (non-synonymous) SNPs revealed distribution of affected proteins among various categories (genetic information processing, metabolism, signaling and cellular processes, and unknown or uncharacterized). Since the first SNP-finding pipeline predicted a “moderate” impact for all missense SNPs, we used a second pipeline to validate two of these initial predictions by analyzing structural and functional effects of the missense SNPs on tryptophanase (TnaA) and arabinose import ATP-binding protein (AraG). The second pipeline identified protein domains containing the missense amino acids, estimated the mutant protein stability, predicted ‘neutral’ effects for the missense SNPs on mutant TnaA and AraG functions, and generated numerical and visual (heat map) outputs for predicted effects. The predicted ‘neutral’ effect of the missense amino acid on TnaA function was validated by demonstrating that both the mutant and the wild-type TnaA produced similar amounts of indole from tryptophan. Similarly, the predicted ‘neutral’ effect of the missense amino acid on AraG function of NADC 6564 were confirmed by observations that in the presence of arabinose NADC 6564 grew at a rate similar to EDL933 (wild-type AraG), but slightly slower than the Sakai strain (mutant AraG as in NADC 6564). In summary, the results of this study indicated that the first pipeline could initially be used for identifying various SNP types (synonymous or non-synonymous) and the identified SNPs could then be used for constructing phylogenetic trees. However, a second software pipeline and the ability to perform functional assays are obligatory for confirming the real effects of missense SNPs on structure and function of the mutant proteins containing missense SNPs.