|LOUHA, SWARNALI - University Of Georgia|
|Meinersmann, Richard - Rick|
|ABDO, ZAID - Colorado State University|
|GLENN, TRAVIS - University Of Georgia|
Submitted to: Applied and Environmental Microbiology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/11/2020
Publication Date: 12/17/2020
Citation: Louha, S., Meinersmann, R.J., Abdo, Z., Berrang, M.E., Glenn, T. 2020. An open-source program (Haplo-ST) for whole-genome sequence yping shows extensive diversity among Listeria monocytogenes isolates in outdoor environments and poultry Processing Plants. Applied and Environmental Microbiology. 87:e02248. https://doi.org/10.1128/AEM.02248-20.
Interpretive Summary: Listeria monocytogenes (Lm) is an important foodborne pathogen frequently found in the environment. Whole genome sequencing (WGS) is used for strain characterization that may be useful for tracking the organism. WGS creates large amounts of data and several algorithms have been created for interpreting the data. One useful algorithm is whole-genome multi-locus sequence typing (wgMLST) that does a gene-by-gene analysis of what genes are present in a given strain and determines the version (allele) of the gene. wgMLST results in an exquisite typing of the organism and provides information that can be examined for making hypotheses on the relative importance of each gene on the makeup of a population. The software tools for doing these analyses need improvement, thus the first goal of this project was to create software to take raw sequence data as input that is then assembled into sequences for all the genes (loci) and each is identified for the version with a standard database. The data are then incorporated into a database that formats the data for comparisons of the strains. The developed software was applied to two different data sets of Lm sequences. The first data set was of strains from a river in North Eastern Georgia that were categorized by the riparian characteristics in which the strains were isolated. The second set of Lm was of strains that were isolated from two further processing plants and were categorized by whether the strain was persistent in the plant or only found transiently. A total of 2554 genes were analyzed. There were 111 genes that were identified as possibly being linked to whether a strain was persistent or transient in processing plants. These include genes related to metabolism, membrane transport, oxidative stress, and chemotaxic functions. Chemototaxis is important for niche localization that may help the bacteria set up persistent colonization. Thus we succeeded in rapid analysis of Lm genomic sequences that yielded information that will direct future hypotheses.
Technical Abstract: A reliable and standardized classification of Listeria monocytogenes (Lm) is of pivotal importance for accurate strain identification during outbreak investigations. Current whole-genome sequencing (WGS) based approaches for strain characterization either lack standardization, rendering them less suitable for data exchange, or are not freely available. Thus, we developed a portable and open-source tool Haplo-ST to improve standardization and provide maximum discriminatory potential to WGS data tied to an MLST (multi locus sequence typing) framework. Haplo-ST performs whole-genome MLST (wgMLST) for Lm while allowing for data exchangeability worldwide. This tool takes in (i) raw WGS reads as input, (ii) cleans the raw data according to user specified parameters, (iii) assembles genes across loci by mapping to genes from reference strains, (iv) assigns allelic profiles to assembled genes and provides a wgMLST subtyping for each isolate. Data exchangeability relies on the tool assigning allelic profiles based on a centralized nomenclature defined by the widely-used BIGSdb-Lm database. Tests on Haplo-ST’s performance with simulated reads from Lm reference strains yielded a high sensitivity of 97.5%, and coverage depths of = 20× was found to be sufficient for wgMLST profiling. We used Haplo-ST to characterize and differentiate between two groups of Lm isolates, derived from the natural environment and poultry processing plants. Phylogenetic reconstruction showed sharp delineation of lineages within each group and no lineage-specificity was observed with isolate phenotypes (transient/persistent) or origins. Genetic differentiation analyses between isolate groups identified 21 significantly differentiated loci, potentially enriched for adaptation and persistence of Lm within poultry processing plants. Importance We have developed an open-source tool that provides allele-based subtyping of Lm isolates at the whole genome level. Along with allelic profiles, this tool also generates allele sequences, and identifies paralogs, which is useful for phylogenetic tree reconstruction and deciphering relationships between closely related isolates. More broadly, Haplo-ST is flexible and can be adapted to characterize the genome of any haploid organism simply by installing an organism-specific gene database. Haplo-ST also allows for a scalable subtyping of isolates; fewer genes can be used for low resolution typing, whereas higher resolution can be achieved by increasing the number of genes used in the analysis. Our tool enabled clustering of Lm isolates into lineages and detection of potential loci for adaptation and persistence in food processing environments. Findings from these analyses highlights the effectiveness of Haplo-ST in subtyping and evaluating relationships between isolates for routine surveillance, outbreak investigations and source tracking.