Skip to main content
ARS Home » Pacific West Area » Albany, California » Western Regional Research Center » Produce Safety and Microbiology Research » Research » Publications at this Location » Publication #410996

Research Project: Elucidating the Factors that Determine the Ecology of Human Pathogens in Foods

Location: Produce Safety and Microbiology Research

Title: DeepPL: a deep-learning-based tool for the prediction of bacteriophage lifecycle

item Zhang, Yujie
item MAO, MARK - Kansas State University
item ZHANG, ROBERT - Kansas State University
item Liao, Yen-Te
item Wu, Vivian

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 2/28/2024
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: Bacteriophages are viruses that infect bacteria and can be classified into two different lifecycles. Virulent phages (or lytic phages) have a lytic cycle that can lyse the bacteria host immediately after their infection. Temperate phages (or lysogenic phages) can integrate their phage genomes into bacterial chromosomes and replicate with bacterial hosts via the lysogenic cycle. Identifying phage lifecycles is a crucial step in developing suitable applications for phages. Compared to the complicatedly traditional biological experiments, several tools were designed for predicting phage lifecycle using different algorithms, such as random forest (RF), linear support-vector classifier (SVC), and convolutional neural network (CNN). In this study, we developed a natural language processing (NLP)-based tool, DeepPL, for predicting phage lifecycles via nucleotide sequences. The test results showed that our DeepPL had an accuracy of 94.65% with a sensitivity of 92.24% and a specificity of 95.91%. This result indicated that DeepPL captured the most fundamental genomic differences between virulent and temperate phages at the nucleotide level and accurately predicted phage lifecycles. Moreover, DeepPL had 100% accuracy in lifecycle prediction on the phages we isolated and biologically verified previously in the lab. Additionally, a mock phage community metagenomic dataset was used to test the potential usage of DeepPL in viral metagenomic research. DeepPL displayed 100% accuracy for individual phage complete genomes and high accuracies ranging from 71.14% to 100% on phage contigs produced by various next-generation sequencing technologies. Overall, our study indicates that DeepPL has a reliable performance on phage lifecycle prediction using nucleotide sequences and can benefit phage and viral metagenomic research and application.