Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Environmental Microbial & Food Safety Laboratory » Research » Research Project #448121

Research Project: Precision Solutions for Pathogen Carriage on Dairy Farms Using Machine Learning

Location: Environmental Microbial & Food Safety Laboratory

Project Number: 8042-32420-008-005-S
Project Type: Non-Assistance Cooperative Agreement

Start Date: Sep 1, 2025
End Date: Aug 31, 2029

Objective:
The objective of this research is to develop actionable models using machine learning techniques to predict the presence of Salmonella enterica, a dual-threat pathogen causing significant animal health issues and human foodborne illnesses, and Shiga toxin-producing Escherichia coli (STEC) a major human pathogen, on dairy farms. Along with significant human and animal health implications, Salmonella enterica causes decreased milk production, spontaneous abortions in pregnant cattle, treatment costs, reduced feed efficiency, and losses from antibiotic residue-contaminated milk, while STEC poses direct risks to human health via contaminated products. This study aims to reduce the risk of human infections caused by cull dairy animals harboring pathogens and animal illness caused by Salmonella enterica. The primary outcome will be predictive models that offer practical strategies for improving animal health and enhancing food safety within the dairy industry.

Approach:
This study will be conducted on commercial dairy farms located in Tulare County, California. Farms will be selected to ensure variation in size, management practices, and environmental exposure, allowing for robust modeling across a diverse operational landscape. The longitudinal approach will allow for seasonal and annual variability to be captured in both pathogen prevalence and the linked risk factors. Sampling will occur monthly on each farm. Fecal samples will be collected from culled dairy cows on each study farm. Management histories of sampled cows and herd demographics will be collected as well. Fecal samples will be collected directly from the rectum. Samples will be tested for the presence and levels of Salmonella enterica and Shiga toxin-producing E. coli (STEC) using both culture-based and molecular detection methods such as PCR and qPCR. Animal-level metadata will be collected from farm records for each sampled animal and will include parity, lactation stage, days in milk, milk production history, recent disease events, antimicrobial administration, and reasons for culling. Herd-level management data will be collected, including culling protocols, vaccination strategies, feed regimens, manure management, bedding types, dust mitigation strategies, and pen clearing methods. Microbiological, environmental, management, and animal-level data will be compiled into a centralized database. Machine learning models such as random forest, gradient boosting machines, and neural networks, will be trained to predict the presence/absence and abundance of Salmonella enterica and STEC in cull animals. Feature importance analyses will be used to identify key predictive variables and uncover relationships between culling decisions, farm practices, environmental conditions and pathogen carriage. Models will be validated using cross-validation and external validation on data held out from specific farms or time periods. The ultimate goal is to build a predictive decision-support tool that enables producers to assess the likelihood of pathogen carriage in cull dairy cows under varying conditions and implement targeted preharvest interventions to mitigate risk of animal disease and food product contamination.