Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #314779

Title: Using routinely recorded herd data to predict and benchmark herd and cow health status

Author
item PARKER GADDIS, KRISTEN - University Of Florida
item Cole, John
item CLAY, JOHN - Dairy Records Management Systems(DRMS)
item MALTECCA, CHRISTIAN - North Carolina State University

Submitted to: Journal of Dairy Science
Publication Type: Abstract Only
Publication Acceptance Date: 3/17/2015
Publication Date: 7/12/2015
Citation: Parker Gaddis, K.L., Cole, J.B., Clay, J.S., Maltecca, C. 2015. Using routinely recorded herd data to predict and benchmark herd and cow health status. Journal of Dairy Science. 98(Suppl. 2)/Journal of Animal Science 93(Suppl. 3):815(abstr. 687).

Interpretive Summary:

Technical Abstract: Genetic improvement of dairy cattle health using producer-recorded data is feasible. Estimates of heritability are low, indicating that genetic progress will be slow. Improvement of health traits may also be possible with the incorporation of environmental and managerial aspects into herd health programs. The objective of this study was to use the more than 1,100 herd characteristics that are regularly recorded on farm test days to benchmark herd and cow health status. Herd characteristics were combined with producer-recorded health event data. Parametric and non-parametric models were used to predict and benchmark health status. Models implemented included stepwise logistic regression, support vector machines, and random forests. At both the herd- and individual-level, random forest models attained the highest accuracy for predicting health status in all health event categories when evaluated by ten-fold cross validation. Accuracy of prediction (SD) ranged from 0.59 (0.04) to 0.61 (0.04) in logistic regression models, 0.55 (0.02) to 0.61 (0.04) in support vector machine models, and 0.61 (0.04) to 0.63 (0.04) with random forest models at the herd level. Accuracy of prediction (SD) at the cow level ranged from 0.69 (0.002) to 0.77 (0.01) for support vector machine models and 0.87 (0.06) to 0.93 (0.001) with random forest models. Results of these analyses indicate that machine learning algorithms, specifically random forest, can be used to accurately identify herds and cows likely to experience a health event of interest. It was concluded that accurate prediction and benchmarking of health status using routinely collected herd data is feasible. Nonparametric models were better able to handle the large, complex data compared to traditional models. Further development and incorporation of predictive models into herd management programs will help to continue improvement of dairy herd health.