Skip to main content
ARS Home » Plains Area » Fort Collins, Colorado » Center for Agricultural Resources Research » Agricultural Genetic Resources Preservation Research » Research » Research Project #438961

Research Project: Bringing Historic Seed Test Data Alive Through Machine Learning Algorithms: Proof of Concept

Location: Agricultural Genetic Resources Preservation Research

Project Number: 3012-21000-015-32-S
Project Type: Non-Assistance Cooperative Agreement

Start Date: Sep 1, 2020
End Date: Aug 31, 2021

Objective:
Objective of this project is to conduct a proof of concept, that optical character recognition (OCR) technology can be used to migrate seed germination data from scanned images of historic germination data collection forms to an electronic format that enables data exploration and analysis.

Approach:
The National Laboratory for Genetic Resource Preservation has been conducting seed viability tests since it's inception in the 1950's. The laboratory has 800,000 germination cards that were used to record the results of seed tests over the last 50 years. These cards were scanned in 2017, however the hand written data captured on the scanned images is not amenable to analysis. The cooperator will use optical character recognition technology and machine learning to develop a program that can identify patterns, effectively using artificial intelligence to read handwritten data on cards and transfer this data to a format that not only securely warehouses the data, but makes it amenable to further analytics that will allow us to sift through a large quantity of data to extract valuable information.