Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #426373

Research Project: Championing Improvement of Sorghum and Other Agriculturally Important Species through Data Stewardship and Functional Dissection of Complex Traits

Location: Plant, Soil and Nutrition Research

Title: Accelerating Agricultural Research Through Interoperable Genetic and Phenotypic Variation Data

Author
item TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory
item WEI, SHARON - Cold Spring Harbor Laboratory
item KUMAR, VIVEK - Cold Spring Harbor Laboratory
item OLSON, ANDREW - Cold Spring Harbor Laboratory
item Harrison, Melanie
item CEZARD, TIMOTHEE - Embl-Ebi
item Gladman, Nicholas
item Ware, Doreen

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 4/5/2025
Publication Date: 4/5/2025
Citation: Tello-Ruiz, M.K., Wei, S., Kumar, V., Olson, A., Harrison, M.L., Cezard, T., Gladman, N.P., Ware, D. 2025. Accelerating Agricultural Research Through Interoperable Genetic and Phenotypic Variation Data. Meeting Abstract. 18th Annual International Biocuration Conference.

Interpretive Summary:

Technical Abstract: Data standards play a crucial role in enabling the integration of datasets from different studies, thus facilitating comparative analysis and yielding broader insights. The biomedical field has long embraced standards for genetic and phenotypic variation data. In contrast, agricultural research has only recently begun to adopt these practices. Historically, data sharing in agriculture has relied on personal networks, trust, and supplementary materials in publications, with significant effort required for data cleaning and harmonization. This has hindered research efficiency and often prevented cross study collaboration. To address this gap and promote the widespread adoption of standardized identifiers and formats for genetic markers, germplasm samples, and phenotypic traits in agricultural research, we have employed a multi-faceted approach. First, in partnership with the AgBioData Standards for Genetic Variation Working Group which includes agricultural researchers, bioinformaticians, and biocurators, we conducted surveys of current practices and reviewed existing data and metadata formats. In addition, we have begun evaluating relevant ontologies (e.g., CO_324) and controlled vocabularies with sorghum breeders. This allowed us to identify challenges and propose actionable solutions for the community. Second, we implemented the adoption of rsIDs (Reference SNP cluster IDs) for single nucleotide polymorphisms (SNPs), insertion and deletion (indels) in the comparative SorghumBase and Gramene pan-genome databases. This initiative has enabled consistency of genetic markers across genome assemblies and closely related breeding lines. Moreover, working closely with the European Variation Archive, we have begun to actively promote this practice among other agricultural resources and genotyping service providers. Finally, we adopted standard biosample identifiers by working with major germplasm repositories, such as GRIN-Global and ICRISAT, to streamline germplasm data integration. Our work highlights that data standards enhance data interoperability, support re-analysis and meta-analysis, and increase the visibility and citability of research. Furthermore, they foster greater collaboration and ensure proper credit attribution, ultimately advancing the efficiency and impact of agricultural research.