Location: Plant, Soil and Nutrition Research
Title: Propagating rsIDs Across Crop Pan-Genomes in Gramene Platform Using the Ensembl Variant Remapping PipelineAuthor
![]() |
CHOUGULE, KAPEEL - Cold Spring Harbor Laboratory |
![]() |
KIM, SUYAN - Cold Spring Harbor Laboratory |
![]() |
WEI, SHARON - Cold Spring Harbor Laboratory |
![]() |
OLSON, ANDREW - Cold Spring Harbor Laboratory |
![]() |
LU, ZHENYUAN - Cold Spring Harbor Laboratory |
![]() |
TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory |
![]() |
Ware, Doreen |
|
Submitted to: Meeting Abstract
Publication Type: Abstract Only Publication Acceptance Date: 11/5/2025 Publication Date: 11/5/2025 Citation: Chougule, K., Kim, S., Wei, S., Olson, A., Lu, Z., Tello-Ruiz, M., Ware, D. 2025. Propagating rsIDs Across Crop Pan-Genomes in Gramene Platform Using the Ensembl Variant Remapping Pipeline. Meeting Abstract. Genome Informatics Conference. Interpretive Summary: Technical Abstract: The Reference SNP cluster ID (rsID) has long been the standard identifier for genetic variation in human genomics, enabling stable cross-referencing across databases, assemblies, and studies. Its persistence across reference versions has transformed population genetics, medical research, and clinical applications. Following this success, rsIDs are now increasingly adopted in plant genomics through the European Variation Archive (EVA), which has assigned hundreds of millions or even billions of identifiers to crop genomes. This adoption ensures that variants are referenced independently of a single genome build, simplifying integration, promoting FAIR data stewardship, and enabling reproducible, trait-driven analyses. Gramene has adopted rsIDs as a unifying framework to consolidate genetic variation knowledge across species, improve phenotype prediction, and enhance trait-based marker discovery. By integrating EVA-assigned rsIDs into its variation module, Gramene provides consistent identifiers for millions of variants, decoupling genetic variation data from specific assemblies and supporting pan-genome scale interoperability. Currently, rsIDs have been integrated into four major crop genomes: Sorghum (41M), Rice (67M), Maize (78M), and Grape (0.3M). Together this represents more than 193 million rsIDs standardized across crop species across Gramene and its pan-sites. As additional pan-genomes and breeding lines are sequenced, the propagation of rsIDs from reference genomes to new assemblies provides an efficient alternative to re-calling variants for each accession. Using EVA’s Ensembl Variant Remapping pipeline, rsIDs are mapped with high success, achieving ~98% accuracy between reference assembly versions and ~87% across pan-genomes. These stable identifiers are directly accessible through Gramene’s genome browser, where remapped rsIDs are made available as searchable variant tracks and gene-level annotations. This integration allows researchers to link variants with gene function, trait associations, and orthologous loci across different accessions and assemblies within a species, while maintaining consistency with EVA’s species-specific rsID assignments. The adoption and propagation of rsIDs provide a stable framework for managing plant genetic variation across assemblies and accessions, ensuring that resources remain interoperable, FAIR, and directly useful for breeding and translational research. Support for this work is provided by USDA-ARS grant 8062-21000-051-000D. |
