Location: Crop Bioprotection Research
Project Number: 5010-22410-023-008-S
Project Type: Non-Assistance Cooperative Agreement
Start Date: Sep 22, 2025
End Date: Sep 21, 2026
Objective:
The Beenome100 is one of a set of projects conducting a comprehensive catalogue of genomes across the tree of life. The genomic data generated by the reference genomes will enable a better understanding of the biology, ecology, and evolution of known organisms, as well as the conservation, protection, and regeneration of biodiversity. These efforts have only been made possible by the recent and ongoing expansion in our capability of handling large data sets, in conjunction with leaps in technological advances in DNA sequencing methods and sequencers. Because of this increase in capability, Pangenomic analyses have emerged as a powerful approach in conceptualizing genetic information at a genome-wide level. Pangenome models have been shown to be more comprehensive in their ability to capture a wider range of genetic diversity, including core genes (shared by all individuals), accessory genes (present in subsets), and structural variation, such as deletions or inversions, which are critical for adaptation but often overlooked in phylogenetics. Recognizing that the pangenome is a comprehensive approach that directly links genetic variation to phenotypic outcomes – providing insights into adaptation strategies and evolutionary biology, such as detoxification mechanisms – that phylogenetics alone cannot reveal, this research aims to construct a comprehensive bee super-pangenome of the genus Osmia. This super-pangenome will capture the full spectrum of genetic variation across the analyzed bee species, such as gene family expansions and contractions, with a focus on genes involved in detoxification and cellular stress.
The objective of this study is to construct and characterize a super-pangenome for the Osmia bee genus providing a comprehensive view of genetic diversity across multiple species. This includes identifying core, accessory, and unique genes; analyzing patterns of structural variants; and identifying gene family expansions and contractions. Special emphasis will be placed on genes related to detoxification and cellular stress responses, offering insights into the adaptive mechanisms within the genus.
Approach:
The chromosome-level assembly of the bee genus Osmia will be obtained from the Beenome100 project. The scaffolds of pseudomolecules will be evaluated to validate the chromosome collinearity. Then, the homologous pseudomolecules will be aligned in a chromosome-wise fashion using standard methods. The false gaps produced by the alignment will be removed and the chromosome alignments will be converted into a graph format for downstream read mapping and genotyping. This will allow the identification of core, accessory, and unique genes across the analyzed genomes. This output will then be analyzed to identify large-scale structural variations (SVs) between the genomes, annotate these SVs, and assess their functional impact. To further characterize the genomic diversity the presence/absence patterns of genes across genomes will be determined and species-specific genes and gene families identified; the identified gene sets (core, accessory, and unique) will be functionally annotated to provide insights into their potential roles and metabolic pathways.
For evolutionary analysis, a maximum likelihood phylogenetic tree will be constructed based on the core gene. This tree will represent the evolutionary relationships among the analyzed genomes and will serve as the backbone for other analyses as mapping gene family changes. To gain insight into the selective pressures acting on different genes selection analyses will be performed on coding sequences to identify genes under positive, negative, or relaxed selection.