Location: Genetics and Animal BreedingTitle: Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies
|LEONARD, ALEXANDER - Eth Zurich|
|CRYSNANTO, DANANG - Eth Zurich|
|FANG, ZIH-HUA - Eth Zurich|
|Heaton, Michael - Mike|
|VANDER LEY, BRIAN - University Of Nebraska|
|HERRERA, CAROLINA - University Of Zurich|
|BOLLWEIN, HEINRICH - University Of Zurich|
|Smith, Timothy - Tim|
|Rosen, Benjamin - Ben|
|PAUSCH, HUBERT - Eth Zurich|
Submitted to: Nature Communications
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/10/2022
Publication Date: 5/31/2022
Citation: Leonard, A.S., Crysnanto, D., Fang, Z., Heaton, M.P., Vander Ley, B.L., Herrera, C., Bollwein, H., Bickhart, D.M., Kuhn, K.L., Smith, T.P.L., Rosen, B.D., Pausch, H. 2022. Structural variant-based pangenome construction has low sensitivity to variability of haplotype-resolved bovine assemblies. Nature Communications. 13. Article 3012. https://doi.org/10.1038/s41467-022-30680-2.
Interpretive Summary: A species' reference genome is a guide for fast and efficient analysis of complete genomes of individual members. However, genetic diversity within a species is not well represented by any one member of the species. A proposed solution is the construction of a combined reference genome from multiple individuals representing the diversity within the species (a pangenome). Here we show, for the first time, that a pangenome can be built from the combined genomes of diverse individuals, derived from multiple technologies, and approaches, without loss of assembly quality. The genomes from domestic and wild bovines were used to demonstrate the method. We show that the resulting pangenome was useful for identifying missing gene sequence variation that was otherwise hidden from traditional discovery approaches. These newly discovered variant sequences were associated with traits and have the potential to impact well known phenotypes such as coat color. These results and this approach have important implications for global efforts aimed at constructing pangenomes for many species with diverse data produced in cooperating laboratories. The field of pangenome construction and application is presently in its formative stages. The results presented here will facilitate pangenome assemblies by the ever-growing number of international consortia building them to advance genome research of their favorite species.
Technical Abstract: Advantages of pangenomes over linear reference assemblies for genome research have recently been established. However, potential effects of sequence platform and assembly approach, or of combining assemblies created by different approaches, on pangenome construction have not been investigated. Here we generate haplotype-resolved assemblies from the offspring of three bovine trios representing increasing levels of heterozygosity that each demonstrate a substantial improvement in contiguity, completeness, and accuracy over the current Bos taurus reference genome. Diploid coverage as low as 20x for HiFi or 60x for ONT is sufficient to produce two haplotype-resolved assemblies meeting standards set by the Vertebrate Genomes Project. Structural variant-based pangenomes created from the haplotype-resolved assemblies demonstrate significant consensus regardless of sequence platform, assembler algorithm, or coverage. Inspecting pangenome topologies identifies 90 thousand structural variants including 931 overlapping with coding sequences; this approach reveals variants affecting QRICH2, PRDM9, HSPA1A, TAS2R46, and GC that have potential to affect phenotype.