Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #416957

Research Project: Accelerating Genetic Improvement of Ruminants Through Enhanced Genome Assembly, Annotation, and Selection

Location: Animal Genomics and Improvement Laboratory

Title: Long read and preliminary pangenome analyses reveal breed-specific structural variations and novel sequences in Holstein and Jersey cattle

Author
item GAO, YAHUI - University Of Maryland
item YANG, LIU - University Of Maryland
item Kuhn, Kristen
item Li, Wenli
item Zanton, Geoffrey
item Bowman, Mary
item ZHAO, PENGJU - Zhejiang University
item ZHOU, YANG - Huazhong Agricultural University
item FANG, LINGZHAO - Aarhus University
item COLE, JOHN - Council On Dairy Cattle Breeding
item Rosen, Benjamin
item MA, LI - University Of Maryland
item Li, Congjun
item Baldwin, Ransom
item Van Tassell, Curtis
item ZHANG, ZHE - South China Agricultural Univerisity
item Smith, Timothy
item Liu, Ge

Submitted to: Journal of Advanced Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/10/2025
Publication Date: 4/19/2025
Citation: Gao, Y., Yang, L., Kuhn, K.L., Li, W., Zanton, G.I., Bowman, M.E., Zhao, P., Zhou, Y., Fang, L., Cole, J.B., Rosen, B.D., Ma, L., Li, C., Baldwin, R.L., Van Tassell, C.P., Zhang, Z., Smith, T.P., Liu, G. 2025. Long read and preliminary pangenome analyses reveal breed-specific structural variations and novel sequences in Holstein and Jersey cattle. Journal of Advanced Research. https://doi.org/10.1016/j.jare.2025.04.014.
DOI: https://doi.org/10.1016/j.jare.2025.04.014

Interpretive Summary: Structural variation (SV) is an important type of genetic variation. We used long-read sequencing and pangenome tools to detect SV and novel sequences in Holstein and Jersey cattle. These findings fill knowledge gaps and provide a foundation for incorporating SV into future breeding programs. Farmers, scientists, and policy planners looking to improve animal health and production through genome-enabled selection will benefit from this study.

Technical Abstract: We sequenced 20 Holsteins and 8 Jersey cattle using PacBio HiFi to 20× coverage, assembling 28 genomes with an average size of 3.25 Gb and a contig N50 of 69.36 Mb. Using the cattle ARS-UCD2.0 reference assembly, we integrated five read-based and one assembly-based SV caller, resulting in Holstein/Jersey SV catalogs with 74,068/54,689 events spanning 202/135 Mb (7.43%/4.97% of the genome). Our analysis showed that SVs are enriched in less conserved, non-coding, and non-regulatory regions. Comparing Holsteins with high and low feed efficiency (FE), we found that high FE-specific SVs were linked to energy metabolism and olfactory receptors, while low FE-specific SVs were associated with material transport. We constructed Holstein/Jersey pangenome graphs with 148,598/105,875 nodes and 208,891/147,990 edges, representing 47,028/37,137 deletions, insertions, and complex biallelic and multi-allelic events, along with 63.75/42.34 Mb of novel sequence. Notably, we observed SV count saturation with 20 Holsteins, while adding Jersey samples significantly increased the SV count, highlighting breed-specific SV events. Our long-read data and SV catalogs are valuable resources, revealing that the cattle genome is more complex than previously thought.