|ZHOU, YANG - Huazhong Agricultural University|
|YANG, LV - Huazhong Agricultural University|
|HAN, XIAOTAO - Huazhong Agricultural University|
|HU, YAN - Huazhong Agricultural University|
|LI, FAN - Huazhong Agricultural University|
|XIA, HAN - Huazhong Agricultural University|
|HAN, JIAZHENG - Huazhong Agricultural University|
|PENG, LINGWEI - Huazhong Agricultural University|
|Rosen, Benjamin - Ben|
|ZHANG, SHUJUN - Huazhong Agricultural University|
|GUO, AIZHEN - Huazhong Agricultural University|
|Van Tassell, Curtis - Curt|
|Smith, Timothy - Tim|
|YANG, LIGUO - Huazhong Agricultural University|
|Liu, Ge - George|
Submitted to: Genome Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/21/2022
Publication Date: 8/26/2022
Citation: Zhou, Y., Yang, L., Han, X., Hu, Y., Li, F., Xia, H., Han, J., Peng, L., Boschiero, C., Rosen, B.D., Bickhart, D.M., Zhang, S., Guo, A., Van Tassell, C.P., Smith, T.P., Yang, L., Liu, G. 2022. Assembly of a pan-genome for global cattle reveals missing sequence and novel structural variation, providing new insights into their diversity and evolution history. Genome Research. 32(8):1585-1601. https://doi.org/10.1101/gr.276550.122.
Interpretive Summary: Pan-genome and structural variation are the basis of livestock genomics. We reported missing sequence and novel structural variation and provided new knowledge of the diversity and evolution of cattle. These results fill our knowledge gaps and provide the foundation for incorporating new knowledge into the future animal breeding program. Farmers, scientist, and policy planners who need improve animal health and production based on genome-enabled animal selection will benefit from this study.
Technical Abstract: Using an integrated bioinformatics pipeline, we generated an enhanced structural variation (SV) catalog from the genome sequence of 898 cattle covering 60 breeds worldwide, resulting in ~3.3 million deletions, ~0.13 million duplications and ~0.15 million inversions. In addition, we built a cattle pan-genome, revealing ~74 Mb or ~2.3% novel sequences beyond the current cattle reference genome ARS-UCD1.2 assembly. After examining the sequence features of deletions near their breakpoints, we performed deletion-based population genetic analyses, producing breed ancestry and hybridization results similar to those derived from single nucleotide polymorphism (SNP). We discovered hundreds of deletions with frequency differentiation across subspecies and breeds, including dozens of them that were reported before as the lead variants at their corresponding loci. A Bov-tA1 insertion/deletion event in the first intron of the APPL2, potentially affecting immune response, olfactory functions and mediating growth factor–induced cell proliferation and glucose metabolism in muscle, corresponds to the cattle breed geographic distributions. Therefore, we conclude that domestication, breeding, and adaptive introgression have remodeled the domestic cattle genomes, and the pan-genome is a valuable resource for studying their diversity and evolution history.