Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Research Project #442481

Research Project: Accelerating Genetic Improvement of Ruminants Through Enhanced Genome Assembly, Annotation, and Selection

Location: Animal Genomics and Improvement Laboratory

2024 Annual Report


Objectives
Objective 1: Develop biological resources and computational tools to enhance the representation and annotation characterization of dairy breed-specific bovine and other genomes. Sub-objective 1.A: Improve dairy breed-specific bovine and other genome assemblies for pangenome representation. Sub-objective 1.B: SNP and CNV mapping in cattle and other ruminants. Sub-objective 1.C: Evaluate digestive tract function and identify gastrointestinal microorganism effects on nutrient digestibility, milk production capacity, nutrient use efficiency, and health in dairy cattle. Objective 2: Apply novel tools to utilize genotypic and phenotypic data to enhance genetic improvement in ruminant production systems. Sub-objective 2.A: Continue work on community-based breeding programs and develop imputation pipeline and data in goats. Sub-objective 2.B: Characterize and localize within and across breed measures of dominance as observed in inbreeding depression and heterosis. Objective 3: Characterize functional genetic and epigenetic variations for improved fertility, growth, health, production, reproduction, and environmental sustainability of ruminants. Sub-objective 3.A: Epigenome-wide Association Study based on DNA methylation. Sub-objective 3.B: FarmGTEx for goats.


Approach
Completion of our objectives is expected, in the short term, to improve genome-wide selection in the U.S. dairy industry as well as facilitate new genome-enhanced breeding strategies to bring economic and genetic stability to various ruminant value chains. Ultimately, longer term objectives to identify and understand how causative genetic variation affects livestock biology will require a combination of genome sequencing and comparative genomics, quantitative genetics, epigenomics and metagenomics, all of which are components of this project plan and areas of expertise in our group. Efforts to characterize genome activity and structural conservation/variation are an extension of our current research program in applied genomics. This project plan completely leverages the resources derived from the Bovine Genomes, HapMap, 1000 Bull Genomes, FAANG, Bovien Pangenome, and FarmGTEx projects, and genotypic data derived from the Council on Dairy Cattle Breeding (CDCB) genome-enhanced genetic evaluations for North American dairy cattle.


Progress Report
This is the second report for the new NP101 Project 8042-31000-112-000D which started July 24, 2022, entitled “Accelerating Genetic Improvement of Ruminants Through Enhanced Genome Assembly, Annotation, and Selection”. For Objective 1.A, the Bovine Pangenome Consortium was launched to describe the full extent of genetic variation in cattle through the creation of genome assemblies for bovine species of economic and biodiversity importance. Using the trio-binning method, the Consortium has released 15 breed-specific reference assemblies. The genome assemblies were generated from trios of Angus and Brahman, Nelore and Brown Swiss, and Original Braunvieh breeds. Additionally, Highland cattle and yak, Piedmontese cattle and gaur, Simmental cattle and bison, as well as for Holsteins and Jerseys to improve earlier efforts, and tropically adapted indicine breeds (Sahiwal, and Tharparkar). AGIL has generated assemblies for an additional 23 breeds for the Consortium: 1960’s Holstein, Contemporary Holstein, Ayrshire, Betizu, Brown Swiss, Charolais, Chianina, Gelbvieh, Guzerat, Gyr, Maine-Anjou, Nordic Red, Pirenaica, Red Wagyu, Retinta, Rubia Gallega, Sarda, Terrena, Tuli, Wagyu, Welsh Black, White Fulani, Whitebred Shorthorn. Additionally, Consortium members have generated assemblies for 40 breeds from Africa (8), Asia (12), South America (4), and Europe (16). Pangenome efforts are also underway in sheep and goats. The National Institute of Food and Agriculture (NIFA) funded Developing the Ovine Pangenome project has assembled 14 breeds: Awassi, Damara, Dorper, Friesian, Katahdin, Merino, Native Churro, Polypay, Romanov, Romney, Shire, St. Croix, Suffolk, and Wiltshire. The goat pangenome is in the earlier stages of development but has 7 breeds assembled thus far: Appenzeller, Boer, Kiko, Saanen, Spanish, Toggenburg, and Valais Blackneck. The T2T Ruminant project is developing complete, gapless, assemblies across the ruminant clade, although the only whole genome completed at this time is bighorn, the project has publicly released the first complete assemblies of the cattle and sheep Y chromosomes. The project has nearly complete assemblies for cattle (1960’s Holstein, Ayrshire, Charolais, Gyr, Piedmontese, Simmental, Tuli, and Wagyu), close relatives of cattle (bison, gaur, and river buffalo), sheep (Friesian, Native Churro, and Polypay), close relatives of sheep (bighorn and muskox), goat (Boer, Kiko, Saanen, and Spanish), and close relatives of goat (chamois and ibex). For Objective 1.B, we used computer tools to detect PAVs and study their impacts on Holstein cattle traits. PAV, or presence-absence variation, means that some individuals have certain genes, while others don't in one species. PAV-based genome-wide association studies (GWAS) identified associations between gene PAVs and 15 traits including milk, fat and protein yields, and those related to health (metritis) and reproduction. Associations were found on multiple chromosomes, illustrating important associations on BTA15 and 7, involving olfactory receptor and immune-related genes, respectively. By examining the PAVs at the population level, the results of this research provided crucial insights into the genetic structures underlying the complex traits of Holstein cattle. For Objective 1C, samples have been collected from all cattle, subjected to maximum HMW DNA extraction, and sequenced using Nanopore PromethION flow cells. Following initial quality control steps, this yielded an average N50 of 13,306 for reads from Holstein cows and an average N50 of 2377 for Jersey reads. These higher N50 values facilitate easier assembly of the genomes. Preliminary microbial classification results using the Kaiju program indicate significant differences in microbial populations between Jersey and Holstein cattle. To further explore this, we plan to collect additional sequence data using Hi-C technology to complement our current dataset and identify mobile elements within the microbial community. In cases where initial metagenome-assembled genomes (MAGs) lack certain low-abundance microbes of interest, we will sequence more samples and leverage Nanopore's MinKNOW software to enrich for targeted bacteria. For Objective 2.A, the use of genome-wide SNP information will increase the accuracy of predictions of genetic merit. The Space Chip software was modified to enable the additional weighting of SNP by values derived for each breed and SNP locus. For example, the resulting values from breed-specific genome wide association tests could be used to increase marker density in precise regions with larger magnitude genomic effects. The genetic marker scoring adds a weighting factor that is proportional to the scaling values provided by the user. These weighting factors are in addition to the existing score values that include minor allele frequency (MAF), the gap size that is being split, the nearness of the midpoint of the location of the SNP, and the overall breed weight. There are additional transformations provided to add additional flexibility to the scoring values. For Objective 2.B, the global application of genomic selection in dairy cattle has raised interest in characterizing dominance effects to better understand inbreeding depression. We believe that a richer understanding of additive (ADD), dominance (DOM), and runs of homozygosity (ROH) effects in both purebred and crossbred dairy cattle will aid in comprehending the impact of these factors on inbreeding depression and heterosis. To identify and localize genomic regions associated with ADD, DOM, and ROH effects, we performed a large-scale single-SNP GWAS analysis, fitting SNPs as fixed effects for ADD, DOM, and ROH, one locus at a time. To this end, we developed quantitative methods, programs, and scripts, and made the necessary software packages available. These were initially tested on small datasets. We then applied our methods, programs, and scripts to more than 1 million U.S. Holstein and nearly 250,000 Jersey cows genotyped on 79,294 SNP markers. To date, we have analyzed 3 yield traits, daughter pregnancy rate (DPR), and somatic cell score (SCS). For milk yield, Chr5 and Chr14 had the largest SNP effects in Holsteins, while Chr14 showed the largest effects in Jerseys. For fat yield, results were similar to milk yield, with a notable additional telomeric region on Chr2 in Jerseys. Notably, for protein yield, Chr6 exhibited significant effects in addition to Chr5 and Chr14 in Holsteins. A relatively small number of dominance effects were detected for the production traits with lower statistical significance than the additive effects. Chr5 had the most significant and the largest number of dominance SNP effects in Holsteins. In addition, significant dominance effects were observed on Chr14 (milk and fat yields) and Chr23 (milk yield) in Holsteins, with fewer significant effects in Jerseys. For SCS and DPR traits, relatively few additive or dominance effects were detected. Chr6 had the most significant additive SNP effects in Holsteins for SCS, while no significant dominance effects were observed for Holsteins in SCS. Both additive and dominance effects were identified for DPR in both Holsteins (Chr6 and Chr18) and Jerseys (Chr11). Furthermore, a unique additive effect on Chr1 for DPR in Holsteins was noted. For Objective 3.A, DNAm microarray data were collected to enhance our understanding of functional genetic variations. Using over 200 Holstein cattle and a newly designed DNAm microarray, 432 data sets were generated, covering various production and reproduction traits, such as feed efficiency and parasite resistance. Data analysis has been completed, resulting in one published paper on DNAm effects on feed efficiency. Additionally, two more manuscripts are ready for submission. For Objective 3.B, the FarmGTEx (genotype-tissue expression) Consortium Sheep/Goat project was launched one year ago to create a comprehensive atlas of tissue-specific gene expression and genetic regulation in sheep and goats. Thousands of whole genome and transcript sequencing data sets from over 100 tissues and cell types across dozens of breeds were processed for both species. The reference imputation panels are now ready for distribution. This project will provide a detailed transcriptome landscape across tissues and identify thousands of variants associated with gene expression and alternative splicing in numerous major tissues.


Accomplishments
1. The Farm Genotype-Tissue Expression (FarmGTEx) Consortium. Understanding the regulation of livestock gene expression is crucial for studying the biological mechanisms underlying economic traits and for improving animal selection. The FarmGTEx Consortium is an international collaboration aimed at providing a comprehensive atlas of tissue-specific gene expression and genetic regulation in farm animals. Led by ARS scientists in Beltsville, Maryland, and researchers at Aarhus University, Denmark, the Consortium includes over 100 global universities and institutes. FarmGTEx has successfully developed comprehensive GTEx atlases for cattle, pigs, and chickens, detailing tissue-specific gene expression and genetic regulation. These atlases, which are cornerstones for researchers, describe the transcriptome landscape across diverse tissues and catalogue thousands of variants influencing gene expression and alternative splicing. They enable integrated genomics analyses, aiding the interpretation of significant GWAS loci associated with economically important traits. In addition to the papers on Cattle and Pig GTEx published in Nature Genetics, FarmGTEx research outcomes have led to the development of a dedicated public web portal. This portal enables researchers to query gene expression, alternative splicing patterns, and trait-associated DNA regions across tissues in a user-friendly format, standing as a pivotal resource for farm animal genomics and breeding advancements.


Review Publications
Tuggle, C.K., Clarke, J.L., Murdoch, B.M., Lyons, E., Scott, N.M., Mckay, S., Lipka, A., Fulton, J., Hess, A., Lubberstedt, T., Fragomeni, B., Rowan, T., Mccarthy, F., Guadagno, C., Goddard, E., Das Choudhury, S., Sheehan, M., Kramer, L., Feldman, M.J., Daigle, C., Steibel, J.P., Benes, B., Murray, S., Riggs, P., Thompson, A., Hagen, D., Thornton-Kurth, K., Van Tassell, C.P., Campbell, J.D., Dorea, J., Chung, H., Dekkers, J.C., Ertl, D., Lawrence-Dill, C.A., Schnable, P.S. 2024. Current challenges and future of agricultural genomes to phenomes in the USA. Genome Biology. 25:8. https://doi.org/10.1186/s13059-023-03155-w.
Gao, Y., Marceau, A., Iqbal, V., Torres-Vazquez, J.A., Neupane, M., Jiang, J., Liu, G., Ma, L. 2023. Genome-wide association analysis of heifer livability and early first calving in Holstein cattle. BMC Genomics. 24:628. https://doi.org/10.1186/s12864-023-09736-0.
Sun, X., Guo, J., Li, R., Zhang, H., Zhang, Y., Liu, G., Emu, Q., Zhang, H. 2024. Whole-genome resequencing reveals genetic diversity and wool trait-related genes in Liangshan Semi-fine Wool Sheep. Animals. 14(3):444. https://doi.org/10.3390/ani14030444.
Badjibassa, A., Ouedraogo, D., Burger, P.A., Rosen, B.D., Van Tassell, C.P., Solkner, J., Soudre, A. 2024. Participatory investigation of goat farmers’ breeding practices, trait preference, and selection criteria in Burkina Faso. Tropical Animal Health and Production. 56:35. https://doi.org/10.1007/s11250-023-03869-w.
Hu, Z., Boschiero, C., Li, C., Connor, E.E., Baldwin, R.L., Liu, G. 2023. Unraveling the genetic basis of feed efficiency in cattle through integrated DNA methylation and CattleGTEx analysis. Genes. 14(12):2121. https://doi.org/10.3390/genes14122121.
Sun, J., Xie, F., Wang, J., Luo, J., Chen, T., Jiang, Q., Xi, Q., Liu, G., Zhang, Y. 2024. Integrated meta-omics reveals the regulatory landscape involved in lipid metabolism between pig breeds. Microbiome. 12. Article e33. https://doi.org/10.1186/s40168-023-01743-3.
Teng, J., Gao, Y., Yin, H., Bai, Z., Liu, S., Zeng, H., Consortium, T., Bai, L., Cai, Z., Zhao, B., Li, X., Xu, Z., Lin, Q., Pan, Z., Yang, W., Yu, X., Guan, D., Hou, Y., Keel-Mercer, B.N., Rohrer, G.A., Lindholm-Perry, A.K., Oliver, W.T., Ballester, M., Crespo, D., Quintanilla, R., Canela-Xandri, O., Rawlik, K., Xia, C., Yao, Y., Zhao, Q., Yao, W., Yang, L., Li, H., Zhang, H., Liao, W., Chen, T., Karlskov-Mortensen, P., Fredholm, M., Amills Eras, M., Clop, A., Giuffra, E., Wu, J., Cai, X., Diao, S., Pan, X., Wei, C., Li, J., Cheng, H., Wang, S., Su, G., Sahana, G., Lund, M., Dekkers, J., Kramer, L., Tuggle, C.K., Corbett, R., Groenen, M.A., Madsen, O., Godia, M., Rocha, D., Li, C., Pausch, H., Hu, X., Frantz, L., Luo, Y., Lin, L., Zhou, Z., Zhang, Z., Chen, Z., Cui, L., Xiang, R., Shen, X., Li, P., Huang, R., Tang, G., Li, M., Zhao, Y., Yi, G., Tang, Z., Jiang, J., Zhao, F., Yuan, X., Liu, X., Chen, Y., Xu, X., Zhao, S., Zhao, P., Haley, C., Zhou, H., Wang, Q., Pan, Y., Ding, X., Ma, L., Li, J., Navarro, P., Zhang, Q., Li, B., Tenesa, A., Liu, G. 2024. A compendium of genetic regulatory effects across pig tissues. Nature Genetics. 56:112-123. https://doi.org/10.1038/s41588-023-01585-7.
Yang, L., Yin, H., Bai, L., Yao, W., Tao, T., Zhao, Q., Gao, Y., Teng, J., Xu, Z., Lin, Q., Diao, S., Pan, Z., Guan, D., Li, B., Zhou, H., Zhou, Z., Zhao, F., Wang, Q., Pan, Y., Zhang, Z., Li, K., Fang, L., Liu, G. 2024. Mapping and functional characterization of structural variation in 1060 pig genomes. Genome Biology. 25. Article e116. https://doi.org/10.1186/s13059-024-03253-3.
Xiang, R., Fang, L., Liu, S., Macleod, I.M., Liu, Z., Breen, E.J., Gao, Y., Liu, G., Tenesa, A., Consortium, C., Mason, B., Chamberlain, A.J., Wray, N.R., Goddard, M.E. 2023. Gene expression and RNA splicing explain large proportions of the heritability for complex traits in cattle. Cell Genomics. 100385. https://doi.org/10.1016/j.xgen.2023.100385.
Van Tassell, C.P., Rosen, B.D., Woodward Greene, M.J., Silverstein, J., Huson, H.J., Solkner, J., Boettcher, P., Rothschild, M.F., Meszaros, G., Nakimbugwe, H., Gondwe, T., Muchadeyi, F.C., Nandolo, W., Mulindwa, H.A., Banda, L.J., Kaumbata, W., Getachew, T., Haile, A., Soudre, A., Ouedraogo, D., Rischkowsky, B.A., Mwai, A.O., Dzomba, E.F., Nash, O., Abegaz, S., Masiga, C.W., Wurzinger, M., Sayre, B.L., Stella, A., Tosser-Klopp, G., Sonstegard, T.S. 2023. The African Goat Improvement Network: A scientific group empowering smallholder farmers. Frontiers in Genetics. 14:1183240. https://doi.org/10.3389/fgene.2023.1183240.
Boyd, A., Luo, Y., Lunney, J.K., Kustas, B., Fukagawa, N.K., Mattoo, A.K., Crow, W.T., Pachepsky, Y.A., Kim, M.S., Lillehoj, H.S., Van Tassell, C.P., Zhang, H.Q., Blomberg, L., Dubey, J.P. 2023. Cross-cutting concepts to transform agricultural research. Frontiers in Sustainable Food Systems. 7. Article e1242665. https://doi.org/10.3389/fsufs.2023.1242665.
Dai, X., Bian, P., Hu, D., Luo, F., Huang, Y., Jiao, S., Wang, X., Gong, M., Li, R., Cai, Y., Wen, J., Yang, Q., Deng, W., Nanaei, H.A., Wang, Y., Wang, F., Zhang, Z., Rosen, B.D., Heller, R., Jiang, Y. 2023. A Chinese indicine pan-genome reveals a wealth of novel structural variants introgressed from other Bos species. Genome Research. 33(8):1284–1298. https://doi.org/10.1101/gr.277481.122.