Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Research Project #433411

Research Project: Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection

Location: Animal Genomics and Improvement Laboratory

2021 Annual Report

Objective 1: Develop biological resources and computational tools to enhance characterization of breed-specific bovine and other genomes. De novo reference genome assemblies will be developed for dairy cattle breeds (Holstein and Jersey). In addition, improvements will be made to the existing, but suboptimal, reference assemblies for Bos taurus cattle and Zebu cattle (Bos indicus). These reference genome resources are essential for discovery of single nucleotide polymorphisms (SNP) and copy number variation (CNV) polymorphisms segregating in target populations. Genome characterization will be done by state-of-the-art platforms using short- and long-read sequencing of selected animals. Candidate animals will be derived from those populations targeted for genome-based genetic improvement to enable development of novel tools for proper parent and breed composition identification. To complement these studies, epigenomic and metagenomic surveys will be explored to better define DNA methylation and ruminant microbiome, which in turn will improve overall annotation of genes, genetic variation, epigenetic variation and other sequence motifs affecting phenotype expression. Objective 2: Utilize genotypic data to enhance genetic improvement in ruminant production systems. This objective has two components. The first component identifies signatures of selection and evaluates the potential to develop community-based breeding programs based on population structure and management system limitations in goats. The second component requires the optimization and application of statistical methodologies to develop cheap low-density SNP panels that can be used to guide genetic improvement of production traits while maintaining variants enriched by natural selection during adaptation of local breeds to marginal production environments. Objective 3: Characterize functional genetic variation for improved fertility, growth, and environmental sustainability of ruminants. The third objective involves detection of genetic variation affecting fertility, growth and environmental sustainability during early embryonic development or adaptation to climate or disease using whole genome or exome resequencing. The resultant sequence information will be integrated with other database resources that provide basic information about gene expression activity and motif patterns to guide selection of positional candidate genes for further study and validation of functional annotation in ruminants. Sub-objectives for objectives 1,2 and 3 are listed in post plan under related documents.

Completion of our objectives is expected, in the short term, to improve genome-wide selection in the U.S. dairy industry as well as facilitate new genome-enhanced breeding strategies to bring economic and genetic stability to various ruminant value chains in developing nations. Ultimately, longer term objectives to identify and understand how causative genetic variation affects livestock biology will require a combination of genome sequencing and comparative genomics, quantitative genetics, epigenomics and metagenomics, all of which are components of this project plan and areas of expertise in our group. Efforts to characterize genome activity and structural conservation/variation are an extension of our current research program in applied genomics. This project plan completely leverages the resources derived from the Bovine Genomes, HapMap, 1000 Bull Genomes and FAANG projects, and genotypic data derived from the Council on Dairy Cattle Breeding (CDCB) genome-enhanced genetic evaluations for North American dairy cattle.

Progress Report
Progress was made on all three objectives of project 8042-31000-001-00D (Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection). For Objective 1 (develop biological resources and computational tools to enhance characterization of breed-specific bovine and other genomes), ARS scientists in Beltsville, Maryland, continued as global leaders for the improvement of cattle genome assemblies using sequence data from third-generation sequencing and mapping platforms (PacBio, Oxford Nanopore, and Hi-C) to assemble breed-specific genomes. Assemblies have been released for the Hereford, Holstein, and Jersey breeds. After pioneering the trio-binning method for the generation of fully haplotype-resolved genomes with Angus and Brahman breeds, scientists have generated genome assemblies from trios of Highland cattle and yak, Piedmontese cattle and gaur, and Simmental cattle and bison. Trio-binned haplotype-resolved cattle assemblies are in progress for Holsteins and Jerseys to improve earlier efforts as well as for tropically adapted indicine breeds (Gir, Sahiwal, and Tharparkar). Additionally, the genome of the Rambouillet ewe used for the ovine Functional Annotation of Animal Genomes (FAANG) project was improved by long-read sequencing, and the de novo assembly was released along with assemblies of the White Dorper and Romanov sheep breeds. Transcript sequencing analyses were completed in over 100 tissues/cell types among over 40 breeds. A high-quality cattle genotype-tissue expression (GTEx) atlas was built, and tissue-specific gene contribution to complex traits was studied. Copy-number variation (CNV) discovery was performed based on long-read and short-read sequencing data; CNV discovery based on short-read sequencing and association studies was conducted (with collaborators) for cattle and goats. For Objective 2 (utilize genotypic data to enhance genetic improvement in ruminant production systems), development of genomic tools for selection continued. Space-Chip, a DNA-chip design software, continues to be enhanced and used to design new DNA chips. These improvements enable continued development of specialized single-nucleotide polymorphism (SNP) assays for genomic prediction in a broad range of species. Genome assemblies and genotyping chips from extensive genome sequence data were developed with Indian collaborators for use in water buffalo and indicine cattle. The IndiGau 788K SNP chip was designed using the Space-Chip software, and preliminary results indicate this array will perform well, especially for Indian-derived cattle. Additional SNPs were selected in collaboration with the International Goat Genome Consortium, AdaptMap, and VarGoats to augment the Illumina GoatSNP50 BeadChip for enhanced utility in more diverse goat breeds. For Objective 3 (characterize functional genetic variation for improved fertility, growth, and environmental sustainability of ruminants), sequencing data were analyzed for a better understanding of functional genetic variations. Using 172 sequenced Holstein bulls and newly assembled immune gene haplotypes, 155 candidate SNPs were discovered that allowed distinguishing between alleles of cattle immune genes that provide innate resistance to diseases. Of those candidate markers, 124 have been used in custom genotype panels to determine their frequency in a cohort of 1,800 cows. Genome wide association studies involving the larger cohort found that two of the markers predicted increased susceptibility to bovine tuberculosis and will be useful in future genetic evaluations for tuberculosis resistance. A concurrent study at USDA’s Meat Animal Research Center in Clay Center, Nebraska, identified two custom markers that had strong and significant protective effects against persistent infection of bovine viral diarrhea. Combining parental genome and epigenome information, 16 and 25 genes were detected as potential candidate markers for male fertility and gestation length, respectively. Association studies were also performed to investigate the genetic basis of seven health traits in dairy cattle. Six significant associations and 20 candidate genes were identified for cattle health. All markers have been made public to assist in future cattle health and production genomic selection efforts.

1. Bovine Pangenome Consortium and generation of better genome assemblies. A reference genome assembly provides the foundation for genomic analysis, but the current cattle reference is derived from a single Hereford cow and is insufficient to describe the full extent of genetic variation in cattle. Led by ARS scientists in Beltsville, Maryland; Madison, Wisconsin; and Clay Center, Nebraska, the Bovine Pangenome Consortium was launched to create and improve genome assemblies for bovine species of economic and biodiversity importance and now includes over 60 members that represent 40 institutions in 20 countries. Using the trio-binning method, which is based on genomes from a family trio of parents and an offspring, the Consortium has released six breed-specific reference assemblies and is developing methods to incorporate those assemblies into a single reference. Genome assemblies were generated from trios from Brahman and Angus cattle, Highland cattle and yak, and bison and Simmental cattle; those assemblies were published in Nature Communications, GigaScience, and the Journal of Heredity, respectively. These six genome assemblies are among the most complete and accurate vertebrate genomes ever produced and have the potential to increase accurate identification and selection for production traits in target populations.

2. Farm Genotype-Tissue Expression (FarmGTEx) Consortium and the Cattle Gene Atlas. Understanding the regulation of livestock gene expression is important for studying the biological mechanisms that underlie economic traits and for improving animal selection. FarmGTEx is an international collaborative to provide a comprehensive atlas of tissue-specific gene expression and genetic regulation in farm animals. Led by ARS scientists in Beltsville, Maryland, and researchers at the University of Edinburgh in Edinburgh, Scotland, the FarmGTEx Consortium includes over 20 universities and institutes around the world. The pilot phase of FarmGTEx built a Cattle Gene Atlas for the research community based on almost 12,000 publicly available RNA-sequence datasets that represent over 100 tissues and cell types among over 40 breeds, and the atlas describes the landscape of transcriptome (the RNA expressed by an organism’s genome) across tissues and reports thousands of variants associated with gene expression and alternative splicing (a process that enables RNA to direct synthesis of different protein variants with different cellular functions or properties) for 24 major tissues in cattle. Additionally, this work detected 496 gene-tissue pairs significantly associated with 43 economically important traits in cattle via a large transcriptome-wide association study. A portal was developed to allow researchers to query gene expression, alternative splicing, and DNA regions associated with particular traits in an easy and uniform way across tissues and to serve as a primary reference source for cattle genomics, cattle breeding, adaptive evolution, comparative genomics, and veterinary medicine.

3. IndiGau, a high-density Zebu cattle genotyping platform. Zebu (or indicine) cattle, which originated on the Indian subcontinent, tend to be much more heat tolerant and resilient to parasites and disease, and their genetics are of increasing interest in combating the challenges of climate change. ARS scientists in Beltsville, Maryland, worked with scientists at the National Institute of Animal Biotechnology (NIAB) in India to develop a high-density genotyping array (the IndiGau chip) for Zebu cattle using sequence data from 20 animals for each of five breeds (Sahiwal, Gir, Tharparkar, Red Sindhi, and Kankrej) as well as from 2 animals for each of 38 remaining Indian breeds. Over 6 million high-quality genetic markers were identified, and an additional 1 million markers were available from several commercial genotyping arrays. The Space-Chip software developed by ARS researchers in Beltsville, Maryland, was used to select the best markers for performance of Zebu cattle, and the final IndiGau chip includes almost 800,000 markers. The IndiGau chip will be optimal for marker-enhanced genetic improvement and for understanding the purity and diversity among Zebu breeds.

4. New methods for microbiome screening. The microbiome is the combined genetic material of all microorganisms (bacteria, fungi, protozoa, and viruses) that live in a particular environment. Because those microorganisms exist in large communities that are difficult to assess using old DNA sequencing technologies, ARS scientists in Madison, Wisconsin, led a research project conducted by an international and interdisciplinary team of researchers from four countries (Russia, Netherlands, Israel, and the United States) and two private U.S. companies (Phase Genomics and Pacific Biosciences) in developing new methods for microbiome screening. Using the latest high accuracy, long-read DNA sequencing technologies, microbial strains could be resolved down to single nucleotide variants in the population. Over 44 bacterial genomes were assembled into single, continuous chromosome genomes, which is the highest ever achieved in a single sequenced sample, and using additional DNA sequencing methods, over 400 viral- and 250 plasmid-host associations were identified in this one sample. A manuscript describing the study is in review but also was released as a preprint so that results could be shared with the research community. These discoveries represent the highest resolution image of genomic DNA prevalence and transfer within a single community and will impact the interpretation of future microbiome sequencing results.

5. High-throughput method to assess microbial virus-host associations. Traditionally, researchers had to rely on direct observation via sequencing or through classical microbial isolation to determine virus-host associations. However, a better metagenomics tool could provide therapeutics and prophylactics in agriculture and human medicine. A collaboration between ARS scientists in Madison, Wisconsin, and Clay Center, Nebraska, and industry partner Phase Genomics in Seattle, Washington, has resulted in the development of a new sequence-based method that does not require manual labor and has high throughput. This method, which has been called “proxiPhage” by Phase Genomics, identifies host-virus associations using intracellular DNA-protein interactions and can detect a viral genome within a specific bacterial cell, which is direct evidence of infection. A preprint that details the method is being made available in advance of a more detailed survey on other metagenomics datasets. The beta version of proxiPhage was previously highlighted for a Federal Laboratory Consortium for Technology Transfer award and has already been used in clinical settings to identify gene alleles in the environment that are considered to provide antimicrobial resistance.

Review Publications
Hu, Y., Xia, H., Li, M., Xu, C., Ye, X., Su, R., Zhang, M., Guo, A., Nash, O., Sonstegard, T.S., Yang, L., Liu, G., Zhou, Y. 2020. Comparative analyses of copy number variations between Bos taurus and Bos indicus. BMC Genomics. 21(1):682.
Guo, J., Zhong, J., Liu, G., Yang, L., Li, L., Chen, G., Song, T., Zheng, H. 2020. Identification and population genetic analyses of copy number variations in six domestic goat breeds and Bezoar ibexes using next-generation sequencing. BMC Genomics. 21(1):840.
Zhang, W., Qu, Y., Lin, M., Datta, A., Liu, G., Li, B. 2020. Immune cells signaling-pathway and genomic profiles for personalized immunotherapy (Chapter 2). In: Li, B., Larson, A., Li, S. editors. Personalized Immunotherapy for Tumor Diseases and Beyond. Singapore, Singapore: Bentham Science Publishers Pte. Ltd. p. 20-42.
Zhang, W., Liu, G., Devemy, E., Li, B. 2020. Molecular screening and neoantigen cloning - Fundamental of adoptive T-cell immunotherapy (Chapter 6). In: Li, B., Larson, A., Li, S. editors. Personalized Immunotherapy for Tumor Diseases and Beyond. Singapore, Singapore: Bentham Science Publishers Pte. Ltd. p. 80-96.
Liu, G., Zheng, J., Li, B. 2020. Bioinformatics of T-cell and primary tumor cells - Fundamental of adoptive T-cell immunotherapy (Chapter 8). In: Li, B., Larson, A., Li, S. editors. Personalized Immunotherapy for Tumor Diseases and Beyond. Singapore, Singapore: Bentham Science Publishers Pte. Ltd. p. 118-136.
Li, B., Liu, G., Zheng, J. 2020. System modeling of T-cell function - Development of adoptive T-cell immunotherapy (Chapter 12). In: Li, B., Larson, A., Li, S. editors. Personalized Immunotherapy for Tumor Diseases and Beyond. Singapore, Singapore: Bentham Science Publishers Pte. Ltd. p. 197-223.
Yan, Z., Huang, H., Freebern, E., Santos, D.J., Dai, D.D., Si, J., Ma, C., Cao, J., Guo, G., Liu, G., Ma, L., Fang, L., Zhang, Y. 2020. Integrating RNA-Seq with GWAS reveals novel insights into the molecular mechanism underpinning ketosis in cattle. BMC Genomics. 21(1):489.
Liang, D., Zhao, P., Si, J., Fang, L., Xu, Q., Hou, Y., Hu, X., Gong, Y., Liang, Z., Tian, B., Mao, H., Yindee, M., Faruque, M.O., Liu, G., Wu, D., Barker, J.S., Han, J., Zhang, Y. 2020. A LINE-1 derived promoter driving over-expression of the ASIP gene is responsible for white color in swamp buffalo. Molecular Biology and Evolution. 38(3):1122-1136.
Zhao, G., Zhang, T., Liu, Y., Wang, Z., Xu, L., Zhu, B., Gao, X., Zhang, L., Gao, H., Liu, G., Li, J., Xu, L. 2020. Genome-wide assessment of runs of homozygosity in Chinese Wagyu beef cattle. Animals. 10(8):E1425.
Kaumbata, W., Banda, L.J., Meszaros, G., Gondwe, T.N., Woodward Greene, M.J., Rosen, B.D., Van Tassell, C.P., Solkner, J., Wurzinger, M. 2020. Tangible and intangible benefits of local goats rearing in smallholder farms in Malawi. Small Ruminant Research. 87:106095.
Gao, Y., Fang, L., Baldwin, R.L., Connor, E.E., Cole, J.B., Van Tassell, C.P., Ma, L., Li, C., Liu, G. 2021. Single-cell transcriptomic analyses of cattle ruminal epithelial cells before and after weaning. Genomics. 113(4):2045-2055.
Nandolo, W., Meszaros, G., Banda, L.J., Gondwe, T.N., Mulindwa, H.A., Nakimbugwe, H.N., Wurzinger, M., Clark, E., Woodward Greene, M.J., Liu, M., Liu, G., Van Tassell, C.P., Rosen, B.D., Solkner, J. 2021. Detection of copy number variants in African goats using whole genome sequence data. BMC Genomics. 22(1):398.
Naji, M.M., Utsunomiya, Y.T., Solkner, J., Rosen, B.D., Meszaros, G. 2021. Investigation of ancestral alleles in the Bovinae subfamily. Gigascience. 22(1):108.
Heaton, M.P., Smith, T.P.L., Bickhart, D.M., Vander Ley, B.L., Kuehn, L.A., Oppenheimer, J., Shafer, W.R., Schuetze, F.T., Stroud, B., McClure, J.C., Barfield, J.P., Blackburn, H.D., Kalbfleisch, T.S., Davenport, K.M., Kuhn, K.L., Green, R.E., Shapiro, B., Rosen, B.D. 2021. A reference genome assembly of Simmental cattle, Bos taurus taurus. Journal of Heredity. 112(2):184-191.
Oppenheimer, J., Rosen, B.D., Heaton, M.P., Vander Ley, B.L., Shafer, W.R., Schuetze, F.T., Stroud, B., Kuehn, L.A., McClure, J.C., Barfield, J.P., Blackburn, H.D., Kalbfleisch, T.S., Bickhart, D.M., Davenport, K.M., Kuhn, K.L., Green, R.E., Shapiro, B., Smith, T.P.L. 2021. A reference genome assembly of American bison, Bison bison bison. Journal of Heredity. 112(2):174-183.
Edwards, R.J., Field, M.A., Ferguson, J.M., Dudchenko, O., Keilwagen, J., Rosen, B.D., Johnson, G.S., Rice, E., Hillier, L., Hammond, J.M., Towarnicki, S.G., Omer, A., Skvortsova, K., Bogdanovic, O., Zammit, R.A., Aiden, E.L., Warren, W.C., Ballard, J.W. 2021. Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome. BMC Genomics. 22(1):188.
Kaumbata, W., Nakimbugwe, H.N., Nandolo, W., Banda, L.J., Meszaros, G., Gondwe, T.N., Woodward Greene, M.J., Rosen, B.D., Van Tassell, C.P., S0lkner, J., Wurzinger, M. 2021. Experiences from the implementation of community-based goat breeding programs in Malawi and Uganda: a potential approach for conservation and improvement of indigenous small ruminants in smallholder farms. Sustainability. 13(3):1494.