Location: Children's Nutrition Research Center2020 Annual Report
Objective 1: Select inbred mouse strains with phenotypic extremes in milk production will be used to: a) identify genomic variants along with intestinal and mammary-expressed genes that differentiate low and high milk production, and b) determine the extent to which genome-driven differences in milk production and mammary gene expression are directly mediated through host-dependent differences in the intestinal and/or mammary tissue microbiome. Subobjective 1A: Sequence the genomes of additional unsequenced strains from our original milk yield cohort and then use this completed lactation phenome genotype data to identify strain-specific private alleles and predict the functional consequences of these variants on genes with the potential to regulate traits defined in the lactation phenome dataset. Subobjective 1B: Combine the lactation phenome dataset with the expanded common variant data from sub objective 1A to conduct an enhanced joint-GWAS of SNP, INDEL, and SV, and to subsequently predict the functional consequences of the newly identified variants to lactation. Subobjective 1C: Using a complete 3x3 diallele cross of QSi3, QSi5, and PL/J determine the contribution of strain-dosage, heterosis, parent-of-origin, and epistasis to milk production and composition, and mammary gland development during early lactation, and identify mammary epithelial cell and intestinal eGenes on the basis of allelic imbalance. Subobjective 1D: Integrate the set of eGenes discovered in 1C with the set of private and common variants discovered 1A and1B and employ network modeling to predict and test those variant-eGene pairs that are most likely to cause the variation in the lactation phenome traits. Subobjective 1E: Analyze the fecal microbiota along with prolactin and oxytocin in samples obtained from the diallel conducted under sub-objective 1C to determine the contribution of strain-dosage, heterosis, parent-of-origin, and epistasis to the diversity and richness of the intestinal microbiota, to the abundance of specific taxa, and to neuroendocrine function in mouse strains with a genetic propensity for high or low milk yield.
Genetic background is known to influence variation in milk production however environmental factors also play a role. Advances in high-throughput DNA sequencing technologies have revolutionized the way in which the microbial world is viewed and has led to the concept that the microbiome is a major regulator of normal development and health. The microbiome is regulated by diet, but is also under the control of the host genome. In this regard, the full number of host genetic variants associated with lactation-related traits remains to be determined. Differences in milk production are driven by changes in gene expression within organs important to milk synthesis. Additionally, the intestinal microbiome is controlled by the host genome, but can directly influence gene expression within the host. We aim to understand how variations in the maternal genome interact with the microbiome to determine lactation success. Whole genome sequence data from select mouse strains will be used to identify genetic variants that are unique to high or low milk production. These newly identified variants will be functionally linked to milk production and composition, and to lactation-induced intestinal and mammary gene expression through a specific RNA Sequencing test known as allelic imbalance. Strain- and allele-dependent differences in fecal ribosomal 16s sequencing reads will associate the variants with the intestinal microbiome. Lastly, maternal microbiome seeding through neonatal cross-fostering will establish the ability of the intestinal microbiome to over-ride the effects of genetic background lactation-dependent gene expression and milk production.
Our prior work has measured maternal traits known to be connected to milk production, milk composition and a panel of other maternal traits in a cohort of 31 inbred mouse strains that we now refer to as The Lactation Phenome Cohort. We have used these data in conjunction with genetic variant data to identify regions in the genome that are associated with these maternal traits and thus act as potential regulators of lactation. During the course of conducting this work it was realized that for 9 of these strains, the genetic variant data were very sparse. We viewed this lack of data as a serious limitation to progress toward the objective of this project. This year we completed the whole genome sequencing of all 9 missing strains. We then used a number of software packages to both process and analyze the resulting data. This work included quality assessment of the DNA sequencing reads, the mapping of reads to a reference genome, and then finally the identification and calling of the variants. As a result of this work we now have a completed catalog of 92,510,146 variants for all 31 strains in the cohort. This represents a 300-fold increase in the amount of genome data with which we can now use to identify variants important to lactation. We have subsequently used this newly acquired data to identify what are known as private variants. Private variants are identified by the fact that they are present in a single individual. In contrast, common variants which make up the rest are typically found in 5% or more of the individuals in a population. Private variants have often been found to have much larger biological effects than those that are commonly found among a population. Our work with both classes of variants is now using the wealth of publically available annotation data in various ontology databases to assign functions to the variants and identify those that could be important to lactation outcomes.