Skip to main content
ARS Home » Plains Area » Houston, Texas » Children's Nutrition Research Center » Research » Research Project #436284

Research Project: The Microbiome as a Mediator of Host-Genome-Determined Lactation Outcomes and the Liver-Gut Axis in Lactation

Location: Children's Nutrition Research Center

2022 Annual Report

Objective 1: Select inbred mouse strains with phenotypic extremes in milk production will be used to: a) identify genomic variants along with intestinal and mammary-expressed genes that differentiate low and high milk production, and b) determine the extent to which genome-driven differences in milk production and mammary gene expression are directly mediated through host-dependent differences in the intestinal and/or mammary tissue microbiome. Subobjective 1A: Sequence the genomes of additional unsequenced strains from our original milk yield cohort and then use this completed lactation phenome genotype data to identify strain-specific private alleles and predict the functional consequences of these variants on genes with the potential to regulate traits defined in the lactation phenome dataset. Subobjective 1B: Combine the lactation phenome dataset with the expanded common variant data from sub objective 1A to conduct an enhanced joint-GWAS of SNP, INDEL, and SV, and to subsequently predict the functional consequences of the newly identified variants to lactation. Subobjective 1C: Using a complete 3x3 diallele cross of QSi3, QSi5, and PL/J determine the contribution of strain-dosage, heterosis, parent-of-origin, and epistasis to milk production and composition, and mammary gland development during early lactation, and identify mammary epithelial cell and intestinal eGenes on the basis of allelic imbalance. Subobjective 1D: Integrate the set of eGenes discovered in 1C with the set of private and common variants discovered 1A and1B and employ network modeling to predict and test those variant-eGene pairs that are most likely to cause the variation in the lactation phenome traits. Subobjective 1E: Analyze the fecal microbiota along with prolactin and oxytocin in samples obtained from the diallel conducted under sub-objective 1C to determine the contribution of strain-dosage, heterosis, parent-of-origin, and epistasis to the diversity and richness of the intestinal microbiota, to the abundance of specific taxa, and to neuroendocrine function in mouse strains with a genetic propensity for high or low milk yield. Objective 2: Determine the short and long-term impact of lactation on the maternal hepatic metabolome composition and hepatic signaling pathways in mice. Objective 3: Determine the impact of maternal nuclear receptor signaling on the maternal hepatic metabolome and pup viability.

Genetic background is known to influence variation in milk production however environmental factors also play a role. Advances in high-throughput DNA sequencing technologies have revolutionized the way in which the microbial world is viewed and has led to the concept that the microbiome is a major regulator of normal development and health. The microbiome is regulated by diet, but is also under the control of the host genome. In this regard, the full number of host genetic variants associated with lactation-related traits remains to be determined. Differences in milk production are driven by changes in gene expression within organs important to milk synthesis. Additionally, the intestinal microbiome is controlled by the host genome, but can directly influence gene expression within the host. We aim to understand how variations in the maternal genome interact with the microbiome to determine lactation success. Whole genome sequence data from select mouse strains will be used to identify genetic variants that are unique to high or low milk production. These newly identified variants will be functionally linked to milk production and composition, and to lactation-induced intestinal and mammary gene expression through a specific RNA Sequencing test known as allelic imbalance. Strain- and allele-dependent differences in fecal ribosomal 16s sequencing reads will associate the variants with the intestinal microbiome. Lastly, maternal microbiome seeding through neonatal cross-fostering will establish the ability of the intestinal microbiome to over-ride the effects of genetic background lactation-dependent gene expression and milk production. Additionally, although rates of breastfeeding (BF) have increased, there is much variability in BF initiation and duration rates. Lactation insufficiency, inability to produce enough breast milk to support offspring development, is estimated to be between 40-60%. The underlying mechanisms of lactation insufficiency are not well understood and require more study. The liver and small intestine undergo metabolic changes that support the production of mature milk in the mammary gland in lactating rodents, including significant increases in hepatic and intestinal bile acids. In lactating animal models, key enzymes involved in cholesterol and lipid homeostasis are altered during lactation. Bile acids promote the solubilization of cholesterol and lipid soluble nutrients, which enhance milk lipid nutrient composition. These genes are regulated by a group of transcription factors called nuclear receptors- the metabolic nuclear receptors farnesoid x receptor (FXR) and peroxisome proliferator activated receptor alpha (PPARalpha). These nuclear receptors and their target genes represent novel targets for study to address our central hypothesis that manipulation of hepatic and intestinal nuclear receptors alters lipid composition in breast milk. The overall goal of this project is to determine the role of FXR and PPARalpha in the metabolic adaptations of the maternal liver-gut axis.

Progress Report
We completed work to create a dataset consisting of 25 million genomic variant calls in these mouse families and filtered the calls to produce a more focused list of 1,701,355 private variants (Sub-objective 1A). We continued to complete the assignment of these private variants and their affected genes to biological pathways. This required the creation of subtracted gene lists for all 118 maternal traits that were measured. These were created by ranking the performance of the mouse families for each trait and taking the variant and gene lists for the top and bottom 20%. These lists were then subtracted from each other to produce 236 smaller filtered lists that were specific of either "high" or "low" performance in each trait. These lists were uploaded to a pathway-enrichment database and a statistical test for pathway over-representation was conducted and a total of 63 over-represented pathways were detected. Of these, 59 were accounted for by milk composition traits, of which 55 were fatty acid composition traits. We ranked the pathways by how often they were detected in the analysis. The top 7 most frequently over-represented pathways were "Pentose and Glucuronate Interconversions", "PI3-Kinase Signaling", "Steroid Hormone Biosynthesis", "Drug Metabolism", and "Notch Signaling", "Transport of Glucose, Other Sugars, Bile Salts, Organic Acids", and "Diseases of Glycosylation". These pathways allow us to focus on 3,133 private variants and the 229 genes that they impact. We used a database that allowed us to intersect the variants and genes to other known features in the mouse genome that are important for regulating gene expression including markers of actively transcribed DNA, transcription factor binding sites, and specialized cell-fate determining regions known as "super-enhancers". In the lactating mammary gland, the super-enhancer that is of greatest importance is known as a "STAT5" super-enhancer. We identified 14 variants that intersected both the super-enhancer and one or more transcription factor binding sites that have the potential to regulate 5 genes. We will confirm the importance of these genes by comparing their levels of expression in mammary and other tissues from the Lactation Phenome mouse families. To identify additional genes of importance, we used an RNA-sequencing dataset that our lab generated from a subset of 6 Lactation Phenome mouse families in which the females displayed extremely high or low milk-yield. The dataset was derived from mammary tissue RNA and allow for the analysis of 53,715, messenger RNAs. By comparing each family to those in the opposing milk-yield group we identified 679 variants within 59 genes that were both differentially expressed and contained one or more private variants. Intersecting these genes and variants with other genomic features resulted in the identification of a single gene that contained a potentially disrupted transcription factor binding site that was within the STAT5 super-enhancer. This gene will be further investigated. We analyzed the RNA-sequencing data in a slightly different way testing for sets of differentially expressed genes that distinguish each of 6 families. We discovered an additional 54 private variants within 12 additional differentially expressed genes. Of these there were 4 genes that contained private variants overlapping transcription factor binding sites. These will also be the subject of further study. Our work on Sub-objective 1A this year allowed us to focus on 10 private variant-containing genes that could be considered biologically meaningful regulators of lactation outcomes. For Sub-objective 1B, we had produced and were ready to analyze a genome data set containing 25 million high-quality DNA sequence variant calls for all 31 families in our Lactation Phenome mouse cohort. The amount of data that we intended to work with was 100-fold more that we had ever worked with and we were able to access the Baylor College of Medicine computing cluster. Unlike the dataset analyzed in Sub-objective 1A, the DNA sequence data for 1B contained private and rare variants as well as variants that are classified as "common" (sequence variants that are present in at least 5%, and up to 50% of the population). We had to reformat the dataset to a structure that could be processed by a software package that is commonly used in human genetic studies known as "plink". Even with the use of the computing cluster we had to subdivide the data into smaller sets to facilitate this processing. We divided the data set into subsets that contained the data from each of the 20 mouse chromosomes for the data reformatting step. We used plink to define important regions of each chromosome that are known as "haplotype blocks" and to format the data sets into a structure which could be used by another software package known as "gemma". We used gemma to conduct statistical tests to identify DNA sequence variants that were associated with each of our maternal traits. We have conducted statistical testing on the 51 milk fatty acid composition traits that are in the dataset and are summarizing the results. For Sub-objective 1C, we are half-way through the animal breeding and sample collection phase. To do this, we set up 9 sets of breeding pairs to cross three different mouse families. The pairs were set to produce all possible mother/father combinations in the female progeny. As of November, we had collected about 85% of the samples required for the study. The work was paused from December through March due to the departure of support personnel and refilling this position. We have worked to generate the samples from the remaining mating combinations to complete this phase. Work continues with new staff and once these last samples are collected there will be a cluster of results available. Work under Sub-objective 1C also required the isolation and sequencing of mammary and intestinal RNA from the females in this study. We have been able to complete the RNA-sequencing on 18 of 36 mammary samples and were poised to obtain the first analysis results of comparing gene expression between maternal and paternal copies of each gene. Genes for which expression between the mother's and father's copies differ are said to exhibit "allelic imbalance" (the presence of a biologically significant DNA sequence variant differs between the maternal and paternal genes). We used two specialized open-source software packages to conduct this analysis. "G2gtools" is designed to use genome reference files to construct customized, family-specific, references that are necessary to accurately map the RNA sequencing reads that each parent contributes to the F1 progeny. The second, called "Emase", is used to map the reads, classify them, and count them. During the running of our reference genome and variant data through g2gtools encountered an unexpected error with the script which put a temporary pause on this work. We have been in contact with the authors of the script and will rely on their help to troubleshoot the issue. Also related to Sub-Objective 1B, we used 1,500 digital images of mammary tissue sections collected from 31 different mouse families. To measure the extent of mammary gland development in the lactating females from these families we used image analysis software to measure and count the mammary alveoli which are the secretory structures of the gland and fat cells. With this data, we have conducted a preliminary genome-wide association conducted using our original DNA sequence variant data set of 315 thousand variants. We detected 1,243 variants associated with size and number of mammary adipocytes and 86 variants associated with size and number of mammary alveoli. These new maternal traits now also contribute to our Lactation Phenome dataset and will be included in the repeated analysis that is based on the large 25 million variant dataset. Additionally, a new project was established due to the hiring of a new research scientist. Lactation is a physiological state that exerts profound increases in energy demand and metabolic adaptation to meet nutritional requirements for the mother and offspring. Although there is a positive association of breastfeeding with wellness and health for both mother and offspring, there is very limited data regarding the maternal metabolic adaptations during lactation. Essential metabolic processes in the liver are regulated by the expression of genes controlled by a group of transcription factors known as nuclear receptors, in particular farnesoid x receptor (FXR). We have initiated studies for Objective 1 to determine whether metabolic gene expression in the liver is changed in lactating mice and whether different mouse models have impaired metabolic adaptations during lactation as associated with Objective 2. Previously we established the following mouse model breeding colonies to be used in our studies: wild-type (control), metabolic nuclear receptor knockouts (FXR), and mice genetically modified to develop extreme liver accumulation of copper (Atp7b knockout). At day one after birth, the litters were culled to 5-6 pups to provide similar sucking stimulus and growth curves for the litters obtained. At day 14 post-partum, (the phase of lactation with maximal milk production), the livers were collected from lactating dams, as well as age-matched virgin female wild-type controls. We collected preliminary data for the average pup weight/litter and relative expression of genes required for cholesterol, bile acid, and glucose homeostasis. Our preliminary data indicate that pup weight is lower in FXR and Atp7b knockout mice relative to the wild-type controls. Lactation in all genotypes was associated with altered levels of metabolic gene expression relative with virgin female wild-type controls. These preliminary findings support the hypothesis that maternal liver metabolic adaptations occur during lactation.