1a.Objectives (from AD-416):
1. Epigenetic analysis in the form of methylation data and histone modification data will be collected on one subset of a NAM population. This will be the same population from which RNA-Seq gene expression data will be collected; 2. Loci identified for which epigenetic changes appear to be correlated to agronomic traits will be verified in other NAM populations.
1b.Approach (from AD-416):
Seventy lines will be sequenced to 20x coverage for the determination of methyl cytosines. DNA from the 70 lines will be treated using sodium bisulfate conversion kits that will then be sequenced using an Illumina HiSeq, 1 line per sample (70 lines of sequencing). This will be done in two replicates to reduce sample variation and provide more confidence in ‘epialleles’ (the epigenetic variants of the same gene). Seventy lines will be sequenced to 5x using ChIP pulldowns for H3K9-methylation. Commercial antibodies for H3K9 methylation will be used to pull down DNA associated with this epigenetic mark. DNA will be sequenced to 5x per accession, four accessions per lane. Seventy lines sequenced to 5x using ChIP pulldowns for H3K27-methylation. Commercial antibodies for H3K27 methylation will be used to pull down DNA associated with this epigenetic mark. DNA will be sequenced to 5x per accession, four accessions per lane. Informatic analysis of data to identify the epialleles correlated with agronomic traits. Collaboration with University of Delaware will allow the public access of all the data generated. In addition, the informaticists and postdoc on this project will analyze the data to find epigenetic marks, epialleles, that appear to be correlated with specific agronomic traits as identified by ARS. We will confirm the association of putative epialleles with agronomic traits. The putative epialleles are not really useful until they are validated and the NAM populations provide a unique opportunity for validation. Validation will be done using PCR to amplify the putative alleles. We will then sequence those amplicons in an indexed format. We will first focus on replicates of the initial population we targeted, then expand to other NAM subpopulations to determine how robust these epialleles are.
For deeper and better understanding of soybean methylome, a total of six soybean methylome libraries (three different tissues with two biological replicates) were sequenced. For each soybean tissue, ~98 percent of total cytosines in the genome were covered by at least one sequence read. The methylation levels were compared between each library in different cytosine contexts. Average correlation coefficients between biological replicate were 0.963, 0.87 and 0.495 in CG, CHG and CHH context, respectively. CG and CHG methylation were mostly similar between the replicates, but an obvious difference was found in CHH context, showing biological replicates are highly recommended for additional methylome sequencing. Average correlation coefficients between different soybean tissues showed similar trends to those between the replicates, supporting that CHH methylation was dynamically changed during plant growth and development compared to CG and CHG methylation. By comparing different methylomes, we were able to construct a pipeline for determining Single Methylation Polymorphism (SMPs) and Differentially Methylated Regions (DMRs), and plan to use this pipeline for future analysis with the methylome of Nested Association Mapping (NAM) population.