2011 Annual Report
1a.Objectives (from AD-416)
The objective of this SCA is to increase our understanding of plant genomes through sequence generation and characterization.
1b.Approach (from AD-416)
Plant genomes are large and complex genomes, making them a challenge to sequence and characterize their genomes’. With recent changes in sequencing technology it is now possible to generate sequence at fraction of the cost. But the sequence that is generated has different qualities that will require changes in the way we process and interpret this sequence. Experimental and computational approaches will be reviewed and develop to make use of the new short read technologies. This will include library development, integration of different sequencing methodologies, and development of computational pipelines to process, store and interpret the data sets.
In FY 2011, collaboration focused on the development of resources to support genome annotations, and analyses. Although one of the collaboration groups has interest in plant related research, a major focus is using next generation sequencing technology as a method to identify sequence associated variation and its relationship to cognitive disorders. For this work, the group uses this technology to produce genome sequence for re-sequencing as well as de-novo assembly and expression, sequence and methylation-state variation in complete genomes, as well as optimizing for targeted regions. As part of this collaboration, members have attended weekly group meetings that discuss production and analysis of sequence allowing technology transfer between the groups for library production as well as down stream sequence analysis. Preliminary successes and failure associated with multiplexing of samples to reduce cost as well as scale, have been discussed. Collaborators have worked with personnel in each other’s group to optimize genome DNA preparation genomic DNA libraries. This collaboration has supported the sequencing and analyses small RNA libraries, cDNA based libraries (RNA sequence), and genomic libraries from Arabidopsis, maize, wheat and sorghum, to support the baseline annotations and regulator sequence objectives of the in-house project. Experimental approaches for sequence-based transcript profiling, as well as computational tools for their analysis, have been evaluated and implemented. In the last year, we have shifted from a 3’-tag Digital Gene Expression (DGE) strategy to a whole-transcript, RNA-seq approach, the latter providing improved sensitivity, specificity, and information on alternative transcript variants. We continue to improve our experimental methods to include multiplexing of biological replicates and strand-specific information. A robust computational pipeline for mapping and analysis of RNA-seq data has been developed and meta-data, including differentially expressed genes and their profiles during development, are being analyzed and integrated with other types of genomics data sets. These include genome-wide DNA binding profiles for key transcription factors and epigenetic maps. The latter is a primary focus, where we are developing and evaluating a computational pipeline for analysis of shot-gun bisulfite sequencing data to determine context-specific shifts in methylation on a genome-wide scale across different tissues and developmental stages in maize. In addition to the sequencing resources it has also provided support for development of the yeast-1-hybrid library, 650 TF have been cloned into new approved vectors AD-DEST-2u that had high copy numbers. These collections had reached 93% of TFs that expressed in stele (708). Using this system, we have screened 34 miRNA promoters and four Arabidopsis transcription factors’ promoters. They interacted with 135 TFs from the stele. To functionally asses and model the network, 400 mutant lines for 135 TFs have been obtained from existing resources, 90 of these have now been selected as homozygous lines.
Progress monitored by weekly meetings.