2013 Annual Report
1a.Objectives (from AD-416):
1. An atlas of small RNAs will be created; 2. A website will be developed and publicly released for the visualization of small RNAs; 3. MicroRNAs will be identified, isolated, and evaluated; 4. Strand-specific, cleaved mRNA libraries will be used to identify targets of soybean microarrays.
1b.Approach (from AD-416):
Small RNA libraries will be constructed from tissues coordinated with ARS. Small RNA libraries will be generated in the University of Delaware lab using the improved Illumina TruSeq small RNA sample preparation kits, which eliminate all column purification and gel selection steps, are less time-intensive and more robust. These kits allow for decreased inputs of RNA; 500 ng total RNA is sufficient for library construction. Libraries will be bar-coded and four samples pooled and analyzed in a single lane on an Illumina HiSeq 2000 machine to yield single-end 50 base reads. We will quantitatively dissect the small RNA profiles across (a) tissues and (b) under stress treatments. Our proposed study will sequence far more deeply, with replicates, in a large array of tissues to give a comprehensive analysis of miRNA expression profiles in developing soybean tissues. By examining family member-specific PARE data, we will distinguish expression patterns between members of highly duplicated, conserved miRNA families. PARE data is semi-quantitative but any interesting expression differences will be validated by precursor-specific RT-PCR. Between the analysis of these conserved families and the many novel miRNAs, we will generate extensive data on miRNA tissue specificity, and these will be validated as described below. Analysis of differentially expressed siRNA clusters may identify loci subject to epigenetic regulation during development or stress, which is as yet poorly characterized in plants. The deep sequencing of matched PARE libraries will define small RNA targets and greatly enhance the information gained from other small RNA and RNA-seq projects. The deep sequencing of PARE libraries allows a systematic experimental analysis of small RNA targets. PARE is based on a modified 5'-RACE and generates libraries containing 3' cleavage products of mRNA, including those caused by small RNA-mediated cleavage. Computational tools are then used to compare PARE signatures to small RNA data to identify miRNA-target RNA pairs (see Aim 2). In addition to validating predicted targets, PARE signatures provide an important resource to validate new miRNAs. A database containing PARE signatures from diverse tissue samples will therefore be a tremendously valuable resource for the soybean community that will enhance the information gained from all small RNA sequencing data. PARE libraries will be generated from the same tissue samples indicated above, but because of the difficulty in generating these libraries, we will be able to make only a quarter as many PARE libraries as small RNA libraries.
We now have substantial amounts of interesting soybean data from both sRNA and Parrellel Analysis of RNA Ends (PARE) libraries generated from biotic and abiotic stressed samples, the reproductive tissues and the tissues from different stages of nodule development.
Reproductive tissue: We have analyzed the reproductive tissue-specific small RNA libraries and found one 22-nt miRNA (miR4392), which was highly expressed in the anther tissue, as a trigger of 21-nt phased, secondary siRNAs (“phasiRNA”). In order to investigate the function of miR4392, we generated the constructs of Small Tandem Target Mimic (STTM) that are specific to miR4392 and two phasiRNAs that will be used for plant transformation. To provide more insights into the role of miR4392, we have recently extended the analysis to other legume species (chickpea) for comparative purposes and to provide functional insights in soybean based on conservation. We generated seven small RNA libraries from ovary, anther, flower bud, opened flower, leaf, shoot and root tissues for chickpea. From a preliminary analysis, we identified a phased position specific to the reproductive tissues, which is consistent with the ones we have found in soybean and Medicago. Now we are performing more analysis on this data set.
Biotic stress tissue: We continue the analysis of the Pathogen-Associated Molecular Pattern (PAMP)-infected materials (flagellin and chitin) in three different genetic backgrounds (Williams 82, Dassel and Vinton). Twenty-four small RNA libraries were made from two replications, and now have been sequenced. The data set was processed and loaded into the database. The data has been analyzed by phasing analysis method. We identified a number of phasing positions from this data set. Interestingly, we found five genes annotated as resistance to P. syringae pv maculicola like were highly phased in all PAMP treatment samples, except for the control sample of flagellin treatment in Dassel background. This is interesting because Dassel background is susceptible to bacterial pathogens. However, further analysis for this data set will be carried out.
Abiotic stress tissue: We have received tissue samples for the drought stress experiments. Total RNA was isolated from the samples. Three replications of small RNA libraries and one replication of PARE library were made from the parental lines, IA3023 (drought tolerant line) and LD003309 (drought intolerant line). Those libraries have been sequenced and preprocessed. The data was used for miRNA and their target predictions. We have identified many miRNAs and their targets. The data was also used for the phasing analysis. Interestingly, five phasing genes that responded to the water stressed treatment were identified. These are heat-shock protein 70T-2 gene, allene oxide synthase gene, extracellular dermal glycoprotein gene, low-temperature-induced 65 and a gene with an unknown function. At present, an extensive data analysis is underway for this data set.
Nodulation related tissue: Soybean nodule samples have also been received. We constructed three replications of small RNA library and one replication of PARE library from the developing nodules at 10 days, 15 days, 20 days, 25 days and 30 days. The libraries were sequenced and the data were analyzed. We performed the phasing analysis and miRNA and their target predictions and target validation by PARE data. We also have done the static clustering analysis for these data sets and the results showed that miR390 were down-regulated along the ages of developing nodules, from 10 days to 30 days. This was in contrast to miR319a,b and miR397a,b that were up-regulated in later developmental stages. The phasing analysis results also showed that there were two auxin response factor genes were up-regulated during the nodule development from 10 days to 30 days. We are continuing the analysis for this data set.
We have performed the global phasing analysis of all of sRNA libraries that we have in our database (149 libraries). The results showed that 244 genes from 59 groups and 65 non-coding genes in soybean were identified as phasing genes – a much higher number than identified in any other eudicot. Interestingly, almost half of the phasing genes (106 genes) are disease resistance protein (NBS –LRR class). We also identified some minor groups of the phasing genes such as Pentatricopeptide Repeat (PPR)-containing proteins, auxin response factor and a variety of single or low copy genes. This high number of low copy genes is quite interesting, as this is a novel finding. At this point, we have achieved the saturation in identification of PHAS loci and now we are trying to understand the biological functions of those identified phased loci.
Small RNA data generated from this project including the small RNA libraries made from reproductive tissues and biotic/abiotic stressed samples were loaded into the expression database and released on a public website at: http://mpss.udel.edu/soy_sRNA_USB/. This site will also allow us to release the PARE, BS-seq and the RNA-seq data using our existing database structures and web interface tools.