Location: Crops Pathology and Genetics ResearchTitle: Discovery of rare mutations in populations: TILLING by sequencing) Author
Submitted to: Plant Physiology
Publication Type: Peer reviewed journal
Publication Acceptance Date: 4/28/2011
Publication Date: 4/29/2011
Publication URL: www.plantphysiol.org/content/156/3/1257.full?sid=829d496e-6bfe-4294-92dc-93f0098f4742
Citation: Tsai, H., Howell, T., Nitcher, R., Missirian, V., Watson, B., Ngo, K., Lieberman, M., Fass, J., Uauy, C., Tran, R., Ali, K., Filkov, V., Tai, T., Dubcovsky, J., Comai, L. 2011. Discovery of rare mutations in populations: TILLING by sequencing. Plant Physiology. 156:1257-1268. Interpretive Summary: TILLING (Targeting of Induced Local Lesions in Genomes) is a method of characterizing gene function which combines generation of chemically-induced mutants and high throughput discovery/detection of mutations. Sensitive and efficient identification of rare changes in gene sequences (i.e. mutations) among pools of individuals (e.g., mutant plant lines) is a key component of the TILLING method. In this study, the development and testing of a next-generation DNA sequencing approach to screening DNAs from pools of rice and wheat mutants is described. The novel tools and methods employed here represent a powerful approach to rapid and accurate discovery of mutations in genes which will enable characterization of gene function on a large scale.
Technical Abstract: Discovery of rare mutations in populations requires methods for processing and analyzing in parallel many individuals. Previous TILLING methods employed enzymatic or physical discrimination of heteroduplexed from homoduplexed target DNA. We used mutant populations of rice and wheat to develop a method based on Illumina sequencing of gene fragments amplified from multi-dimensionally pooled templates. We implemented sequence spiking for sample verification and sequence barcoding for multiplex lane loading. Changes collected after alignment of short reads to the reference were evaluated for sequencing quality, for occurrence in multiple pools, and for statistical relevance producing a Bayesian score with an associated confidence threshold. Errors were favored by varying coverage and sequence context. While these would be problematic in a mono-dimensional assay, they could be addressed by the combination of concurrent discovery in two or three pools with the bioinformatic treatment, resulting in accurate discovery.