AN INTEGRATED DATABASE AND BIOINFORMATICS RESOURCE FOR SMALL GRAINS
Location: Genomics and Gene Discovery
Title: Rapid genome mapping in nano channel array for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome
| Hastie, Alex - |
| Dong, Lingli - |
| Luo, Mingcheng - |
| Huo, Naxin - |
| Xiao, Ming - |
Submitted to: PLoS One
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: January 3, 2013
Publication Date: February 6, 2013
Citation: Hastie, A., Dong, L., Luo, M., Huo, N., Gu, Y.Q., Xiao, M. 2013. Rapid genome mapping in nano channel array for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome. PLoS One. 8:e55864.
Interpretive Summary: Because of the extremely large size and polyploid nature of the wheat genome, sequencing and accurate assembly to generate a gold standard reference sequence for the wheat genome still represents a great challenge. This has hindered the rapid development of genomics research in wheat for crop improvement. In this work, we developed a novel approach to sequence the wheat genomic regions by using a combination of high-throughput Roche 454 sequencing and nanomapping of large insert bacterial artificial chromosome (BAC) clones. The Roche 454 provides sufficient coverage for sequence assembly, while the nanomapping information guides the assembly by ordering sequence contigs to form a linear sequence. We demonstrated that the accuracy of assembled sequences increased dramatically using this novel approach. The technology will be very useful for the international wheat genome sequencing community with the aim to generate a completely assembled wheat genome.
Next-generation sequencing (NGS) technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high levels of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC) clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum). Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in size, and thus, it avoids most of the pitfalls of sequence-based assembly. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs <2 kb long, to dramatically improved the assembly from 72% to 98% complete.