Title: Evaluation of pyrosequencing for assembly of plasmids carrying IS elements Authors
Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: February 28, 2011
Publication Date: May 21, 2011
Citation: Chen, C., Lindsey, R.L., Strobaugh Jr, T.P., Frye, J.G., Bono, J.L., Smith, T.P., Meinersmann, R.J. 2011. Evaluation of pyrosequencing for assembly of plasmids carrying IS elements [abstract]. 111th ASM General Meeting. Technical Abstract: Background: Highly parallel pyrosequencing (PS) is an efficient means to generate sequence data, but read lengths of current instruments are limited to average 450 bases, and library construction involves random fragmentation of the sample. Thus, assembly of sequences containing long repetitive elements could be problematic. Methods: Three Salmonella ColE1-like plasmids, two lacking IS elements (3.2 and 5.7 kb) and one carrying three copies of the 1057-bp IS903 (8.2 kb), were sequenced using both dye terminator (Sanger) and PS. The Sanger sequencing was performed on library clones, PCR products, and intact plasmid using custom primers. Libraries for PS were prepared by shearing of the plasmids into 400-800 bp fragments. Assembly of Sanger reads (to completion) was performed with Sequencher, and of PS reads with the Newbler assembler. Results: The 3.2- and 5.7-kb plasmids were fully assembled using PS data, and the sequences were almost identical by both methods, with = 3 bp differences. In contrast, assembly of PS reads for the IS-containing 8.2-kb plasmid resulted in six contigs, three of which possessed partial IS903 sequences. Since the three IS903 copies had identity >98% (max. 21 mismatches), the assembler was unable to properly place PS reads lying internal to the IS903, thus terminating the contigs in the IS elements. The coverage of the IS903-containing contigs was ~3X higher than other plasmid genes, confirming the copy number determined by the Sanger sequence. Conclusion: PS was effective for assembly of non-repetitive plasmids but not for the plasmid containing multiple IS elements. The IS-containing PS reads could not be properly ordered in a de novo assembly process. To solve this problem, paired end reads of fragments longer than the length of the longest repetitive element present could be used, but this requires that two separate libraries be constructed for each plasmid. Other methods such as long PCR to interrogate the junctions of IS elements and other plasmid genes may also be used to aid assembly of PS data.