Skip to main content
ARS Home » Plains Area » Houston, Texas » Children's Nutrition Research Center » Research » Publications at this Location » Publication #307419

Title: Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

Author
item KUNDE-RAMAMOORTHY, GOVINDARAJAN - Children'S Nutrition Research Center (CNRC)
item COARFA, CRISTIAN - Baylor College Of Medicine
item LARITSKY, ELEONORA - Children'S Nutrition Research Center (CNRC)
item KESSLER, NOAH - University Of Houston
item HARRIS, R - Baylor College Of Medicine
item XU, MINGCHU - Baylor College Of Medicine
item CHEN, RUI - Baylor College Of Medicine
item SHEN, LANLAN - Children'S Nutrition Research Center (CNRC)
item MILOSAVLJEVIC, ALEKSANDAR - Baylor College Of Medicine
item WATERLAND, ROBERT - Children'S Nutrition Research Center (CNRC)

Submitted to: Nucleic Acids Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/29/2013
Publication Date: 1/3/2014
Citation: Kunde-Ramamoorthy, G., Coarfa, C., Laritsky, E., Kessler, N.J., Harris, R.A., Xu, M., Chen, R., Shen, L., Milosavljevic, A., Waterland, R.A. 2014. Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing. Nucleic Acids Research. 42(6):e43.

Interpretive Summary: The new gold standard for genome-wide analysis of DNA methylation (a biological process where a methyl molecular is added to the cytosine DNA nucleotides) combines chemical conversion of genomic DNA with next-generation sequencing (called Bisulfite-seq). Chemical conversion of DNA by bisulfite converts all unmethylated cytosines to thymines (i.e. C to T in the DNA sequence). This dramatically reduces the complexity of the sequence, making the mapping of sequence reads very difficult. Various programs to map the sequencing reads to the genome have been developed, but these had not previously been objectively evaluated. We used Bisulfite-seq to generate 4 complete sets of human DNA methylation marks (methylomes) and compared 5 different mapping algorithms. We used an independent method of measuring DNA methylation (bisulfite pyrosequencing) to validate the quantitative methylation estimates of the different mappers. Our results show important differences in the performance of the mappers, providing useful guidance to help investigators choose the ideal mapper for their Bisulfite-seq studies.

Technical Abstract: Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitative accuracy has been reported. We sequenced bisulfite-converted DNA from two tissues from each of two healthy human adults and systematically compared five widely used Bisulfite-seq mapping algorithms: Bismark, BSMAP, Pash, BatMeth and BS Seeker. We evaluated their computational speed and genomic coverage and verified their percentage methylation estimates. With the exception of BatMeth, all mappers covered greater than 70% of CpG sites genome-wide and yielded highly concordant estimates of percentage methylation (r2 = 0.95). Fourfold variation in mapping time was found between BSMAP (fastest) and Pash (slowest). In each library, 8–12% of genomic regions covered by Bismark and Pash were not covered by BSMAP. An experiment using simulated reads confirmed that Pash has an exceptional ability to uniquely map reads in genomic regions of structural variation. Independent verification by bisulfite pyrosequencing generally confirmed the percentage methylation estimates by the mappers. Of these algorithms, Bismark provides an attractive combination of processing speed, genomic coverage and quantitative accuracy, whereas Pash offers considerably higher genomic coverage.