**Submitted to:** JSM Mathematics & Statistics

**Publication Type:** Peer Reviewed Journal

**Publication Acceptance Date:** 8/22/2014

**Publication Date:** 8/22/2014

**Citation:** Irwin, P.L., Nguyen, L.T., Chen, C., He, Y. 2014. Variability in DNA polymerase efficiency: effects of random error, DNA extraction method, and isolate type. JSM Mathematics & Statistics. 1(1):1003.

**Interpretive Summary:** Foodborne bacteria exist in heterogeneous microbial communities. Some microbes in these communities form multicellular structures and/or biofilms. Pathogens which happen to reside within these structures are more difficult to detect and might be more likely to survive. Part of our research project involves the characterization of such biological constructions from the standpoint of cell number. The only reasonable way for doing this is to count the occurrence of certain genetic components, which are proportional to the number of cells present in a food extract using quantitative, or real time, polymerase chain reaction (qPCR: an analytical method used to determine the number of copies of any particular gene per volume tested). To perform these analyses, DNA standards (various concentrations of known gene copy number) are measured and utilized to convert raw qPCR data associated with unknown test samples into concentration terms. However, the determination of a test sample’s DNA concentration assumes that the concentration dependence (related to reaction efficiency) of the standard and unknown reactions are equivalent. Frequently, this criterion is not met. In this work we have investigated the influence of random error (computer generated), various extraction technologies, and 18 different Gram-positive and -negative foodborne bacteria on PCR efficiency in determining unknown DNA concentration. We found that: even a small amount of random error induces relatively large perturbations in efficiency; most of the variation in efficiency is random albeit some extraction methods/isolates show significant effects on this important qPCR parameter which might be related to polymerase inhibitory compounds being present.

**Technical Abstract:** Using computer-generated data calculated with known amounts of random error (E = 1, 5 & 10%) associated with calculated qPCR cycle number (C ) at four jth 1:10 dilutions, we found that the “efficiency” (eff) associated with each population distribution of n = 10,000 measurements varied from 0.95 to 1.05 for E = 1% (average eff = 1.00 ± 0.0132), 0.85 to 1.2 for E = 5% (average eff = 1.00 ± 0.0665; fraction of observed distribution between eff = 0.9 and 1.1 = 89%), and 0.7 to 1.6 for E = 10% (average eff = 1.02 ± 0.139; fraction of distribution between eff = 0.9 and 1.1 = 54%). The data associated with the highest error rate also displayed a large asymmetry in the eff frequency distribution whereupon the 3rd central moment was about 8-fold greater than the distribution linked to the lowest rate of error. Not surprisingly, this skewness was shown to be associated with the eff calculation since the distribution of 'Cj./'Log[0.1 lambda j] (10% error; j = 0, 1, 2, 3), from which eff is calculated, was normally distributed (Gaussian distribution function: mu = -3.32 ± 0.00399 and sigma = 0.327 ± 0.00327; ± asymptotic standard error). To better identify potential sources of such variation, we investigated DNA standards of known concentration (n = 8 independent sets of experiments of 6 dilutions each) from 3 isolates to test the typical threshold cycle number qPCR data (Ct) with that of the less used derivative method (C': interpolated value of C where the second derivative in normalized fluorescence with respect to C = 0) and found that the average coefficient of variation (CV) dropped from 4.35% to 2.70% using the latter method. Using this less error-prone variable (C'), we tested 11 different DNA extraction technologies (applied to 1 Gram-positive and 2 Gram-negative organisms using primers specific for their 16S rRNA “genes”) and found that 3 of these methods showed significant variation in epp due to extraction method alone (averaged across 3 replicates/isolate × 3 isolates). Applying 3 of the most effective (i.e., those with the greatest 16S rDNA copies per colony forming unit) DNA extraction methods to 5 different Gram-positive and 10 Gram-negative organisms, only two organisms showed significant variation in epp due to organism alone (averaged across 3 replicates/method × 3 methods using a universal primer for the 16S rRNA “gene”). Overall, the implication is that most of the variation in qPCR eff is random (CV approximately 8%). However, certain DNA extraction methods and isolates did induce a greater variation in eff (CV approximately 11%).