|Kirst, Matias - CORNELL UNIVERSITY|
|Caldo, Rico - IOWA STATE UNIV.|
|Casati, Paula - STANFORD UNIV.|
|Tanimoto, Gene - AFFYMETRIX, INC., CA|
|Walbot, Virginia - STANFORD UNIV.|
Submitted to: Plant Biotechnology Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: April 20, 2006
Publication Date: August 5, 2006
Citation: Kirst, M., Caldo, R., Casati, P., Tanimoto, G., Walbot, V., Wise, R.P., Buckler Iv, E.S. 2006. Contribution of genetic diversity to type 1 errors in short-oligonucleotide microarray analysis. Plant Biotechnology Journal. 4:489-498. Interpretive Summary: Although genome sequencing projects have the potential to rapidly advance our understanding of the molecular mechanisms underlying cellular processes, complete sequence data alone serves only to identify all theoretical genes within a genome. Microarrays, or DNA chips, help determine the functional expression of genes by measuring levels of mRNA and their corresponding proteins. Prior to this study, wide application of DNA arrays within a species had typically not been questioned. Here, we challenge this notion by demonstrating that short oligonucleotide-based gene expression analysis of maize - and likely other highly diverse species - can be considerably confounded by DNA sequence diversity. Therefore, genetic polymorphism in highly diverse species should be considered during the initial probe design as well as when evaluating levels of detection.
Technical Abstract: DNA arrays based on short oligonucleotide (=25-mers) probes are increasingly being developed and applied to quantify transcript abundance variation in species with high genetic diversity. We analyzed gene expression estimates generated for four maize inbred lines using a custom Affymetrix DNA array and identified biases associated with high levels of polymorphism among lines. Statistically significant interactions between probes and maize inbreds were detected, affecting five or more probes in the majority of cases. Single nucleotide polymorphisms (SNPs) and insertions/deletions were identified by re-sequencing as the primary source of probe by line interactions, affecting probeset level estimates and reducing power of detecting transcript level variation among maize inbreds. This analysis identified 36,196 probes in 5,118 probesets containing markers that may be used for genotyping in natural and segregating populations for association gene analysis and genetic mapping.