|Freking, Bradley - Brad|
|Nonneman, Danny - Dan|
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/13/2008
Publication Date: 5/29/2008
Citation: Bischoff, S.R., Tsai, S., Hardison, N.E., York, A.M., Freking, B.A., Nonneman, D.J., Rohrer, G.A., Piedrahita, J.A. 2008. Identification of SNPs and INDELS in swine transcribed sequences using short oligonucleotide microarrays. Biomed Central (BMC) Genomics 9:252.
Interpretive Summary: At the molecular level, naturally occurring DNA sequence variation is comprised of single nucleotide polymorphisms (SNPs) or INDELs (insertions/deletions). Traditionally, experimental approaches for SNP discovery in targeted populations have relied on directed DNA sequencing. This approach suffers from difficulty of DNA template amplification, limitations of PCR multiplexing per reaction, and laborious electrophoretic separation steps. Discovery of expressed SNPs represents the most informative and valuable resource to study disease susceptibilities, to determine structural effects on protein sequence, and to design association studies aimed to clarify complex, polygenic phenotypes. Gene expression profiling utilizing microarrays is one approach to study biological function of expressed genes in complex systems. The Affymetrix Porcine microarray has the capacity to simultaneously evaluate up to 24,123 genes. Each gene is represented on the array as a set of 11 probes each 25 bases in length. Hybridization of mRNA to the target probes is affected by sequence conservation. Because of the short probe length, a SNP falling in the middle of the probe sequence will result in that probe’s failure to hybridize. This phenomenon was exploited by a statistical modeling procedure that identifies probes that do not fit the expected hybridization pattern set by the other probes within that targeted transcript. Six porcine gene expression chips were hybridized with day 25 placental total RNA of occidental (n=3) or Chinese Meishan (n=3) swine origin. A linear-mixed model was fit to the dataset to clarify probe-by-breed interaction effects. A total of 789 probe sets exceeded the false discovery rate and two-fold expression change thresholds. Twenty-seven targeted genes from the entire distribution of significance levels were subjected to validation of SNP under the identified probe. Success rate using these threshold values was 87%. These results demonstrate that this approach can identify polymorphisms between two breeds and/or lines of any species for which a short oligonucleotide array is available, and can be used to rapidly develop markers for genetic mapping and association analysis in species where high density genotyping platforms are otherwise unavailable.
Technical Abstract: Genome-wide detection of single feature polymorphisms (SFP) in swine using transcriptome profiling of day 25 placental RNA by contrasting probe intensities from either Meishan or an occidental composite breed with Affymetrix porcine microarrays is presented. A linear mixed model analysis was used to identify significant breed-by-probe interactions. Gene specific linear mixed models were fit to each of the log*2 transformed probe intensities on these arrays, using fixed effects for breed, probe, breed-by-probe interaction, and a random effect for array. After surveying the day 25 placental transcriptome, 789 probes with a q-value less than 0.05 and |fold change| greater than 2 were identified as candidates containing SFP. To address the quality of the bioinformatics approach, universal pyrosequencing assays were designed from Affymetrix exemplar sequences to independently assess polymorphisms within a subset of probes. Of those probes sampled from high-, medium-, and low-ranking categories, 20 of 27 were confirmed by pyrosequencing to contain SFPs. In most cases, the 25-mer probe sequence printed on the microarray diverged from Meishan, not occidental crosses. This analysis was used to define a set of highly reliable predicted SFPs according to their probability scores. By this method we detected transition and transversion single nucleotide polymorphisms, as well as insertions/deletions. These results demonstrate that this approach can identify polymorphisms between two breeds and/or lines of any species for which a short oligonucleotide array is available, and can be used to rapidly develop markers for genetic mapping and association analysis in species where high density genotyping platforms are otherwise unavailable.