Submitted to: Biomed Central (BMC) Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: August 9, 2013
Publication Date: August 15, 2013
Citation: Pegadaraju, V., Nipper, R., Hulke, B., Qi, L., Schultz, Q. 2013. De novo sequencing of sunflower genome for SNP discovery using RAD (Restriction site Associated DNA) approach. Biomed Central (BMC) Genomics. 14:556. Interpretive Summary: This paper describes the methods used to produce a large set of SNP markers for sunflower. These markers can be useful for conducting DNA analysis of sunflower plants. On average, 1 SNP was located every 143 bp of the sunflower genome sequence. We report the molecular and computational methodology involved in SNP development for a complex genome like sunflower lacking reference assembly, offering an attractive tool for molecular breeding purposes in sunflower. The success of this project can be translated to other crop species with similarly complex genomes.
Technical Abstract: Application of Single Nucleotide Polymorphism (SNP) marker technology as a tool in sunflower breeding programs offers enormous potential to improve sunflower genetics, and facilitate faster release of sunflower hybrids to the market place. Through a National Sunflower Association (NSA) funded initiative, we report on the process of SNP discovery through reductive genome sequencing and local assembly of six diverse sunflower inbred lines that represent oil as well as confection types. A combination of Restriction site Associated DNA Sequencing (RAD-Seq) protocols and Illumina paired-end sequencing chemistry generated high quality 89.4M paired end reads from the six lines which represent 5.3 GB of the sequencing data. Raw reads from RHA 464 were assembled de novo to serve as a framework reference genome. About 15.2 Mb of sunflower genome distributed over 42,267 contigs were obtained upon assembly of RHA 464 sequencing data, the contig lengths ranged from 200-950bp with an N50 length of 393 bp. SNP calling was performed by aligning sequencing data from the six sunflower lines to the assembled reference RHA 464. On average, 1 SNP was located every 143 bp of the sunflower genome sequence. Based on several filtering criteria, a final set of 16,467 putative sequence variants with characteristics favorable for Illumina Infinium Genotyping Technology (IGT) were mined from the sequence data generated across six diverse sunflower lines. Here we report the molecular and computational methodology involved in SNP development for a complex genome like sunflower lacking reference assembly, offering an attractive tool for molecular breeding purposes in sunflower.