Submitted to: Acta Horticulturae
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/4/2021
Publication Date: 8/17/2021
Citation: Wang, X., Grauke, L.J. 2021. SNP discovery and SSR mining from Carya illinoinensis - RAD sequences. Acta Horticulturae. 1318:165-176. https://doi.org/10.17660/ActaHortic.2021.1318.25.
Interpretive Summary: Pecan (Carya illinoinensis) genetics study lags behind other crops duo to lack of genomic information. Two pecan genome scaffolds have been available in the USDA ARS Pecan Breeding and Genetics Program. We used these two scaffolds as references to map nine pecan cultivar restriction-site associated DNA (RAD) sequences, in order to discover molecular tools known as single nucleotide polymorphisms (SNPs) and simple sequence repeat (SSR) markers. These RAD-based markers will be validated using our pecan Repository Collection and breeding and/or mapping populations. This project is the foundation of our current USDA Speciality Crop Research Initiative (SCRI) project to develop new genetic tools for pecan breeding and germplasm enhancement.
Technical Abstract: Pecan (Carya illinoinensis) is recognized as one of the major nut trees worldwide. Long juvenility, large tree size, and heterozygosity due to out-breeding hamper genomic characterization of this species. We sequenced Restriction-site Associated DNA (RAD) from nine pecan cultivar genomes. In total, 87.7M useful short reads with an average length of 45.91 bp, equivalent to ~5x pecan genomes were produced from four restriction enzyme-digested RAD libraries. Sequences were assembled into a total of 64,391 contigs N50 with an average length of 269 bp. Cultivar 'Wichita' generated the most contigs of 10,456 and 'Sumner' the fewest of 5,439. Contigs were mapped to two previously sequenced pecan cultivar scaffolds ('87MX3-2.11' and 'Pawnee'), generating 78,264 and 56,047 SNP markers respectively. Contigs were also mined to discover 4,698 SSR motifs (di-nucleotides or higher), with 3,014 (64.2%) allowing design of SSR primers. Of the four restriction enzymes, SbfI generated the highest number of 41.6M useful reads from nine cultivars, of which approximately 18 M reads (43%) were assembled, followed by FseI and NotI. AscI showed the lowest numbers of reads and fewest SNP. The rates of SSR discovery from four RAD libraries showed the same trend with SNP discovery. Based on the preliminary results, SbfI was the optimal enzyme for RAD-based marker discovery in pecan. In addition, the SNP variation among pecan cultivars has no significant difference, but apparently depending on their genetic/geographic distance with reference genome. This study provides not only useful molecular markers for population association mapping and genotyping, but a strategy for pecan whole genome sequencing and subsequently gene discovery.