|DUPUIS, JULIAN - University Of Hawaii|
|BREMER, FOREST - University Of Hawaii|
|SAN JOSE, MICHAEL - University Of Hawaii|
|LEBLANC, LUC - University Of Idaho|
|RUBINOFF, DANIEL - University Of Hawaii|
Submitted to: Molecular Ecology Resources
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/19/2018
Publication Date: 4/6/2018
Citation: Dupuis, J.R., Bremer, F.T., Kauwe, A.N., San Jose, M., Leblanc, L., Rubinoff, D., Geib, S.M. 2018. HiMAP: robust phylogenomics from highly multiplexed amplicon sequencing. Molecular Ecology Resources. 18:1000-1019. https://doi.org/10.1111/1755-0998.12783.
Interpretive Summary: With over 4,900 species, ~200 recognized as serious agricultural pests, fruit flies in the family Tephritidae are a systematically diverse and taxonomically difficult group. Four genera, Anastrepha, Bactrocera, Ceratitis, and Zeugodacus, include some of the most economically important pest species in the tropics and subtropics, and control, eradication, and quarantine efforts are employed across the world to combat these pests. The main taxonomic difficulties across these genera lie in groups of closely-related species, whose adults, and even more so, larvae, are morphologically indistinct. Our goal was to develop a phylogenomic foundation for a rapid, straightforward tool for species identification of invasive tephritid fruit flies that are commonly detected in California, Florida, and South Texas. We developed a bioinformatic locus selection pipeline that takes advantage of a variety of genomic and transcriptomic data sources to identify phylogenetically-informative, conserved exons in orthologous genes. Using Paragon Genomics’ CleanPlex technology, we targeted 878 conserved exons in highly multiplexed, single tube reactions for hundreds of individuals across the four aforementioned genera. This approach yielded a phylogenomic dataset that far exceeded the phylogenetic resolution of existing datasets, containing >40,000 informative characters after reasonable filtering. The wet lab procedure and our analysis pipeline can analyze hundreds of individuals at a time, and return taxonomic, and in some cases population level, assignment in as few as three days from sample collection. Our approach provides a novel way to combine diverse genomic and transcriptomic data sources, particularly when at least one well-annotated data source is available, and can rapidly develop robust phylogenetic analyses for non-model systems that are scalable, cost-effective, and robust.
Technical Abstract: High-throughput sequencing has fundamentally changed how molecular phylogenetic datasets are assembled, and phylogenomic datasets commonly contain 50-100-fold more loci than those generated using traditional Sanger-based approaches. Here, we demonstrate a new approach for building phylogenomic datasets using single tube, highly multiplexed amplicon sequencing, which we name HiMAP (Highly Multiplexed Amplicon-based Phylogenomics), and present bioinformatic pipelines for locus selection based on genomic and transcriptomic data resources and post-sequencing consensus calling and alignment. This method is inexpensive and amenable to sequencing a large number (hundreds) of taxa simultaneously, requires minimal hands-on time at the bench (<1/2 day), and data analysis can be accomplished without the need for read mapping or assembly. We demonstrate this approach by sequencing 878 amplicons in single reactions for 82 species of tephritid fruit flies across seven genera (384 individuals), including some of the most economically-important agricultural insect pests. The resulting dataset (>150,000 bp concatenated alignment) contained >40,000 phylogenetically informative characters, and although some discordance was observed between analyses, it provided unparalleled resolution of many phylogenetic relationships in this group. Most notably, we found high support for the generic status of Zeugodacus and the sister relationship between Dacus and Zeugodacus. We discuss HiMAP, with regard to its molecular and bioinformatic strengths, and the insight the resulting dataset provides into relationships of this diverse insect group.