Genome Wide Sequencing and Analysis of Bactrocera Species Complex
Tropical Crop and Commodity Protection Research
2011 Annual Report
1a.Objectives (from AD-416)
The objective of this research is to perform genome-wide analysis of several members of the Oriental fruit fly species complex with focus on:
• Identification of gene regions for use in species-level diagnosis based on these genomes
• Identification of gene regions that demonstrate poor performance for diagnostics (and should not be included in screening studies)
• Identification of population-level markers for pathway analysis and population structure studies (these markers should be useful across species boundaries and provide markers for various bactrocera and fruit fly species)
• Comparative phylogenomics analysis of the included species to direct future studies and test species boundaries
1b.Approach (from AD-416)
Comparative genomic analysis within the dorsalis complex will be performed by using pyrosequencing technology to generate several “shallow” (low coverage) genomes from Bactrocera species and analyze them using bioinformatic tools. Using five-ten fruit fly specimens representing Bactrocera species in the dorsalis complex our team will generate genomic databases for each specimen using a 454 pyrosequencer operated at the University of Hawaii. This genomic work will be designed and supervised by a PBARC research entomologist who is currently in charge of annotating the completed Bactrocera dorsalis genome. The species included for genomic analysis will be selected based on similarity to B. dorsalis, economic significance, and value to SIT programs. The data generated using 454 technology will be edited, annotated and analyzed by CPHST and ARS staff.
The methodology for preparing high throughput DNA sequencing libraries for RAD-Tag (restriction site associated DNA tag) sequencing was developed at USDA-ARS-PBARC. This included development of methodologies for extracting high quality DNA from preserved specimens, developing methodologies for library sheering and size selection and developing barcoding methods to allow multiplexing of multiple samples of a single sequencing run. Initial sequencing libraries were prepared from 6 IAEA laboratory strains and 2 USDA-ARS laboratory strains as a proof of concept demonstration of the methodology with 4 samples barcoded on each of 2 sequencing runs. These samples are currently being sequenced on an Illumnia GAIIx high throughput sequencer and results are pending. Additionally, over 30 distinct populations of B. dorsalis are being made available to this project for RAD-Tag sequencing from a range of geographic locations in Southeast Asia and the Hawaiian Islands. Once confirmation of the proof of concept data is verified, these libraries will be sequenced. To facilitate the analysis of the large amount of data that will be produced in this project, a database has been initiated to house genomic information and can be accessed in house at USDA-ARS-PBARC as well as can be made available to collaborators at www.bactrobase.org. Throughout the year, communication between USDA-ARS, USDA-APHIS, and collaborators providing samples for sequencing have been ongoing through email communication and teleconferencing. This includes the arrangement of samples being sent from IAEA to USDA-ARS through the organization of USDA-APHIS.