Location: Arthropod-borne Animal Diseases Research2012 Annual Report
1a. Objectives (from AD-416):
The goal of this study is to quantify the population structure of the mosquito species of medical and veterinary importance in North America, Culex tarsalis (Coquillett) and Aedes vexans (Meigen). The objective is to associate identified genotypes with geographic locations and disease vector competence.
1b. Approach (from AD-416):
Culex tarsalis and Ae. vexans mosquitoes were selected for several reasons. (1) One or both species are vector competent to transmit many viruses, including West Nile, Rift Valley Fever, Japanese B and Saint Louis encephalitis, and Eastern, Western, and Venezuelan equine encephalitis viruses; (2) The species opportunistically feed on both mammals and birds and potentially serves as a bridge vector between distinct host groups; and (3) The species have a long dispersal distance and extensive geographic distribution (coast to coast and from the Southern United States into Canada) which creates a large at-risk area. Throughout their range one or both species of mosquito have been characterized by phenotypic variation in virus susceptibility, horizontal and vertical virus transmission, autogeny, and host preference. However, the observed phenotypic variation is currently unlinked to genetic or geographic groupings, despite previous studies examining population structure using allozyme, five microsatellite and mitochondrial markers. Finding appropriate genetic markers for this non-model organism has been difficult because of the relatively recent expansion (~22,000 years) throughout the North American continent. Mosquitoes from at least 150 locations throughout their known distributions will be collected for a population genetics analysis. The genomic DNA from at least ten individuals from each collection location will be pooled and used for the identification of single nucleotide polymorphisms (SNPs) using restriction-site-associated DNA tags (RAD-tags) and 454-sequencing. The SNPs will be used with phyogenetics and population genetics techniques to describe the historical range expansion and to quantify the number of existing mosquito populations and the migration rates between them.
3. Progress Report:
Vexans and Culex tarsalis mosquito specimens collected throughout the summer of 2011 during the North American Mosquito Project, were used in preparation of RAD-Seq testing and library sequencing as preliminary steps to quantifying variation between mosquito populations. The most significant accomplishments for this project to date were: 1) Completion of RAD-Seq test library prep, 2) Completion of RAD-Seq test library sequencing and 3) Completion of RAD-Seq test library preliminary bioinformatic analysis. 1) Completion of RAD-Seq test library prep: Completion of test RAD-Seq library prep of 12 sampels (1/2 tarsalis, 1/2 vexans) all with four different restriction enzymes: PstI, EcoRI, SgrAI, SbfI. This is important because test libraries allow us to quantify how well an enzyme works on the given sample material. In an organism (or two) without a reference genome, it is not possible to accurately estimate the number of restriction sites. Normally this would be done in silico and would provide information on which enzyme would be best to use for a given experiment. However, since that is not the case for these two species of mosquito, this information needed to be identified to more accurately move forward on the large scale efforts. Quantification of test libraries- with a Qubit fluorometer and agarose gel analysis provided us essential information (RAD-Seq fragment sizes, level of contamination and confirmation of fluorometric readings) for high quality RAD-Seq output upon sequencing. 2) Completion of RAD-Seq test library sequencing: Completion of RAD-Seq sequencing of test libraries described above. Sequencing of test libraries is essential to determine the quality and quantity of a sequence for every given sample/enzyme combination. This is essential as some sample/enzyme combinations perform better than others and without a reference genome to estimate these values, test sequencing needs to be performed. These sequencing metrics derived from a test period, such as this, will provide the much needed metrics for full scale sequencing of the 96 pools of this project. Further, since this project entails RAD sequencing of pooled samples, different considerations need to be taken. Pooled samples require higher coverage sequencing to accurately determine all alleles present in the pool, as well as calculate all allele frequencies. Lastly, RAD sequencing provides the raw data from which all bioinformatic analyses begin. 3) Completion of RAD-Seq test library preliminary bioinformatic analysis: Completion of preliminary RAD-Seq bioinformatic analysis: Preliminary bioinformatic analysis allows us to determine the following criteria for every single sequencing event: a) Determine the quality of sequence obtained for every sample (high for all samples and enzymes) b) Determine the quantity of sequence obtained for every sample (variable for all samples- gDNA quality influences output) c) Determine if the library prep parameters established in the test phase of the project were accurate and provided the amount of sequence desired (PstI, EcoRI- sufficient parameters established in test prep stage; SgrAI- library prep parameters not sufficient for full scale production and will not be able to use this enzyme- not enough restriction sites). d) Determine the overall number of RAD tags per pool (this is the number of restriction sites x2) (PstI- 799,000, EcoRI- 750,000, SgrAI- 32,000, SbfI- TBD) e) Determine if the estimated amount of sequencing per sample is enough to identify alleles in samples and to calculate allele frequencies. (TBD) f) Determine the overall amount of sequencing required to complete the project (planning for future sequencing events). TBD once enzyme for full scale production is chosen. The major accomplishments in the first year of the project have been to determine the actual number of sequenced fragments for each restriction enzyme and the number of variations in a pool associated with each enzyme. This provided evidence for the actual amount of sequencing needed to effectively complete the project. Furthermore, the sequencing has povided evidence that the given pools of samples contain far more variation per pool than was previously expected. Ultimately this will lead to the completion of full scale preparation of 96 samples followed by sequencing and analysis by Floragenex based on the information obtained above.