2010 Annual Report
1a.Objectives (from AD-416)
Sequencing and profiling of functional transcripts constructed in expressed sequence tags (ESTs) of Coptotermes formosanus to fulfill the following objectives: a) Identifying and annotating of genes specifically associated with post-embryonic polyphenism (caste differentiation and development); b) Discovering of genes specifically responding to environmental cues and internal signals (pesticides, food sources, juvenile hormones, etc.); and c) Characterizing of genes uniquely involved in critical physiological pathways (digestion, molting, immunity, etc.). The fulfillment of these objectives would directly lead to achieving the following goals: a) providing biochemical/physiological/molecular bases for disruption of colony formation, development and survival; b) Discovering novel target site(s) that would be developed into new control strategies that could be incorporated into effective area-wide integrated management of Formosan subterranean termites.
The overall objectives of this cooperative research are to determine genes that are unique to termites - specifically reproductive and development associated genes e.g. nymph or soldier formation mechanisms with specific emphasis on Coptotermes formosanus while also examining cellulosic material hydrolyzing enzymes for agriculture and industry. We will also concentrate on private and public database and website development strategies.
1b.Approach (from AD-416)
A cDNA library representing expressed genes in each different developmental stage of Coptotermes formosanus has been constructed. To facilitate transcriptome analysis and rare gene discovery, repeated transcripts were proportionally removed from the cDNA library using the procedure of cDNA normalization. The cDNA library would contain approximately 400,000 independent clones, an estimate of at least 16X coverage of the entire expressed genes, provided that the protein-coding capacity of invertebrate genomes is in the range of 16,000 to 25,000 genes. The sequencing and gene data assembly of the cDNA library will be cooperatively conducted in JCVI. We will continue Sanger sequencing of EST clones and perform SoLID sequencing to determine differential gene expression. EST sequences will be compared against existing databases and annotated using the Basic Local Alignment Search Tool (BLAST). Batches of sequences will be sequentially released to GenBank for public access. Unique genes and singletons would be selected and gene expression analysis. Differentially-expressed genes or development-stage specific genes will be preferentially analyzed quantitatively (such as real-time PCR) or qualitatively (such as gene silencing by RNA interference).
Last year, the JCVI continued analysis of sequences of pooled gene samples from all life stages of the Formosan subterranean termite, Coptotermes formosanus, obtained using the Sanger sequencing platform. The 132,000 gene sequencing reads were assembled into 25,000 unique sequences corresponding to individual transcripts (unigenes). Almost half of the transcripts showed similarity to known protein, while the other half did not, suggesting that they represent non-protein coding ribonucleic acids (RNAs). Among the former unigenes, over 50% unigenes (25% of all unigenes) shared similarity with other insect genomes such as the fly, bee, or wasp genomes. After insects, the second most represented class was Trichomonada, consistent with the presence of similar microrganisms in the termite gut. Many unigenes (940) in this group showed significant similarity to Trichomonas vaginalis genes.
The search for carbohydrate-active enzymes, which may be involved in digestion of cellulose, identified five superfamilies: glycosidase (124 unigenes), polysaccharide lyase (2), carbohydrate-binding (32), carbohydrate esterase (11), and glycosyltransferase (80) superfamilies. The most abundant glycosidase families included GH5 (15 unigenes) and GH7 (14 unigenes). One unigene, corresponding to a cellulose digesting enzyme gene from the termite, was experimentally characterized as cellobiohydrolase (which digests cellulose from the end). The presence of GH7 enzymes was unexpected as these cellulases are usually found only in fungi. This suggests the presence of disease causing fungi, which have been identified before as potential biological pest control agents against termites. Another well represented glycosidase family was GH22 (13 unigenes) that contain lysozymes involved in defense against pathogens.
The presence of a unigene classified as a member of a starch degrading enzyme family (GH14) typically found in plants and just a few bacteria was unusual. Another unexpected finding was the abundance of the carbohydrate-binding module family CBM13 (16 unigenes), which typically bind plant polysaccharides. In addition, unigenes from the chemical binding families were identified. Further computational analysis performed by the JCVI team demonstrated that additional sequencing is required to get sufficient genetic coverage of the termite and its symbionts.
We have begun sequencing of RNA from all individual castes of the termite by using the novel RNA-Seq approach using the Illumina short-read sequencing technology. We have obtained over 40,000,000 reads from two castes (male alates and nymphs) and are continuing sequencing of the remaining castes. About 65% of reads aligned to the previously identified unigenes. We are currently assembling the reads and beginning determination and identification of the genes expressed in the various castes. Progress is monitored through regular e-mail exchanges and via teleconference calls.