Submitted to: International Genome Sequencing and Analysis Conference
Publication Type: Abstract Only
Publication Acceptance Date: July 6, 2004
Publication Date: September 27, 2004
Citation: Harhay, G.P., Sonstegard, T.S., Clawson, M.L., Smith, T.P. 2004. Database directed full-length cDNA generation from normalized bovine cDNA libraries. International Genome Sequencing and Analysis Conference XVI, Washington, DC. September 27-30, 2004. Poster Abstract #49. Technical Abstract: Full-length cDNA sequences including the 5' untranslated region (UTR), entire coding sequence, and 3' UTR are critically important in producing accurate gene models. Assembly and annotation of the impending bovine genome sequence assembly will benefit greatly from full-length cDNA sequence data. Consequently, we developed a database directed full-length bovine cDNA prediction, sequencing, analysis, and annotation pipeline. To facilitate the economical production of bovine full-length cDNA sequences, EST data produced from normalized cDNA libraries were computationally screened for potential full-length clone candidates via similarity to transcripts in the NCBI RefSeq database. The entire lengths of the candidate clone inserts were sequenced using primer walking and assembled. The pipeline is tolerant to sequencing failures and provides for the simultaneous sequencing of different clones at different walk steps in either walk direction on the same 384-well plate. During the initial stages of clone insert sequencing, walking primer selection and sequence assembly is heavily automated. In the later stages of insert sequencing, the pipeline provides for manual finishing primer selection and sequence analysis. Over 1,000 full-length cDNA are being processed through this pipeline prior to submission to the NCBI FLIC (Full-Length Insert Clone) database.