Submitted to: National Fusarium Head Blight Forum
Publication Type: Abstract Only
Publication Acceptance Date: 12/11/2004
Publication Date: 12/11/2004
Citation: Gueldener, U., Trail, F., Xu, J.-R., Adam, G., Kistler, H.C. 2004. Development of an Affymetrix GeneChip microarray using the GEN-AU/MIPS Fusarium graminearum genome database [abstract]. 2nd International Symposium on Fusarium Head Blight Proceedings. p. 566.
Technical Abstract: Shortly after public release of the genome sequence of the plant pathogenic fungus Fusarium graminearum by the Broad Institute, automated draft gene calls were processed at MIPS and also at the Broad Institute. For both predicted gene sets, a variety of bioinformatics methods were applied at MIPS using the PEDANT system. Manual inspection of the calls using different gene prediction programs and EST sequences led to the conclusion that ~ 1/3 of all calls are falsely predicted. These errors lead to a significant amount of under-prediction, based on falsely fused coding regions, or improper assignment of 5' and 3' ends. Thus manual inspection of genes, at least the most interesting targets, is desirable before the design of an Affymetrix GeneChip microarray design in order to avoid as many mis-designed probe sets as possible. With the help of the Fusarium community we manually processed ~860 entries; 408 of the calls were altered or added as completely new calls. To integrate all different calls as well as the results of the applied bioinformatics methods, the F. graminearum Genome Database was created (http://mips.gsf.de/genre/proj/fusarium/). However, only 6.1% of the putative 14,000 Fusarium genes were manually processed. During the manual gene modeling and correction procedure it appeared that the MIPS draft gene call set performed significantly better than the Broad set. Therefore, we produced a combined gene call set for the Affymetrix GeneChip design with the order of preference “manually processed new calls” > “MIPS draft set” > “Broad set”. To reduce the number of ~26,000 gene calls, the ORF sequences were truncated to 500 bases towards the 3' end and all redundant call names were added to the preferred ones as an alias. This approach resulted in a set of 16,926 calls. The set of full-length ORF sequences and an additional 611 EST- and rRNA-sequences were submitted to Affymetrix for initial computation of probe sets. After three rounds of chip design proposals, the sets were approved for mask design.