Skip to main content
ARS Home » Research » Publications at this Location » Publication #275322

Title: Uniform standards for genome databases in forest and fruit trees

Author
item WEGRZYN, JILL - University Of California
item MAIN, DORRIE - Washington State University
item FIGUEROA, BEN - University Of California
item CHOI, MINYOUNG - University Of California
item NEALE, DAVID - University Of California
item JUNG, SOOK - Washington State University
item STANTON, MARGARET - Clemson University
item ZHENG, PING - Washington State University
item FICKLIN, STEPHEN - Washington State University
item CHO, ILHYUONG - Washington State University
item PEACE, CAMERON - Washington State University
item EVANS, KATE - Washington State University
item Volk, Gayle
item ORAGUZIE, NNADOZIE - Washington State University
item CHEN, CHUNXIAN - University Of Florida
item GMITTER, FRED - University Of Florida
item ABBOTT, ALBERT - Clemson University

Submitted to: Tree Genetics and Genomes
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/16/2012
Publication Date: 3/27/2012
Citation: Wegrzyn, J.L., Main, D., Figueroa, B., Choi, M., Neale, D.B., Jung, S., Stanton, M., Zheng, P., Ficklin, S., Cho, I., Peace, C., Evans, K., Volk, G.M., Oraguzie, N., Chen, C., Gmitter, F.G., Abbott, A.G. 2012. Uniform standards for genome databases in forest and fruit trees. Tree Genetics and Genomes. 8:549-557.

Interpretive Summary: Genomic databases for tree fruit and forestry species contain critical data for research and breeding programs. These data are most valuable when descriptive information about the physical traits, experimental conditions, and environmental conditions are also available. This manuscript describes the development and integration of standardized vocabularies that are being developed to increase the value of the genomic data within the TreeGenes and tfGDR databases.

Technical Abstract: TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype and phenotype projects have recently spawned the development of independent tools and interfaces within these repositories to deliver information to both geneticists and breeders. The increase in next generation sequencing projects has increased the amount of data as well as the scale of analysis that can be performed. These two repositories are now working towards a similar goal of archiving the diverse, independent data sets generated from genotype/phenotype experiments. This is achieved through focused development on data input standards (templates), pipelines for the storage and automated curation, and consistent annotation efforts through the application of widely accepted ontologies to improve the extraction and exchange of the data for comparative analysis. Efforts toward standardization are not limited to genotype/phenotype experiments but are also being applied to other data types to improve gene prediction and annotation for de novo sequencing projects. The resources developed towards these goals represent the first large-scale coordinated effort in plant databases to add informatic value to diverse genotype/phenotype experiments.