|WEGRZYN, JILL - University Of California|
|MAIN, DORRIE - Washington State University|
|FIGUEROA, BEN - University Of California|
|CHOI, MINYOUNG - University Of California|
|NEALE, DAVID - University Of California|
|JUNG, SOOK - Washington State University|
|STANTON, MARGARET - Clemson University|
|ZHENG, PING - Washington State University|
|FICKLIN, STEPHEN - Washington State University|
|CHO, ILHYUONG - Washington State University|
|PEACE, CAMERON - Washington State University|
|EVANS, KATE - Washington State University|
|ORAGUZIE, NNADOZIE - Washington State University|
|CHEN, CHUNXIAN - University Of Florida|
|GMITTER, FRED - University Of Florida|
|ABBOTT, ALBERT - Clemson University|
Submitted to: Tree Genetics and Genomes
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/16/2012
Publication Date: 3/27/2012
Citation: Wegrzyn, J.L., Main, D., Figueroa, B., Choi, M., Neale, D.B., Jung, S., Stanton, M., Zheng, P., Ficklin, S., Cho, I., Peace, C., Evans, K., Volk, G.M., Oraguzie, N., Chen, C., Gmitter, F.G., Abbott, A.G. 2012. Uniform standards for genome databases in forest and fruit trees. Tree Genetics and Genomes. 8:549-557.
Interpretive Summary: Genomic databases for tree fruit and forestry species contain critical data for research and breeding programs. These data are most valuable when descriptive information about the physical traits, experimental conditions, and environmental conditions are also available. This manuscript describes the development and integration of standardized vocabularies that are being developed to increase the value of the genomic data within the TreeGenes and tfGDR databases.
Technical Abstract: TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype and phenotype projects have recently spawned the development of independent tools and interfaces within these repositories to deliver information to both geneticists and breeders. The increase in next generation sequencing projects has increased the amount of data as well as the scale of analysis that can be performed. These two repositories are now working towards a similar goal of archiving the diverse, independent data sets generated from genotype/phenotype experiments. This is achieved through focused development on data input standards (templates), pipelines for the storage and automated curation, and consistent annotation efforts through the application of widely accepted ontologies to improve the extraction and exchange of the data for comparative analysis. Efforts toward standardization are not limited to genotype/phenotype experiments but are also being applied to other data types to improve gene prediction and annotation for de novo sequencing projects. The resources developed towards these goals represent the first large-scale coordinated effort in plant databases to add informatic value to diverse genotype/phenotype experiments.