Author
RUTTER, LINDSAY - Iowa State University | |
VANDERPLAS, SUSAN - Iowa State University | |
COOK, DIANNE - Monash University | |
Graham, Michelle |
Submitted to: Journal of Statistical Software
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 3/16/2017 Publication Date: 5/29/2019 Citation: Rutter, L., Vanderplas, S., Cook, D., Graham, M.A. 2019. ggenealogy: An R package for visualizing genealogical data. Journal of Statistical Software. 89(13)1-31. https://doi.org/10.18637/jss.v089.i13. DOI: https://doi.org/10.18637/jss.v089.i13 Interpretive Summary: Genealogy is the study of parent-child relationships. Comparative geneticists, computational biologists, and bioinformaticians commonly use genealogy tools to better understand the histories of new traits arising across time. Plant breeders use genealogy information to select parents with traits of interest that can be bred to produce crops with better yield or enhanced disease resistance. However, few tools exits for visualizing complex genealogy data. Here we present software for viewing and using complex genealogy data. Technical Abstract: This paper introduces ggenealogy (Rutter et al. 2015), a developing R software package that provides tools for searching through genealogical data, generating basic statistics on their graphical structures using parent and child connections, and displaying the results. It is possible to draw the genealogy in relation to variables related to the nodes, and to determine and display the shortest path distances between the nodes. Production of pairwise distance matrices and genealogical diagrams constrained on generation are also available in the visualization toolkit. The tools are being tested on a dataset with milestone cultivars of soybean varieties (Hymowitz et al. 1977) as well as on a web-based database of the academic genealogy of mathematicians (Mathematics Genealogy Project). The software package is currently available on the Comprehensive R Archive Network. |