Skip to main content
ARS Home » Southeast Area » Stoneville, Mississippi » Genomics and Bioinformatics Research » Research » Publications at this Location » Publication #354852

Research Project: Applied Agricultural Genomics and Bioinformatics Research

Location: Genomics and Bioinformatics Research

Title: Crossword: A data-driven simulation language for the design of genetic-mapping experiments and breeding strategies

item KORANI, WALID - University Of Georgia
item Vaughn, Justin

Submitted to: bioRxiv
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/17/2018
Publication Date: 3/13/2019
Citation: Korani, W., Vaughn, J.N. 2019. Crossword: A data-driven simulation language for the design of genetic-mapping experiments and breeding strategies. bioRxiv. 9:4386.

Interpretive Summary: Plant and animal breeding are critical to the maintenance of food yield per unit land. Generally, breeders design sexual crosses between appropriate parents in order to create progeny populations that can provide superior individuals that can lead to new cultivars or that can be used to identify superior genes. A breeding cycle is expensive and laborious and, therefore, should be performed at the optimum scale to achieve a breeding target. In order to provide breeders tools to evaluate the parameters of their experiments, we have developed a simple language called "crossword". crossword takes advantage of ever-expanding genomics data in order to make simulations of a targeted breeding scenario as rapid and realistic as possible. We present a set of examples of how crossword can be implemented, but, because crossword is a language, these represent only a sliver of the multitude breeding schemes that can be specified. crossword is fully supported by a user friendly graphical interface and extensive visual output. Our hope is that these features will make crossword immediately accessible to the researchers and breeders conducting experiments. crossword will also be a useful tool to help the future generation of plant breeders by building their intuitive skillset.

Technical Abstract: The quantitative genetic simulations can save time and resources by optimizing the logistics of an experiment. Current tools are difficult to use by those unfamiliar with programming, and these tools rarely address the actual genetic structure of the population under study. Here, we introduce crossword, which utilizes the widely available results of re-sequencing and genomics data to create more realistic simulations and to simplify user input. The software was written in R, making installation and implementation straightforward. Because crossword is a domain-specific language, it allows complex and unique simulations to be performed, but the language is supported by a graphical interface that guides users through functions and options. We first show crossword's utility in QTL-seq design, where its output accurately reflects empirical data. By introducing the concept of levels to reflect family relatedness, crossword is suitable to a broad range of breeding programs and crops. Using levels, we further illustrate crossword's capabilities by examining the effect of family size and number of selfing generations on phenotyping accuracy and genomic selection. Additionally, we explore the ramifications of effect polarity among parents in a mapping cross, a scenario that is common in crop genetics but often difficult to simulate. Given the ease of use and apparent realism, we anticipate crossword will quickly become a "bicycle for the [geneticist's] mind".