Location: Genomics and Bioinformatics Research2020 Annual Report
1. Advance and accelerate translational research for ARS and its collaborators that addresses the agricultural needs of primarily the Southeast region, through data generation, data analysis, and data management, with an emphasis on genomic approaches and on crop, animal, insect, and microbiome analyses; support germplasm analysis for breeding and for trait genetic and molecular analyses; and support gene expression analysis and gene discovery. 1.A. A cross section of GBRU operations in genomics and bioinformatics. 1.B. Specific ongoing collaborative projects. 1.C. Data Management. 2. Accelerate ARS bioinformatics community development and capacity building, primarily for the Southeast region, through training workshops, webinars, and direct project participation; develop and evaluate new tools, workflows, and systems that enable ARS and its collaborators to more efficiently manage, analyze, and share diverse streams of biological data and knowledge, including high throughput genotyping and phenotyping, thereby enhancing crop and animal genetic improvement, health, and nutrition. 2.A. Bioinformatics community development and capacity building. 2.B. Development of new tools and procedures.
The Genomics and Bioinformatics Research Unit’s (GBRU) primary function is conducting research in the areas of bioinformatics and genomics on a wide array of species and topics. Genomic technologies are powerful tools for germplasm improvement using marker assisted selection (MAS), biotechnology, or synthetic biology, and for analyzing associated biological processes (genetics, physiology, cell and molecular biology, biochemistry, and evolutionary biology). Thus, many ARS scientists, e.g., crop and animal breeders, have a direct need for genomic tools in their research. Others, e.g., soil scientists, can enhance their research dramatically using genomic tools to analyze the microbiome, if the technologies and appropriate expertise are available. However, not all ARS locations have sufficient resources to support core genomic technologies. Thus, the mission of the ARS Genomics and Bioinformatics Research Unit (GBRU), is to: (1) coordinate, facilitate, collaborate and conduct genomics and bioinformatics research emphasizing the Southeast region; (2) serve as a research and training resource for genomic technologies and bioinformatic analyses in support of ARS scientists and their collaborations; and (3) serve as a technical resource for ARS research programs that have not typically utilized these technologies, and aid in their development of genomic resources. Within the GBRU, this research project will conduct and collaborate on genome sequencing, sequence assembly and analysis, diversity analysis, marker development, haplotyping, physical and genetic map production, and transcription profiling research. Thus, essential product development includes new and improved reference genomes for plants, animals, insects, fish, and microbes that enable genomics-assisted breeding; new physical and genetic maps; improved cultivars, germplasm, or breeding lines; and new information on key agricultural problems such as disease resistance and drought tolerance.
Service and research efforts have continued during the past year in all areas of genomics and bioinformatics, with some impacts from COVID-19 due to decreased personnel available for running the service lab component. In relation to genomes, the unit has contributed to several genomes including yellow perch, blueberry, st. augustinegrass, fescue grass, pecan, several peanut species, numerous insect pests, and fungi. The unit has helped lead multiple aspects of the Ag100Pest project, generating high quality genomes for important insects including the murder hornet and other immediate pests of interest. Members of the unit have served on multiple working groups in the Ag100Pest project, including the “Assembly Team” which has worked with SciNet to advance capabilities and training materials for standard operating procedures for genome assembly in insects, but that applies largely across species. These tools and the insect genomes will be used by the scientific community to control and monitor pests in the field. A four-year project to characterize U.S. rice germplasm, 167 lines, was completed and submitted for publication. In addition to data releases on Ricebase, a tool for exploring the wealth of information was released called HaploStrata as an open source tool. This project developed a high-quality genome reference sequence for Carolina Gold, which was a historically important, founding US variety. This work illuminated the nature of deleterious alleles in crop breeding with specific details about the genomic flux of U.S. rice through the century. The unit has developed genomes for most cultivated peanut types and a large number of important peanut species that have been used in historical (and possible future) breeding contexts. This effort should dramatically enhance the ability to understand and incorporate unexplored germplasm into US breeding materials. These genome assemblies will be utilized to build a pangenome that promises to be more stable and, ultimately, more useful in the application of genomics to breeding efforts. In addition, a tool has been developed that will allow for detection of spontaneous gene conversion events which appear to happen at relatively advanced rates in peanut. SCINet resource components were enhanced by the unit. The public facing website for SCINet/CERES was designed and released working with the Scientific Advisory Team. User guides and case studies were developed related to genetic mapping and data analysis using Jupyter/RStudio/Jupyterlab framework. Additional training materials were developed working with the Ag100Pest group for assembly. Multiple Data Carpentry courses were taught and assisted by the unit, focusing on Unix, Git, R Software and Python. Breeding Insight On-Ramp program was initiated in the Southeast area led by the unit. This program will focus on advancing commodities to prepare for utilizing database applications for field analyses, archiving historical data, developing trait ontologies and initiate data collection suitable for advanced genomic analyses. Initial commodities currently supported in the program are blueberry and sugarcane, with multiple other commodities in discussions to enter the program. A method was developed to determine the sex of chicken eggs by sensing their odor with rapid mass spectrometry and applying machine learning. A provisional patent application was filed and, with funding from the Foundation for Food and Agricultural Research, work has begun to screen large numbers of eggs and develop increasingly accurate machine learning methods sex determination. En-ovo sex determination would prevent the culling of millions of male chicks and reduce costs in the poultry industry. The application of machine learning and high-speed volatile compound detection is a novel platform enabling computational method that has application in many areas of Agriculture. The unit is evaluating it for real-time aflatoxin detection, dairy cow mastitis detection, the diagnosis of Bovine respiratory disease complex, coffee breeding, and citrus greening.
1. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Cotton research collaborations have led to the availability of high-quality reference genomes from five allotetraploid species, including both polyploid Pima and Upland species. Comparisons of the genomes of these species has validated the prior research indicating very minimal diversity in cottons, this information will help ARS researchers in Stoneville, Mississippi, to investigate the small number of differences that are likely to be extremely important for the observed phenotypic differences among the species. Continued collaboration will focus on elucidating available diversity in cultivated types and investigation of these limited differences between species to understand their functional consequences.
2. Under-the-radar dengue virus infections in natural populations of Aedes aegypti mosquitoes. Metagenomics has helped to identify Dengue virus in Florida prior to any human infection. ARS researchers in Stoneville, Mississippi, have demonstrated the ability to monitor vector borne diseases ahead of outbreaks by metagenomics. To date, the current U.S. public health system’s response to outbreaks has been largely reactive, but this research shows that by monitoring mosquito populations it may be possible to identify emerging mosquito borne diseases in high-risk, high-tourism areas of the United States to enable proactive, targeted vector control before potential outbreaks.
Ulloa, M., De Santiago, L., Hulse-Kemp, A.M., Stellly, D.M., Burke, J.J. 2019. Enhancing upland cotton for drought resilience, productivity and fiber quality: Comparative evaluations and genetic dissection. Molecular Genetics and Genomics. 295(1):155-176. https://doi.org/10.1007/s00438-019-01611-6.
Kandel, S.L., Hulse-Kemp, A.M., Stoffel, K., Koike, S.T., Shi, A., Mou, B., Van Deynze, A., Klosterman, S.J. 2020. Transcriptional analyses of differential cultivars during resistant and susceptible interactions with Peronospora effusa, the causal agent of spinach downy mildew. Scientific Reports. 10:6719. https://doi.org/10.1038/s41598-020-63668-3.
Chen, Z., Sreedasyam, A., Ando, A., Song, Q., De Santiago, L., Hulse-Kemp, A.M., Ding, M., Ye, W., Kirkbride, R., Jenkins, J., Plott, C., Lovell, J., Yu-Ming, L., Vaughn, R., Liu, B., Simpson, S.A., Scheffler, B.E. 2020. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nature Genetics. https://doi.org/10.1038/s41588-020-0614-5.
Peters, D.C., Rivers, A.R., Hatfield, J.L., Lemay, D.G., Liu, S.Y., Basso, B. 2020. Harnessing AI to transform agriculture and inform agricultural research. IEEE IT Professional. 22(3):16-21. https://doi.org/10.1109/MITP.2020.2986124.
Sudduth, K.A., Woodward Greene, M.J., Penning, B., Locke, M.A., Rivers, A.R., Veum, K.S. 2020. AI down on the farm. IEEE IT Professional. 22(3):22-26. https://doi.org/10.1109/MITP.2020.2986104.
Boyles, S., Mavian, C., Finol, E., Ukhanova, M., Stephenson, C., Hamerlinck, G., Kang, S., Baumgartner, C., Geesey, M., Rivers, A.R. 2020. Under-the-radar dengue virus infections in natural populations of aedes aegypti mosquitoes. mSphere. 5(2):e00316-20. https://doi.org/10.1128/mSphere.00316-20.
Molin, W.T., Kronfol, R.R., Ray, J.D., Scheffler, B.E., Bryson, C.T. 2019. Genetic diversity among geographically separated Cyperus rotundus accessions based on RAPD markers and morphological characteristics. American Journal of Plant Sciences. 10:2034-2046.
Kingan, S.B., Urban, J., Lambert, C.C., Baybayan, P., Childers, A.K., Coates, B.S., Scheffler, B.E., Hackett, K.J., Korlack, J., Geib, S.M. 2019. A high-quality genome assembly from a single, field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System. Gigascience. 8(10):1-10. https://doi.org/10.1093/gigascience/giz122.
Grana, E., Diaz-Tielas, C., Sanchez-Moreiras, A.M., Reigosa, M.J., Celeirao, M., Abagyan, R., Teijeira, M., Duke, M.V., Clerk, T., Pan, Z., Duke, S.O. 2019. Transcriptome and binding data indicate that citral inhibits single strand DNA-binding proteins. Physiologia Plantarum. 169(1):99-109. https://doi.org/10.1111/ppl.13055.
Valles, S.M., Rivers, A.R. 2019. Nine new RNA viruses associated with the fire ant Solenopsis invicta from its native range. Virus Genes. https://doi.org/10.1007/s11262-019-01652-4.
Bolyen, E., Rideout, J., Dillon, M., Bokulich, N., Abnet, C., Al-Ghalith, G., Alexander, H., Alm, E., Arumugam, M., Asnicar, F., Rivers, A.R. 2019. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. Nature Biotechnology. https://doi.org/10.1038/s41587-019-0209-9.
Martinez-Castillo, J., Arias De Ares, R.S., Andueza-Noh, R.H., Ortiz-Garcia, M.M., Irish, B.M., Scheffler, B.E. 2019. Microsatellite markers in Spanish lime (Melicoccus bijugatus Jacq., Sapindaceae), a neglected Neotropical fruit crop. Genetic Resources and Crop Evolution. 66(7):1371-1377. https://doi.org/10.1007/s10722-019-00815-4.
Korani, W., Vaughn, J.N. 2019. Crossword: A data-driven simulation language for the design of genetic-mapping experiments and breeding strategies. bioRxiv. 9:4386. http://dx.doi.org/10.1101/330563.
Stewart-Brown, B.B., Song, Q., Vaughn, J., Li, Z. 2019. Genomic selection for yield and seed composition traits within an applied soybean breeding program . G3, Genes/Genomes/Genetics. https://doi.org/10.1534/g3.118.200917.
Diaz-Tielas, C., Grana, E., Sanchez-Moreiras, A.M., Reigosa, M.J., Vaughn, J.N., Pan, Z., Bajsa Hirschel, J.N., Duke, S.O. 2019. Transcriptome responses to the phytotoxin t-Chalcone in Arabidopsis thaliana L. Pest Management Science. https://doi.org/10.1002/ps.5405.
Arias De Ares, R.S., Ballard, L.L., Duke, M.V., Simpson, S.A., Liu, X.F., Orner, V.A., Sobolev, V., Scheffler, B.E., Martinez-Castillo, J. 2020. Development of nuclear microsatellite markers to facilitate germplasm conservation and population genetics studies of five groups of tropical perennial plants with edible fruits and shoots: ranbutan (Nephelium lappaceumt). Genetic Resources and Crop Evolution. https://doi.org/10.1007/s10722-020-00965-w.