Publication : USDA ARS

ARS Home » Northeast Area » Beltsville, Maryland (BHNRC) » Beltsville Human Nutrition Research Center » Diet, Genomics and Immunology Laboratory » Research » Publications at this Location » Publication #361255

Title: Enabling Open-source data networks in Public Agricultural Research

Author

	BROUDER, SYLVIE - Purdue University
	EAGLE, ALISON - Environmental Defense
	Fukagawa, Naomi
	MCNAMARA, JOHN - Washington State University
	MURRAY, SETH - Texas A&M University
	Parr, Cynthia - Cyndy
	TREMBLAY, NICOLAS - Agriculture And Agri-Food Canada

Submitted to: Council for Agricultural Science and Technology Issue Paper
Publication Type: Other
Publication Acceptance Date: 3/11/2019
Publication Date: 3/11/2019
Citation: Brouder, S., Eagle, A., Fukagawa, N.K., Mcnamara, J., Murray, S., Parr, C., Tremblay, N. 2019. Enabling Open-source data networks in Public Agricultural Research. Council for Agricultural Science and Technology Issue Paper. QTA2019-1, 20 pp..

Interpretive Summary: The next generation of agricultural problem solving will require big science and forging linkages across data sets and disciplines. Currently, a lack of data sharing and data accessibility is a major barrier for making better decisions in agriculture. Business cases for data sharing infrastructure include that pooling datasets and computational power efficiently extends sparse data resources, facilitates new discovery, derives better answers and decision making, lowers the barrier of entry, and ensures scientific reproducibility so that U.S. production agriculture can compete sustainably. Immediate imperatives for facilitating data sharing to fully realize open access to public agricultural research are the following: (1) development and implementation of best practices for data—workflows and standards—in all future federally funded projects; (2) incentives and mechanisms for making grey and dark data available; (3) coordination among existing and emerging data initiatives, networks, and repositories; and (4) dedicated and sustainable infrastructure—hardware, software, and human resources—to curate, preserve, and add value to data beyond the primary use for which they were collected. To simultaneously achieve sustained and equitable data access, the authors suggest the most promise lies with a yet untested business model in which funding agencies pay directly for stewardship in proportion to grant volume. Further, the authors propose four major institutional strategies for agriculture’s pathway forward into data-driven research: bridging gaps, reorienting institutions, leveraging assets, and connecting feedbacks. Teams must bridge expertise gaps through meaningful collaborations between agricultural researchers and data scientists. Institutions will need to reorient to prioritize team science and data sharing over smaller scale, individual efforts and to infuse an understanding of data sciences into curricula and learning outcomes. Initiatives to leverage assets should focus on surfacing grey/dark data not represented by peer-review publication, including high-value legacy datasets for which time and cost prohibit replication. Finally, for research data to achieve and maintain public value, it must connect feedbacks to ensure data are useful and useable for informing the end-user “apps” designed to enhance and secure our current food supply and address environmental and social challenges

Technical Abstract: The next generation of agricultural problem solving will require big science and forging linkages across data sets and disciplines. Currently, a lack of data sharing and data accessibility is a major barrier for making better decisions in agriculture. Business cases for data sharing infrastructure include that pooling datasets and computational power efficiently extends sparse data resources, facilitates new discovery, derives better answers and decision making, lowers the barrier of entry, and ensures scientific reproducibility so that U.S. production agriculture can compete sustainably. Immediate imperatives for facilitating data sharing to fully realize open access to public agricultural research are the following: (1) development and implementation of best practices for data—workflows and standards—in all future federally funded projects; (2) incentives and mechanisms for making grey and dark data available; (3) coordination among existing and emerging data initiatives, networks, and repositories; and (4) dedicated and sustainable infrastructure—hardware, software, and human resources—to curate, preserve, and add value to data beyond the primary use for which they were collected. To simultaneously achieve sustained and equitable data access, the authors suggest the most promise lies with a yet untested business model in which funding agencies pay directly for stewardship in proportion to grant volume. Further, the authors propose four major institutional strategies for agriculture’s pathway forward into data-driven research: bridging gaps, reorienting institutions, leveraging assets, and connecting feedbacks. Teams must bridge expertise gaps through meaningful collaborations between agricultural researchers and data scientists. Institutions will need to reorient to prioritize team science and data sharing over smaller scale, individual efforts and to infuse an understanding of data sciences into curricula and learning outcomes. Initiatives to leverage assets should focus on surfacing grey/dark data not represented by peer-review publication, including high-value legacy datasets for which time and cost prohibit replication. Finally, for research data to achieve and maintain public value, it must connect feedbacks to ensure data are useful and useable for informing the end-user “apps” designed to enhance and secure our current food supply and address environmental and social challenges.

U.S. DEPARTMENT OF AGRICULTURE

Diet, Genomics and Immunology Laboratory: Beltsville, MD