Location: Cotton Ginning Research
Title: The agricultural contamination elements (ACE) dataset: Multi class annotated imagesAuthor
![]() |
Donohoe, Sean |
![]() |
Alege, Femi |
![]() |
Delhom, Christopher |
![]() |
HOUSTON, WILLIAM - Texas A&M University |
|
Submitted to: Data in Brief
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 3/25/2026 Publication Date: N/A Citation: N/A Interpretive Summary: Plastic found in cotton bales has cost United States producers more than $750 million dollars per year. One source of this is trash in the cotton field. To address this, a new agricultural contamination elements (ACE) dataset was developed to train vision systems. The dataset includes four types of elements or objects. The elements include bags, bottles, cans, and general trash. The bag type includes plastic bags and any thin sheets. Bottle and can types include only the type of containers for drinks. General trash includes any element not covered by the other types. An unmanned aerial system (UAS) captured all the images. The UAS took images of cotton fields in Mississippi at random heights and speeds in 2021, 2022, and 2023. Researchers placed the elements in the fields just before imaging and removed them right after each flight. The items serve as a general example of trash types and are not necessarily the most common examples of any type. The dataset includes over 21,500 images. Each image has an accompanying file that describes the type and location of the objects in it. This dataset provides a basis for detecting contamination in cotton fields by enabling machine vision systems. Detection is the first step toward contamination removal which will prevent economic harm to U.S. cotton producers. Technical Abstract: This work introduces the new agricultural contamination elements (ACE) dataset, which is comprised of annotated images representing four classes of elements, including bag, bottle, can, and trash. The bag annotation tag includes plastic bags and thin plastic sheet material. The trash annotation is a general-purpose tag for anything that does not fit into one of the other categories and that is not cotton. All the annotations included in the dataset are of the bounding box type. An unmanned aerial system (UAS) captured the images used for the annotations. The data capture included random heights and speeds to allow for more variation in the dataset. The current dataset includes images of cotton fields in Mississippi from three growing seasons (2021, 2022, and 2023). Researchers randomly placed the contamination elements in the cotton field before imaging and removed them from the field immediately after. The elements used within each class were random, based on what was available. The items serve as a general example of trash types and are not necessarily the most common examples of any type. In addition to three years, the data also represents different stages of the growing season. The data collected in 2021 contained the most variation in growing stages. The focus of the 2022 and 2023 data is defoliated cotton plants just before harvest. There are over 21,500 box annotations in the dataset, with 2021 accounting for 59%, 2022 accounting for 12% and 2023 accounting for 29%. The full-size images from the UAS were either still images taken at 16 megapixels (MP) or 4K video. In either case, the full-size images were pre-processed and broken into square tiles of size 720 x 720 pixels with no overlap in height but some overlap in width. The extensible markup language (XML) files contain all the annotations, and share the same name as their associated image. Folders separate the images and annotations by year to facilitate future studies that incorporate temporal aspects, such as testing performance across years. It is possible to use the dataset to train new vision systems or to benchmark existing systems using this data as a ground truth. The data is also useful for building object detection models. |
