Skip to main content
ARS Home » Southeast Area » Mississippi State, Mississippi » Crop Science Research Laboratory » Genetics and Sustainable Agriculture Research » Research » Publications at this Location » Publication #414662

Research Project: Dynamic, Data-Driven, Sustainable, and Resilient Crop Production Systems for the U.S.

Location: Genetics and Sustainable Agriculture Research

Title: Impact of sampling techniques on crop type mapping using multi-temporal composites from Harmonized Landsat-Sentinel images

Author
item AIRES, UILSON - Mississippi State University
item MARTINS, VITOR - Mississippi State University
item LUCAS, FERREIRA - Mississippi State University
item Huang, Yanbo
item Heintzman, Lucas
item OUYANG, YING - Forest Service (FS)

Submitted to: Computers and Electronics in Agriculture
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/11/2025
Publication Date: 6/20/2025
Citation: Aires, U.R., Martins, V.S., Lucas, F., Huang, Y., Heintzman, L.J., Ouyang, Y. 2025. Impact of sampling techniques on crop type mapping using multi-temporal composites from Harmonized Landsat-Sentinel images. Computers and Electronics in Agriculture. 237(110676):1-19. https://doi.org/10.1016/j.compag.2025.110676.
DOI: https://doi.org/10.1016/j.compag.2025.110676

Interpretive Summary: An important step to achieve a high-quality crop type mapping result based on satellite imagery is the data sample selection for model calibration. However, researchers typically have questions about the optimal sampling strategy to choose when developing their applications. This study investigated how different sampling techniques and sample sizes influence crop type mapping in the complex agricultural landscape of Mississippi, United States. In this study the impact of four sampling techniques (grid, random, stratified, and cluster techniques) and sample sizes (1%, 0.5%, 0.1%, and 0.05% of the total pixels of the image tile) was evaluated with machine learning classification of six primary crops in the Mississippi Delta region. The result indicated that he optimal sampling approach is to allocate the samples per stratum to represent all classes in the crop area, and the evaluated satellite image-derived temporal composites are effective for the crop type mapping.

Technical Abstract: Crop type mapping provides a key spatial information for agricultural monitoring and management, crop production, and food security. Satellite multi-spectral imagery and machine learning algorithms have been used for large-area crop type mapping, and one of the relevant steps to achieve a high-quality mapping result is the data sample selection for model training. However, researchers typically have questions about the optimal sampling strategy to choose when developing their applications, and neglecting this step can lead to high commission and omission errors depending on the class representation. This study investigated how different sampling techniques and sample sizes influence crop type mapping in the complex agricultural landscape of Mississippi, United States. We derived a set of temporal composites from 2022 Harmonized Landsat Sentinel-2 (HLS) images, and utilized the Cropland Data Layer from USDA National Agricultural Statistics Service as the ground truth to train and validate the classification model. The impact of four sampling techniques (grid, random, stratified, and cluster techniques) and sample sizes (1%, 0.5%, 0.1%, and 0.05% of the total pixels of the HLS tile) was evaluated with Artificial Neural Network classification of six primary crops in the Mississippi Delta region, including corn, cotton, rice, sorghum, soybeans, and wheat. In general, our results showed that the stratified sampling technique achieved the best results for all crop type classes, with average F1-score ranging from 0.63 to 0.81. The less frequent classes such as sorghum and wheat were not properly represented by the grid and random sampling techniques. Allocating the samples considering a proportion between the area represented by the classes, and increasing the number of samples improved the model's performance. The highest accuracy was obtained using 0.05% and 1% of the total number of pixels in the image tile as training samples, with an average F-score ranging from 0.81 to 0.84, respectively. When an equal number of samples distributed per class was used, the results showed a significant mapping confusion among the crop type classes. Our findings suggested that the optimal sampling approach is to allocate the samples per stratum to represent all classes in the crop area, and the HLS-derived temporal composites are effective for the crop type mapping.