Skip to main content
ARS Home » Northeast Area » University Park, Pennsylvania » Pasture Systems & Watershed Management Research » Research » Publications at this Location » Publication #400868

Research Project: Sustainable Intensification of Crop and Integrated Crop-Livestock Systems at Multiple Scales

Location: Pasture Systems & Watershed Management Research

Title: Landscape structural consequences of multivariate clustering algorithms

item Goslee, Sarah
item Pisarello, Kathryn
item Coffin, Alisa

Submitted to: US-International Association for Landscape Ecology
Publication Type: Abstract Only
Publication Acceptance Date: 2/13/2023
Publication Date: 3/20/2023
Citation: Goslee, S.C., Pisarello, K., Coffin, A.W. 2023. Landscape structural consequences of multivariate clustering algorithms [abstract]. US-International Association for Landscape Ecology. P. 1.

Interpretive Summary: No Interpretive Summary is required for this Abstract Only. JLB.

Technical Abstract: All clustering algorithms make assumptions about cluster defintions. For instance, k-means clustering produces spherical clusters of similar size in multivariate space. How do those algorithmic assumptions translate into spatial patterns in real-world examples? Four of USDA's LTAR sites with different environments were used as test cases: Gulf Atlantic Coastal Plain; Upper Chesapeake Bay; Northern Plains; and Walnut Gulch Experimental Watershed. A suite of twenty-one climate variables were grouped into 2-30 clusters (k) using ten non-spatially-explicit methods from five families: agglomerative hierarchical clustering (3 methods); divisive hierarchical clustering (1); partitioning (4); model-based clustering (1); and self-organizing trees (SOTA, 1). We focused on three landscape-level metrics to characterize the structural differences in maps produced by the algorithms: patch density (aggregation); coefficient of variation of patch area (area); and mean perimeter-area ratio (shape). Patch density increased with k. SOTA clustering produced a much higher patch density for a given k than other methods, followed by partitioning methods, and lowest for the agglomerative hierarchical and model-based algorithms. Variability in patch area largely stabilized with higher k. Agglomerative clusterings had lower variability in patch area, divisive and partitioning methods moderate, with SOTA having the highest variability. Mean perimeter-area ratio also stabilized with higher k, with partitining methods, SOTA, and divisive hierarchical clustering producing more complex patch shapes, and partitioning methods simpler shapes. For all four metrics, the differences across sites varied by both region and algorithm. Divisive hierarchical clustering behaved more like partitioning algorithms than like agglomerative hierarchical methods. While partitioning methods such as k-means produce compact spherical clusters in multivariate space, this did not translate into more compact spatial representations. Even algorithms that resulted in structurally-similar landscapes produced very different maps; structural attributes are only one aspect of characterizing and choosing an appropriate algorithm for a particular site and objective.