
Unified cross-dataset benchmark across SiBCS / WRB / USDA
Source:R/benchmark-unified.R
benchmark_unified.RdRuns a system's soilKey classifier on every dataset that has reference labels for that system, then pools the results into a single nation-/world-wide accuracy estimate.
Arguments
- systems
Character vector. Any subset of
c("wrb2022", "sibcs", "usda"). Default"all"runs all three.- datasets
Character vector. Any subset of
c("bdsolos", "febr", "kssl", "lucas_esdb"). Default"all"pools every dataset that has reference labels for the requested systems. Datasets without reference labels for a system are silently excluded from that system's pooled result.- paths
Named list of dataset paths. Element names should match those in
datasets. IfNULL(default), soilKey looks for canonical paths under"~/soil_data/".- max_n_per_dataset
Optional integer to cap per-dataset sample size (useful for development / debugging).
NULL(default) classifies every available pedon.- engine
Currently forwarded to Phase-1 aqp wiring. One of
"soilkey"(default),"aqp","both". When"aqp", setsoptions(soilKey.diagnostic_engine = "aqp")for the duration of the benchmark, which routesargic()/cambic()through the canonicalaqp::getArgillicBounds/getCambicBounds.- harmonize
If
TRUE(defaultFALSE), appliesharmonize_to_gsmto each dataset's pedons before classification, putting all chemistry/texture on the GSM depth grid (0-5 / 5-15 / 15-30 / 30-60 / 60-100 / 100-200 cm). Required for cross-dataset pooling integrity (Phase 2.3) but slow (~1-2 min for 1k pedons) and may degrade per-dataset accuracy slightly because the splined depths are approximations.- verbose
If
TRUE(default), emits cli progress.
Value
A list with elements:
per_system– per-system pooledlist(accuracy, n_compared, n_correct, confusion, per_class).per_system_per_dataset– per-(system, dataset) same shape, for breakdown.coverage– per-(system, dataset) sample sizes and label coverage.config– capturessystems, datasets, engine, soilKey_version, timestamp.
Datasets and their reference labels
| Dataset | Systems with reference labels |
| BDsolos | SiBCS (dense), WRB (sparse), USDA (sparse) |
| FEBR superconjunto | SiBCS, WRB, USDA (most rows have all 3) |
| KSSL+NASIS | USDA only (samp_taxsubgrp universal) |
| LUCAS + ESDB raster | WRB (via lookup_esdb on coords) |
For each (system, dataset) pair, this function:
Loads pedons via the appropriate
load_*helper.Filters to pedons with a populated reference label for the requested system.
Normalises both reference and predicted labels via
normalise_febr_*()/ KSSL canonicalisation helpers.Calls the system's classifier and records pred-vs-ref.
Then pools per-system results across datasets.
Engine selection (Phase 1 wiring)
For datasets with morphological data (BDsolos / FEBR), the diagnostics that pivot Argissolos / Latossolos / Cambissolos classification can be run with two engines:
engine = "soilkey"(default) – the hand-coded WRB 6/1.4/20 thresholds.engine = "aqp"– aqp::getArgillicBounds / getCambicBounds (KST 13ed 3/1.2/8 thresholds).
On the v0.9.62 RJ benchmark (722 perfis), aqp was 14.8 pp stricter
on argic and 40.6 pp more permissive on cambic; the SiBCS
Argissolos / Latossolos / Cambissolos boundary is sensitive to
both. engine is currently forwarded to a future v0.9.63
wired argic() / cambic(); for now,
benchmark_unified() reports separately per engine when
engine = "both".