
Align Knowledge-Graph node labels to a canonical vocabulary
Source:R/causal_ontology.R
causal_kg_alignment.RdComputes the mapping from node labels currently present in an
edaphos_causal_kg onto their canonical counterparts in a target
vocabulary (causal_ontology_cerrado() by default). Three matchers
are tried in order — exact, substring, fuzzy — and the first hit
wins. The mapping is returned as a tidy data frame; apply it with
causal_kg_rename().
Usage
causal_kg_alignment(
kg,
vocab = NULL,
method = c("exact", "substring", "fuzzy"),
max_distance = 4L,
agrovoc_cache = NULL,
agrovoc_batch = FALSE,
agrovoc_max_active = 5L
)Arguments
- kg
An
edaphos_causal_kg.- vocab
Either a character vector / data frame of canonical terms, the string
"cerrado"(default; usescausal_ontology_cerrado()) or the string"agrovoc"which triggers a live SPARQL query to the FAO AGROVOC endpoint viacausal_ontology_agrovoc_align(). For AGROVOC the alignment type is reported as"agrovoc"instead of exact / substring / fuzzy.- method
Which matcher tier(s) to enable (ignored when
vocab = "agrovoc"). Any combination of"exact","substring","fuzzy".- max_distance
Fuzzy-matcher Levenshtein cap.
- agrovoc_cache
Optional
.rdspath used byvocab = "agrovoc"to avoid re-querying the same terms.- agrovoc_batch
Logical — when
TRUEandvocab = "agrovoc", uses the parallel-dispatch variantcausal_ontology_agrovoc_align_batch()to resolve all nodes in flight. Recommended for KGs with more than ~20 unique nodes.- agrovoc_max_active
Integer — concurrency for the parallel variant. Only consulted when
agrovoc_batch = TRUE.