
Simulate backend extractions from a gold-standard set
Source:R/llm_benchmark.R
llm_benchmark_simulate.RdProduces a realistic noisy extraction from a gold-standard data frame by sampling recall, false-positive rate and label-mutation probability for each backend. Useful for offline reproducibility and CI builds where real LLM APIs are unreachable.
Usage
llm_benchmark_simulate(
gold,
recall = 0.82,
precision_target = 0.86,
mutate_rate = 0.05,
seed = NULL
)