
Pairwise Cohen's kappa between backends on edge presence
Source:R/llm_benchmark.R
llm_benchmark_kappa.RdComputes agreement (Cohen's kappa) for every pair of backends on the union of edges seen across them (and optionally gold) per abstract. For each backend pair (A, B), treats every (abstract, edge) as a binary rating: "did backend X extract this edge?".