Shapley and Banzhaf Values

ProvSQL computes Shapley values and Banzhaf values – game-theoretic measures from cooperative game theory that quantify the individual contribution of each input tuple to a query result.

Background

Given a Boolean query result whose truth depends on a set of input tuples, the Shapley value of an input tuple t is the average marginal contribution of t over all orderings of the input tuples.

The Banzhaf value is a simpler variant: the average marginal contribution over all subsets of the other inputs (not weighted by ordering).

Both measures are always computed under the probabilistic model [Karmakar et al., 2024], giving expected Shapley/Banzhaf values: each input token contributes according to its probability (set with set_prob, see Probabilities). When no probabilities are set they default to 1, which recovers the standard deterministic Shapley/Banzhaf values.

Computing Shapley Values

shapley takes the provenance token of the query result and the token of the input tuple whose contribution to measure:

SELECT person,
       shapley(provenance(), m.token) AS sv
FROM suspects, witness_mapping m;

To compute Shapley values for all input variables at once (more efficient than calling shapley once per variable), use shapley_all_vars:

SELECT person, value AS witness, sv
FROM suspects,
     shapley_all_vars(provenance()) AS (variable uuid, sv double precision),
     witness_mapping m
WHERE m.token = variable;

Computing Banzhaf Values

banzhaf and banzhaf_all_vars have the same calling conventions as their Shapley counterparts:

SELECT person,
       banzhaf(provenance(), m.token) AS bv
FROM suspects, witness_mapping m;

Computation Notes

Shapley-value computation is generally #P-hard. ProvSQL compiles the provenance circuit to a d-DNNF and evaluates it efficiently. The optional third argument selects the compilation method ('tree-decomposition', 'd4', 'c2d', etc.), with the same semantics as for probability_evaluate.

Choosing Between Shapley and Banzhaf

  • Shapley values satisfy a set of axioms (efficiency, symmetry, dummy, additivity) that uniquely characterise them as a fair measure of individual contribution.

  • Banzhaf values are faster to compute and satisfy a slightly different set of axioms; they are appropriate when the efficiency guarantee is not required.