![]() |
ProvSQL SQL API
Adding support for provenance and uncertainty management to PostgreSQL databases
|
Functions for computing probabilities, expected values, and game-theoretic contribution measures (Shapley/Banzhaf values) from provenance circuits. More...
Functions | |
| DOUBLE PRECISION | provsql.probability_evaluate (UUID token, TEXT method=NULL, TEXT arguments=NULL) |
| Compute the probability of a provenance token. | |
| DOUBLE PRECISION | provsql.expected (ANYELEMENT input, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL) |
| Compute the expected value of a probabilistic scalar. | |
| DOUBLE PRECISION | provsql.rv_moment (UUID token, INTEGER k, BOOLEAN central, UUID prov=gate_one()) |
| Internal: shared C entry point for variance / moment / central_moment. | |
| DOUBLE PRECISION | provsql.agg_raw_moment (AGG_TOKEN token, INTEGER k, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL) |
| Compute the raw moment E[X^k | prov] of an AGG_TOKEN aggregate. | |
| DOUBLE PRECISION | provsql.variance (ANYELEMENT input, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL) |
| Compute the variance Var[X | prov] of a probabilistic scalar. | |
| DOUBLE PRECISION | provsql.moment (ANYELEMENT input, INTEGER k, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL) |
| Compute the raw moment E[X^k | prov] of a probabilistic scalar. | |
| VOID | provsql.rv_support (UUID token, UUID prov=gate_one(), float8 &lo, float8 &hi) |
| Internal: rv-side support computation. | |
| VOID | provsql.support (ANYELEMENT input, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL, float8 &lo, float8 &hi) |
Compute the support interval [lo, hi] of a probabilistic (or deterministic) scalar. | |
| DOUBLE PRECISION | provsql.central_moment (ANYELEMENT input, INTEGER k, UUID prov=gate_one(), TEXT method=NULL, TEXT arguments=NULL) |
| Compute the central moment E[(X - E[X|prov])^k | prov]. | |
| DOUBLE PRECISION | provsql.shapley (UUID token, UUID variable, TEXT method=NULL, TEXT arguments=NULL, BOOLEAN banzhaf='f') |
| Compute the Shapley value of an input variable. | |
| SETOF RECORD | provsql.shapley_all_vars (UUID token, TEXT method=NULL, TEXT arguments=NULL, BOOLEAN banzhaf='f', OUT UUID variable, OUT DOUBLE PRECISIONvalue) |
| Compute Shapley values for all input variables at once. | |
| DOUBLE PRECISION | provsql.banzhaf (UUID token, UUID variable, TEXT method=NULL, TEXT arguments=NULL) |
| Compute the Banzhaf power index of an input variable. | |
| SETOF RECORD | provsql.banzhaf_all_vars (UUID token, TEXT method=NULL, TEXT arguments=NULL, OUT UUID variable, OUT DOUBLE PRECISIONvalue) |
| Compute Banzhaf power indices for all input variables at once. | |
Functions for computing probabilities, expected values, and game-theoretic contribution measures (Shapley/Banzhaf values) from provenance circuits.
| DOUBLE PRECISION provsql.agg_raw_moment | ( | AGG_TOKEN | token, |
| INTEGER | k, | ||
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the raw moment E[X^k | prov] of an AGG_TOKEN aggregate.
Sister of expected() for the AGG_TOKEN side of the polymorphic moment / variance / central_moment dispatch. Supports the same aggregation functions as expected: SUM (which COUNT normalises to at the gate level via Aggregation.cpp:322), MIN, and MAX.
Strategy:
prov when conditioning), and evaluate the probability via probability_evaluate.v with v^k in the rank-based enumeration that expected already uses; MAX is handled by sign-flipping per the existing trick (negate vs. rerank), with the outer multiplier becoming \((-1)^k\) instead of just \(-1\).Cost: SUM is \(O(n^k)\) probability evaluations – tractable for small k or small n; for larger sizes, prefer reaching for the sampler. MIN / MAX stay linear in n.
| DOUBLE PRECISION provsql.banzhaf | ( | UUID | token, |
| UUID | variable, | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the Banzhaf power index of an input variable.
| SETOF RECORD provsql.banzhaf_all_vars | ( | UUID | token, |
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL, | ||
| OUT UUID | variable, | ||
| OUT DOUBLE PRECISION | value ) |
Compute Banzhaf power indices for all input variables at once.
| DOUBLE PRECISION provsql.central_moment | ( | ANYELEMENT | input, |
| INTEGER | k, | ||
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the central moment E[(X - E[X|prov])^k | prov].
k = 0 returns 1; k = 1 returns 0; k = 2 is equivalent to variance(input, prov, ...). Polymorphic dispatcher: routes random_variable through rv_moment, and AGG_TOKEN through the binomial expansion \(E[(X-\mu)^k|A] = \sum_{i=0}^{k} \binom{k}{i} (-\mu)^{k-i} E[X^i|A]\) with \(\mu = E[X|A]\), where each \(E[X^i|A]\) comes from agg_raw_moment.
| DOUBLE PRECISION provsql.expected | ( | ANYELEMENT | input, |
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the expected value of a probabilistic scalar.
Computes E[input | prov] for either an AGG_TOKEN (discrete SUM/MIN/MAX aggregation over Boolean-input gate_agg circuits, with prov as the Boolean conditioning event) or a random_variable (continuous distribution, traversed by the analytical / MC evaluator from Expectation.cpp).
Implementation: thin wrapper over moment(input, 1, prov, method,
arguments). Both branches converge on the same machinery; the AGG_TOKEN side computes E[X] as the \(k=1\) instance of the \(n^k\)-tuple enumeration in agg_raw_moment, the random_variable side calls compute_expectation through rv_moment.
| input | aggregate expression or random variable to compute E[·] of |
| prov | provenance condition (defaults to gate_one(), i.e., unconditional) |
| method | knowledge compilation method (AGG_TOKEN path only) |
| arguments | additional arguments for the method (AGG_TOKEN path only) |
| DOUBLE PRECISION provsql.moment | ( | ANYELEMENT | input, |
| INTEGER | k, | ||
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the raw moment E[X^k | prov] of a probabilistic scalar.
k must be a non-negative INTEGER. k = 0 returns 1; k = 1 is equivalent to expected(input). Polymorphic dispatcher: routes random_variable through rv_moment (analytical / MC) and AGG_TOKEN through agg_raw_moment (SUM via tuple enumeration, MIN / MAX via rank enumeration).
| DOUBLE PRECISION provsql.probability_evaluate | ( | UUID | token, |
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the probability of a provenance token.
Compiles the provenance circuit to d-DNNF and evaluates the probability. The compilation method can be selected explicitly.
| token | provenance token to evaluate |
| method | knowledge compilation method (NULL for default) |
| arguments | additional arguments for the method |
| DOUBLE PRECISION provsql.rv_moment | ( | UUID | token, |
| INTEGER | k, | ||
| BOOLEAN | central, | ||
| UUID | prov = gate_one() ) |
Internal: shared C entry point for variance / moment / central_moment.
The expected() SQL function reaches the Expectation evaluator through provenance_evaluate_compiled(..., 'expectation', ...). The variance / raw-moment / central-moment SQL functions need an extra k INTEGER argument that does not fit that dispatcher's signature, so they go through this dedicated entry point. Returns E[X^k] when central is FALSE, or E[(X - E[X])^k] when TRUE.
| VOID provsql.rv_support | ( | UUID | token, |
| UUID | prov = gate_one(), | ||
| float8 & | lo, | ||
| float8 & | hi ) |
Internal: rv-side support computation.
Lifts provsql.compute_support out of RangeCheck.cpp – the same interval-arithmetic propagation runRangeCheck uses to decide gate_cmps. Returns [-Infinity, +Infinity] when the tightest bound is the conservative all-real interval (e.g. for a normal RV, or any sub-circuit that mixes a normal in).
| DOUBLE PRECISION provsql.shapley | ( | UUID | token, |
| UUID | variable, | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL, | ||
| BOOLEAN | banzhaf = 'f' ) |
Compute the Shapley value of an input variable.
Measures the contribution of a specific input variable to the truth of a provenance expression, using game-theoretic Shapley values.
| token | provenance token to evaluate |
| variable | UUID of the input variable |
| method | knowledge compilation method |
| arguments | additional arguments for the method |
| banzhaf | if true, compute the Banzhaf value instead |
| SETOF RECORD provsql.shapley_all_vars | ( | UUID | token, |
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL, | ||
| BOOLEAN | banzhaf = 'f', | ||
| OUT UUID | variable, | ||
| OUT DOUBLE PRECISION | value ) |
Compute Shapley values for all input variables at once.
| VOID provsql.support | ( | ANYELEMENT | input, |
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL, | ||
| float8 & | lo, | ||
| float8 & | hi ) |
Compute the support interval [lo, hi] of a probabilistic (or deterministic) scalar.
Polymorphic dispatcher mirroring expected / variance / moment / central_moment, with two extra "free" branches:
smallint / INTEGER / bigint / NUMERIC / real / double precision): degenerate point support \([c, c]\). Lets callers ask for the support of a literal without round-tripping through as_random.random_variable / bare UUID (any provenance gate token, including the implicit random_variable -> UUID cast): routes to rv_support, which propagates distribution supports (uniform exact, exponential [0,+∞), normal (-∞,+∞)) through gate_arith via interval arithmetic. gate_value gives the same \([c, c]\) point support as the NUMERIC branch; any non-scalar gate (Boolean gates, aggregates, ...) safely falls back to the conservative all-real interval without raising. Conditioning on prov is not yet supported.AGG_TOKEN: closed-form per aggregation function:SUM : \([\sum_i \min(0,v_i), \sum_i \max(0,v_i)]\) (every row is independently in or out of the included set; the extreme SUMs are reached by including only positive or only negative-valued rows).MIN : \([\min_i v_i, \max_i v_i]\) in the non-empty subsets, plus +Infinity if the empty subset has positive probability under prov.MAX : symmetric – -Infinity if empty has positive probability under prov, otherwise min_i v_i; hi is always max_i v_i.Other aggregation functions raise.
Returns the composite RECORD (lo, hi) via the function's OUT parameters, with -Infinity / +Infinity marking unbounded ends.
| DOUBLE PRECISION provsql.variance | ( | ANYELEMENT | input, |
| UUID | prov = gate_one(), | ||
| TEXT | method = NULL, | ||
| TEXT | arguments = NULL ) |
Compute the variance Var[X | prov] of a probabilistic scalar.
Polymorphic dispatcher that mirrors expected: random_variable inputs go through the analytical / MC evaluator (rv_moment(UUID, 2, true)); AGG_TOKEN inputs go through the agg_raw_moment helper, computing \(\mathrm{Var}[X|A] = E[X^2|A] - E[X|A]^2\). Conditioning on prov is supported for AGG_TOKEN (matching expected) but not yet for random_variable.