ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
CountCmpEvaluator.h
Go to the documentation of this file.
1
/**
2
* @file CountCmpEvaluator.h
3
* @brief Closed-form Poisson-binomial CDF resolution for HAVING
4
* @c COUNT(*) @c op @c C @c gate_cmps.
5
*
6
* For comparators that reduce to
7
* @c gate_cmp(gate_agg(COUNT, semimod children), gate_value(C)) where
8
* each semimod child's K side is a distinct single @c gate_input
9
* leaf (the first-slice scope: flat-table HAVING COUNT queries), the
10
* comparator's probability is @c Pr(B op C) with @c B Poisson-binomial
11
* over the children's stored marginal probabilities. A partial DP
12
* computed on the smaller side of @c C (either the lower tail
13
* @c [0, C-1] on the original Bernoullis, or the upper tail computed
14
* via the complementary @c [0, N-C] on the inverted Bernoullis
15
* @c Y_i @c = @c 1-X_i) gives the answer in @c O(N x min(C, N-C))
16
* time per cmp, without ever materialising the @c binom(N, C) DNF
17
* that @c provsql_having's @c enumerate_valid_worlds path would
18
* otherwise emit.
19
*
20
* Soundness condition (checked per cmp gate) :
21
* - shape matches `gate_cmp(gate_agg(COUNT, semimod_i(K_i, 1)*), gate_value(C))`
22
* in either order ;
23
* - every K_i is a single @c gate_input leaf (no @c gate_times,
24
* no @c gate_mulinput, no nested arithmetic) ;
25
* - the K_i leaves are pairwise distinct (no shared input across
26
* children) ;
27
* - none of the K_i leaves is reachable from the rest of the
28
* surrounding circuit, i.e. each input appears exactly once and
29
* only inside this cmp's subtree.
30
*
31
* The first two conditions are cheap to check locally. The third
32
* (no outside reachability) is checked once per pass via a
33
* reference-count walk over the whole circuit.
34
*
35
* After replacement the cmp is a Bernoulli @c gate_input carrying
36
* the numeric probability, semantically meaningful only on the
37
* probability path. Like @c runAnalyticEvaluator, the pass runs
38
* inside @c probability_evaluate.cpp, not at load time.
39
*
40
* Empty-group semantics mirror @c count_enum and SQL : worlds with
41
* zero present children are excluded. In particular the
42
* "all-absent" mass @c dp[N][0] is never included regardless of
43
* operator.
44
*/
45
#ifndef PROVSQL_COUNT_CMP_EVALUATOR_H
46
#define PROVSQL_COUNT_CMP_EVALUATOR_H
47
48
#include "
GenericCircuit.h
"
49
50
namespace
provsql
{
51
52
/**
53
* @brief Run the Poisson-binomial pre-pass over @p gc.
54
*
55
* For every @c gate_cmp whose shape matches the first-slice scope
56
* (see file docstring), computes the comparator's probability by
57
* Poisson-binomial CDF and replaces the cmp by a Bernoulli
58
* @c gate_input via @c GenericCircuit::resolveCmpToBernoulli.
59
*
60
* @param gc Circuit to mutate in place.
61
* @return Number of comparators resolved by this pass.
62
*/
63
unsigned
runCountCmpEvaluator
(GenericCircuit &gc);
64
65
}
// namespace provsql
66
67
#endif
// PROVSQL_COUNT_CMP_EVALUATOR_H
GenericCircuit.h
Semiring-agnostic in-memory provenance circuit.
provsql
Definition
AnalyticEvaluator.cpp:19
provsql::runCountCmpEvaluator
unsigned runCountCmpEvaluator(GenericCircuit &gc)
Run the Poisson-binomial pre-pass over gc.
Definition
CountCmpEvaluator.cpp:279
Home
·
Documentation
·
Publications
·
Contributors
·
Source
·
Cite