ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
CountCmpEvaluator.h File Reference

Closed-form Poisson-binomial CDF resolution for HAVING COUNT(*) op C gate_cmps. More...

#include "GenericCircuit.h"
Include dependency graph for CountCmpEvaluator.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces

namespace  provsql

Functions

unsigned provsql::runCountCmpEvaluator (GenericCircuit &gc)
 Run the Poisson-binomial pre-pass over gc.

Detailed Description

Closed-form Poisson-binomial CDF resolution for HAVING COUNT(*) op C gate_cmps.

For comparators that reduce to gate_cmp(gate_agg(COUNT, semimod children), gate_value(C)) where each semimod child's K side is a distinct single gate_input leaf (the first-slice scope: flat-table HAVING COUNT queries), the comparator's probability is Pr(B op C) with B Poisson-binomial over the children's stored marginal probabilities. A partial DP computed on the smaller side of C (either the lower tail [0, C-1] on the original Bernoullis, or the upper tail computed via the complementary [0, N-C] on the inverted Bernoullis Y_i = 1-X_i) gives the answer in O(N x min(C, N-C)) time per cmp, without ever materialising the binom(N, C) DNF that provsql_having's enumerate_valid_worlds path would otherwise emit.

Soundness condition (checked per cmp gate) :

  • shape matches gate_cmp(gate_agg(COUNT, semimod_i(K_i, 1)*), gate_value(C)) in either order ;
  • every K_i is a single gate_input leaf (no gate_times, no gate_mulinput, no nested arithmetic) ;
  • the K_i leaves are pairwise distinct (no shared input across children) ;
  • none of the K_i leaves is reachable from the rest of the surrounding circuit, i.e. each input appears exactly once and only inside this cmp's subtree.

The first two conditions are cheap to check locally. The third (no outside reachability) is checked once per pass via a reference-count walk over the whole circuit.

After replacement the cmp is a Bernoulli gate_input carrying the numeric probability, semantically meaningful only on the probability path. Like runAnalyticEvaluator, the pass runs inside probability_evaluate.cpp, not at load time.

Empty-group semantics mirror count_enum and SQL : worlds with zero present children are excluded. In particular the "all-absent" mass dp[N][0] is never included regardless of operator.

Definition in file CountCmpEvaluator.h.