![]() |
ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
|
Peephole simplifier for continuous gate_arith sub-circuits.
More...
#include "GenericCircuit.h"

Go to the source code of this file.
Namespaces | |
| namespace | provsql |
Functions | |
| unsigned | provsql::runHybridSimplifier (GenericCircuit &gc) |
Run the peephole simplifier over gc. | |
| unsigned | provsql::runConstantFold (GenericCircuit &gc) |
Constant-fold pass over every gate_arith in gc. | |
| unsigned | provsql::runHybridDecomposer (GenericCircuit &gc, unsigned samples) |
Marginalise unresolved continuous-island gate_cmp gates into Bernoulli gate_input leaves. | |
Peephole simplifier for continuous gate_arith sub-circuits.
The hybrid evaluator owns the simplification + island-decomposition passes invoked between the universal cmp-resolution (RangeCheck) and the probability-specific analytic-CDF resolution (AnalyticEvaluator). This header exposes the simplifier; the island decomposer will be added alongside in the same namespace.
The term peephole comes from compiler engineering (McKeeman, CACM 8(7), 1965): a small sliding window over consecutive instructions / gates, a fixed list of local pattern -> replacement rules, iterated to a fixed point. Each rule below looks at one gate_arith plus its immediate children, never further, which is exactly the peephole scope. Contrast with RangeCheck (global interval- propagation walk) and runHybridDecomposer (union-find over base-RV footprints across the whole circuit).
The simplifier rewrites gate_arith gates in-place using closure rules that preserve the gate's scalar value in every world:
gate_arith subtrees over only gate_value leaves — collapses to a single gate_value.TIMES with any constant-zero wire becomes gate_value:0, regardless of the other (possibly non-constant) wires.gate_value:0 wires from a PLUS gate; remove gate_value:1 wires from a TIMES gate.PLUS / TIMES over a single gate_mixture child pushes the other wires inside the branches. For the classic 3-wire shape mixture(p, X, Y) each branch becomes a fresh arith over the lifted wires; for the categorical N-wire shape mixture(key, mul_1, ..., mul_n) the constant offset / factor is folded directly into each mulinput's value text (sharing the key keeps the new mixture correlated with the original).PLUS whose wires decompose as a_i·Z_i + b_i merges same-Z_i terms by summing coefficients (so X+X folds to 2·X, X-X to 0, etc.) and consolidates the constant offsets.X a gate_rv whose distribution admits a closed-form scale (Normal, Uniform, Exp, Erlang) folds to a single scaled gate_rv.PLUS over linear combinations of independent normal gate_rv leaves folds to a single normal gate_rv whose mean and variance are the closed-form combinations. The independence test mirrors Expectation's footprint check: each contributing normal must be reached through a disjoint base-RV UUID.PLUS over k ≥ 2 i.i.d. exponential gate_rv leaves with the same rate λ and pairwise-distinct UUIDs folds to a single Erlang(k, λ) gate_rv.The rules form a fixed-point loop per gate: after any rewrite succeeds, the gate is re-evaluated under all rules until none fire. Compositions like arith(NEG, arith(PLUS, value, value)) therefore collapse fully in one bottom-up pass.
The simplifier runs in two passes for shared-RV identity preservation: pass 1 applies every rule EXCEPT the scalar-times-RV closure so the aggregator gets to see c·X-shaped wires inside a parent PLUS with their underlying RV identity intact (otherwise a bottom-up fold of the inner TIMES would mint a fresh gate_rv there and decouple it from a sibling reference to the same X). Pass 2 then folds the remaining TIMES wrappers with the scalar-times-RV rule.
Soundness: every rule produces a circuit semantically equivalent to the original in every world (same scalar distribution, same support, same comparator outcomes). The closures merge independent leaves into a new leaf with a fresh distribution; downstream consumers (sampler, RangeCheck, AnalyticEvaluator, Expectation) see the new leaf and use the closed-form distribution machinery as if the user had written it directly.
Operates on the in-memory GenericCircuit only; the persistent mmap store is never mutated. Children that become orphaned by a rewrite are not reaped here.
Gated by the provsql.hybrid_evaluation GUC (default on): probability_evaluate skips the pass entirely when the GUC is off, letting users verify that the analytic path and the whole-circuit MC fallback agree on the same query.
Definition in file HybridEvaluator.h.