![]() |
ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
|
In-memory provenance circuit with semiring-generic evaluation. More...
#include "GenericCircuit.h"


Public Member Functions | |
| virtual std::string | toString (gate_t g) const override |
| Return a placeholder debug string (not intended for display). | |
| void | setInfos (gate_t g, unsigned info1, unsigned info2) |
Set the integer annotation pair for gate g. | |
| std::pair< unsigned, unsigned > | getInfos (gate_t g) const |
Return the integer annotation pair for gate g. | |
| void | setExtra (gate_t g, const std::string &ex) |
Attach a string extra to gate g. | |
| std::string | getExtra (gate_t g) const |
Return the string extra for gate g. | |
| gate_t | addGate () override |
| Allocate a new gate with a default-initialised type. | |
| gate_t | setGate (gate_type type) override |
Allocate a new gate with type type and no UUID. | |
| gate_t | setGate (const uuid &u, gate_type type) override |
Create or update the gate associated with UUID u. | |
| const std::set< gate_t > & | getInputs () const |
| Return the set of input (leaf) gates. | |
| void | setProb (gate_t g, double p) |
Set the probability for gate g. | |
| double | getProb (gate_t g) const |
Return the probability for gate g. | |
| void | resolveCmpToBernoulli (gate_t g, double p) |
Replace a gate_cmp by a constant Boolean leaf (gate_one for p == 1, gate_zero for p == 0) or by a Bernoulli gate_input for any other p. | |
| void | resolveGateToZero (gate_t g) |
Replace an arbitrary gate (typically gate_times) by gate_zero. | |
| void | resolveToValue (gate_t g, const std::string &s) |
Rewrite an arbitrary gate as a gate_value carrying the textual extra s. | |
| void | resolveToRv (gate_t g, const std::string &s) |
Rewrite an arbitrary gate as a gate_rv carrying the distribution-spec extra s. | |
| void | foldSemiringIdentities () |
Drop semiring identity wires and collapse single-wire gate_times / gate_plus to their lone non-identity child; collapse a gate_times containing a gate_zero wire to that absorber. | |
| void | foldBooleanIdentities () |
Apply Boolean-only simplification rules to gate_plus and gate_times. | |
| void | markBooleanAssumed (gate_t g) |
Mark gate g as Boolean-assumed (in-memory side band). | |
| bool | isBooleanAssumed (gate_t g) const |
Report whether g carries the Boolean-assumption flag. | |
| void | setWires (gate_t g, std::vector< gate_t > w) |
Replace the wires of g with w. | |
| void | resolveToPlus (gate_t g, std::vector< gate_t > w) |
Rewrite an arbitrary gate as a gate_plus over w. | |
| gate_t | addAnonymousInputGate (double p) |
Allocate a fresh gate_input gate carrying probability p, with a unique synthetic UUID so subsequent BooleanCircuit conversion does not collide multiple no-UUID inputs onto the same gate. | |
| gate_t | addAnonymousMulinputGate (gate_t key, double p, unsigned value_index) |
Allocate a fresh gate_mulinput gate with key key, probability p, and value index value_index. | |
| gate_t | addAnonymousArithGate (provsql_arith_op op, std::vector< gate_t > wires_) |
Allocate a fresh gate_arith gate with operator tag op and the given wires. | |
| gate_t | addAnonymousValueGate (const std::string &text) |
Allocate a fresh gate_value gate carrying the textual scalar text. | |
| void | resolveToMixture (gate_t g, gate_t p_token, gate_t x_token, gate_t y_token) |
Rewrite g in place as a gate_mixture over the wires [p_token, x_token, y_token]. | |
| gate_t | addAnonymousMulinputGateWithValue (gate_t key, double p, unsigned value_index, const std::string &value_text) |
Allocate a fresh gate_mulinput labelled with a numeric outcome value carried in extra. | |
| void | resolveToCategoricalMixture (gate_t g, std::vector< gate_t > wires_) |
Rewrite g in place as a categorical-form gate_mixture over wires ([key, mul_1, ..., mul_n]). | |
| bool | isCategoricalMixture (gate_t g) const |
Test whether g is a categorical-form gate_mixture (the explicit provsql.categorical output). | |
| template<class Archive> | |
| void | serialize (Archive &ar, const unsigned int version) |
| Boost serialisation support. | |
| template<typename S, std::enable_if_t< std::is_base_of_v< semiring::Semiring< typename S::value_type >, S >, int > = 0> | |
| S::value_type | evaluate (gate_t g, std::unordered_map< gate_t, typename S::value_type > &provenance_mapping, S semiring) const |
Evaluate the sub-circuit rooted at gate g over semiring semiring. | |
| Public Member Functions inherited from Circuit< gate_type > | |
| virtual | ~Circuit () |
| std::vector< gate_t >::size_type | getNbGates () const |
| Return the total number of gates in the circuit. | |
| gate_t | getGate (const uuid &u) |
Return (or create) the gate associated with UUID u. | |
| uuid | getUUID (gate_t g) const |
Return the UUID string associated with gate g. | |
| gate_type | getGateType (gate_t g) const |
Return the type of gate g. | |
| std::vector< gate_t > & | getWires (gate_t g) |
Return a mutable reference to the child-wire list of gate g. | |
| virtual gate_t | setGate (const uuid &u, gate_type type) |
Create or update the gate associated with UUID u. | |
| bool | hasGate (const uuid &u) const |
Test whether a gate with UUID u exists. | |
| void | addWire (gate_t f, gate_t t) |
Add a directed wire from gate f (parent) to gate t (child). | |
Private Attributes | |
| std::map< gate_t, std::pair< unsigned, unsigned > > | infos |
| Per-gate (info1, info2) annotations. | |
| std::map< gate_t, std::string > | extra |
| Per-gate string extras. | |
| std::set< gate_t > | inputs |
| Set of input (leaf) gate IDs. | |
| std::vector< double > | prob |
| Per-gate probability values. | |
| std::set< gate_t > | boolean_assumed_gates |
Side-band Boolean-assumption marker set by foldBooleanIdentities ; an evaluator visiting a gate in this set refuses to proceed under a semiring that does not admit a homomorphism from Boolean functions. In-memory only ; never persisted to mmap. Distinct from the gate_assumed_boolean enum (used by the safe-query rewriter to encode the same restriction at the persistent layer). | |
Friends | |
| class | dDNNFTreeDecompositionBuilder |
| class | boost::serialization::access |
Additional Inherited Members | |
| Public Types inherited from Circuit< gate_type > | |
| using | uuid |
UUID type used in this circuit (always std::string). | |
| Protected Member Functions inherited from Circuit< gate_type > | |
| void | setGateType (gate_t g, gate_type t) |
| Update the type of an existing gate. | |
| Protected Attributes inherited from Circuit< gate_type > | |
| std::unordered_map< uuid, gate_t > | uuid2id |
| UUID string → gate index. | |
| std::unordered_map< gate_t, uuid > | id2uuid |
| Gate index → UUID string. | |
| std::vector< gate_type > | gates |
| Gate type for each gate. | |
| std::vector< std::vector< gate_t > > | wires |
| Child wire lists for each gate. | |
In-memory provenance circuit with semiring-generic evaluation.
Gate types are the PostgreSQL gate_type values. The circuit is constructed from the persistent mmap store and then evaluated in-memory.
Definition at line 48 of file GenericCircuit.h.
|
inline |
Allocate a fresh gate_arith gate with operator tag op and the given wires.
Used by the HybridEvaluator simplifier's mixture-lift rule when pushing a PLUS / TIMES inside a mixture's two scalar branches: each branch needs a fresh gate_arith child carrying the lifted operation, and those children must round-trip through downstream id2uuid / uuid2id lookups (Studio's simplified subgraph, to_provxml). A unique synthetic UUID is minted for the same reason as addAnonymousInputGate.
Definition at line 403 of file GenericCircuit.h.

|
inline |
Allocate a fresh gate_input gate carrying probability p, with a unique synthetic UUID so subsequent BooleanCircuit conversion does not collide multiple no-UUID inputs onto the same gate.
The synthetic UUID is derived from the freshly-assigned gate id; it is not a real v4 UUID (does not match the xxxxxxxx-... format) and exists only for in-memory uniqueness during the probability_evaluate pipeline. The gate is added to inputs so the conversion's first loop maps it into BooleanCircuit's gc_to_bc.
| p | Probability for the new input. |
Definition at line 354 of file GenericCircuit.h.

|
inline |
Allocate a fresh gate_mulinput gate with key key, probability p, and value index value_index.
Used by the joint-table decomposer to materialise one Bernoulli outcome of a 2^k-way categorical distribution over a shared continuous island. All mulinputs in one joint table share the same key (the block anchor returned by addAnonymousInputGate); value_index is stored in info1 so BooleanCircuit::independentEvaluation can group / dedup MULIN references at OR sites. A unique synthetic UUID is minted for the same reason as addAnonymousInputGate.
Definition at line 378 of file GenericCircuit.h.


|
inline |
Allocate a fresh gate_mulinput labelled with a numeric outcome value carried in extra.
Variant of addAnonymousMulinputGate used by the categorical mixture lowering: the mulinput's info1 still holds the ordinal value_index for independentEvaluation's dedup, and the outcome's numeric label is stored in extra so the evaluator-side categorical-mixture handlers can read it as a float8.
Definition at line 475 of file GenericCircuit.h.

|
inline |
Allocate a fresh gate_value gate carrying the textual scalar text.
Used by the HybridEvaluator simplifier's PLUS coefficient aggregation rule: when same-RV terms in a PLUS gate are merged into arith(TIMES, value:a_total, Z) per RV, each coefficient a_total needs a fresh gate_value to feed the synthetic TIMES. text must be a canonical text form that round-trips through parseDoubleStrict (the simplifier already formats with precision 17). A unique synthetic UUID is minted for the same reason as addAnonymousInputGate.
Definition at line 428 of file GenericCircuit.h.

|
overridevirtual |
Allocate a new gate with a default-initialised type.
Derived classes override this to perform additional initialisation (e.g. resizing auxiliary vectors).
gate_t identifier of the newly created gate. Reimplemented from Circuit< gate_type >.
Definition at line 36 of file GenericCircuit.cpp.


| S::value_type GenericCircuit::evaluate | ( | gate_t | g, |
| std::unordered_map< gate_t, typename S::value_type > & | provenance_mapping, | ||
| S | semiring ) const |
Evaluate the sub-circuit rooted at gate g over semiring semiring.
Performs a post-order traversal from g, mapping each input gate to its semiring value via provenance_mapping, and combining the results using the semiring operations.
| S | A concrete semiring::Semiring subclass. |
| g | Root gate of the sub-circuit to evaluate. |
| provenance_mapping | Map from input gate IDs to semiring values. |
| semiring | Semiring instance providing zero(), one(), plus(), times(), etc. |
g. Definition at line 35 of file GenericCircuit.hpp.


| void GenericCircuit::foldBooleanIdentities | ( | ) |
Apply Boolean-only simplification rules to gate_plus and gate_times.
Each rule fires by mutating the gate's wires in place AND adding the gate to boolean_assumed_gates so semiring evaluators that do not admit a homomorphism from Boolean functions refuse to proceed.
Rules :
gate_plus(a, a, b) → gate_plus(a, b), gate_times(a, a, b) → gate_times(a, b). Child dedup by gate id ; preserves first-occurrence order.gate_plus(..., gate_one, ...) → gate_one (rewires to an empty plus then collapses to gate_one).Operates on the in-memory GenericCircuit only ; the persistent mmap store is never mutated, and the gate's UUID-to-gate_t mapping survives so callers indexing by the original UUID still find it. Re-runs foldSemiringIdentities at the end to collapse single-wire gate_plus / gate_times residues the dedup may have left behind.
Definition at line 176 of file GenericCircuit.cpp.


| void GenericCircuit::foldSemiringIdentities | ( | ) |
Drop semiring identity wires and collapse single-wire gate_times / gate_plus to their lone non-identity child; collapse a gate_times containing a gate_zero wire to that absorber.
Universal rewrite: the multiplicative identity (gate_one), the additive identity (gate_zero), and the multiplicative absorber (gate_zero) hold across every provsql semiring, so a single pass after RangeCheck is sound for every downstream consumer (probability_evaluate, to_provxml, view_circuit, Studio's simplified subgraph). Does NOT apply the additive absorber rewrite (plus-with-one): gate_one only absorbs in idempotent semirings (Boolean, MinMax), so applying it unconditionally would silently change the semantics for Counting / Formula / etc.
Operates on the in-memory circuit only; the persistent mmap store is never touched. Gated alongside RangeCheck by provsql.simplify_on_load.
Definition at line 45 of file GenericCircuit.cpp.


|
inline |
Return the string extra for gate g.
| g | Gate identifier. |
Definition at line 106 of file GenericCircuit.h.

|
inline |
Return the integer annotation pair for gate g.
| g | Gate identifier. |
{info1, info2}, or {-1,-1} if not set. Definition at line 83 of file GenericCircuit.h.

|
inline |
Return the set of input (leaf) gates.
Definition at line 126 of file GenericCircuit.h.

|
inline |
Return the probability for gate g.
| g | Gate identifier. |
Definition at line 144 of file GenericCircuit.h.

|
inline |
Report whether g carries the Boolean-assumption flag.
Definition at line 307 of file GenericCircuit.h.

|
inline |
Test whether g is a categorical-form gate_mixture (the explicit provsql.categorical output).
Returns true iff g is a gate_mixture whose wires are [key, mul_1, ..., mul_n] with n ≥ 1: wires[0] a gate_input key anchor, and every subsequent wire a gate_mulinput. The classic mixture shape [p_token, x_token, y_token] returns false (one or both of wires[1..2] are not gate_mulinput).
Definition at line 516 of file GenericCircuit.h.


|
inline |
Mark gate g as Boolean-assumed (in-memory side band).
Visited by every evaluate<S> traversal : if g is in the set, the visit checks S::compatibleWithBooleanRewrite and throws a CircuitException otherwise.
Definition at line 304 of file GenericCircuit.h.

|
inline |
Replace a gate_cmp by a constant Boolean leaf (gate_one for p == 1, gate_zero for p == 0) or by a Bernoulli gate_input for any other p.
Used by peephole pruning passes when a comparator's probability is provably 0, 1, or a closed-form value. Distinguishing the 0 / 1 case from the fractional case matters because non-probabilistic semirings (e.g. sr_formula, sr_counting) have well-defined zero() / one() values but no notion of "Bernoulli with
probability @c p" – an unknown gate_input would default to semiring.one() in every semiring (per GenericCircuit::evaluate), which is wrong for an always-false comparator. Using gate_zero / gate_one directly is universally correct: every semiring knows its identities.
Fractional probabilities are still encoded as gate_input + a probability so probability evaluators (MC, independent, treedec, d-DNNF, d4) can consume them, but only those evaluators handle non-trivial probabilities meaningfully. Such resolutions should therefore be confined to passes invoked from a probability context (not the universal getGenericCircuit-time pass).
Operates on the in-memory circuit only; the persistent mmap store is never mutated. Children that become orphaned are not reaped here.
Definition at line 176 of file GenericCircuit.h.


|
inline |
Replace an arbitrary gate (typically gate_times) by gate_zero.
Used by RangeCheck's joint-conjunction pass when an AND of cmps over a shared RV constrains its support to an empty interval: since gate_zero is the multiplicative absorber in every semiring, replacing a gate_times with it is universally sound, and orphans the conjuncts (their effects are now unreachable from the root, regardless of what each individual cmp would resolve to).
The wires, infos and extra fields are cleared so the gate is a proper leaf. Operates on the in-memory circuit only.
Definition at line 206 of file GenericCircuit.h.


Rewrite g in place as a categorical-form gate_mixture over wires ([key, mul_1, ..., mul_n]).
Used by the explicit provsql.categorical constructor (built at SQL-call time, not by the simplifier): the gate_mixture type is reused with N > 3 wires, where wires[0] is a fresh gate_input "key" anchor (its own probability is irrelevant: the categorical mass is on the mulinputs) and wires[1..n] are gate_mulinput leaves sharing that key. Every gate_mixture handler downstream branches on wires.size() == 3 for the classic [p_token, x_token, y_token] shape vs the categorical shape; the latter is what unlocks closed-form CDF / cmp evaluation via AnalyticEvaluator.
Definition at line 498 of file GenericCircuit.h.

|
inline |
Rewrite g in place as a gate_mixture over the wires [p_token, x_token, y_token].
Used by the HybridEvaluator simplifier's mixture-lift rule when a gate_arith containing a single gate_mixture child is pushed inside the mixture's branches: the outer arith gate is rewritten in place as the lifted mixture, preserving its UUID and the non-mixture-aware code paths that already hold references to it. The p_token is reused verbatim so Bernoulli identity is preserved across the rewrite; the x_token and y_token are the freshly-minted arith children built via addAnonymousArithGate.
Definition at line 452 of file GenericCircuit.h.

Rewrite an arbitrary gate as a gate_plus over w.
Used by the HybridEvaluator multi-cmp island decomposer when a comparator from a shared-island group is rewritten as the OR of the joint-table gate_mulinput leaves where the comparator's bit is set. Clears infos and extra and installs the new wires.
Definition at line 331 of file GenericCircuit.h.

|
inline |
Rewrite an arbitrary gate as a gate_rv carrying the distribution-spec extra s.
Used by the HybridEvaluator simplifier when a linear combination of independent normals (or i.i.d. exponentials with the same rate) collapses to a single closed-form distribution. s must be a textual encoding parseable by parse_distribution_spec. Same wire/info clearing as resolveCmpToBernoulli. Operates on the in-memory circuit only.
Definition at line 243 of file GenericCircuit.h.

|
inline |
Rewrite an arbitrary gate as a gate_value carrying the textual extra s.
Used by the HybridEvaluator simplifier when a gate_arith subtree constant-folds to a scalar. Same wire/info/extra clearing as resolveCmpToBernoulli – the old children become orphans relative to g. s is interpreted by the consumer via parseDoubleStrict (or analogous routines), so it must be a canonical textual representation that round-trips through std::stod. Operates on the in-memory circuit only.
Definition at line 225 of file GenericCircuit.h.

|
inline |
Boost serialisation support.
| ar | Boost archive (input or output). |
| version | Archive version (unused). |
Definition at line 534 of file GenericCircuit.h.
|
inline |
Attach a string extra to gate g.
| g | Gate identifier. |
| ex | String to store. |
Definition at line 96 of file GenericCircuit.h.

Create or update the gate associated with UUID u.
If the UUID is already mapped the existing gate's type is updated. Otherwise a new gate is allocated.
| u | UUID string to associate with the gate. |
| type | Gate type. |
Definition at line 27 of file GenericCircuit.cpp.

Allocate a new gate with type type and no UUID.
| type | Gate type. |
Definition at line 18 of file GenericCircuit.cpp.


|
inline |
Set the integer annotation pair for gate g.
| g | Gate identifier. |
| info1 | First annotation integer. |
| info2 | Second annotation integer. |
Definition at line 73 of file GenericCircuit.h.

|
inline |
Set the probability for gate g.
| g | Gate identifier. |
| p | Probability in [0, 1]. |
Definition at line 135 of file GenericCircuit.h.

Replace the wires of g with w.
Used by the HybridEvaluator simplifier's identity-element drop to remove constant-zero wires from a PLUS gate (or constant-one wires from a TIMES gate) without changing the gate's type. Dropped children become orphans relative to g.
Definition at line 319 of file GenericCircuit.h.


|
inlineoverridevirtual |
Return a placeholder debug string (not intended for display).
| g | Gate identifier (unused). |
"<GenericCircuit>". Implements Circuit< gate_type >.
Definition at line 63 of file GenericCircuit.h.
|
friend |
Definition at line 547 of file GenericCircuit.h.
|
friend |
Definition at line 546 of file GenericCircuit.h.
|
private |
Side-band Boolean-assumption marker set by foldBooleanIdentities ; an evaluator visiting a gate in this set refuses to proceed under a semiring that does not admit a homomorphism from Boolean functions. In-memory only ; never persisted to mmap. Distinct from the gate_assumed_boolean enum (used by the safe-query rewriter to encode the same restriction at the persistent layer).
Definition at line 55 of file GenericCircuit.h.
|
private |
Per-gate string extras.
Definition at line 52 of file GenericCircuit.h.
|
private |
Per-gate (info1, info2) annotations.
Definition at line 51 of file GenericCircuit.h.
|
private |
Set of input (leaf) gate IDs.
Definition at line 53 of file GenericCircuit.h.
|
private |
Per-gate probability values.
Definition at line 54 of file GenericCircuit.h.