ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
provsql Namespace Reference

Classes

struct  ConditionalScalarSamples
 Outcome of a conditional Monte Carlo sampling pass. More...
struct  DistributionSpec
 Parsed distribution spec (kind + up to two parameters). More...
struct  TruncatedSingleRv
 Detection result for a closed-form, optionally-truncated single-RV shape. More...
struct  DiracShape
 Point mass at a finite scalar value (a gate_value root, or an as_random(c) leaf surfaced as a gate_value). More...
struct  CategoricalShape
 Categorical distribution over a finite outcome set. More...
struct  BernoulliMixtureShape
 Bernoulli mixture (gate_mixture with the [p_token, x_token, y_token] shape). More...

Typedefs

using ClosedFormShape
 One of the closed-form shapes the analytical-curves payload can render: bare RV (continuous PDF/CDF), Dirac (point mass), categorical (multiple point masses), or Bernoulli mixture of any two of the above.

Enumerations

enum class  DistKind { Normal , Uniform , Exponential , Erlang }
 Continuous distribution kinds supported by gate_rv. More...

Functions

double pdfAt (const DistributionSpec &d, double c)
 Closed-form probability density \(f(c)\) for a basic distribution.
double cdfAt (const DistributionSpec &d, double c)
 Closed-form CDF \(F_X(c) = P(X \le c)\) for a basic continuous distribution.
unsigned runAnalyticEvaluator (GenericCircuit &gc)
 Run the closed-form CDF resolution pass over gc.
unsigned runCountCmpEvaluator (GenericCircuit &gc)
 Run the Poisson-binomial pre-pass over gc.
double evaluateBooleanProbability (const GenericCircuit &gc, gate_t boolRoot)
 Probability that the Boolean subcircuit rooted at boolRoot evaluates to true under the tuple-independent probabilistic-database model.
double compute_expectation (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root=std::nullopt)
 Compute \(E[X]\) (or \(E[X \mid A]\) if event_root is set) over the scalar sub-circuit rooted at root.
double compute_variance (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root=std::nullopt)
 Compute \(\mathrm{Var}[X]\) (or \(\mathrm{Var}[X \mid A]\) if event_root is set) over the scalar sub-circuit rooted at root.
double compute_raw_moment (const GenericCircuit &gc, gate_t root, unsigned k, std::optional< gate_t > event_root=std::nullopt)
 Compute the raw moment \(E[X^k]\) (or \(E[X^k \mid A]\) if event_root is set) for k >= 0.
double compute_central_moment (const GenericCircuit &gc, gate_t root, unsigned k, std::optional< gate_t > event_root=std::nullopt)
 Compute the central moment \(E[(X - E[X])^k]\) (or \(E[(X - E[X \mid A])^k \mid A]\) if event_root is set).
unsigned runConstantFold (GenericCircuit &gc)
 Constant-fold pass over every gate_arith in gc.
unsigned runHybridSimplifier (GenericCircuit &gc)
 Run the peephole simplifier over gc.
unsigned runHybridDecomposer (GenericCircuit &gc, unsigned samples)
 Marginalise unresolved continuous-island gate_cmp gates into Bernoulli gate_input leaves.
double monteCarloRV (const GenericCircuit &gc, gate_t root, unsigned samples)
 Run Monte Carlo on a circuit that may contain gate_rv leaves.
std::vector< double > monteCarloJointDistribution (const GenericCircuit &gc, const std::vector< gate_t > &cmps, unsigned samples)
 Estimate the joint distribution of cmps via Monte Carlo.
std::vector< double > monteCarloScalarSamples (const GenericCircuit &gc, gate_t root, unsigned samples)
 Sample a scalar sub-circuit samples times and return the draws.
ConditionalScalarSamples monteCarloConditionalScalarSamples (const GenericCircuit &gc, gate_t root, gate_t event_root, unsigned samples)
 Rejection-sample root conditioned on event_root.
std::optional< std::vector< double > > try_truncated_closed_form_sample (const GenericCircuit &gc, gate_t root, gate_t event_root, unsigned n)
 Try to draw n exact samples from the conditional distribution of root given event_root via closed-form truncation, bypassing MC rejection.
bool circuitHasRV (const GenericCircuit &gc, gate_t root)
 Walk the circuit reachable from root looking for any gate_rv.
double parseDoubleStrict (const std::string &s)
 Strictly parse s as a double.
std::optional< DistributionSpecparse_distribution_spec (const std::string &s)
 Parse the on-disk text encoding of a gate_rv distribution.
std::string format_distribution_spec (const DistributionSpec &d)
 Format a spec back into its on-disk text encoding.
double analytical_mean (const DistributionSpec &d)
 Closed-form expectation E[X] for a basic distribution.
double analytical_variance (const DistributionSpec &d)
 Closed-form variance Var(X) for a basic distribution.
double analytical_raw_moment (const DistributionSpec &d, unsigned k)
 Closed-form raw moment \(E[X^k]\) for a basic distribution.
unsigned runRangeCheck (GenericCircuit &gc)
 Run the support-based pruning pass over gc.
std::pair< double, double > compute_support (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root=std::nullopt)
 Compute the [lo, hi] support interval of a scalar sub-circuit rooted at root.
std::optional< std::pair< double, double > > collectRvConstraints (const GenericCircuit &gc, gate_t event_root, gate_t target_rv)
 Walk event_root collecting rv op c constraints on target_rv.
static bool extract_finite_double (const GenericCircuit &gc, gate_t x, double &out)
 Parse a gate_value's extra as a finite float8.
static bool extract_mulinput_value (const GenericCircuit &gc, gate_t mul, double &out)
 Same parsing applied to a mulinput's outcome label (categorical).
std::optional< TruncatedSingleRvmatchTruncatedSingleRv (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root)
 Detect a closed-form, optionally-truncated single-RV shape.
bool eventIsProvablyInfeasible (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root)
 True iff the conditioning event is provably infeasible for a bare gate_rv root.
static std::optional< double > shape_mass (const ClosedFormShape &s, double lo, double hi)
 Unconditional probability mass of a shape over the interval [lo, hi].
static std::optional< ClosedFormShapetruncateShape (const ClosedFormShape &s, double lo, double hi)
 Conditional shape after truncating the underlying variable to [lo, hi].
std::optional< ClosedFormShapematchClosedFormDistribution (const GenericCircuit &gc, gate_t root, std::optional< gate_t > event_root)
 Detect any of the closed-form shapes supported by rv_analytical_curves.

Typedef Documentation

◆ ClosedFormShape

Initial value:
std::variant<TruncatedSingleRv,
Bernoulli mixture (gate_mixture with the [p_token, x_token, y_token] shape).
Definition RangeCheck.h:217
Categorical distribution over a finite outcome set.
Definition RangeCheck.h:188
Point mass at a finite scalar value (a gate_value root, or an as_random(c) leaf surfaced as a gate_va...
Definition RangeCheck.h:172
Detection result for a closed-form, optionally-truncated single-RV shape.
Definition RangeCheck.h:101

One of the closed-form shapes the analytical-curves payload can render: bare RV (continuous PDF/CDF), Dirac (point mass), categorical (multiple point masses), or Bernoulli mixture of any two of the above.

Definition at line 200 of file RangeCheck.h.

Enumeration Type Documentation

◆ DistKind

enum class provsql::DistKind
strong

Continuous distribution kinds supported by gate_rv.

Enumerator
Normal 

Normal (Gaussian): p1=μ, p2=σ

Uniform 

Uniform on [a,b]: p1=a, p2=b.

Exponential 

Exponential: p1=λ, p2 unused.

Erlang 

Erlang: p1=k (positive integer), p2=λ.

Sum of k i.i.d. Exp(λ); support [0, +∞). The parameter k is stored in p1 as a double but must be integer-valued ≥ 1 for any consumer that uses the finite-sum CDF / sampler.

Definition at line 28 of file RandomVariable.h.

Function Documentation

◆ analytical_mean()

double provsql::analytical_mean ( const DistributionSpec & d)

Closed-form expectation E[X] for a basic distribution.

  • Normal(μ, σ): μ
  • Uniform(a, b): (a + b) / 2
  • Exponential(λ): 1 / λ
  • Erlang(k, λ): k / λ

Definition at line 113 of file RandomVariable.cpp.

Here is the caller graph for this function:

◆ analytical_raw_moment()

double provsql::analytical_raw_moment ( const DistributionSpec & d,
unsigned k )

Closed-form raw moment \(E[X^k]\) for a basic distribution.

  • Normal(μ, σ): \(\sum_{j=0,2,\ldots}^{k} \binom{k}{j} \mu^{k-j} \sigma^j (j-1)!!\) (odd- \(j\) terms vanish since central moments of \(N(0, \sigma)\) are zero for odd \(j\)).
  • Uniform(a, b): \((b^{k+1} - a^{k+1}) / ((k+1)(b-a))\).
  • Exponential(λ): \(k! / \lambda^k\).
  • Erlang(s, λ): \(\Gamma(s+k) / (\Gamma(s) \lambda^k) = s(s+1)\cdots(s+k-1) / \lambda^k\) for integer shape \(s\).

Returns 1 for \(k = 0\) and analytical_mean for \(k = 1\).

Definition at line 183 of file RandomVariable.cpp.

Here is the call graph for this function:

◆ analytical_variance()

double provsql::analytical_variance ( const DistributionSpec & d)

Closed-form variance Var(X) for a basic distribution.

  • Normal(μ, σ): σ²
  • Uniform(a, b): (b − a)² / 12
  • Exponential(λ): 1 / λ²
  • Erlang(k, λ): k / λ²

Definition at line 124 of file RandomVariable.cpp.

◆ cdfAt()

double provsql::cdfAt ( const DistributionSpec & d,
double c )

Closed-form CDF \(F_X(c) = P(X \le c)\) for a basic continuous distribution.

Returns the cumulative distribution at c for the distribution d. Used internally by AnalyticEvaluator's gate_cmp resolution and by the HybridEvaluator decomposer's monotone-shared-scalar fast path to compute interval probabilities analytically (no MC noise) when the shared scalar is a bare gate_rv. Returns NaN when d carries a parameter shape the CDF doesn't cover (e.g. non-integer Erlang shape, which would require the regularised lower incomplete gamma function).

  • Normal(μ, σ): \(\Phi((c - \mu) / \sigma)\) via std::erf.
  • Uniform(a, b): piecewise linear; 0 for c<=a, 1 for c>=b, (c - a) / (b - a) otherwise.
  • Exponential(λ): 1 - exp(-λc) for c>0; 0 for c<=0.
  • Erlang(k, λ) (integer k≥1): finite-sum form \(1 - e^{-\lambda c} \sum_{n=0}^{k-1} (\lambda c)^n / n!\) for c>0.

Definition at line 70 of file AnalyticEvaluator.cpp.

Here is the caller graph for this function:

◆ circuitHasRV()

bool provsql::circuitHasRV ( const GenericCircuit & gc,
gate_t root )

Walk the circuit reachable from root looking for any gate_rv.

Used by probability_evaluate to dispatch between the existing BooleanCircuit path and the RV-aware sampler in this file.

Definition at line 614 of file MonteCarloSampler.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ collectRvConstraints()

std::optional< std::pair< double, double > > provsql::collectRvConstraints ( const GenericCircuit & gc,
gate_t event_root,
gate_t target_rv )

Walk event_root collecting rv op c constraints on target_rv.

Descends through AND-conjunct factors (gate_times, gate_one, Boolean leaves whose footprint doesn't include target_rv – these are independent of the RV and contribute no truncation) collecting every gate_cmp interpretable as target_rv op c for a constant c, and intersects them into a running interval seeded with the unconditional support of target_rv.

Returns the resulting interval as (lo, hi), or std::nullopt if the walk found a structure that defeats the recognisers (a gate_plus / gate_monus disjunction over the chain, a cmp shape other than rv op const, ...). Callers treat std::nullopt as "fall back to the unconditional case" – sound for support and MC fallback for moments.

Constraints on RVs other than target_rv are ignored; they affect P(A) but not the truncation of the target's distribution.

Definition at line 1097 of file RangeCheck.cpp.

Here is the caller graph for this function:

◆ compute_central_moment()

double provsql::compute_central_moment ( const GenericCircuit & gc,
gate_t root,
unsigned k,
std::optional< gate_t > event_root = std::nullopt )

Compute the central moment \(E[(X - E[X])^k]\) (or \(E[(X - E[X \mid A])^k \mid A]\) if event_root is set).

k = 0 returns 1; k = 1 returns 0; k = 2 returns compute_variance. Higher orders are obtained by binomial expansion in terms of the raw moments returned by compute_raw_moment, which inherits the analytical / MC dispatch described above.

Definition at line 911 of file Expectation.cpp.

Here is the caller graph for this function:

◆ compute_expectation()

double provsql::compute_expectation ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root = std::nullopt )

Compute \(E[X]\) (or \(E[X \mid A]\) if event_root is set) over the scalar sub-circuit rooted at root.

The conditional path requires that event_root be a gate_t in the same GenericCircuit as root – typically the circuit was loaded via getJointCircuit so a shared gate_rv between root and event has one gate_t, which is what couples the MC sampler. When event_root is std::nullopt the unconditional path (existing analytical decomposition with MC fallback) is taken unchanged.

Exceptions
CircuitExceptionon malformed circuits, unknown distribution kinds, when provsql.rv_mc_samples is 0 and a sub-expression cannot be decomposed analytically, or when the conditional MC pass accepts too few samples (suggesting P(A) is very small or zero).

Definition at line 884 of file Expectation.cpp.

Here is the caller graph for this function:

◆ compute_raw_moment()

double provsql::compute_raw_moment ( const GenericCircuit & gc,
gate_t root,
unsigned k,
std::optional< gate_t > event_root = std::nullopt )

Compute the raw moment \(E[X^k]\) (or \(E[X^k \mid A]\) if event_root is set) for k >= 0.

k = 0 returns 1; k = 1 delegates to compute_expectation.

Definition at line 902 of file Expectation.cpp.

Here is the caller graph for this function:

◆ compute_support()

std::pair< double, double > provsql::compute_support ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root = std::nullopt )

Compute the [lo, hi] support interval of a scalar sub-circuit rooted at root.

Same interval-arithmetic propagation runRangeCheck uses internally, exposed for the SQL support() function:

  • gate_value: point [c, c].
  • gate_rv: distribution support (uniform exact, exponential on [0, +∞), normal on (-∞, +∞)).
  • gate_arith: propagated through +, , ×, /, unary .

Anything else collapses to the conservative all-real interval (-∞, +∞). Never throws on unrecognised gates – callers receive the wide interval instead, which is the right semantic for "we cannot prove a tighter bound".

When event_root is set, the returned interval is the intersection of the unconditional support with the per-RV constraints implied by the AND-conjunct chain rooted at the event (rv op c cmps over root collected via the same walker runRangeCheck uses for joint feasibility). Constraints we cannot interpret are silently skipped: the result is then a conservative superset of the true conditional support, never a subset.

Definition at line 1064 of file RangeCheck.cpp.

Here is the caller graph for this function:

◆ compute_variance()

double provsql::compute_variance ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root = std::nullopt )

Compute \(\mathrm{Var}[X]\) (or \(\mathrm{Var}[X \mid A]\) if event_root is set) over the scalar sub-circuit rooted at root.

Same exception contract as compute_expectation.

Definition at line 893 of file Expectation.cpp.

◆ evaluateBooleanProbability()

double provsql::evaluateBooleanProbability ( const GenericCircuit & gc,
gate_t boolRoot )

Probability that the Boolean subcircuit rooted at boolRoot evaluates to true under the tuple-independent probabilistic-database model.

Tries BooleanCircuit::independentEvaluation first; if that throws (e.g. the subcircuit is not disconnected for that method), falls back to Monte Carlo with provsql.rv_mc_samples samples. Used by the mixture moment evaluators for compound Boolean Bernoulli wires.

Definition at line 36 of file Expectation.cpp.

Here is the call graph for this function:

◆ eventIsProvablyInfeasible()

bool provsql::eventIsProvablyInfeasible ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root )

True iff the conditioning event is provably infeasible for a bare gate_rv root.

Distinguishes "event proved infeasible" (event resolves to gate_zero, or collectRvConstraints intersects to an empty interval) from "shape unsupported by @c matchTruncatedSingleRv" (return std::nullopt that just means "fall back to MC").

Used by the conditional-moment dispatcher to raise an explicit infeasibility error before falling through to MC rejection — MC would still detect the same condition by accepting 0 of N samples, but the closed-form predicate spots it without ten thousand wasted draws and emits a tighter message.

Returns false for non-gate_rv roots and for roots whose event/support pair is not provably infeasible by this cheap pass (the caller can still proceed to MC).

Definition at line 1215 of file RangeCheck.cpp.

Here is the call graph for this function:

◆ extract_finite_double()

bool provsql::extract_finite_double ( const GenericCircuit & gc,
gate_t x,
double & out )
static

Parse a gate_value's extra as a finite float8.

Sibling of extract_constant_double in having_semantics.cpp but with a const GenericCircuit ref (used in the closed-form shape detector path). Bails on NaN / ±Infinity so a downstream stem renderer never sees a non-finite x coordinate.

Definition at line 1136 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ extract_mulinput_value()

bool provsql::extract_mulinput_value ( const GenericCircuit & gc,
gate_t mul,
double & out )
static

Same parsing applied to a mulinput's outcome label (categorical).

Definition at line 1154 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ format_distribution_spec()

std::string provsql::format_distribution_spec ( const DistributionSpec & d)

Format a spec back into its on-disk text encoding.

Inverse of parse_distribution_spec: round-trip safe up to the precision of std::to_string for double.

Definition at line 95 of file RandomVariable.cpp.

◆ matchClosedFormDistribution()

std::optional< ClosedFormShape > provsql::matchClosedFormDistribution ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root )

Detect any of the closed-form shapes supported by rv_analytical_curves.

Generalisation of matchTruncatedSingleRv that adds Bernoulli mixtures, categoricals, and Dirac (scalar gate_value) roots. Conditioning (event_root) is honoured for bare RV roots only; Dirac / categorical / mixture roots bail when the event isn't gate_one (the post-load-simplification "always true" default).

Returns std::nullopt when none of the supported shapes match; callers fall back to histogram-only rendering.

Definition at line 1350 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ matchTruncatedSingleRv()

std::optional< TruncatedSingleRv > provsql::matchTruncatedSingleRv ( const GenericCircuit & gc,
gate_t root,
std::optional< gate_t > event_root )

Detect a closed-form, optionally-truncated single-RV shape.

Common shape-detection helper shared by every closed-form single-RV consumer:

Returns std::nullopt when the shape is not tractable:

  • root is not a bare gate_rv;
  • the gate's extra does not parse as a DistributionSpec;
  • event_root resolves to gate_zero (event already decided infeasible by runRangeCheck);
  • collectRvConstraints fails (incomplete walk);
  • the resulting interval is empty or degenerate (lo >= hi).

When event_root is omitted or resolves to gate_one, the returned TruncatedSingleRv carries the RV's natural support and truncated = false; callers that don't distinguish the conditional and unconditional cases (e.g. the analytical-curves x-range chooser) can read uniformly off the result.

Definition at line 1172 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ monteCarloConditionalScalarSamples()

ConditionalScalarSamples provsql::monteCarloConditionalScalarSamples ( const GenericCircuit & gc,
gate_t root,
gate_t event_root,
unsigned samples )

Rejection-sample root conditioned on event_root.

For each of samples iterations, the shared Sampler resets its per-iteration cache, then:

  1. evaluates event_root as a Boolean (populating bool_cache_ and scalar_cache_ for every gate_rv / gate_input touched);
  2. if the indicator is true, evaluates root as a scalar using the SAME caches, so any shared gate_t leaf produces one draw that the indicator and the value both observe;
  3. otherwise rejects the iteration.

This coupling is the entire point of routing the conditional path through one joint circuit: a gate_rv reachable from both root and event_root has the same gate_t and therefore shares its per-iteration draw between the indicator (which decides acceptance) and the value (which we record). The accepted draws are samples from the conditional distribution \(X \mid A\) where X = root and A = event_root.

Parameters
gcCircuit (typically from getJointCircuit).
rootScalar gate whose value we sample.
event_rootBoolean gate that the iteration must satisfy.
samplesNumber of iterations to attempt.

Definition at line 438 of file MonteCarloSampler.cpp.

Here is the caller graph for this function:

◆ monteCarloJointDistribution()

std::vector< double > provsql::monteCarloJointDistribution ( const GenericCircuit & gc,
const std::vector< gate_t > & cmps,
unsigned samples )

Estimate the joint distribution of cmps via Monte Carlo.

For each of samples worlds, samples the underlying continuous island once (shared gate_rv leaves use the same per-iteration draw, per monteCarloRV's evalScalar) and evaluates each comparator in cmps; the k = cmps.size() resulting bits form a single word w with bit i = result of cmps[i]. The returned vector has size 2^k; entry w is the empirical probability that the joint outcome w occurred.

Used by the multi-cmp half of the hybrid evaluator's island decomposer to inline a categorical distribution over the k cmps that share an island; cmps must all sit over a continuous island whose scalar evaluation reuses common gate_rv leaves so the cmp draws are correctly correlated.

k is capped at 30 (the result vector size is 2^30) to keep memory bounded; the decomposer enforces a much tighter cap (k_max in HybridEvaluator.cpp) so this is purely a safety limit. Throws CircuitException above the cap.

Parameters
gcThe circuit.
cmpsThe comparators jointly evaluated.
samplesNumber of independent worlds.
Returns
Vector of joint probabilities, indexed by the bit word w (bit i = cmps[i] outcome).

Definition at line 381 of file MonteCarloSampler.cpp.

◆ monteCarloRV()

double provsql::monteCarloRV ( const GenericCircuit & gc,
gate_t root,
unsigned samples )

Run Monte Carlo on a circuit that may contain gate_rv leaves.

Parameters
gcThe circuit (loaded from the mmap store via CircuitFromMMap).
rootGate to evaluate as a Boolean expression.
samplesNumber of independent worlds to sample.
Returns
Estimated probability that root is true.
Exceptions
CircuitExceptionon malformed circuits (unknown gate kind in a Boolean position, malformed extra, unknown comparison operator, etc.).

Definition at line 363 of file MonteCarloSampler.cpp.

Here is the caller graph for this function:

◆ monteCarloScalarSamples()

std::vector< double > provsql::monteCarloScalarSamples ( const GenericCircuit & gc,
gate_t root,
unsigned samples )

Sample a scalar sub-circuit samples times and return the draws.

root must yield a scalar (gate_value, gate_rv, or gate_arith over scalar children); otherwise a CircuitException is thrown. Each iteration uses a fresh per-iteration memo cache so that repeated occurrences of the same gate_rv UUID inside an arithmetic expression share their draw within an iteration but not across iterations.

The RNG is seeded from provsql.monte_carlo_seed exactly like monteCarloRV; pinning the GUC makes the returned vector reproducible.

Used as the universal MC fallback by the analytical evaluators (Expectation, HybridEvaluator) when structural shortcuts cannot decide a sub-expression. Returning the raw draws (rather than a single statistic) lets callers compute any combination of moments from a single sampling pass.

Definition at line 419 of file MonteCarloSampler.cpp.

Here is the caller graph for this function:

◆ parse_distribution_spec()

std::optional< DistributionSpec > provsql::parse_distribution_spec ( const std::string & s)

Parse the on-disk text encoding of a gate_rv distribution.

Accepts "normal:μ,σ", "uniform:a,b", "exponential:λ", and "erlang:k,λ", with parameters parseable as double. Whitespace around the kind name and parameters is tolerated.

Parameters
sThe byte string read from MMappedCircuit::getExtra.
Returns
The parsed spec, or std::nullopt on malformed input.

Definition at line 59 of file RandomVariable.cpp.

Here is the caller graph for this function:

◆ parseDoubleStrict()

double provsql::parseDoubleStrict ( const std::string & s)

Strictly parse s as a double.

Used by every consumer that has to interpret the extra byte string of a gate_value: the sampler when sampling a constant leaf, the interval-arith pass when bounding a constant leaf, and any future scalar-evaluation pass. Lives here (rather than next to one specific consumer) so the parsing convention is shared.

Exceptions
CircuitExceptionon empty input, non-numeric input, or trailing characters past the parsed double.

Definition at line 17 of file RandomVariable.cpp.

Here is the caller graph for this function:

◆ pdfAt()

double provsql::pdfAt ( const DistributionSpec & d,
double c )

Closed-form probability density \(f(c)\) for a basic distribution.

Used by rv_analytical_curves to ship a sampled curve to clients (Studio's Distribution profile overlay). Returns 0 outside the natural support and NaN for parameter shapes the analytical form doesn't cover (e.g. non-integer Erlang shape).

  • Normal(μ, σ): \(\frac{1}{\sigma\sqrt{2\pi}} \exp(-(c-\mu)^2 / (2\sigma^2))\).
  • Uniform(a, b): 1/(b-a) for a<=c<=b, 0 otherwise.
  • Exponential(λ): λ·exp(-λc) for c>=0, 0 otherwise.
  • Erlang(k, λ) (integer k>=1): \(\frac{\lambda^k c^{k-1} e^{-\lambda c}}{(k-1)!}\) for c>=0, 0 otherwise.

Definition at line 21 of file AnalyticEvaluator.cpp.

◆ runAnalyticEvaluator()

unsigned provsql::runAnalyticEvaluator ( GenericCircuit & gc)

Run the closed-form CDF resolution pass over gc.

For every gate_cmp in the circuit whose two sides match one of the supported shapes (see the header docstring), computes the comparator's probability analytically and replaces the cmp by a Bernoulli gate_input via GenericCircuit::resolveCmpToBernoulli.

Parameters
gcCircuit to mutate in place.
Returns
Number of comparators resolved by this pass.

Definition at line 300 of file AnalyticEvaluator.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runConstantFold()

unsigned provsql::runConstantFold ( GenericCircuit & gc)

Constant-fold pass over every gate_arith in gc.

Walks the circuit bottom-up and replaces any gate_arith whose children all evaluate to scalar constants with the equivalent gate_value (e.g. arith(NEG, value:2) becomes value:-2, arith(PLUS, value:1, value:2) becomes value:3).

Strictly a subset of runHybridSimplifier (only try_eval_constant fires; no family closures, identity drops, or mixture lifts), and therefore safe to run at load time alongside runRangeCheck and foldSemiringIdentities: the resulting gate_value gates carry no random identity, so no consumer's shared-RV coupling is broken by the rewrite. The family closures stay behind the separate hybrid_evaluation GUC because they replace a multi-leaf subtree with a fresh gate_rv UUID and would decouple shared base RVs that other parts of the circuit reference.

Lifts the -c::random_variable parser quirk (which builds an arith(NEG, value:c) gate rather than value:-c) into a clean gate_value before downstream consumers like collectRvConstraints / asRvVsConstCmp inspect the circuit.

Returns
Number of gates rewritten.

Definition at line 1255 of file HybridEvaluator.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runCountCmpEvaluator()

unsigned provsql::runCountCmpEvaluator ( GenericCircuit & gc)

Run the Poisson-binomial pre-pass over gc.

For every gate_cmp whose shape matches the first-slice scope (see file docstring), computes the comparator's probability by Poisson-binomial CDF and replaces the cmp by a Bernoulli gate_input via GenericCircuit::resolveCmpToBernoulli.

Parameters
gcCircuit to mutate in place.
Returns
Number of comparators resolved by this pass.

Definition at line 279 of file CountCmpEvaluator.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runHybridDecomposer()

unsigned provsql::runHybridDecomposer ( GenericCircuit & gc,
unsigned samples )

Marginalise unresolved continuous-island gate_cmp gates into Bernoulli gate_input leaves.

Runs after RangeCheck, the simplifier (runHybridSimplifier), and AnalyticEvaluator have done what they can. Picks up the residual comparators whose two sides are an entirely continuous island (subtree of gate_value, gate_rv, gate_arith with no Boolean structure underneath) but whose specific shape is not one the analytic CDF resolver handles — e.g. Normal + Uniform > 0, heterogeneous-rate sums of exponentials, or other compositions the simplifier could not fold to a bare distribution leaf.

Each qualifying comparator is marginalised by drawing samples worlds and applying the comparator scalar-by-scalar; the empirical probability replaces the gate_cmp via resolveCmpToBernoulli. The circuit downstream becomes purely Boolean, so the existing independent / tree-decomposition / compilation methods become available on circuits that would otherwise have to fall through to whole-circuit MC.

Singleton groups are marginalised into a single gate_input via GenericCircuit::resolveCmpToBernoulli.

Multi-cmp shared-island groups (k comparators sharing one or more base gate_rv leaves, detected via pairwise footprint overlap with union-find) are resolved by inlining a 2^k joint distribution table:

  • One anonymous gate_input acts as the block key.
  • One gate_mulinput per joint outcome with positive probability, all sharing the key, carries the joint mass (mutually-exclusive block).
  • Each comparator is rewritten as gate_plus over the mulinputs whose joint outcome word has the comparator's bit set. The downstream OR over the rewritten comparators thereby observes the dependent joint distribution: mulinputs across comparators dedup at OR sites in BooleanCircuit::independentEvaluationInternal (or are Bayesian-tree-rewritten by rewriteMultivaluedGates before tree-decomposition / monte-carlo / external compilers). Groups with k > JOINT_TABLE_K_MAX (currently 8, i.e. 256 outcomes) fall through to whole-circuit MC to keep the materialisation bounded.
Parameters
gcCircuit to mutate in place.
samplesNumber of MC iterations used per marginalisation. Callers typically pass provsql_rv_mc_samples.
Returns
Number of comparators resolved by this pass.

Definition at line 1777 of file HybridEvaluator.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runHybridSimplifier()

unsigned provsql::runHybridSimplifier ( GenericCircuit & gc)

Run the peephole simplifier over gc.

Visits every gate in post-order and applies the closure rules described in the header comment until a fixed point is reached.

Parameters
gcCircuit to mutate in place.
Returns
Number of gate rewrites performed by this pass.

Definition at line 1277 of file HybridEvaluator.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ runRangeCheck()

unsigned provsql::runRangeCheck ( GenericCircuit & gc)

Run the support-based pruning pass over gc.

For every gate_cmp in the circuit, computes the interval of (lhs - rhs) via interval arithmetic over gate_value, gate_rv, and gate_arith leaves; when the interval is provably above, below, or disjoint from zero, replaces the gate_cmp by a Bernoulli gate_input carrying the decided probability (0 or 1).

Comparators whose interval is inconclusive (overlaps zero) are left intact for downstream passes.

Iterates every gate (rather than walking from a specific root) so that a single sweep at getGenericCircuit time benefits every downstream consumer regardless of which sub-circuit they later traverse.

Parameters
gcCircuit to mutate in place.
Returns
Number of comparators resolved by this pass.

Definition at line 854 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ shape_mass()

std::optional< double > provsql::shape_mass ( const ClosedFormShape & s,
double lo,
double hi )
static

Unconditional probability mass of a shape over the interval [lo, hi].

TruncatedSingleRv arms supplied here must carry truncated == false (the unconditional shape); the helper uses the natural support to compute the CDF endpoints, so calling with an already-truncated input would double-truncate.

Recursive: a Bernoulli mixture's mass is the Bernoulli-weighted combination of its arms' masses. Categorical mass is the sum of outcome masses falling in the interval. Dirac mass is 1 iff the Dirac value sits in the interval, else 0. Returns std::nullopt when a leaf's spec defeats the closed-form CDF (e.g. non-integer Erlang shape — cdfAt returns NaN there).

Definition at line 1253 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ truncateShape()

std::optional< ClosedFormShape > provsql::truncateShape ( const ClosedFormShape & s,
double lo,
double hi )
static

Conditional shape after truncating the underlying variable to [lo, hi].

Bare-RV arm: intersects its natural / current truncation with [lo, hi] and marks the result truncated so downstream shape_pdf renormalises by the truncated CDF. Dirac: keep iff value ∈ interval, otherwise nullopt (infeasible). Categorical: keep outcomes in interval, renormalise masses. Bernoulli mixture: recursively truncate each arm and reweight the Bernoulli by the ratio of arm masses (the standard \( \pi' = \pi Z_L / (\pi Z_L + (1-\pi) Z_R) \) update); a fully-eliminated arm degenerates to the surviving one. Returns nullopt when the truncated shape has zero mass (caller can raise infeasibility).

Definition at line 1299 of file RangeCheck.cpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ try_truncated_closed_form_sample()

std::optional< std::vector< double > > provsql::try_truncated_closed_form_sample ( const GenericCircuit & gc,
gate_t root,
gate_t event_root,
unsigned n )

Try to draw n exact samples from the conditional distribution of root given event_root via closed-form truncation, bypassing MC rejection.

Fires only when root is a bare gate_rv whose family admits a closed-form truncation (Uniform / Exponential / Normal) and collectRvConstraints can extract a sound interval from event_root. Other shapes (arith composites, mixtures, Erlang, un-extractable events) return std::nullopt so the caller can fall back to monteCarloConditionalScalarSamples.

Sampling kernels:

  • Uniform(a, b): collectRvConstraints already intersects with [a, b], so the draw is a plain U(lo, hi) on the intersected interval. 100% acceptance.
  • Exponential(λ), one-sided X > c: memorylessness yields c + Exp(λ). Two-sided lo < X < hi: inverse-CDF via std::log1p / std::expm1 for numerical accuracy near the support boundary.
  • Normal(μ, σ): inverse-CDF transform. Forward CDF uses std::erf (matching AnalyticEvaluator::cdfAt); inverse uses the Beasley-Springer-Moro rational approximation (~1e-7 accuracy, ample for sampling).

Empty / degenerate truncations (lo >= hi after intersection) also return std::nullopt so the caller's MC fallback can emit its usual "accepted 0" diagnostic.

The RNG is seeded from provsql.monte_carlo_seed identically to monteCarloScalarSamples, so a pinned seed gives reproducible output on either path.

Definition at line 532 of file MonteCarloSampler.cpp.

Here is the call graph for this function:
Here is the caller graph for this function: