ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
RandomVariable.h
Go to the documentation of this file.
1/**
2 * @file RandomVariable.h
3 * @brief Continuous random-variable helpers (distribution parsing, moments).
4 *
5 * Helpers for the @c gate_rv leaf introduced for continuous probabilistic
6 * c-tables. A @c gate_rv stores its distribution name and parameters in
7 * the gate's @c extra byte string using a small text encoding (e.g.
8 * <tt>"normal:2.5,0.5"</tt>); these helpers parse and format that encoding,
9 * and provide closed-form moments where they exist. Arithmetic over RV
10 * expressions is built on the generic @c gate_arith gate (see
11 * @c provsql_utils.h), which is shared with non-RV scalar arithmetic.
12 *
13 * Sampling itself lives in @c BooleanCircuit::monteCarlo and
14 * @c MonteCarloSampler; this header only exposes what is needed for
15 * parsing and analytical moment computation.
16 */
17#ifndef PROVSQL_RANDOM_VARIABLE_H
18#define PROVSQL_RANDOM_VARIABLE_H
19
20#include <optional>
21#include <string>
22
23namespace provsql {
24
25/**
26 * @brief Continuous distribution kinds supported by @c gate_rv.
27 */
28enum class DistKind {
29 Normal, ///< Normal (Gaussian): p1=μ, p2=σ
30 Uniform, ///< Uniform on [a,b]: p1=a, p2=b
31 Exponential, ///< Exponential: p1=λ, p2 unused
32 Erlang ///< Erlang: p1=k (positive integer), p2=λ. Sum of k
33 ///< i.i.d. @c Exp(λ); support @c [0, +∞). The
34 ///< parameter @c k is stored in @c p1 as a double but
35 ///< must be integer-valued ≥ 1 for any consumer that
36 ///< uses the finite-sum CDF / sampler.
37};
38
39/**
40 * @brief Parsed distribution spec (kind + up to two parameters).
41 *
42 * Stored in the @c extra byte string of a @c gate_rv as
43 * <tt>"normal:μ,σ"</tt>, <tt>"uniform:a,b"</tt>, <tt>"exponential:λ"</tt>,
44 * or <tt>"erlang:k,λ"</tt>.
45 */
48 double p1; ///< First parameter (μ, a, or λ)
49 double p2; ///< Second parameter (σ or b; unused for Exponential)
50};
51
52/**
53 * @brief Parse the on-disk text encoding of a @c gate_rv distribution.
54 *
55 * Accepts <tt>"normal:μ,σ"</tt>, <tt>"uniform:a,b"</tt>,
56 * <tt>"exponential:λ"</tt>, and <tt>"erlang:k,λ"</tt>, with parameters
57 * parseable as @c double. Whitespace around the kind name and
58 * parameters is tolerated.
59 *
60 * @param s The byte string read from @c MMappedCircuit::getExtra.
61 * @return The parsed spec, or @c std::nullopt on malformed input.
62 */
63std::optional<DistributionSpec> parse_distribution_spec(const std::string &s);
64
65/**
66 * @brief Format a spec back into its on-disk text encoding.
67 *
68 * Inverse of @c parse_distribution_spec: round-trip safe up to the
69 * precision of @c std::to_string for @c double.
70 */
71std::string format_distribution_spec(const DistributionSpec &d);
72
73/**
74 * @brief Strictly parse @p s as a @c double.
75 *
76 * Used by every consumer that has to interpret the @c extra byte
77 * string of a @c gate_value: the sampler when sampling a constant
78 * leaf, the interval-arith pass when bounding a constant leaf, and
79 * any future scalar-evaluation pass. Lives here (rather than next
80 * to one specific consumer) so the parsing convention is shared.
81 *
82 * @throws CircuitException on empty input, non-numeric input, or
83 * trailing characters past the parsed double.
84 */
85double parseDoubleStrict(const std::string &s);
86
87/**
88 * @brief Closed-form expectation E[X] for a basic distribution.
89 *
90 * - Normal(μ, σ): μ
91 * - Uniform(a, b): (a + b) / 2
92 * - Exponential(λ): 1 / λ
93 * - Erlang(k, λ): k / λ
94 */
95double analytical_mean(const DistributionSpec &d);
96
97/**
98 * @brief Closed-form variance Var(X) for a basic distribution.
99 *
100 * - Normal(μ, σ): σ²
101 * - Uniform(a, b): (b − a)² / 12
102 * - Exponential(λ): 1 / λ²
103 * - Erlang(k, λ): k / λ²
104 */
106
107/**
108 * @brief Closed-form raw moment @f$E[X^k]@f$ for a basic distribution.
109 *
110 * - Normal(μ, σ):
111 * @f$\sum_{j=0,2,\ldots}^{k} \binom{k}{j} \mu^{k-j} \sigma^j (j-1)!!@f$
112 * (odd-@f$j@f$ terms vanish since central moments of @f$N(0, \sigma)@f$
113 * are zero for odd @f$j@f$).
114 * - Uniform(a, b): @f$(b^{k+1} - a^{k+1}) / ((k+1)(b-a))@f$.
115 * - Exponential(λ): @f$k! / \lambda^k@f$.
116 * - Erlang(s, λ):
117 * @f$\Gamma(s+k) / (\Gamma(s) \lambda^k) = s(s+1)\cdots(s+k-1) / \lambda^k@f$
118 * for integer shape @f$s@f$.
119 *
120 * Returns 1 for @f$k = 0@f$ and @c analytical_mean for @f$k = 1@f$.
121 */
122double analytical_raw_moment(const DistributionSpec &d, unsigned k);
123
124} // namespace provsql
125
126#endif // PROVSQL_RANDOM_VARIABLE_H
DistKind
Continuous distribution kinds supported by gate_rv.
@ Normal
Normal (Gaussian): p1=μ, p2=σ
@ Exponential
Exponential: p1=λ, p2 unused.
@ Uniform
Uniform on [a,b]: p1=a, p2=b.
@ Erlang
Erlang: p1=k (positive integer), p2=λ.
double analytical_variance(const DistributionSpec &d)
Closed-form variance Var(X) for a basic distribution.
double parseDoubleStrict(const std::string &s)
Strictly parse s as a double.
std::optional< DistributionSpec > parse_distribution_spec(const std::string &s)
Parse the on-disk text encoding of a gate_rv distribution.
double analytical_mean(const DistributionSpec &d)
Closed-form expectation E[X] for a basic distribution.
std::string format_distribution_spec(const DistributionSpec &d)
Format a spec back into its on-disk text encoding.
double analytical_raw_moment(const DistributionSpec &d, unsigned k)
Closed-form raw moment for a basic distribution.
Parsed distribution spec (kind + up to two parameters).
double p2
Second parameter (σ or b; unused for Exponential).
double p1
First parameter (μ, a, or λ).