ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
How.h
Go to the documentation of this file.
1/**
2 * @file semiring/How.h
3 * @brief How-provenance m-semiring (canonical polynomial provenance ℕ[X]).
4 *
5 * The **how-provenance** semiring is the universal commutative semiring
6 * for provenance [Green, Karvounarakis, Tannen, *Provenance Semirings*,
7 * PODS'07]: every commutative-semiring provenance value can be obtained
8 * from the how-provenance polynomial by the unique homomorphism that
9 * substitutes each input label with its semiring image.
10 *
11 * The carrier is the multivariate polynomial semiring
12 * @f$\mathbb{N}[X]@f$ over the set @f$X@f$ of base-tuple labels: each
13 * value is a *multiset of multisets* of labels, equivalently a sum of
14 * monomials with non-negative integer coefficients. We store it in
15 * canonical sum-of-products form as
16 * @c std::map<std::map<std::string,unsigned>,unsigned>: the outer map
17 * sends a monomial (variable→exponent) to its coefficient, and ordered
18 * containers give canonical equality on semantically-equal polynomials,
19 * which is the value-add over @c sr_formula.
20 *
21 * Operations:
22 * - @c zero() → @f$0@f$ (empty polynomial)
23 * - @c one() → @f$1@f$ (the constant 1, monomial of degree 0)
24 * - @c plus() → polynomial addition (coefficient-wise sum)
25 * - @c times() → polynomial multiplication (Cauchy product)
26 * - @c monus() → coefficient-wise truncated subtraction
27 * (per the Lean @c coeff_sub theorem)
28 * - @c delta() → @f$0@f$ if @f$x = 0@f$, else @f$1@f$ (support
29 * indicator, mirroring @c Counting::delta)
30 *
31 * This semiring is **not** idempotent and **not** absorptive
32 * (@c absorptive() returns @c false), so the evaluator does not
33 * deduplicate plus operands. It also does not satisfy
34 * left-distributivity of multiplication over monus, contradicting an
35 * earlier claim in the literature; see Lean @c How.not_idempotent,
36 * @c How.not_absorptive, and @c How.not_mul_sub_left_distributive.
37 *
38 * @see https://provsql.org/lean-docs/Provenance/Semirings/How.html
39 * Lean 4 verified instance: the @c MvPolynomial X ℕ
40 * @c SemiringWithMonus instance, with proofs of @c How.universal,
41 * @c How.not_idempotent, @c How.not_absorptive, and
42 * @c How.not_mul_sub_left_distributive.
43 */
44#ifndef HOW_H
45#define HOW_H
46
47#include <cstring>
48#include <map>
49#include <sstream>
50#include <string>
51#include <vector>
52
53#include "Semiring.h"
54
55namespace semiring {
56
57/** @brief A single label identifying a base tuple. */
58using how_label_t = std::string;
59/** @brief A monomial: each variable mapped to its (positive) exponent. */
60using how_monomial_t = std::map<how_label_t, unsigned>;
61/** @brief How-provenance value: each monomial mapped to its (positive) coefficient. */
62using how_provenance_t = std::map<how_monomial_t, unsigned>;
63
64/**
65 * @brief How-provenance m-semiring over @f$\mathbb{N}[X]@f$.
66 *
67 * Each gate evaluates to a @c how_provenance_t, a polynomial in
68 * canonical sum-of-products form. Two semantically-equal polynomials
69 * compare equal under @c std::map's lexicographic equality, so this
70 * semiring supports provenance-aware query equivalence.
71 */
72class How : public Semiring<how_provenance_t> {
73public:
74value_type zero() const override {
75 return {};
76}
77
78value_type one() const override {
79 return { { how_monomial_t{}, 1u } };
80}
81
82value_type plus(const std::vector<value_type> &vec) const override {
83 value_type result;
84 for (const auto &p : vec) {
85 for (const auto &[mono, coeff] : p) {
86 result[mono] += coeff;
87 }
88 }
89 return result;
90}
91
92value_type times(const std::vector<value_type> &vec) const override {
93 if (vec.empty()) return one();
94
95 value_type result = vec[0];
96 for (size_t i = 1; i < vec.size(); ++i) {
97 value_type next;
98 for (const auto &[mono1, c1] : result) {
99 for (const auto &[mono2, c2] : vec[i]) {
100 how_monomial_t combined = mono1;
101 for (const auto &[var, exp] : mono2) {
102 combined[var] += exp;
103 }
104 next[combined] += c1 * c2;
105 }
106 }
107 result = std::move(next);
108 }
109 return result;
110}
111
112value_type monus(value_type x, value_type y) const override {
113 for (const auto &[mono, c2] : y) {
114 auto it = x.find(mono);
115 if (it == x.end()) continue;
116 if (it->second <= c2)
117 x.erase(it);
118 else
119 it->second -= c2;
120 }
121 return x;
122}
123
124value_type delta(value_type x) const override {
125 return x.empty() ? zero() : one();
126}
127
128/**
129 * @brief No semiring homomorphism @c BoolFunc(Y) →+* ℕ[X] exists, so
130 * the safe-query Boolean rewrite is unsound under
131 * how-provenance (which is sensitive to derivation
132 * multiplicities and monomial structure that the read-once
133 * rewrite collapses). Inherits the @c false default from
134 * @c Semiring; this override exists for documentation.
135 *
136 * Lean: @c Provenance.Semirings.How.no_hom_from_BoolFunc
137 * (provenance-lean/Provenance/Semirings/How.lean).
138 */
139virtual bool compatibleWithBooleanRewrite() const override {
140 return false;
141}
142
143/**
144 * @brief Parse a leaf value into a how-provenance polynomial.
145 *
146 * Accepted input formats (round-trip with @c to_text):
147 * - @c "0" → zero polynomial
148 * - Bare label: @c "Alice" → @c {Alice}
149 * - Constant: @c "5" → @c 5 (polynomial @c 5⋅1)
150 * - Monomial: @c "2⋅Alice⋅Bob^2"
151 * - Sum of monomials: @c "2⋅Alice⋅Bob^2 + 3⋅Charlie"
152 *
153 * Variables may contain any character except @c '⋅', @c '+', @c '^'.
154 * The @c " + " separator must use ASCII spaces around the @c '+'.
155 */
156value_type parse_leaf(const char *v) const {
157 static const std::string DOT = "\xE2\x8B\x85"; // ⋅ U+22C5
158 static const std::string PLUS_SEP = " + ";
159
160 value_type result;
161 std::string s(v);
162
163 if(s == "0")
164 return result;
165
166 size_t pos = 0;
167 while(pos <= s.size()) {
168 size_t mono_end = s.find(PLUS_SEP, pos);
169 bool last = (mono_end == std::string::npos);
170 if(last)
171 mono_end = s.size();
172
173 std::string mono_s = s.substr(pos, mono_end - pos);
174 if(mono_s.empty())
175 throw SemiringException("How: empty monomial");
176
177 std::vector<std::string> parts;
178 size_t mp = 0;
179 while(mp <= mono_s.size()) {
180 size_t end = mono_s.find(DOT, mp);
181 bool plast = (end == std::string::npos);
182 if(plast) end = mono_s.size();
183 parts.push_back(mono_s.substr(mp, end - mp));
184 if(plast) break;
185 mp = end + DOT.size();
186 }
187
188 unsigned coeff = 1;
189 size_t first_factor = 0;
190 if(!parts.empty()) {
191 const std::string &first = parts[0];
192 bool is_num = !first.empty();
193 for(char c : first) {
194 if(c < '0' || c > '9') { is_num = false; break; }
195 }
196 if(is_num) {
197 coeff = static_cast<unsigned>(std::stoul(first));
198 first_factor = 1;
199 }
200 }
201
202 how_monomial_t mono;
203 for(size_t i = first_factor; i < parts.size(); ++i) {
204 const std::string &factor = parts[i];
205 if(factor.empty())
206 throw SemiringException("How: empty factor");
207 size_t caret = factor.find('^');
208 std::string var;
209 unsigned exp = 1;
210 if(caret == std::string::npos) {
211 var = factor;
212 } else {
213 var = factor.substr(0, caret);
214 if(var.empty())
215 throw SemiringException("How: empty variable name");
216 try {
217 exp = static_cast<unsigned>(std::stoul(factor.substr(caret + 1)));
218 } catch(...) {
219 throw SemiringException("How: invalid exponent");
220 }
221 }
222 if(var.empty())
223 throw SemiringException("How: empty variable name");
224 mono[var] += exp;
225 }
226
227 if(coeff > 0)
228 result[std::move(mono)] += coeff;
229
230 if(last) break;
231 pos = mono_end + PLUS_SEP.size();
232 }
233
234 return result;
235}
236
237std::string to_text(const value_type &prov) const {
238 std::ostringstream oss;
239 if (prov.empty()) {
240 oss << "0";
241 } else {
242 bool firstMono = true;
243 for (const auto &[mono, coeff] : prov) {
244 if (!firstMono) oss << " + ";
245 firstMono = false;
246 bool need_dot = false;
247 if (coeff != 1 || mono.empty()) {
248 oss << coeff;
249 need_dot = true;
250 }
251 for (const auto &[var, exp] : mono) {
252 if (need_dot) oss << "⋅";
253 need_dot = true;
254 oss << var;
255 if (exp != 1) oss << "^" << exp;
256 }
257 }
258 }
259 return oss.str();
260}
261
262};
263
264}
265
266#endif // HOW_H
Abstract semiring interface for provenance evaluation.
How-provenance m-semiring over .
Definition How.h:72
value_type times(const std::vector< value_type > &vec) const override
Apply the multiplicative operation to a list of values.
Definition How.h:92
virtual bool compatibleWithBooleanRewrite() const override
No semiring homomorphism BoolFunc(Y) →+* ℕ[X] exists, so the safe-query Boolean rewrite is unsound un...
Definition How.h:139
std::string to_text(const value_type &prov) const
Definition How.h:237
value_type monus(value_type x, value_type y) const override
Apply the monus (m-semiring difference) operation.
Definition How.h:112
value_type zero() const override
Return the additive identity .
Definition How.h:74
value_type parse_leaf(const char *v) const
Parse a leaf value into a how-provenance polynomial.
Definition How.h:156
value_type one() const override
Return the multiplicative identity .
Definition How.h:78
value_type plus(const std::vector< value_type > &vec) const override
Apply the additive operation to a list of values.
Definition How.h:82
value_type delta(value_type x) const override
Apply the operator.
Definition How.h:124
Exception thrown when a semiring operation is not supported.
Definition Semiring.h:51
Abstract base class for (m-)semirings.
Definition Semiring.h:93
std::map< how_label_t, unsigned > how_monomial_t
A monomial: each variable mapped to its (positive) exponent.
Definition How.h:60
std::string how_label_t
A single label identifying a base tuple.
Definition How.h:58
std::map< how_monomial_t, unsigned > how_provenance_t
How-provenance value: each monomial mapped to its (positive) coefficient.
Definition How.h:62