ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
Which.h
Go to the documentation of this file.
1/**
2 * @file semiring/Which.h
3 * @brief Which-provenance (lineage) m-semiring.
4 *
5 * The **which-provenance** semiring (also known as *lineage*) records,
6 * for each derivation of a result tuple, the set of base-tuple labels
7 * that contributed to it. Unlike why-provenance, only one such set is
8 * tracked per gate: the union of all witnesses across all derivations.
9 *
10 * Formally, the carrier set is
11 * @f$\mathcal{P}(\text{Labels}) \uplus \{\bot\}@f$, where @f$\bot@f$
12 * (the additive identity) represents "no derivation". Implemented as
13 * @c which_provenance_t = @c std::optional<std::set<std::string>>
14 * (an empty optional is @f$\bot@f$).
15 *
16 * Operations:
17 * - @c zero() → @f$\bot@f$ (no derivations)
18 * - @c one() → @f$\emptyset@f$ (a derivation requiring no witnesses)
19 * - @c plus() → union of witness sets; @f$\bot@f$ acts as identity
20 * - @c times() → union of witness sets; @f$\bot@f$ is absorbing
21 * - @c monus() → set difference, @f$\bot@f$ when @f$x \subseteq y@f$
22 * - @c delta() → identity (@f$\bot \mapsto \bot@f$, @f$x \mapsto x@f$)
23 *
24 * The semiring is idempotent (set union is idempotent: @f$a \oplus a = a@f$)
25 * but **not** absorptive in the @f$\mathbb{1} \oplus a = \mathbb{1}@f$ sense
26 * used by @c absorptive(): @f$\emptyset \oplus \{x\} = \{x\} \neq \emptyset@f$.
27 * Times also does not left-distribute over monus as long as the label
28 * universe is non-empty.
29 *
30 * @see https://provsql.org/lean-docs/Provenance/Semirings/Which.html
31 * Lean 4 verified instance: @c instSemiringWithMonusWhich, with
32 * proofs of @c Which.idempotent, @c Which.not_absorptive, and
33 * @c Which.not_mul_sub_left_distributive.
34 */
35#ifndef WHICH_H
36#define WHICH_H
37
38#include <cstring>
39#include <optional>
40#include <set>
41#include <sstream>
42#include <string>
43#include <vector>
44
45#include "Semiring.h"
46
47namespace semiring {
48
49/** @brief Which-provenance value: a set of labels, or @f$\bot@f$ (empty optional). */
50using which_provenance_t = std::optional<std::set<std::string> >;
51
52/**
53 * @brief Which-provenance (lineage) semiring.
54 *
55 * Each gate evaluates to a @c which_provenance_t (set of contributing
56 * labels, or @f$\bot@f$).
57 */
58class Which : public Semiring<which_provenance_t> {
59public:
60value_type zero() const override {
61 return std::nullopt;
62}
63
64value_type one() const override {
65 return std::set<std::string>{};
66}
67
68value_type plus(const std::vector<value_type> &vec) const override {
69 value_type result = std::nullopt;
70 for (const auto &v : vec) {
71 if(!v.has_value())
72 continue;
73 if(!result.has_value())
74 result = std::set<std::string>{};
75 result->insert(v->begin(), v->end());
76 }
77 return result;
78}
79
80value_type times(const std::vector<value_type> &vec) const override {
81 if(vec.empty())
82 return one();
83
84 value_type result = std::set<std::string>{};
85 for (const auto &v : vec) {
86 if(!v.has_value())
87 return std::nullopt;
88 result->insert(v->begin(), v->end());
89 }
90 return result;
91}
92
93value_type monus(value_type x, value_type y) const override {
94 if(!x.has_value())
95 return std::nullopt;
96 if(!y.has_value())
97 return x;
98 if(std::includes(y->begin(), y->end(), x->begin(), x->end()))
99 return std::nullopt;
100 std::set<std::string> diff;
101 std::set_difference(
102 x->begin(), x->end(),
103 y->begin(), y->end(),
104 std::inserter(diff, diff.end())
105 );
106 return diff;
107}
108
109value_type delta(value_type x) const override {
110 return x;
111}
112
113/**
114 * @brief No semiring homomorphism @c BoolFunc(Y) →+* Which exists, so
115 * the safe-query Boolean rewrite is unsound under
116 * which-provenance. (Lineage tracks the union of contributing
117 * labels per derivation; the read-once rewrite collapses
118 * alternative derivations to a single shared label set, which
119 * is not in general the union produced by the original
120 * circuit.) Inherits the @c false default from @c Semiring;
121 * this override exists for documentation.
122 *
123 * Lean: @c Provenance.Semirings.Which.no_hom_from_BoolFunc
124 * (provenance-lean/Provenance/Semirings/Which.lean).
125 */
126virtual bool compatibleWithBooleanRewrite() const override {
127 return false;
128}
129
130/**
131 * @brief Parse a leaf value into a which-provenance set.
132 *
133 * Accepted input formats (round-trip with @c to_text):
134 * - @c "⊥" → zero (no derivation)
135 * - Bare label (no leading @c '{'): @c "Alice" → @c {Alice}
136 * - Empty set: @c "{}" → one (a derivation requiring no witnesses)
137 * - Full set: @c "{a,b,c}" → @c {a,b,c}
138 *
139 * Labels may contain any character except @c ',', @c '{', @c '}'.
140 */
141value_type parse_leaf(const char *v) const {
142 if(strcmp(v, "⊥") == 0)
143 return std::nullopt;
144
145 if(*v != '{') {
146 std::set<std::string> single;
147 single.insert(std::string(v));
148 return value_type(std::move(single));
149 }
150
151 const char *p = v + 1;
152 std::set<std::string> result;
153
154 if(*p == '}') {
155 if(*(p+1) != '\0')
156 throw SemiringException("Which: trailing junk after closing '}'");
157 return value_type(std::move(result));
158 }
159
160 while(true) {
161 const char *start = p;
162 while(*p && *p != ',' && *p != '}' && *p != '{')
163 ++p;
164 if(p == start)
165 throw SemiringException("Which: empty label");
166 result.insert(std::string(start, p - start));
167 if(*p == ',') { ++p; continue; }
168 if(*p == '}') {
169 if(*(p+1) != '\0')
170 throw SemiringException("Which: trailing junk after closing '}'");
171 return value_type(std::move(result));
172 }
173 throw SemiringException("Which: unterminated set");
174 }
175}
176
177std::string to_text(const value_type &prov) const {
178 std::ostringstream oss;
179 if(!prov.has_value()) {
180 oss << "⊥";
181 } else {
182 oss << "{";
183 bool first = true;
184 for (const auto &label : *prov) {
185 if (!first) oss << ",";
186 first = false;
187 oss << label;
188 }
189 oss << "}";
190 }
191 return oss.str();
192}
193};
194
195}
196
197#endif // WHICH_H
Abstract semiring interface for provenance evaluation.
Exception thrown when a semiring operation is not supported.
Definition Semiring.h:51
Abstract base class for (m-)semirings.
Definition Semiring.h:93
Which-provenance (lineage) semiring.
Definition Which.h:58
value_type times(const std::vector< value_type > &vec) const override
Apply the multiplicative operation to a list of values.
Definition Which.h:80
value_type plus(const std::vector< value_type > &vec) const override
Apply the additive operation to a list of values.
Definition Which.h:68
value_type zero() const override
Return the additive identity .
Definition Which.h:60
std::string to_text(const value_type &prov) const
Definition Which.h:177
value_type monus(value_type x, value_type y) const override
Apply the monus (m-semiring difference) operation.
Definition Which.h:93
value_type parse_leaf(const char *v) const
Parse a leaf value into a which-provenance set.
Definition Which.h:141
value_type delta(value_type x) const override
Apply the operator.
Definition Which.h:109
value_type one() const override
Return the multiplicative identity .
Definition Which.h:64
virtual bool compatibleWithBooleanRewrite() const override
No semiring homomorphism BoolFunc(Y) →+* Which exists, so the safe-query Boolean rewrite is unsound u...
Definition Which.h:126
std::optional< std::set< std::string > > which_provenance_t
Which-provenance value: a set of labels, or (empty optional).
Definition Which.h:50