ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
WhereCircuit.h
Go to the documentation of this file.
1/**
2 * @file WhereCircuit.h
3 * @brief Where-provenance circuit tracking column-level data origin.
4 *
5 * Where-provenance records not just which base tuples contributed to a
6 * query result, but also *which attribute values* in those base tuples
7 * gave rise to each output column.
8 *
9 * @c WhereCircuit extends @c Circuit<WhereGate> with per-gate metadata
10 * that describes:
11 * - **Input gates** – the source table, row UUID, and number of columns.
12 * - **Projection gates** – the list of attribute positions being projected.
13 * - **Equality gates** – the pair of attribute positions being joined.
14 *
15 * The @c evaluate() method traverses the circuit and returns, for each
16 * output position, a set of @c Locator values identifying the base
17 * (table, tuple, column) triples that the value was copied from.
18 */
19#ifndef WHERE_CIRCUIT_H
20#define WHERE_CIRCUIT_H
21
22#include <set>
23#include <vector>
24#include <utility>
25
26#include "Circuit.hpp"
27
28/**
29 * @brief Gate types for a where-provenance circuit.
30 *
31 * - @c UNDETERMINED Placeholder; should not appear in a complete circuit.
32 * - @c TIMES Product (conjunction) of child where-provenance sets.
33 * - @c PLUS Sum (disjunction) of child where-provenance sets.
34 * - @c EQ Equijoin gate recording the joined attribute pair.
35 * - @c PROJECT Projection gate recording which attributes are kept.
36 * - @c IN Input gate for a single base-relation tuple.
37 */
38enum class WhereGate {
39 UNDETERMINED, ///< Placeholder; should not appear in a complete circuit
40 TIMES, ///< Product (conjunction) of child where-provenance sets
41 PLUS, ///< Sum (disjunction) of child where-provenance sets
42 EQ, ///< Equijoin gate recording the joined attribute pair
43 PROJECT, ///< Projection gate recording which attributes are kept
44 IN ///< Input gate for a single base-relation tuple
45};
46
47/**
48 * @brief Circuit encoding where-provenance (column-level data origin).
49 *
50 * Stores extra per-gate metadata that relates each gate to its position
51 * in the original query plan (table name, row UUID, column index).
52 */
53class WhereCircuit : public Circuit<WhereGate> {
54 private:
55 std::unordered_map<gate_t, uuid> input_token; ///< UUID of the source tuple for each IN gate
56 std::unordered_map<gate_t, std::pair<std::string,int>> input_info; ///< (table name, nb_columns) for each IN gate
57 std::unordered_map<gate_t, std::vector<int>> projection_info; ///< Projected attribute positions for PROJECT gates
58 std::unordered_map<gate_t, std::pair<int,int>> equality_info; ///< Joined attribute pair (pos1, pos2) for EQ gates
59
60 public:
61 /** @copydoc Circuit::setGate(const uuid&, gateType) */
62 gate_t setGate(const uuid &u, WhereGate type) override;
63
64 /**
65 * @brief Create an input gate for a specific table row.
66 *
67 * @param u UUID of the source tuple (row token).
68 * @param table Name of the source relation.
69 * @param nb_columns Number of columns in the source tuple.
70 * @return Gate identifier.
71 */
72 gate_t setGateInput(const uuid &u, std::string table, int nb_columns);
73
74 /**
75 * @brief Create a projection gate with column mapping.
76 *
77 * @param u UUID to associate with this gate.
78 * @param infos List of attribute positions surviving the projection.
79 * @return Gate identifier.
80 */
81 gate_t setGateProjection(const uuid &u, std::vector<int> &&infos);
82
83 /**
84 * @brief Create an equality (equijoin) gate for two attribute positions.
85 *
86 * @param u UUID to associate with this gate.
87 * @param pos1 Left-side attribute position.
88 * @param pos2 Right-side attribute position.
89 * @return Gate identifier.
90 */
91 gate_t setGateEquality(const uuid &u, int pos1, int pos2);
92
93 /** @copydoc Circuit::toString() */
94 std::string toString(gate_t g) const override;
95
96 /**
97 * @brief Describes the origin of a single attribute value.
98 *
99 * A @c Locator identifies a specific attribute of a specific tuple in a
100 * specific base table as the origin of an output value.
101 */
102 struct Locator {
103 std::string table; ///< Name of the source relation
104 uuid tid; ///< UUID (row token) of the source tuple
105 int position; ///< Zero-based column index within the tuple
106
107 /**
108 * @brief Construct a @c Locator.
109 * @param t Source table name.
110 * @param u Source tuple UUID.
111 * @param i Column position.
112 */
113 Locator(std::string t, uuid u, int i) : table(t), tid(u), position(i) {}
114 /**
115 * @brief Lexicographic ordering for use in @c std::set.
116 * @param that Other @c Locator.
117 * @return @c true if @c *this is less than @p that.
118 */
119 bool operator<(Locator that) const;
120 /**
121 * @brief Return a human-readable representation of this locator.
122 * @return String of the form "table[tid].position".
123 */
124 std::string toString() const;
125 };
126
127 /**
128 * @brief Evaluate the where-provenance circuit at gate @p g.
129 *
130 * Returns one set of @c Locator values per output column position.
131 * Each @c Locator in the set identifies a base-relation cell that
132 * contributed the value at that output position.
133 *
134 * @param g Root gate to evaluate.
135 * @return Vector (indexed by output column) of sets of @c Locator values.
136 */
137 std::vector<std::set<Locator>> evaluate(gate_t g) const;
138};
139
140#endif /* WHERE_CIRCUIT_H */
gate_t
Strongly-typed gate identifier.
Definition Circuit.h:48
Out-of-line template method implementations for Circuit<gateType>.
WhereGate
Gate types for a where-provenance circuit.
@ PROJECT
Projection gate recording which attributes are kept.
@ EQ
Equijoin gate recording the joined attribute pair.
@ PLUS
Sum (disjunction) of child where-provenance sets.
@ TIMES
Product (conjunction) of child where-provenance sets.
@ IN
Input gate for a single base-relation tuple.
@ UNDETERMINED
Placeholder; should not appear in a complete circuit.
Generic template base class for provenance circuits.
Definition Circuit.h:61
std::string uuid
UUID type used in this circuit (always std::string).
Definition Circuit.h:64
Circuit encoding where-provenance (column-level data origin).
std::unordered_map< gate_t, std::pair< int, int > > equality_info
Joined attribute pair (pos1, pos2) for EQ gates.
std::unordered_map< gate_t, uuid > input_token
UUID of the source tuple for each IN gate.
std::unordered_map< gate_t, std::pair< std::string, int > > input_info
(table name, nb_columns) for each IN gate
std::unordered_map< gate_t, std::vector< int > > projection_info
Projected attribute positions for PROJECT gates.
gate_t setGateInput(const uuid &u, std::string table, int nb_columns)
Create an input gate for a specific table row.
gate_t setGateEquality(const uuid &u, int pos1, int pos2)
Create an equality (equijoin) gate for two attribute positions.
gate_t setGate(const uuid &u, WhereGate type) override
Create or update the gate associated with UUID u.
std::string toString(gate_t g) const override
Return a textual description of gate g for debugging.
gate_t setGateProjection(const uuid &u, std::vector< int > &&infos)
Create a projection gate with column mapping.
std::vector< std::set< Locator > > evaluate(gate_t g) const
Evaluate the where-provenance circuit at gate g.
Describes the origin of a single attribute value.
std::string toString() const
Return a human-readable representation of this locator.
bool operator<(Locator that) const
Lexicographic ordering for use in std::set.
int position
Zero-based column index within the tuple.
std::string table
Name of the source relation.
uuid tid
UUID (row token) of the source tuple.
Locator(std::string t, uuid u, int i)
Construct a Locator.