ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
classify_query.h
Go to the documentation of this file.
1/**
2 * @file classify_query.h
3 * @brief Public surface of the query-time TID / BID / OPAQUE classifier.
4 *
5 * The classifier is invoked by @c provsql_planner on the top-level
6 * @c Query when the @c provsql.classify_top_level GUC is on, and emits a
7 * @c NOTICE carrying the certified kind and the set of
8 * provenance-tracked base relations the query touches. Independent-
9 * TID join inference, BID block-key preservation under projection
10 * and @c GROUP @c BY, transitive ancestor-set computation through
11 * the per-relation registry, view descent, and ANSI @c INNER /
12 * @c CROSS @c JOIN handling all live in this same file's helpers
13 * and the matching propagation pre-passes in
14 * @c src/safe_query.c.
15 */
16#ifndef CLASSIFY_QUERY_H
17#define CLASSIFY_QUERY_H
18
19#include "postgres.h"
20#include "nodes/parsenodes.h"
21#include "nodes/pg_list.h"
22
23#include "MMappedTableInfo.h"
24
25/** @brief GUC: emit a classification @c NOTICE on every top-level SELECT. */
27
28/**
29 * @brief Result of @c provsql_classify_query.
30 *
31 * @c kind is the certified result-relation kind under the existing
32 * @c provsql_table_kind taxonomy (TID, BID, OPAQUE).
33 *
34 * @c source_relids is a @c List of @c pg_class @c Oids of the
35 * provenance-tracked base relations the query touches, built via
36 * @c lappend_oid. Reported even when @c kind is OPAQUE so the caller
37 * can attribute the non-certifiability to specific tables. Allocated
38 * in the current memory context; callers may free with @c list_free.
39 */
44
45/**
46 * @brief Classify the result relation of a parsed top-level @c Query.
47 *
48 * Scope :
49 *
50 * - Single-source classification : a flat @c fromlist of
51 * @c RangeTblRefs, with no kind-altering features (@c SubLinks,
52 * modifying @c CTEs, @c cteList, @c DISTINCT, @c GROUP @c BY,
53 * @c HAVING, aggregates, window functions, set-returning
54 * functions in the target list). Zero or one provenance-tracked
55 * base relation reached either directly (@c RTE_RELATION) or
56 * through any depth of @c RTE_SUBQUERY entries (view bodies
57 * after PG rewriting, inline @c FROM subqueries). The PG 18
58 * virtual @c RTE_GROUP is skipped transparently. When a single
59 * tracked base relation is reached, the source's recorded kind
60 * is preserved verbatim. @c ORDER @c BY, @c LIMIT, @c OFFSET
61 * do not change row lineages and are therefore transparent.
62 * - @c UNION @c ALL specialisation : a fully-UNION-ALL tree of
63 * subquery legs each of which classifies as TID over a base-
64 * relid set that is disjoint from every other leg's promotes
65 * to TID with the cumulative source list.
66 * - Zero tracked sources : trivially deterministic, reported as
67 * TID with an empty source list.
68 * - Everything else is reported as OPAQUE.
69 *
70 * - Multi-source TID promotion : @c n_meta @c >= @c 2 promotes
71 * to TID when every classifier-reported source is TID and the
72 * registered ancestor sets are pairwise disjoint.
73 * - BID projection preservation : the single-source BID branch
74 * walks the outer target list (transitively through any depth
75 * of @c RTE_SUBQUERY TLE projection) and downgrades to OPAQUE
76 * when any block-key column is dropped.
77 * - BID @c GROUP @c BY block-key promotion : a pre-dispatch
78 * special case promotes @c SELECT @c k @c FROM @c bid_t
79 * @c GROUP @c BY @c k to TID (each output row is one block).
80 * - ANSI @c INNER / @c CROSS @c JOIN : the shape gate accepts
81 * @c JoinExpr fromlist entries when @c jointype @c ==
82 * @c JOIN_INNER, recursing into both arms.
83 *
84 * @param q Parsed @c Query to classify. Read-only; not mutated.
85 * @param out Output struct. @c source_relids is built in the current
86 * memory context.
87 */
88extern void provsql_classify_query(Query *q, ProvSQLClassification *out);
89
90/**
91 * @brief Render the result of @c provsql_classify_query as a @c NOTICE.
92 *
93 * Formats :
94 * @verbatim
95 * ProvSQL: query result is TID (sources: schema.t1, schema.t2)
96 * ProvSQL: query result is BID (sources: schema.t1)
97 * ProvSQL: query result is TID (no provenance-tracked sources)
98 * ProvSQL: query result is OPAQUE
99 * @endverbatim
100 *
101 * The OPAQUE form omits the parenthetical because the rtable
102 * walk only reaches syntactically visible sources : when the
103 * shape gate trips on a sublink, set operation, GROUP BY, ...
104 * the list would be partial and falsely suggest completeness.
105 */
107
108#endif /* CLASSIFY_QUERY_H */
Per-table provenance metadata persisted alongside the circuit store.
provsql_table_kind
How the provenance leaves of a tracked relation are correlated.
bool provsql_classify_top_level
Backing storage for the provsql.classify_top_level GUC.
void provsql_classify_emit_notice(const ProvSQLClassification *c)
Render the result of provsql_classify_query as a NOTICE.
void provsql_classify_query(Query *q, ProvSQLClassification *out)
Classify the result relation of a parsed top-level Query.
Result of provsql_classify_query.
provsql_table_kind kind