Query Rewriting Pipeline
This page is a detailed walkthrough of provsql.c, the core of
ProvSQL. It describes how every SQL query is intercepted and rewritten
to propagate provenance tokens.
For the high-level architecture and data-flow overview, see Architecture Overview.
Entry Point: provsql_planner
PostgreSQL’s planner hook allows extensions to intercept every query
before planning. provsql_planner is installed by
_PG_init and is called for every query:
INSERT ... SELECT(CMD_INSERT): delegates toprocess_insert_selectto propagate provenance from the sourceSELECTinto the target table’sprovsqlcolumn.SELECT(CMD_SELECT): checks whether any relation in the range table carries aprovsqlcolumn (has_provenance). If so, callsprocess_queryto rewrite the query tree.After rewriting, the (possibly modified) query is passed to the previous planner hook or
standard_planner.
When provsql.verbose_level >= 20 (PostgreSQL 15+), the full query
text is logged before and after rewriting. At level ≥ 40, the time
spent in the rewriter is logged.
Query-Time TID / BID Classifier
A second, read-only analysis runs from the same planner-hook entry,
guarded by the provsql.classify_top_level
GUC. When on, provsql_classify_query (in
classify_query.c) inspects the user’s parsed Query and
emits a NOTICE certifying the kind of the result relation under
the existing provsql_table_kind taxonomy (TID / BID /
OPAQUE) together with the provenance-tracked base relations the
query touches.
The classifier runs on the user’s original parse tree, before any rewriting, so the reported kind reflects the SQL the user wrote. It is purely informational: no side effect on the query tree, no behavioural change downstream.
Decision rules. A shape gate first rejects any query carrying
any of hasSubLinks, hasModifyingCTE, a non-empty cteList,
a non-NULL setOperations, or a fromlist entry that is neither
a RangeTblRef nor an INNER / CROSS JoinExpr (recursed
into via classify_fromlist_shape_ok, which descends
through each JoinExpr’s two arms to reach the underlying
RangeTblRefs). The range table is then walked, collecting
RTE_RELATION entries with provsql_table_info metadata as
sources; RTE_SUBQUERY entries (the form views take after PG
rewriting, as well as inline FROM-clause subqueries) are
descended into recursively under the same shape gate, so a tracked
base relation reached through any depth of subquery nesting still
contributes to the accumulator; the PostgreSQL 18 synthetic
RTE_GROUP and the RTE_JOIN synthetic alias rows PG appends
alongside every JoinExpr are skipped transparently; any other
rtekind (RTE_VALUES, RTE_FUNCTION, RTE_CTE …) trips
the shape gate. Outer joins (LEFT / RIGHT / FULL)
stay OPAQUE because their NULL-padding rows break per-row
independence; USING and aliased joins are also OPAQUE (they
would require resolving columns through joinaliasvars).
The kind is then decided from the cumulative tracked-source count:
shape failed →
OPAQUE(hidden structure may carry correlations the classifier cannot rule out);shape ok and zero tracked sources →
TID(no provenance involved, trivially deterministic);shape ok and exactly one tracked source → that source’s recorded kind, refined by the BID projection check below;
shape ok and two or more tracked sources → conservative multi-source TID promotion (see below) or fall through to
OPAQUE.
BID projection preservation. In the single-source BID case,
bid_block_key_preserved walks the outer target list and
requires every block-key column of the source to survive – matched
on the underlying Var (resolved transitively through any depth
of RTE_SUBQUERY TargetEntry projection, ignoring renames)
rather than on the output column name. If any block-key column is
dropped from the projection, the result downgrades to OPAQUE:
the mutually-exclusive partitioning the user could observe is no
longer visible in the output. Whole-table BID
(block_key_n == 0) is trivially preserved.
GROUP BY block-key promotion. A pre-dispatch special case
recognises queries of shape SELECT k FROM bid_t GROUP BY k
(and the multi-column-key generalisation): each output row is
exactly one BID block, and the OR over the block’s gate_mulinput
children reduces to the block’s key token (an independent
gate_input), so the output is TID under the cumulative source
list. Conservative gate: no aggregates / window functions /
sublinks / HAVING / DISTINCT / set operations, a single
RangeTblRef fromlist pointing at a BID relation, and the
groupClause matching the source’s block-key set exactly (no
extra group columns, no missing keys). PG 18’s synthetic
RTE_GROUP entry is resolved through inline by
resolve_through_group_rte.
Multi-source TID promotion. The n_meta >= 2 branch no
longer collapses to OPAQUE. try_classify_multi_source_tid
promotes the result to TID when every classifier-reported source
is itself TID and the registered ancestor sets (see
TID / BID Propagation Through Derived Relations below) are pairwise disjoint. Any
failure (a non-TID source, no registry entry, or any pair of
ancestor sets overlapping) keeps the OPAQUE default. The
hierarchical-CQ structure is not inspected here – the
Safe-Query Rewriter runs the full check downstream; the
classifier’s job is only to certify the per-row independence the
user-visible NOTICE pill advertises.
TID and BID NOTICEs carry a complete source list. OPAQUE NOTICEs
deliberately omit it: when the shape gate trips on a sublink, set
operation, HAVING, aggregates, window functions, or SRFs in
the target list, the rtable walk only reaches the syntactically
visible RTEs, so any list would be partial and falsely suggest
completeness. The user already has the query text in front of
them and can identify the involved relations without our help.
Executor-depth gating. ProvSQL’s rewriter inserts calls to
PL/pgSQL helpers (provenance_times, provenance_plus,
provenance_aggregate, …) into the rewritten query. Each of
those helpers contains internal SELECT statements that PL/pgSQL
plans on first invocation, and that planning fires the planner hook
again. Without a gate, the classifier would emit one extra
NOTICE per such internal plan. We track Executor nesting via
an ExecutorStart_hook / ExecutorEnd_hook pair that
increments / decrements a file-local depth counter
(provsql_executor_depth); the classifier only emits when
provsql_executor_depth == 0 at planner-hook entry, which
corresponds to the user’s outermost statement (the helper plans run
during execution, hence at depth ≥ 1).
NOTICE format. Rendered by provsql_classify_emit_notice
with schema-qualified, quote_identifier-quoted relation names:
ProvSQL: query result is TID (sources: public.personnel)
ProvSQL: query result is BID (sources: public.personnel)
ProvSQL: query result is TID (no provenance-tracked sources)
ProvSQL: query result is OPAQUE
ProvSQL Studio enables the GUC automatically and parses the
NOTICE into a kind-aware pill on the result-table provsql
column; see ProvSQL Studio.
process_query: The Main Rewriting Function
process_query is the heart of ProvSQL. It receives a
Query tree, rewrites it to carry provenance, and returns the
modified tree. The function is recursive: subqueries and CTEs are
processed by re-entering process_query.
The function proceeds in the following order:
Step 0: Early Exit
If the query has no FROM clause (q->rtable == NULL), there is
nothing to track – return immediately.
Step 1: CTE Inlining
Non-recursive common table expressions (WITH clauses) are
inlined as subqueries in the range table via inline_ctes.
This converts RTE_CTE entries to RTE_SUBQUERY so that the
subsequent recursive processing can track provenance through them.
Recursive CTEs are not supported and raise an error.
Inlining happens before set-operation handling because UNION /
EXCEPT branches may reference CTEs.
Step 2: Strip Existing Provenance Columns
remove_provenance_attributes_select scans the target list and
removes any existing provsql UUID columns (which might be inherited
from base tables or subqueries). These are stripped so that the
rewriter can append a single, freshly computed provenance expression at
the end. Matching GROUP BY / ORDER BY references and
set-operation column lists are also adjusted.
Step 3: Set-Operation Handling
If the query has setOperations (UNION, EXCEPT):
Non-ALL variants (
UNION,EXCEPTwithoutALL):rewrite_non_all_into_external_group_bywraps the set operation in a new outer query withGROUP BYon all columns. This implements duplicate elimination as provenance addition (⊕). The function then re-entersprocess_queryon the wrapper.UNION ALL: each branch is processed independently. The provenance tokens from different branches are combined with ⊕ (provenance_plus) byprocess_set_operation_union.EXCEPT ALL:transform_except_into_joinrewritesA EXCEPT ALL Bas aLEFT JOINwith aprovenance_monus(⊖) gate, plus a filter removing zero-provenance tuples.INTERSECTis not supported (raises an error).
Step 4: Aggregate DISTINCT Rewrite
If the query has aggregates with DISTINCT (e.g.,
COUNT(DISTINCT x)), rewrite_agg_distinct performs a
structural rewrite: the DISTINCT inside the aggregate is moved to
an inner subquery with GROUP BY, and the outer query aggregates
over the deduplicated results. The function returns a new query tree,
and process_query re-enters on it.
Step 5: Discovery – get_provenance_attributes
get_provenance_attributes walks every entry in the range table
and collects Var nodes pointing to provenance columns:
RTE_RELATION (base table): scans column names for one called
provsqlof type UUID.RTE_SUBQUERY: recursively calls
process_queryon the subquery. If the subquery’s rewritten target list contains aprovsqlcolumn, aVarpointing to it is added to the parent’s provenance list. OuterVarattribute numbers are adjusted to account for any columns removed from the inner query.RTE_JOIN: for inner, left, right, and full joins, nothing is collected directly – the underlying base-table RTEs supply the tokens. Semi-joins and anti-joins raise an error.
RTE_FUNCTION: if the function returns a single UUID column named
provsql, it is collected.RTE_VALUES / RTE_GROUP: no provenance to collect.
If no provenance attributes are found, the query is returned unmodified.
Step 6: Unsupported Feature Checks
Before proceeding, the function checks for:
Sublinks (
EXISTS,IN, scalar subqueries): not supported.DISTINCT ON: not supported.DISTINCT(plain): converted toGROUP BYviatransform_distinct_into_group_by.GROUPING SETS/CUBE/ROLLUP: not supported (except the trivialGROUP BY ()).
Step 7: Build Column Map
build_column_map creates a per-RTE mapping from column
attribute numbers to global column identifiers. This is used by
where-provenance to record which columns participate in equijoin
conditions and projections.
Step 8: Aggregation Rewriting
If the query has aggregates, replace_aggregations_by_provenance_aggregate
walks the target list and replaces each standard aggregate (e.g.,
SUM, COUNT) with a provenance_aggregate call that wraps
the original aggregate result and the provenance of the aggregated
rows. The provenance of grouped rows is combined with ⊕
(provenance_plus) via array_agg + provenance_plus.
After aggregation rewriting:
migrate_probabilistic_qualsmoves anyWHEREcomparisons on aggregate results toHAVING(and liftsWHEREcomparisons onrandom_variablecolumns into the per-tuple provenance), because aggregate-typed and continuous-RV values both need post-classification routing the executor cannot do directly. See Probabilistic-Qual Classifier below for the routing matrix.insert_agg_token_castsinserts type casts foragg_tokenvalues used in arithmetic or window functions.
See Aggregation Provenance for the semantics of the agg /
semimod / value gates produced here, and for the exact
shape of the provenance_aggregate call that replaces each
Aggref.
Step 9: Expression Building – make_provenance_expression
make_provenance_expression combines the collected provenance
Var nodes into a single expression tree:
Combining tokens from multiple tables (the ⊗ / ⊕ / ⊖ choice):
SR_TIMES(default forSELECT ... FROM ... WHERE): wraps the list of provenance tokens inprovenance_times(ARRAY[...])to create a multiplication gate.SR_PLUS(UNION ALL): uses the single provenance token from the union directly (each branch already has its own token).SR_MONUS(EXCEPT ALL): wraps the two tokens inprovenance_monus(left, right).
If a single table is in the FROM clause, no combining function is
needed – the token is used as-is.
GROUP BY / aggregation: When group_by_rewrite or
aggregation is true, the combined token is wrapped in
provenance_plus(array_agg(token)) – this sums the provenance of
all tuples in each group.
Delta gate: For queries with aggregation but no HAVING
clause, a provenance_delta gate is added. This implements the
δ-semiring operator that normalizes aggregate provenance.
HAVING: When a HAVING clause is present, the lift is gated
by the needs_having_lift walker, which returns true only
when the qual references an agg_token Var or a
provenance_aggregate wrapper. On that path,
having_Expr_to_provenance_cmp translates the predicate
into a provenance_cmp gate tree and the original
havingQual is removed from the query (its semantics are now
captured in the provenance circuit). On the deterministic-outcome
path (e.g. HAVING expected(avg(rv)) > 20, where the outer
predicate collapses to a plain Boolean), the qual is left for
PostgreSQL to evaluate natively on the surviving groups while a
per-group provenance_delta wrapper is still emitted, so the
surviving rows carry the expected provenance shape.
Where-provenance (when provsql.where_provenance is enabled):
Equijoin gates:
add_eq_from_Quals_to_ExprscansJOIN ... ONconditions andWHEREequalities, wrapping the provenance expression inprovenance_eqgates that record which column pairs were compared.Projection gates: if the output columns are a strict subset of the input columns (or are reordered), a
provenance_projectgate is added with an integer array recording the column mapping.
Step 10: Splicing – add_to_select
add_to_select appends the provenance expression to the
query’s target list as a new column named provsql. If the query
has GROUP BY, the column is added to the groupClause as well.
Step 11: Replace provenance() Calls
replace_provenance_function_by_expression walks the target
list looking for calls to the provenance SQL function. Each
occurrence is replaced with the computed provenance expression, so
that SELECT provenance() FROM ... returns the actual provenance
token.
Rewriting Rules and Formal Semantics
The rewriting implemented by provsql.c realises the
rewriting rules (R1)–(R5) from the ICDE 2026 paper
[Sen et al., 2026], which is the authoritative reference for
the formal semantics and correctness results. The rules are stated
over an extended relational algebra on annotated relations;
provsql.c has to reproduce them on PostgreSQL Query
trees, so the mapping below is approximate rather than literal –
the pipeline phases described earlier on this page interleave the
rules with PostgreSQL-specific bookkeeping (range-table surgery,
target-list rewriting, HAVING handling, where-provenance…).
Rule |
Algebra operator |
Concrete realisation in |
|---|---|---|
(R1) |
Projection |
The provenance token column is kept on the target list.
|
(R2) |
Cross product / join |
The |
(R3) |
Duplicate elimination |
|
(R4) |
Multiset difference |
|
(R5) |
Aggregation |
|
Two algebra operators are deliberately not rewritten, matching the paper:
Selection
–
WHEREclauses pass through unchanged because selection does not affect provenance (with where-provenance enabled, the rewriter additionally wraps the result inprovenance_eq/provenance_projectgates, but that is orthogonal to R1–R5).Multiset sum
–
UNION ALLis left alone byprocess_set_operation_union; the tokens from each branch flow through independently.
Formal Verification
The correctness of rules (R1)–(R5) is fully machine-checked
in the ProvSQL Lean 4 library. The unified
statement covering all five rules is
Query.rewriting_valid_full
in the Provenance.QueryEvaluateInVK module, stated against
the V_K-lifted evaluator
Query.evaluateAnnotatedFull:
for every supported query q and every annotated database
d, evaluating q against the annotated semantics and
then projecting to the composite (tuple + annotation)
representation yields the same result as evaluating the
rewritten query against the composite database. The V_K
interpretation (values lifted with their K-semimodule
annotation, Definition 7 of the ICDE 2026 paper) is what makes
the aggregation rule (R5) statable alongside (R1)–(R4) in a
single theorem.
A companion theorem,
Query.evaluateAnnotatedFull_hom
in the Provenance.QueryEvaluateInVKHom module, proves
that query evaluation commutes with m-semiring homomorphisms
on the full algebra including aggregation (the formal analogue
of [Green et al., 2007], Proposition 3.5, and
[Geerts and Poggi, 2010], Proposition 1, lifted to
m-semirings via SemiringWithMonusHom and extended to
aggregation via the K-semimodule structure formalised in
Provenance.KSemiModule
and
Provenance.LiftedTK).
The restriction of this theorem to the non-aggregation
fragment is
Query.evaluateAnnotated_hom
in the Provenance.QueryAnnotatedDatabaseHom module.
This is the formal counterpart of ProvSQL’s architectural
choice: a single persistent provenance circuit is kept once,
and each sr_* semiring evaluator is the realisation of one
m-semiring homomorphism out of that circuit; the theorem says
that running the homomorphism on the inputs and then evaluating
the query yields the same result as evaluating the query first
and applying the homomorphism to the annotations.
Probabilistic-Qual Classifier
The single walker migrate_probabilistic_quals covers both
the historical agg_token HAVING surface and the random_variable
WHERE/JOIN surface introduced by the continuous-distribution work.
It routes every qual into one of four mutually-exclusive classes
(the qual_class enum), plus a short tail of mixed-error
classes that raise a clean diagnostic rather than producing a
malformed circuit:
|
Routing |
|---|---|
|
The qual is built only from |
|
The qual is built only from |
|
Ordinary SQL, left untouched. |
Mixed-error classes |
Quals that conjoin a probabilistic comparator with another in the same node, or that compare an RV against an agg_token, raise a structured error so users see the offending shape rather than a downstream evaluation failure. |
For QUAL_PURE_RV, the planner-hook path also covers the
corner case of WHERE rv > 2 on a FROM-less SELECT: there
is no row provenance to conjoin into, so the rewriter synthesises
a single-row FROM-less host so probability_evaluate
reads a well-formed circuit. The dispatch in
make_aggregation_expression keys on the aggregate’s
result type (OID_TYPE_RANDOM_VARIABLE vs
OID_TYPE_AGG_TOKEN) so the same path covers any future
RV-returning aggregate. See Continuous Distributions for
the broader architecture.
Safe-Query Rewriter
The safe-query rewriter (src/safe_query.{c,h}) is an opt-in
pre-pass invoked from process_query in provsql.c when the
GUC provsql.boolean_provenance is on (see
Probabilities). It recognises the safe class of
Dalvi and Suciu [Dalvi and Suciu, 2012], namely
self-join-free hierarchical conjunctive queries, and rewrites them
with per-atom
SELECT DISTINCT projections so the resulting provenance circuit
is read-once. A read-once circuit is probability-evaluated in
linear time by BooleanCircuit::independentEvaluation,
which replaces the fallback to tree decomposition or external
knowledge compilation that the unrewritten circuit would require.
Two normalisation pre-passes run at the head of
try_safe_query_rewrite so the detector and rewriter see a flat
fromlist of base RTE_RELATION entries regardless of how the
user wrote the query:
try_flatten_inner_joins(q)dissolves everyINNER/CROSSJoinExprin any fromlist into flatRangeTblRefs plus AND-mergedON-clauses, recursing throughRTE_SUBQUERYbodies so a wrapped INNER JOIN gets flattened in the inner body before the inlining pre-pass picks it up. Refuses outer joins (NULL-padding rows break per-row independence), aliased joins (JOIN ... AS jwould require resolving columns throughjoinaliasvars), andUSINGclauses. Orphaned syntheticRTE_JOINentries are dropped via the sharedcompact_orphan_rteshelper.try_inline_simple_subqueries(q)then inlines every simpleRTE_SUBQUERYfromlist entry (PG-rewritten view bodies, inlineFROM (SELECT …)subqueries) into the outer rtable. A subquery is “simple” when it is a flat conjunctiveSELECTwith no kind-altering features (noDISTINCT/GROUP BY/HAVING/ aggregates / window functions / set operations / sublinks / CTEs /ORDER BY/LIMIT/OFFSET/ SRFs), noLATERALor security-barrier member RTEs, and a target list of plain base-levelVars (possibly throughRelabelTypewrappers). Fixed-point loop handles nested views.
The disjoint-base-ancestor property is enforced downstream by the candidate gate’s existing shared-relid bail (modulo the PK / disjoint-constant self-join rescues): two fromlist entries that inline to the same base relid produce duplicates the gate refuses. A second, finer-grained check (see TID / BID Propagation Through Derived Relations below) consults the per-relation ancestor registry to refuse joins whose registered ancestor sets overlap through different relids.
After normalisation, two stages, both static helpers in
src/safe_query.c:
is_safe_query_candidate(q): cheap shape gate. RejectshasAggs/hasWindowFuncs/LIMIT/OFFSET/groupingSets/ sublinks / set operations / explicitJoinExpr; rejects anyRangeTblEntrythat is notRTE_RELATION(the normalisation pre-passes above remove the legitimateRTE_SUBQUERY/RTE_JOINcases) ; requires an outerGROUP BYor top-levelDISTINCTso the per-atomDISTINCTwraps do not change the user-visible row count ; requires every tracked base relation to carry a TID or BID classification in the per-databaseprovsql_table_inforegistry ; requires the registered ancestor sets of any two distinct-relid RTEs to be disjoint (catches CTAS-derived TID joined with one of its sources, or two CTAS-derived TIDs sharing a base ancestor through different relids – same-relid pairs are deliberately exempted, those go through the existing PK-unification / disjoint-constant rescues). The OPAQUE classification (which bails the gate) covers tables where the user has bypassed the planner’s UUID injection by writing a non-freshprovsqlvalue directly : the rewriter cannot assume input independence for such tuples, so any query touching them falls through to the standard pipeline.find_hierarchical_root_atoms(q): variable-equivalence analysis on the surviving atoms. Walks every equijoin predicate, unions the variables it links via a union-find structure, and looks for a root variable: one whose atom-set covers every atom (or, in the multi-level case, every atom of an inner group whose remaining variables also admit a root). Cases handled:Single-component fully covered. Every atom has the root variable in a known column. The rewriter emits one wrapped subquery per atom, joining on the root.
Multi-level partial coverage. Some atoms do not bind the root. The rewriter groups the missing atoms by the variables they actually share and recurses on each group as an inner wrapped subquery, which on re-entry is rewritten again by the same pipeline:
get_provenance_attributesre-runsprocess_queryon each subquery, so no explicit recursion is needed in the rewriter itself.Disconnected components. Atoms decompose into connected components by shared variables ; each component becomes an independent wrapped subquery joined by Cartesian product at the outer level.
Head variables. A head variable that happens to also appear in atoms is propagated through the wrapping. Several positional sub-cases are pinned (single-atom head Vars, head Vars on grouped atoms in or out of first member position). When the head Var falls on no atom with a defined slot (the bridge case), the rewriter currently bails rather than rewriting incorrectly.
rewrite_hierarchical_cq(q, atoms, groups, residual): the actual surgery. For every atom in the partition, builds anRTE_SUBQUERYwhose body isSELECT DISTINCT <root-binding cols>, provsql FROM A WHERE <atom-local predicates>, replaces the originalRTE_RELATIONinq->rtable, rebuilds thejointreefromlistaccordingly, and rewrites any WHERE / join predicate so Vars refer to the new subqueries’ output columns. The detector’s BID block-key alignment check ensures that the chosen root-binding columns in every BID atom are a superset of that atom’s block_key, so themulinputstructure survives the projection.
When the rewriter accepts a query, it returns a new Query that
is re-entered into process_query from the top (same recursion
pattern as rewrite_agg_distinct). When it rejects, it returns
NULL and the planner falls through to the existing pipeline
unchanged.
The root gate of every rewritten circuit is wrapped in a
gate_assumed_boolean marker (see Semiring Evaluation)
so that semirings whose algebra is not Boolean-faithful refuse to
evaluate it at runtime.
FD-Aware Extensions
Six extensions layered onto the base detector recover safety for
query shapes that the raw atom-set covers every atom criterion
refuses. All operate through the same DETERMINED matrix and
the per-atom atom_anchor_class selection in the fd_aware_mode
branch of find_hierarchical_root_atoms; the theoretical
framework is the induced-FD construction Γp(q) of Dalvi
and Suciu [Dalvi and Suciu, 2007], the dissociation
framework of Gatterbauer and Suciu
[Gatterbauer and Suciu, 2015], and the textbook
treatment in [Suciu et al., 2011] (Chapter 4).
The extensions and their interaction with the base rewriter:
Constant-selection elimination. A WHERE conjunct of shape
Var = Constinduces the FD∅ → Var.attno; the equijoin closure propagates the literal to every variable in the same union-find class. Implemented asapply_constant_selection_fd_pass(a pre-pass betweensafe_split_qualsand the multi-component dispatcher), which propagates synthesisedVar = constconjuncts to every atom touched by the class and drops the now-redundant equijoin conjuncts from the residual. Atoms whose every Var is in a constant class become disconnected from the rest of the residual and route through the multi-component path, which emits them as independentgate_plussubqueries Cartesian-joined at the outer level. That factoring is what preserves read-once across multiple-match rows on the rest of the query.Primary-key / NOT-NULL UNIQUE FDs. A relation with primary key
KcarriesK → Afor every non-key attributeA; NOT-NULL UNIQUE is FD-equivalent. A separate per-backend cache (provsql_lookup_relation_keysinprovsql_utils.c) scanspg_constraintfiltered tocontype IN ('p','u'), joinspg_indexfor column lists, and verifiespg_attribute.attnotnullon every UNIQUE column. Invalidation hooks intoCacheRegisterRelcacheCallbackwith its own registration flag soALTER TABLE ADD/DROP CONSTRAINTrefreshes the next lookup without polling. The detector applies each FD once: when every key column’s union-find class is multi-atom (the determinant is pinned by some equijoin to a Var on another RTE), every non-key column’s class is taggedDETERMINED(c, rte)and drops from the class’s FD-aware atom set. Composite PKs require every column to satisfy the multi-atom check (the soundness trap on partial coverage). When no single class covers every atom under the raw count but the FD-reduced atom sets are pairwise nested-or-disjoint, the rewriter falls intofd_aware_modeand uses a per-atom local-root anchor instead of a global root. For the canonicalR(x), S(x,y), T(y)shape (two join classes,S’s key determiningy) the flat per-atom wrap is read-once only when the determined valueyis distinct across keys; when several keys collide on oneyit would reuse the sharedT(y)leaf. The detector recognises this and instead folds the determining component{R, S}into an inner sub-query grouped on the determined valuey(projectingxout), so the shared leaf factors out and the lineage is read-once for all data – the Dalvi-Suciu safe plan that applies the independent project to{R, S}before joiningT. Exercised by the collision case intest/sql/safe_query_pk_fd.sql.Deterministic-relation transparency. A relation that is not provenance-tracked (no
provsqlcolumn and no metadata entry) contributes probability-1 tuples; dissociating tuples in it does not change the query’s probability [Gatterbauer and Suciu, 2015]. The detector treats such atoms as structurally transparent for atom-set purposes by settingDETERMINED(c, rte)for every union-find class on the deterministic RTE, so each class drops the atom from its FD-aware set. Soundness guards onpg_class.relkind = 'r'andhas_superclass = falseexclude views, foreign tables, and inheritance children; the residual CTAS-correlation trap (CREATE TABLE foo AS SELECT * FROM <tracked>withoutadd_provenance) is closed by the lineage hook (see TID / BID Propagation Through Derived Relations) and the ancestry-based disjointness gate. An anchor-fallback pass selects any anchored multi-atom class as the local root for deterministic atoms (the FD-aware preference for non-determined classes leaves them orphan otherwise). Each deterministic atom’s wrap uses the standardSELECT DISTINCT slots FROM dim WHERE <filter>shape with noprovsqlcolumn;process_query’s recursion finds no provenance to add, and the outergate_timessimply skips the missing entry.FD closure. Detector-only: accept any query whose FD-reduced atom-sets are pairwise nested-or-disjoint and whose existing single-level wrap is read-once. Delivered by the cumulative constant-selection + PK-FD + deterministic-transparency passes – each FD application in the current rule set is independent of the others, so no fixpoint iteration is needed. The canonical example is the triangle CQ with PKs on two of its three relations: each PK FD is applied once and the FD-aware atom sets become disjoint pairwise. The FD-induced nested rewrite for shapes where the single-level wrap is not read-once (the function/free split) is deliberately deferred.
PK-unifiable self-joins. When two (or more) RTEs over the same relation have all PK / NOT-NULL UNIQUE columns equated through the union-find closure, the key proves they refer to the same tuple.
try_pk_self_join_unificationruns beforeis_safe_query_candidateand merges every fully-unifiable group into a single survivor:rtableis compacted,jointree->fromlistrenumbered, and every Var’svarno(and the parallelvarnosynforruleutils.c-style deparsing) is rewritten throughsafe_unify_remap_mutator. Partial unification (a 3-RTE group with one outlier) bails the entire group: full unification or full bail. The candidate gate’s shared-relid bail then sees one RTE per surviving relid and accepts.Disjoint-constant self-joins. When two same-relid RTEs carry mutually exclusive
Var = Constconjuncts on the same column, their tuple-sets are disjoint and theprovsqltokens never overlap.try_disjoint_constant_self_join_splitruns after the PK-unification pass and certifies eligible relids in aBitmapset;is_safe_query_candidateconsults the set and skips the shared-relid bail for those relids. Pairwise disjointness usesdatumIsEqualon matchingconsttypeto prove distinct literals; non-equality predicates and constants on different columns do not contribute. The rewriter is unchanged: each RTE becomes its ownSELECT DISTINCT slots FROM R WHERE <filter>wrap, with the disjoint partition guaranteeing read-once.
The interaction between these extensions and the base rewriter is
ordered as: PG-18 group-RTE strip → INNER / CROSS JoinExpr
flattening → simple-subquery inlining → PK self-join unification
→ disjoint-constant certification → candidate gate (shape /
metadata / ancestry disjointness) → safe_split_quals →
constant-selection pre-pass → multi-component dispatch →
find_hierarchical_root_atoms (PK FDs, deterministic
transparency, fd_aware_mode) → rewrite_hierarchical_cq.
The detector’s DETERMINED matrix accumulates contributions
from the PK-FD pass and the deterministic-transparency pass; the
atom_anchor_class selection makes a first FD-aware preference
pass followed by a fallback for atoms whose every anchored class
is FD-determined (needed for the deterministic case to land its
local-root choice).
Each extension’s soundness argument is recorded in inline comments
in safe_query.c alongside the corresponding block; the
regression tests live in test/sql/safe_query_const_sel.sql,
safe_query_pk_fd.sql, safe_query_deterministic.sql,
safe_query_self_join_pk.sql, safe_query_fd_closure.sql
(cumulative regression checks for the FD closure),
safe_query_self_join_disjoint.sql,
safe_query_view_descent.sql (subquery-inlining pre-pass),
safe_query_inner_join.sql (JoinExpr flattening), and
safe_query_ancestry_disjoint.sql (ancestry-based disjointness
gate).
TID / BID Propagation Through Derived Relations
The per-relation metadata that gates the safe-query rewriter now
flows through view / CTAS / matview / SELECT INTO derivations
so a query that reaches a tracked base relation through any depth
of indirection is recognised by the rewriter and the classifier.
Per-table base-ancestor registry. The
ProvenanceTableInfo record (in MMappedTableInfo.h)
carries two halves: the kind / block_key half (set by
add_provenance / repair_key / set_table_info) and the
ancestor half (ancestor_n, ancestors[PROVSQL_TABLE_INFO_MAX_ANCESTORS])
storing the sorted-deduplicated pg_class OIDs of the original
add_provenance / repair_key relations a derived table’s
atoms ultimately come from. Base tables seed {self};
CTAS-derived tables inherit the transitive union of their source
ancestor sets via the lineage hook below. Three IPC opcodes (A
set ancestry, a get ancestry, R remove ancestry) and a
parallel provsql_lookup_ancestry backend cache in
provsql_utils.c keep the registry off the planner’s hot
path while propagating ALTER TABLE / DROP TABLE /
add_provenance / repair_key invalidations through the
existing relcache-invalidation channel. The SQL surface
(set_ancestors / remove_ancestors / get_ancestors)
mirrors the kind-half API.
CTAS / SELECT INTO / matview lineage hook. A
ProcessUtility_hook in provsql.c intercepts every
CreateTableAsStmt (covering CREATE TABLE AS,
CREATE MATERIALIZED VIEW, and SELECT INTO: PG’s parser
transforms the third into the same statement node). Pre-pass:
provsql_classify_query runs on the inner Query;
when the classifier returns TID or BID and the target list
projects a provsql column from a tracked, non-OPAQUE source,
the hook captures the inherited kind plus the transitive ancestor
union (via provsql_lookup_ancestry on each
classifier-reported source). After delegating to
standard_ProcessUtility, the post-pass resolves the new
relation via RangeVarGetRelid, aligns BID block-key columns
to their output resno through a target-list walk (demotes to
TID when any key column is dropped), calls set_table_info /
set_ancestors via DirectFunctionCall, and installs
provenance_guard via SPI. Matviews skip the trigger install
(PG forbids triggers on them; matview content only changes
through REFRESH MATERIALIZED VIEW, which re-runs the inner
SELECT and carries the same lineage). The hook fires only when
the inner SELECT projects a provsql column from a tracked
source – otherwise the new relation has no provsql column
and lineage metadata would be operationally pointless.
Ancestry-based disjointness gate. After the existing
per-RTE metadata gate, is_safe_query_candidate computes
each RTE’s ancestor set (registry lookup, fallback to {self}
on the rare race where the registry has no entry) and rejects
any pair of RTEs with different relids whose ancestor sets
overlap. Catches the case the syntactic shared-relid bail
misses: a CTAS-derived table joined with one of its source
tables, or two CTAS-derived tables sharing a base ancestor
through different relids. Same-relid pairs are deliberately
exempted – those go through the existing PK-unification /
disjoint-constant rescues, which prove disjointness at the gate
level on a same-relid basis (a coarser ancestry overlap check
would undo those rescues).