ProvSQL C/C++ API
Adding support for provenance and uncertainty management to PostgreSQL databases
Loading...
Searching...
No Matches
safe_query.c File Reference

Hierarchical-CQ rewriter for the provsql.boolean_provenance GUC. More...

#include "postgres.h"
#include "fmgr.h"
#include "pg_config.h"
#include "access/htup_details.h"
#include "catalog/pg_class.h"
#include "catalog/pg_inherits.h"
#include "catalog/pg_inherits_fn.h"
#include "catalog/pg_type.h"
#include "nodes/bitmapset.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "nodes/parsenodes.h"
#include "nodes/pg_list.h"
#include "optimizer/clauses.h"
#include "parser/parse_oper.h"
#include "utils/builtins.h"
#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
#include "compatibility.h"
#include "provsql_mmap.h"
#include "provsql_utils.h"
#include "safe_query.h"
Include dependency graph for safe_query.c:

Go to the source code of this file.

Classes

struct  safe_proj_slot
 One projected column of an atom's wrapping subquery. More...
struct  safe_rewrite_atom
 Per-atom rewrite metadata discovered by the hierarchy detector. More...
struct  safe_inner_group
 Descriptor for an inner sub-Query introduced when one or more shared classes have partial coverage. More...
struct  safe_collect_vars_ctx
 Walker context for safe_collect_vars_walker. More...
struct  safe_collect_varnos_ctx
 Walker context for safe_collect_varnos_walker. More...
struct  safe_pushed_remap_ctx
 Mutator context for safe_pushed_remap_mutator. More...
struct  safe_remap_ctx
 Mutator context for safe_remap_vars_mutator. More...
struct  safe_inner_varno_remap_ctx
 Mutator context for safe_inner_varno_remap_mutator. More...
struct  safe_outer_te_remap_ctx
 Mutator context for safe_outer_te_remap_mutator. More...
struct  safe_unify_remap_ctx
 Mutator context for safe_unify_remap_mutator. More...
struct  safe_inline_shift_ctx
 Walker context for safe_inline_shift_mutator. More...
struct  safe_inline_subst_ctx
 Walker context for safe_inline_subst_mutator. More...
struct  safe_inline_compact_ctx
 Walker context for safe_inline_compact_mutator. More...
struct  safe_flatten_join_ctx
 Walker context for safe_flatten_join_arm. More...

Macros

#define ANCHOR(c, atom_idx)
#define DETERMINED(c, atom_idx)

Functions

static bool is_safe_query_candidate (const constants_t *constants, Query *q, Bitmapset *approved_self_join_relids)
 Walk a Query and reject anything outside the safe-query scope.
static bool safe_collect_vars_walker (Node *node, safe_collect_vars_ctx *ctx)
 Tree walker that collects every distinct base-level Var node.
static int safe_var_index (List *vars, Index varno, AttrNumber varattno)
 Find the position of a Var inside the vars list (matched on (varno, varattno)).
static bool safe_is_var_equality (Expr *qual, Var **l, Var **r)
 Recognise a top-level WHERE conjunct that equates two base Vars.
static bool safe_is_var_const_equality (Expr *qual, Var **var, Const **konst)
 Recognise OpExpr nodes of shape Var = Const.
static void safe_collect_equalities (Node *quals, List **out)
 Walk quals as a tree of AND, collecting equality conjuncts.
static bool safe_collect_varnos_walker (Node *node, safe_collect_varnos_ctx *ctx)
 Collect the distinct base-level varno values referenced by an expression sub-tree.
static void safe_flatten_and (Node *n, List **out)
 Flatten the top-level AND tree of a WHERE clause into a flat list of conjunct Node *.
static void safe_split_quals (Node *quals, int natoms, List **per_atom_out, Node **out_residual)
 Partition top-level WHERE conjuncts into atom-local quals (pushed into the matching atom's inner wrap) and the residual cross-atom quals (which stay in the outer query).
static void safe_partition_residual (Node *residual, List *atoms, List *groups, Node **outer_residual_out)
 Partition the cross-atom residual into per-group conjuncts and a new outer residual.
static Node * safe_pushed_remap_mutator (Node *node, safe_pushed_remap_ctx *ctx)
 Rewrite Var.varno from the outer atom rtindex to 1, the sole RTE of the inner wrap subquery.
static List * find_hierarchical_root_atoms (const constants_t *constants, Query *q, Node *quals, List **groups_out)
 Run the hierarchy detector on q, returning per-atom rewrite info.
static Node * safe_remap_vars_mutator (Node *node, safe_remap_ctx *ctx)
 Rewrite Var nodes in the outer query after each base RTE has been wrapped as a DISTINCT subquery projecting one or more slot columns.
static Query * safe_build_inner_wrap (Query *outer_src, RangeTblEntry *base_rte, List *proj_slots, Index outer_rtindex, List *pushed_quals)
 Build the inner Query that projects every slot in proj_slots of base_rte under SELECT DISTINCT.
static Node * safe_inner_varno_remap_mutator (Node *node, safe_inner_varno_remap_ctx *ctx)
 Rewrite base-level Var.varno from the outer atom rtindex to the corresponding inner-sub-Query rtindex.
static Query * safe_build_group_subquery (Query *outer_src, safe_inner_group *gr, List *atoms)
 Build the inner sub-Query that aggregates a group of partial-coverage atoms over their non-root shared variables.
static Query * rewrite_hierarchical_cq (const constants_t *constants, Query *q, List *atoms, List *groups, Node *residual)
 Apply the (multi-level when needed) hierarchical-CQ rewrite.
static int compute_atom_components (Query *q, Node *quals, int *atom_to_comp)
 Compute atom-level connected components.
static Node * safe_outer_te_remap_mutator (Node *node, safe_outer_te_remap_ctx *ctx)
 Rewrite Vars in the outer targetList for the multi-component rewrite.
static Query * rewrite_multi_component (const constants_t *constants, Query *q, Node *residual, List **per_atom_quals, int *atom_to_comp, int ncomp)
 Apply the multi-component rewrite.
static void apply_constant_selection_fd_pass (Query *q, List **per_atom_quals, Node **residual_in_out)
 Constant-selection elimination pre-pass.
static Node * safe_unify_remap_mutator (Node *node, safe_unify_remap_ctx *ctx)
 Tree mutator that renumbers Var.varno and RangeTblRef.rtindex through the PK-unifiable self-join map.
static Query * try_pk_self_join_unification (Query *q)
 PK-unifiable self-join detection and unification.
static Bitmapset * try_disjoint_constant_self_join_split (Query *q)
 Disjoint-constant self-join certification.
static Node * safe_inline_shift_mutator (Node *node, safe_inline_shift_ctx *ctx)
 Add offset to the varno of every base-level (varlevelsup == 0) Var and the rtindex of every RangeTblRef in node.
static Node * safe_inline_subst_mutator (Node *node, safe_inline_subst_ctx *ctx)
 Replace every outer-scope Var pointing at the inlined subquery RTE with a shifted copy of the matching target-list entry's expression.
static bool is_inlineable_subquery (Query *sub)
 Decide whether sub may be inlined.
static void inline_one_subquery (Query *q, int target_rti)
 Inline the subquery RTE at target_rti into q in place.
static Node * safe_inline_compact_mutator (Node *node, safe_inline_compact_ctx *ctx)
 Renumber Vars / RangeTblRefs via old_to_new.
static void compact_orphan_rtes (Query *q, Bitmapset *orphans)
 Drop orphan RTEs (the inlined subquery slots) from q->rtable and renumber every surviving Var / RangeTblRef.
static Query * try_inline_simple_subqueries (Query *q)
 Subquery-inlining pre-pass.
static List * safe_flatten_join_arm (Node *arm, safe_flatten_join_ctx *ctx)
 Recursively flatten a fromlist arm into a list of RangeTblRef nodes, appending each JoinExpr's ON clause to ctx->merged_quals along the way.
static bool safe_has_join_expr_anywhere (Query *q)
 Recursively scan q (and any subquery body in its rtable) for a JoinExpr in any fromlist.
static bool safe_flatten_inner_joins_inplace (Query *q)
 In-place flattener for the Query at any nesting depth.
static Query * try_flatten_inner_joins (Query *q)
 Pre-pass driver.
Query * try_safe_query_rewrite (const constants_t *constants, Query *q)
 Top-level entry point for the safe-query rewrite.

Variables

int provsql_verbose
 Verbosity level; controlled by the provsql.verbose_level GUC.

Detailed Description

Hierarchical-CQ rewriter for the provsql.boolean_provenance GUC.

Opt-in pre-pass invoked by process_query in provsql.c. Rewrites SELECT-FROM-WHERE conjunctive queries with a hierarchical structure (every shared variable's atom-set is either fully covered or fits into a multi-level inner-group decomposition) into a form whose provenance circuit is read-once. The result lets the linear-time BooleanCircuit::independentEvaluation method handle queries that would otherwise fall through to the dDNNF / tree-decomposition / external-knowledge-compiler pipeline.

The hierarchical-CQ class and its read-once decomposability are the "safe queries" of Dalvi and Suciu, "The Dichotomy of Probabilistic Inference for Unions of Conjunctive Queries", J. ACM 59(6), 2012 (doi:10.1145/2395116.2395119) ; the dichotomy theorem in that paper is the theoretical foundation for the rewrite this file implements.

Entry point: try_safe_query_rewrite (see safe_query.h).

The bulk of the file is detector + rewriter helpers. All non-API symbols are static.

Definition in file safe_query.c.

Macro Definition Documentation

◆ ANCHOR

#define ANCHOR ( c,
atom_idx )
Value:
class_atom_anchor_attno[(c) * natoms + (atom_idx)]

◆ DETERMINED

#define DETERMINED ( c,
atom_idx )
Value:
determined_in[(c) * natoms + (atom_idx)]

Function Documentation

◆ apply_constant_selection_fd_pass()

void apply_constant_selection_fd_pass ( Query * q,
List ** per_atom_quals,
Node ** residual_in_out )
static

Constant-selection elimination pre-pass.

Implements Dalvi & Suciu 2007 §5.1's induced-FD construction ( R.a from a R.a = c conjunct), specialised to the safe-query rewriter's representation:

  • Build a Var-level union-find from the equijoin conjuncts in *residual_in_out. Every pair of Vars that share an equijoin (transitively, through the closure) lands in the same equivalence class.
  • Scan per_atom_quals[i] (atom-local conjuncts) and *residual_in_out (cross-atom conjuncts) for Var = Const matches. Mark the matched Var's class repr as constant- pinned, recording one of the literals for propagation.
  • For every Var in a constant-pinned class, synthesise the corresponding Var = const conjunct on the Var's atom's per_atom_quals list (when not already present, dedup'd by (varno,varattno)). After this step every atom touching the class carries the local filter, so the standard atom-local pushdown path materialises it in the wrap.
  • Drop top-level AND conjuncts of *residual_in_out whose every base-level Var is in a constant-pinned class. These are the equijoin conjuncts that brought constant atoms together (e.g. R.x = S.x under S.x = 42); after propagation each side carries its own Var = const filter, so the original equijoin is redundant and would only prevent the rewriter from resolving columns the constant-pinned atoms' wraps no longer project.

Effect on the rest of try_safe_query_rewrite: with cross-atom equijoin links to constant-pinned atoms removed, those atoms become their own connected components, and the existing multi-component path in try_safe_query_rewrite handles them by emitting a separate inner sub-Query per component. The recursive process_query re-entry then collapses each constant-pinned atom to a single aggregated gate_plus token, while the remaining atoms keep the standard single-component hierarchical shape. This is the read-once factoring constant-pinning prescribes – the pinned atom's contribution factors out as an independent gate_times child of the result.

Definition at line 3179 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ compact_orphan_rtes()

void compact_orphan_rtes ( Query * q,
Bitmapset * orphans )
static

Drop orphan RTEs (the inlined subquery slots) from q->rtable and renumber every surviving Var / RangeTblRef.

Definition at line 4325 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ compute_atom_components()

int compute_atom_components ( Query * q,
Node * quals,
int * atom_to_comp )
static

Compute atom-level connected components.

Two atoms belong to the same component iff there is a chain of equality conjuncts in quals that connects one of their Vars to one of the other's Vars. Uses a quick atom-level union-find driven by the equality pairs already extracted by safe_collect_equalities, then compacts representatives into 0-based component ids written into atom_to_comp.

Returns
Number of distinct components.

Definition at line 2646 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ find_hierarchical_root_atoms()

List * find_hierarchical_root_atoms ( const constants_t * constants,
Query * q,
Node * quals,
List ** groups_out )
static

Run the hierarchy detector on q, returning per-atom rewrite info.

Builds the variable equivalence relation induced by WHERE-clause Var = Var equalities, identifies a "root variable" (a class whose member Vars touch every base RTE), and decides how each remaining shared class is materialised. Fully-covered non-root classes become extra projection slots in the per-atom inner wrap. Partial-coverage shared classes – those touching at least two but not all atoms – trigger the multi-level path: the affected atoms are bundled into an inner sub-Query whose GROUP BY folds the partial-coverage variables before the outer join with the remaining atoms. Returns a list of safe_rewrite_atom * (one per q->rtable entry, in rtable order) plus, via groups_out, the list of safe_inner_group * the rewriter must build.

Returns NIL when the query is not in the currently-supported shape:

  • fewer than two atoms (single-relation query needs no rewrite);
  • no root variable within the (single) connected component. Disconnected components are handled upstream in try_safe_query_rewrite via rewrite_multi_component, so this bail covers only the "connected but no variable touches every atom" case ;
  • an atom whose root binding spans more than one column. The rewrite would have to push an intra-atom equality (e.g. A(x,x) when x is the root) into the inner subquery ; the current rewriter does not synthesise such an equality and bails ;
  • a Var in quals or q->targetList that does not fit any of the slot kinds the rewriter knows how to expose : the fully-covered class (extra outer slot), a partial-coverage class whose atom landed in an inner group (inner-group slot), or a single-atom head Var on an atom whose wrap can carry an extra projection slot. Body-only Vars on a single-atom class and partial-coverage classes that the multi-group / bridge merger cannot route to any group fall outside this set and trigger the bail.

The shape gate in is_safe_query_candidate has already enforced self-join-free, no aggs / windows / sublinks etc.; this function only adds the hierarchy-specific checks.

Parameters
constantsCached extension OIDs (unused here; reserved for future class/type lookups).
qInput Query; the detector only reads it.
qualsResidual WHERE quals (post-split): the cross-atom conjunction that the union-find must reason about. Single-atom conjuncts have been extracted upstream and stored separately per atom.
groups_outOut: list of safe_inner_group * produced when the partial-coverage path fires; NIL when the rewriter only needs single-level outer wraps.

Definition at line 808 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ inline_one_subquery()

void inline_one_subquery ( Query * q,
int target_rti )
static

Inline the subquery RTE at target_rti into q in place.

Caller is responsible for compaction (the orphan RTE is left in q->rtable so other still-pending rtindex references don't shift mid-pass).

Definition at line 4186 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ is_inlineable_subquery()

bool is_inlineable_subquery ( Query * sub)
static

Decide whether sub may be inlined.

See the chapter comment above for the predicate; the recursion through nested RTE_SUBQUERY entries is bounded by the input query's syntactic nesting depth.

Definition at line 4113 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ is_safe_query_candidate()

bool is_safe_query_candidate ( const constants_t * constants,
Query * q,
Bitmapset * approved_self_join_relids )
static

Walk a Query and reject anything outside the safe-query scope.

Accepts only:

  • self-join-free conjunctive queries
  • no aggregation, window functions, DISTINCT ON, LIMIT/OFFSET, sublinks, or top-level set operations. Top-level UCQs (UNION / EXCEPT / INTERSECT) are processed branch-by-branch by the planner's recursive process_query, so each branch reaches this gate on its own and the outer set-operation node bails here.
  • an outer GROUP BY or top-level DISTINCT. Without one, the per-atom SELECT DISTINCT wraps would shrink the user- visible row count, so the rewrite would change the result set.
  • all base relations have a provenance metadata entry, none are OPAQUE. BID atom block-key validation is deferred to the rewriter (we cannot check it without knowing the root variable).
Returns
true iff q is a candidate for the safe-query rewrite.

Definition at line 94 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ rewrite_hierarchical_cq()

Query * rewrite_hierarchical_cq ( const constants_t * constants,
Query * q,
List * atoms,
List * groups,
Node * residual )
static

Apply the (multi-level when needed) hierarchical-CQ rewrite.

Walks the outer rtable in original order. Each atom is replaced by an RTE_SUBQUERY. Atoms with group_id == -1 get a direct outer wrap (SELECT DISTINCT on their projection slots). The first atom of each inner group emits the group's sub-Query (safe_build_group_subquery), and subsequent group members are skipped from the outer rtable – they live inside the inner sub-Query. The outer rtable therefore has one entry per outer-wrap atom plus one entry per inner group, generally fewer than the original.

The remap pass then rewrites every base Var in the outer targetList and residual WHERE. Both outer-wrap and grouped Vars resolve by scanning the atom's proj_slots for the matching base_attno: the new varno is the atom's (or its group's) outer_rtindex, and the new varattno is the slot's 1-based position in proj_slots (which matches the output column of the outer wrap or of the inner sub-Query's targetList).

Definition at line 2419 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ rewrite_multi_component()

Query * rewrite_multi_component ( const constants_t * constants,
Query * q,
Node * residual,
List ** per_atom_quals,
int * atom_to_comp,
int ncomp )
static

Apply the multi-component rewrite.

Assumes atom_to_comp partitions the q->rtable atoms into ncomp connected components (ncomp >= 2) and that every TargetEntry in q->targetList has all its Vars in a single component. Builds one inner Query per component, each carrying:

  • the component's atoms as RTE_RELATION clones,
  • the cross-atom WHERE conjuncts and atom-local pushed quals confined to those atoms,
  • the slice of q->targetList referencing this component's atoms (fresh ressortgroupref) plus matching groupClause, and assembles an outer Query whose rtable is the list of inner sub-Queries. Each output row's provenance is the gate_times of the per-component provsqls; Choice A re-entry lets the single-component rewriter handle each component independently.

Returns NULL to fall through when any component has no Var- carrying TargetEntry to anchor its inner sub-Query (the all- constant case, e.g. SELECT DISTINCT 1 FROM A,B, is deferred).

Definition at line 2778 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_build_group_subquery()

Query * safe_build_group_subquery ( Query * outer_src,
safe_inner_group * gr,
List * atoms )
static

Build the inner sub-Query that aggregates a group of partial-coverage atoms over their non-root shared variables.

The sub-Query's rtable contains a clone of each member atom's RangeTblEntry, in original-rtindex order. Its WHERE is the AND of gr->inner_quals (cross-atom conjuncts within the group) and every member atom's pushed_quals; each conjunct's Var.varno is remapped from the outer atom rtindex to the inner rtindex via safe_inner_varno_remap_mutator. The targetList exposes a single column carrying the root-class binding of the first member; the groupClause aggregates the per-group provenance over the inner shared variables. When process_query re-enters on this sub-Query, the hierarchical-CQ rewriter fires again and wraps each member atom with its own SELECT DISTINCT.

Definition at line 2217 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_build_inner_wrap()

Query * safe_build_inner_wrap ( Query * outer_src,
RangeTblEntry * base_rte,
List * proj_slots,
Index outer_rtindex,
List * pushed_quals )
static

Build the inner Query that projects every slot in proj_slots of base_rte under SELECT DISTINCT.

One TargetEntry and one SortGroupClause are emitted per slot, in proj_slots order; the root-class slot is conventionally first, so it always ends up at output attno 1. The recursive process_query call on this Query will discover the provsql column on base_rte and append it to the inner target list, so the wrapping outer query gets the slot columns at attno 1..N and the provsql column at attno N+1.

Definition at line 2048 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_collect_equalities()

void safe_collect_equalities ( Node * quals,
List ** out )
static

Walk quals as a tree of AND, collecting equality conjuncts.

Returns the matched (Var *, Var *) pairs as a flat list, with each pair contributed as two consecutive list entries (left, right). Non-equality conjuncts are silently ignored (they do not contribute to the variable equivalence relation but do not prevent the rewrite either; the detector will fall back to bail if no root variable is found). OR / NOT are not traversed – they are not equijoins, and accepting them here would be unsound.

Definition at line 479 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_collect_varnos_walker()

bool safe_collect_varnos_walker ( Node * node,
safe_collect_varnos_ctx * ctx )
static

Collect the distinct base-level varno values referenced by an expression sub-tree.

Used by safe_split_quals to decide whether a WHERE conjunct is "atom-local" (single varno ⇒ pushable) or "cross-atom" (≥ 2 varnos ⇒ stays in the residual). Outer Vars (varlevelsup > 0) are ignored; the safe-query candidate gate has already rejected sublinks, so they cannot legitimately appear here.

Definition at line 521 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_collect_vars_walker()

bool safe_collect_vars_walker ( Node * node,
safe_collect_vars_ctx * ctx )
static

Tree walker that collects every distinct base-level Var node.

"Base level" means varlevelsup == 0; outer references are ignored. Vars are deduplicated by (varno, varattno).

Definition at line 333 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_flatten_and()

void safe_flatten_and ( Node * n,
List ** out )
static

Flatten the top-level AND tree of a WHERE clause into a flat list of conjunct Node *.

Mirrors safe_collect_equalities' AND-tree recursion: a top-level List is treated as an implicit AND (as FromExpr->quals can be), a BoolExpr with AND_EXPR has its args recursed into, and any other node is treated as a single leaf conjunct (including OR / NOT / NULL-tests / equalities / arbitrary OpExpr). The resulting list shares pointers with the original tree; callers must copyObject before mutating.

Definition at line 546 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_flatten_inner_joins_inplace()

bool safe_flatten_inner_joins_inplace ( Query * q)
static

In-place flattener for the Query at any nesting depth.

Walks q's fromlist (replacing JoinExprs with their flat arms and AND-merging their ON clauses), then recurses into every RTE_SUBQUERY body. Returns false when any unsupported JoinExpr shape is hit (outer join, alias, USING), which causes the whole pre-pass to bail.

The caller is responsible for having copyObject'd q first. Recursion runs the subquery-body in-place mutation on the same copy.

Definition at line 4584 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_flatten_join_arm()

List * safe_flatten_join_arm ( Node * arm,
safe_flatten_join_ctx * ctx )
static

Recursively flatten a fromlist arm into a list of RangeTblRef nodes, appending each JoinExpr's ON clause to ctx->merged_quals along the way.

On failure (an unsupported JoinExpr shape), sets ctx->failed and returns NIL ; the caller bails out via the failed flag.

Definition at line 4513 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_has_join_expr_anywhere()

bool safe_has_join_expr_anywhere ( Query * q)
static

Recursively scan q (and any subquery body in its rtable) for a JoinExpr in any fromlist.

Used as a quick-exit gate before the (more expensive) copy-and-flatten path.

Definition at line 4555 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_inline_compact_mutator()

Node * safe_inline_compact_mutator ( Node * node,
safe_inline_compact_ctx * ctx )
static

Renumber Vars / RangeTblRefs via old_to_new.

Slots mapped to 0 (the inlined-orphan rtindexes) would signal a live reference into a dropped RTE – defensively, the node passes through unchanged so the downstream candidate gate notices and refuses the query.

Definition at line 4283 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_inline_shift_mutator()

Node * safe_inline_shift_mutator ( Node * node,
safe_inline_shift_ctx * ctx )
static

Add offset to the varno of every base-level (varlevelsup == 0) Var and the rtindex of every RangeTblRef in node.

Used when relocating a subquery's rtable entries into the tail of the outer query's rtable.

Outer Vars (varlevelsup > 0) and outer RangeTblRefs cannot legitimately appear in an inlineable subquery – the inlineable predicate refuses LATERAL RTEs and sublinks – but we leave them alone defensively.

Definition at line 4038 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_inline_subst_mutator()

Node * safe_inline_subst_mutator ( Node * node,
safe_inline_subst_ctx * ctx )
static

Replace every outer-scope Var pointing at the inlined subquery RTE with a shifted copy of the matching target-list entry's expression.

The substituted expression is copyObject'd before its base-level Vars / RangeTblRefs are renumbered by outer_offset, so the inlined subquery's targetList is left intact for any other outer-scope Var still referencing it.

Definition at line 4082 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_inner_varno_remap_mutator()

Node * safe_inner_varno_remap_mutator ( Node * node,
safe_inner_varno_remap_ctx * ctx )
static

Rewrite base-level Var.varno from the outer atom rtindex to the corresponding inner-sub-Query rtindex.

Applied to each conjunct that the partition pass routed into an inner group (and to every pushed-down atom-local qual of every grouped atom) before injection into the inner sub-Query's jointree->quals. varattno is unchanged – the inner sub-Query's RTEs are fresh clones of the same base relations, so the base attribute numbers carry over.

Definition at line 2176 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_is_var_const_equality()

bool safe_is_var_const_equality ( Expr * qual,
Var ** var,
Const ** konst )
static

Recognise OpExpr nodes of shape Var = Const.

Mirrors safe_is_var_equality but matches an equality between a base-level Var and a planner-time Const literal (in either argument order), again stripping RelabelType wrappers to see through binary- coercion casts. This is the recogniser used by the constant-selection elimination pass (Dalvi & Suciu 2007 §5.1, induced FD R.a) to flag union-find roots as pinned to a literal. Volatile predicates never reach this point: safe_split_quals routes them to the residual, and the constant-selection scan walks both per_atom and the residual but stops at any OpExpr that is not a plain Var/Const equality.

On match, *var receives the base-level Var and *konst the literal; on no match, returns false without touching them. NULL constants (constisnull) do not yield an FD – equality to NULL is never satisfied in standard SQL semantics, so the planner would have already short-circuited the row at executor time, but a Var = NULL conjunct is still SQL-legal and we conservatively reject it as non-FD-inducing.

Definition at line 429 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_is_var_equality()

bool safe_is_var_equality ( Expr * qual,
Var ** l,
Var ** r )
static

Recognise a top-level WHERE conjunct that equates two base Vars.

Matches OpExpr nodes whose operator is the canonical equality for the two operand types and whose operands strip down to base-level Var nodes (possibly through RelabelType wrappers, which carry binary-coercion casts). Returns true and fills *l, *r when matched.

Definition at line 377 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_outer_te_remap_mutator()

Node * safe_outer_te_remap_mutator ( Node * node,
safe_outer_te_remap_ctx * ctx )
static

Rewrite Vars in the outer targetList for the multi-component rewrite.

Each base-level Var(varno=v, varattno=a) in the user's targetList is looked up in atom_to_inner_attno[v-1] to find which output column of the matching component's inner sub-Query carries the Var, then rewritten to point at that component's RTE_SUBQUERY in the outer rtable. A Var whose atom_to_inner_attno entry is 0 (i.e. the detector did not pick this column for its inner sub-Query) indicates a bug or a query the caller should have refused; we provsql_error to surface it.

Definition at line 2720 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_partition_residual()

void safe_partition_residual ( Node * residual,
List * atoms,
List * groups,
Node ** outer_residual_out )
static

Partition the cross-atom residual into per-group conjuncts and a new outer residual.

For every top-level AND conjunct of residual:

  • if every base-level Var it references points at an atom in some inner group's member set, the conjunct moves into that group's inner_quals (in original-varno space; the rewriter remaps to inner varnos when it builds the sub-Query);
  • otherwise the conjunct stays in the rebuilt outer residual.

Volatile conjuncts always stay in the outer residual: collapsing the row count inside a sub-Query with an aggregating GROUP BY would change how many times the volatile function runs.

Definition at line 646 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_pushed_remap_mutator()

Node * safe_pushed_remap_mutator ( Node * node,
safe_pushed_remap_ctx * ctx )
static

Rewrite Var.varno from the outer atom rtindex to 1, the sole RTE of the inner wrap subquery.

Applied to each pushed conjunct before it is AND-injected into the inner Query's jointree->quals. varattno is preserved (the inner subquery's RTE is a fresh clone of the same base relation).

Definition at line 724 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_remap_vars_mutator()

Node * safe_remap_vars_mutator ( Node * node,
safe_remap_ctx * ctx )
static

Rewrite Var nodes in the outer query after each base RTE has been wrapped as a DISTINCT subquery projecting one or more slot columns.

For each base-level Var (varno, varattno), the matching atom is atoms[varno-1]. We scan its proj_slots in order, looking for a slot with base_attno == varattno, and remap the Var to the 1-based output position of that slot. A Var with no matching slot indicates the detector accepted a query it shouldn't have; we provsql_error to surface the bug rather than silently emit a broken plan.

Definition at line 1975 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_split_quals()

void safe_split_quals ( Node * quals,
int natoms,
List ** per_atom_out,
Node ** out_residual )
static

Partition top-level WHERE conjuncts into atom-local quals (pushed into the matching atom's inner wrap) and the residual cross-atom quals (which stay in the outer query).

per_atom_out is an array of length natoms (caller-allocated, zero-initialised) of List *. out_residual receives the residual conjunction, rebuilt as a single Node *: NULL when no residual conjuncts remain, the bare conjunct when exactly one, or a fresh BoolExpr(AND_EXPR) otherwise.

A conjunct is pushed when its base-level Vars all reference the same varno and the conjunct contains no volatile function calls. Volatile predicates are kept in the outer scope because the inner SELECT DISTINCT can collapse the evaluation count – we deliberately avoid changing the number of times such functions run.

The pushed conjuncts are still shared with the original q->jointree->quals tree; safe_build_inner_wrap copyObject's them before performing the varno remap.

Definition at line 585 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_unify_remap_mutator()

Node * safe_unify_remap_mutator ( Node * node,
safe_unify_remap_ctx * ctx )
static

Tree mutator that renumbers Var.varno and RangeTblRef.rtindex through the PK-unifiable self-join map.

Applied to every node of the unified Query : the targetList, the jointree (which itself contains RangeTblRef leaves referring to surviving RTEs as well as expression subtrees in the quals), the havingQual when present, etc. Vars with varlevelsup > 0 are outer references and left alone – the candidate gate has already rejected sublinks, so they cannot legitimately appear, but the guard keeps the mutator local.

varnosyn / varattnosyn (PG 13+ "syntactic" parallel of the semantic rtindex used by ruleutils.c's deparser) are kept in sync; without that, pg_get_querydef on the unified query stack-overflows because the syntactic dereference and the semantic one disagree.

Definition at line 3411 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ safe_var_index()

int safe_var_index ( List * vars,
Index varno,
AttrNumber varattno )
static

Find the position of a Var inside the vars list (matched on (varno, varattno)).

Returns -1 if not present.

Definition at line 356 of file safe_query.c.

Here is the caller graph for this function:

◆ try_disjoint_constant_self_join_split()

Bitmapset * try_disjoint_constant_self_join_split ( Query * q)
static

Disjoint-constant self-join certification.

When two (or more) RTEs over the same relation each carry a Var = Const conjunct on the same column with provably-different literals, their tuple-sets are disjoint: a single base-relation row can satisfy at most one of the constant predicates, so the provsql tokens never overlap across the RTEs. The shared-relid bail in is_safe_query_candidate then becomes too conservative – the standard per-atom SELECT DISTINCT wrap on each RTE (with its constant predicate pushed in) factors the relation into disjoint virtual partitions, each acting as an independent atom.

This pre-pass runs after try_pk_self_join_unification and before is_safe_query_candidate. For each same-relid group of >1 RTE remaining in q->rtable, it checks pairwise whether every pair has Var = Const conjuncts on the same varattno with provably distinct literal values. When the entire group satisfies the check, the relid is added to the returned Bitmapset; the candidate gate consults that set and skips the shared-relid bail for those relids.

"Provably distinct" uses datumIsEqual on the Const values after matching consttype: two literals of the same type with different constvalue are guaranteed different at executor time. Conservative: when types disagree or when datumIsEqual cannot decide (TOAST'ed varlena where the stored representation differs from the logical value), the pair is treated as NOT provably-disjoint – the certification simply doesn't fire on that group, and the candidate gate's existing shared-relid bail refuses the query as before.

Soundness traps:

  • Disjointness on the same column (varattno match). A pair like r1.kind = 'A' AND r2.color = 'B' is NOT disjoint – an R-tuple with kind = 'A' and color = 'B' satisfies both.
  • Pairwise across every pair: a 3-RTE group with two disjoint pairs but one non-disjoint pair stays not certified; partial certification would mean the candidate gate still finds two RTEs of the same relid that are NOT provably disjoint, and the rewrite would be unsound on the rows where both predicates can match.
  • Equality-to-literal only: inequalities (r.kind <> 'A') do not pin a column to a single value and do not contribute to provable disjointness. safe_is_var_const_equality enforces this through the operator-OID check.
  • Transitive disjointness via FDs (e.g. kind category, with r1.category = 'X' / r2.category = 'Y') is deferred to the general FD closure follow-up.

Definition at line 3823 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ try_flatten_inner_joins()

Query * try_flatten_inner_joins ( Query * q)
static

Pre-pass driver.

Returns NULL when nothing flattened; else a fresh Query with INNER JoinExprs at every nesting depth dissolved into flat RangeTblRefs, their ON clauses AND-merged into the corresponding quals, and the rtable compacted to drop orphan RTE_JOIN entries.

Definition at line 4652 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ try_inline_simple_subqueries()

Query * try_inline_simple_subqueries ( Query * q)
static

Subquery-inlining pre-pass.

See the chapter comment. Returns NULL when nothing inlined (caller keeps the original q); else a fresh Query with the inlining and compaction baked in.

Definition at line 4374 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ try_pk_self_join_unification()

Query * try_pk_self_join_unification ( Query * q)
static

PK-unifiable self-join detection and unification.

A query of shape R r1, R r2 WHERE r1.x = r2.x with PRIMARY KEY (x) on R forces r1 and r2 to refer to the same tuple. The two RTEs collapse to one single-atom CQ; the safe-query candidate gate's "no two RTEs may share a relid" bail becomes a missed-opportunity bail when the PK proves the shared-relid pair is non-self-joining at the tuple level.

This pre-pass runs before is_safe_query_candidate. It returns NULL when no unification fires; otherwise it returns a fresh Query in which:

  • For every group of same-relid RTEs whose pairwise PK columns are equated through the union-find closure of the residual equijoins, all but one member (the lowest-rtindex survivor) are dropped from rtable.
  • jointree->fromlist's RangeTblRef entries are renumbered or dropped to match. Multiple original entries pointing at the same survivor are deduplicated.
  • Every Var.varno (and parallel varnosyn) in targetList and jointree->quals is rewritten to point at the survivor's new (compacted) rtindex.

Soundness traps:

  • Composite PK requires every column to be equated; the pairwise check uses the union-find closure so transitive equijoins (e.g. r1.x = r3.x AND r2.x = r3.x) suffice.
  • Partial unification (3 RTEs of the same relid where only two have their PK columns equated) bails the entire group: the candidate gate would otherwise still refuse the surviving duplicates. Full unification or full bail.
  • NOT-NULL UNIQUE is FD-equivalent to PRIMARY KEY (the PK-FD pass above treats them identically); the same key cache feeds this pass, and the same NOT-NULL guard excludes nullable UNIQUEs.

Definition at line 3485 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ try_safe_query_rewrite()

Query * try_safe_query_rewrite ( const constants_t * constants,
Query * q )

Top-level entry point for the safe-query rewrite.

Top-level entry point for the hierarchical-CQ rewriter.

Runs the shape gate then the hierarchy detector. If both accept, applies the single-level rewrite and returns the rewritten Query; the caller (process_query) feeds it back from the top so that inner subqueries are themselves re-considered (multi-level recursion via Choice A). Returns NULL to fall through to the existing pipeline.

Definition at line 4674 of file safe_query.c.

Here is the call graph for this function:
Here is the caller graph for this function:

Variable Documentation

◆ provsql_verbose

int provsql_verbose
extern

Verbosity level; controlled by the provsql.verbose_level GUC.

Global variable that indicates the verbosity level set by the provsql.verbose_level run-time configuration parameter was set.

Definition at line 78 of file provsql.c.