ProvSQL is a PostgreSQL extension that adds semiring provenance and uncertainty management to SQL queries. It is implemented as a PostgreSQL planner hook that transparently rewrites queries – no changes to the application or schema are required.

For a full introduction to the concepts and capabilities, see the Introduction in the user documentation.

Query Rewriting

When a table is registered for provenance tracking via add_provenance(), each tuple gains a provsql UUID column. ProvSQL’s planner hook intercepts every query involving such tables and rewrites it to compute a provenance circuit over those UUIDs, appending the result UUID to the SELECT list.

The rewriter handles:

  • SELECT-FROM-WHERE, JOIN, nested subqueries
  • GROUP BY, SELECT DISTINCT
  • UNION / UNION ALL / EXCEPT
  • VALUES
  • UPDATE / INSERT / DELETE (when provsql.update_provenance is enabled)

Semiring evaluations, probability computation, and Shapley/Banzhaf values are described in the user documentation.

Lean Formalization

Key parts of the algebraic framework underlying ProvSQL – m-semirings, annotated databases, and relational algebra semantics – have been formally verified in Lean 4. See the Lean formalization page for details.

Architecture and More

Get Started