Welcome to ProvSQL’s documentation!

ProvSQL is a PostgreSQL extension that adds semiring provenance and uncertainty management to SQL queries. It transparently rewrites queries to track which input tuples contribute to each result, then evaluates the provenance in any user-chosen semiring – Boolean reachability, counting, probability, Shapley values, and more.

A companion Python web UI, ProvSQL Studio, provides interactive provenance inspection on top of any ProvSQL-enabled database.

This documentation is organized into four parts:

  • The User Guide explains how to install, configure, and use ProvSQL from the SQL level and through ProvSQL Studio. Start here if you are new to ProvSQL.

  • The Case Studies present extended worked examples of ProvSQL applied to realistic scenarios, covering a broader range of features than the introductory tutorial.

  • The Developer Guide describes ProvSQL’s internal architecture and is aimed at contributors. It covers the PostgreSQL extension concepts ProvSQL relies on, the architecture and component map, the query rewriting pipeline, memory management, the where-provenance and data-modification subsystems, aggregation and semiring evaluation, probability computation, coding conventions, testing, debugging, the build system, and ProvSQL Studio’s architecture.

  • The API Reference provides auto-generated reference documentation for the SQL and C/C++ APIs (via Doxygen).

User Guide

Case Studies

Developer Guide

References