Build System
This page describes how ProvSQL is built, how PostgreSQL version compatibility is handled, and how CI is configured.
Makefile Structure
ProvSQL uses two Makefiles:
Makefile(top-level) – user-facing targets:make,make test,make docs,make website,make deploy. This file delegates toMakefile.internalfor the actual build.Makefile.internal– the real build file, based on PostgreSQL’s PGXS (PostgreSQL Extension Building Infrastructure). It includes$(PG_CONFIG) --pgxsto inherit compiler flags, install paths, and thepg_regresstest runner.
Key variables in Makefile.internal:
MODULE_big = provsql– the shared library name.OBJS– the list of object files to compile (both C and C++).DATA– SQL files installed to the extension directory.REGRESS– test names forpg_regress.PG_CPPFLAGS– extra compiler flags (C++17, Boost headers).LINKER_FLAGS–-lstdc++ -lboost_serialization.
LLVM JIT is explicitly disabled (with_llvm = no) due to known
PostgreSQL bugs with C++ extensions.
PostgreSQL Version Compatibility
ProvSQL supports PostgreSQL 10 through 18. Version-specific C code
uses the PG_VERSION_NUM macro (from PostgreSQL’s own pg_config.h):
#if PG_VERSION_NUM >= 140000
// PostgreSQL 14+ specific code
#endif
compatibility.c / compatibility.h provide shim
functions for APIs that changed across PostgreSQL versions (e.g.,
list_insert_nth, list_delete_cell).
Generated SQL
The SQL layer is assembled from two hand-edited sources:
sql/provsql.common.sql– functions for all PostgreSQL versions.sql/provsql.14.sql– functions requiring PostgreSQL 14+.
The Makefile concatenates them into an intermediate
provsql.sql based on the major version reported by
pg_config:
sql/provsql.sql: sql/provsql.*.sql
cat sql/provsql.common.sql > sql/provsql.sql
if [ $(PGVER_MAJOR) -ge 14 ]; then \
cat sql/provsql.14.sql >> sql/provsql.sql; \
fi
provsql.sql is then copied to
sql/provsql--<version>.sql. The two files have identical content,
but the version-suffixed name is the one PostgreSQL’s extension
machinery expects: CREATE EXTENSION provsql looks up
provsql--<version>.sql in the extensions directory. The
unsuffixed provsql.sql also serves as the Doxygen input
file for the SQL API reference.
Do not edit either generated file directly. Edit the source SQL
files (provsql.common.sql / provsql.14.sql) instead.
Extension Upgrades
ProvSQL supports in-place upgrades between released versions via PostgreSQL’s standard mechanism:
ALTER EXTENSION provsql UPDATE;
When this statement is issued, PostgreSQL looks for a chain of
upgrade scripts named provsql--<from>--<to>.sql in the
extensions directory and applies them in order. ProvSQL’s upgrade
scripts live under sql/upgrades/ in the source tree and are
installed alongside the main provsql--<version>.sql file by the
DATA variable in Makefile.internal:
UPGRADE_SCRIPTS = $(wildcard sql/upgrades/$(EXTENSION)--*--*.sql)
DATA = sql/$(EXTENSION)--$(EXTVERSION).sql $(UPGRADE_SCRIPTS)
Upgrade support starts with 1.2.1: back-upgrade scripts were
created for 1.0.0 → 1.1.0 → 1.2.0 → 1.2.1, and every release
since 1.2.1 has its own committed upgrade script, so users on any
supported version can run a single ALTER EXTENSION provsql UPDATE
to reach the current release.
Writing an Upgrade Script
Each upgrade script is a hand-written delta file that re-runs the
SQL changes made during one release cycle. Since every release so
far has consisted of in-place CREATE OR REPLACE FUNCTION rewrites
or purely additive CREATE FUNCTION / CREATE CAST statements,
writing an upgrade script is usually a matter of copy-pasting the
modified function bodies from sql/provsql.common.sql (or
sql/provsql.14.sql) into a new file under sql/upgrades/:
/**
* @file
* @brief ProvSQL upgrade script: A.B.C → X.Y.Z
*
* <one-paragraph summary of SQL-surface changes>
*/
SET search_path TO provsql;
CREATE OR REPLACE FUNCTION ... ;
-- or CREATE FUNCTION, CREATE CAST, etc.
The script runs inside an implicit transaction (the whole
ALTER EXTENSION UPDATE is one transaction), so any error rolls
everything back. The script is executed once when transitioning
from A.B.C to X.Y.Z; it does not have to be idempotent on its own.
Non-idempotent statements such as CREATE SCHEMA, CREATE TYPE
(without IF NOT EXISTS), CREATE TABLE, CREATE OPERATOR,
CREATE CAST, and CREATE AGGREGATE are allowed only if the
upgrade is the first to introduce the corresponding object –
PostgreSQL will not re-run the script against an already-upgraded
installation.
If a release genuinely introduces no SQL-surface change, the upgrade
script still has to exist so that PostgreSQL can offer the update
path, but it may be a no-op (just the header comment and a
SET search_path). The release.sh script (see below)
auto-generates such a no-op file when it detects no SQL diff since
the previous tag.
The On-Disk mmap ABI
In-place upgrades only cover the SQL catalog state. ProvSQL’s
persistent circuit lives in four memory-mapped files per database
(provsql_gates.mmap, provsql_wires.mmap,
provsql_mapping.mmap, provsql_extra.mmap), each under
$PGDATA/base/<db_oid>/. Every file starts with a 16-byte header
(magic, version, element size) that is validated on open – see
the Memory Management chapter for details.
An upgrade also requires that the binary layout of
GateInformation (in MMappedCircuit.h), the
gate_type enum (in provsql_utils.h), and the
MMappedUUIDHashTable slot structure are all byte-compatible
between the two versions. Since 1.0.0 these layouts have been
deliberately frozen: zero commits to any of them. The block
comments at the top of those files explicitly call this out.
If a future contribution has to touch the on-disk layout, the upgrade story has to change:
Bump the on-disk format version. Increment the
versionfield in the 16-byte header, write a migration pass in the background worker or as a standalone tool, and mention the one-time migration cost in the upgrade script.Break upgrades for that release. Document it in the release notes and ship a hard failure at worker startup if the existing
*.mmapfiles carry an unrecognised version – safer than silently misreading the bytes. Users then go throughDROP EXTENSION provsql CASCADE; CREATE EXTENSION provsql, losing only the circuit data (their base tables are unaffected).
Migration from pre-1.3.0 (flat file layout). Before 1.3.0, all
databases shared a single set of files directly in $PGDATA/
(without the base/<db_oid>/ prefix and without a format header).
The standalone tool provsql_migrate_mmap (built with
make provsql_migrate_mmap) migrates those old flat files to the
new per-database layout. It must be run as the postgres user
before restarting the server with the 1.3.0 binaries:
provsql_migrate_mmap -D $PGDATA -c "host=/var/run/postgresql"
The tool connects via libpq to enumerate databases, collects root
UUIDs from provenance-tracked tables, BFS-traverses the old circuit,
writes per-database files, and deletes the old flat files on success.
The upgrade script provsql--1.2.3--1.3.0.sql raises a
WARNING if old flat files are still present when
ALTER EXTENSION provsql UPDATE is run.
Workflow During Development
Release-engineering for upgrades follows this rhythm:
When a contributor adds or modifies SQL in
provsql.common.sqlorprovsql.14.sql, they do not have to write a committed upgrade script themselves. The maintainer who cuts the next release is responsible for writing it. During the dev cycle, the Makefile auto-generates an empty dev-cycle upgrade script (see above) so thatALTER EXTENSION provsql UPDATEis structurally reachable todefault_version; that file is purely a placeholder and does not replay the SQL deltas introduced during the cycle.When cutting a release,
release.shchecks forsql/upgrades/provsql--<prev>--<new>.sqland refuses to proceed if it is missing unlessgit diffshows no SQL source changes since the previous tag (in which case it auto-generates a no-op script and commits it).When touching any mmap-serialised struct or enum, the contributor must either preserve binary compatibility (e.g., by appending a new
gate_typeenumerator at the end, never in the middle) or coordinate a format-version bump with the maintainer – see the warning block at the top ofprovsql_utils.handMMappedCircuit.h.
Automated Testing of the Upgrade Path
The upgrade chain is exercised end-to-end by a pg_regress test,
test/sql/extension_upgrade.sql. Because the test is
destructive (it DROPs the extension CASCADE to replay
the upgrade from a clean 1.0.0 state), it must run strictly
after every other test in the suite, including the
schedule.14-only tests that follow schedule.common on
PostgreSQL 14+. To guarantee that ordering regardless of which
source schedule files are active, the Makefile.internal
rule that assembles test/schedule appends a single
test: extension_upgrade line after concatenating
schedule.common and (where applicable) schedule.14:
test/schedule: $(wildcard test/schedule.*)
cat test/schedule.common > test/schedule
if [ $(PGVER_MAJOR) -ge 14 ]; then \
cat test/schedule.14 >> test/schedule; \
fi
echo "test: extension_upgrade" >> test/schedule
so the final schedule always ends with extension_upgrade.
The test is therefore not listed in either schedule.common
or schedule.14 source files.
The test itself:
Drops the current provsql extension (
CASCADEdestroys all provenance-tracked state from preceding tests).Installs the oldest supported version via
CREATE EXTENSION provsql VERSION '1.0.0'. PostgreSQL picks upprovsql--1.0.0.sqlfrom the extensions directory – a frozen install-script fixture generated at build time fromsql/fixtures/provsql--1.0.0-common.sqlandsql/fixtures/provsql--1.0.0-14.sql(themselves exact copies of the historical 1.0.0 source files, extracted viagit show v1.0.0:sql/...).Runs
ALTER EXTENSION provsql UPDATE(noTOclause), so PostgreSQL advances the extension all the way to whatever the currentdefault_versionis – walking the full chain of committed upgrade scripts undersql/upgrades/and, on a development build, the auto-generated empty dev-cycle upgrade script described below.Asserts that the post-upgrade
extversionequalsdefault_version(read frompg_available_extensions) via a boolean comparison, so the expected output never contains a hard-coded version string and the test stays correct as master advances.Runs a tiny smoke query against the upgraded extension to confirm that the query rewriter,
add_provenance,create_provenance_mapping, and a compiled semiring evaluator still work.
The test runs on every PostgreSQL version in the CI matrix
(10 through 18), because it lives inside the standard pg_regress
suite. It catches regressions in: committed upgrade scripts,
the DATA wildcard in Makefile.internal that ships them,
and the binary stability of the mmap format across versions.
The Auto-Generated Dev-Cycle Upgrade Script
Between releases, HEAD’s default_version is a dev version
(e.g., 1.3.0-dev) for which no committed upgrade script
exists. Rather than maintaining a hand-written dev-cycle script
on master, Makefile.internal detects dev versions and
generates an empty upgrade script at build time:
ifneq ($(findstring -dev,$(EXTVERSION)),)
LATEST_RELEASE = $(shell git describe --tags --abbrev=0 --match 'v[0-9]*' \
2>/dev/null | sed 's/^v//')
ifneq ($(LATEST_RELEASE),)
DEV_UPGRADE = sql/$(EXTENSION)--$(LATEST_RELEASE)--$(EXTVERSION).sql
endif
endif
If a committed upgrade script already exists for the same
LATEST_RELEASE → BARE_VERSION pair (i.e.
sql/upgrades/provsql--<prev>--<bare>.sql without the -dev
suffix), the Makefile copies it to the dev script so that content
such as migration warnings is exercisable during the dev cycle.
Otherwise the file is created by a touch $@ recipe. Either way
the file is included in DATA so make install ships it, and it
is matched by the existing sql/provsql--*.sql gitignore pattern
so it never lands in git.
Actual SQL changes made during the dev cycle are not captured in
this auto-generated file; they are captured by the hand-written
upgrade script that release.sh creates (or auto-generates as a
no-op) at release time. An upgrade from the previous release
directly to an intermediate dev commit therefore works at the
ALTER EXTENSION mechanism level but does not replay the
SQL deltas introduced during the dev cycle – master users who
need a functionally-complete upgrade should wait for the release
tag and use the committed upgrade script.
On release builds (where EXTVERSION does not end in -dev)
or on dev tarballs without a reachable git tag, DEV_UPGRADE
expands to the empty string and the Makefile falls back to the
committed upgrade scripts only.
Manual Testing
To exercise the same upgrade path interactively:
# Install the current build (which now ships the 1.0.0 fixture
# and all upgrade scripts alongside the current install script).
# Run as a user with write access to the PostgreSQL directories
make && make install
systemctl restart postgresql
# Install the extension at the old version and populate state.
psql <<'SQL'
CREATE DATABASE upgrade_test;
\c upgrade_test
CREATE EXTENSION provsql VERSION '1.0.0';
SELECT extversion FROM pg_extension WHERE extname='provsql';
-- ... exercise the extension, populate some provenance state ...
SQL
# Apply the full upgrade chain to the current version.
psql -d upgrade_test -c "ALTER EXTENSION provsql UPDATE;"
psql -d upgrade_test -c "SELECT extversion FROM pg_extension WHERE extname='provsql';"
# ... verify the extension still works; mmap state is intact ...
Standalone Tools
make tdkc builds the standalone Tree-Decomposition Knowledge
Compiler. It links the C++ circuit and tree-decomposition code into
an independent binary (no PostgreSQL dependency) that reads a circuit
from a text file and outputs its probability.
make provsql_migrate_mmap builds the mmap migration tool (requires
libpq-dev). It migrates old flat $PGDATA/provsql_*.mmap files
(pre-1.3.0 format) to the new per-database layout under
$PGDATA/base/<db_oid>/. See The On-Disk mmap ABI for usage details.
Documentation Build
make docs builds the full documentation:
Generates
provsql.sqlfrom the SQL source files.Runs Doxygen twice (
Doxyfile-cfor C/C++,Doxyfile-sqlfor SQL).Post-processes the SQL Doxygen HTML (C-to-SQL terminology).
Runs Sphinx to build the user guide and developer guide.
Runs the coherence checker (
check-doc-links.py) to validate allsqlfuncandcfunccross-references.
Releases
release.sh <version> (e.g. ./release.sh 1.2.0) automates
creating a new release. It:
Checks that
gh,gpg, and a configured GPG signing key are available, and that the version string is newer than any existingvX.Y.Ztag.Opens
$EDITORto collect release notes (a pre-filled template; leaving it unchanged aborts).Checks that
sql/upgrades/provsql--<prev>--<new>.sqlexists (auto-generating a no-op script if the SQL sources have not changed since the previous tag, aborting otherwise).Updates
default_versioninprovsql.common.control,version:anddate-released:inCITATION.cff, the top-levelversionand theprovides.provsql.versioninMETA.json(the PGXN Meta Spec file), and prepends a new entry to bothwebsite/_data/releases.ymlandCHANGELOG.md(the repo-root changelog mirrors the website release-notes block, with the first## What's new in <version>heading stripped to avoid duplicating the release heading).Commits the bumped files, creates a GPG-signed annotated tag
vX.Y.Z, and offers to push the commit and tag.Offers to create a GitHub Release via
gh release createusing the collected notes.Offers a post-release bump of
default_versiononmasterto the nextX.(Y+1).0-dev(or a user-providedNEXT_VERSION).
The signed tag push is what the Linux CI workflow keys on to build
and push the inriavalda/provsql Docker image for the new version.
Studio Releases
ProvSQL Studio (studio/) is released independently of the
extension, as the provsql-studio package on PyPI, from
studio-vX.Y.Z git tags. Studio’s version stream is independent of
the extension’s; the compatibility table in ProvSQL Studio
records each Studio release’s minimum required extension version.
For the Studio source-code architecture, see ProvSQL Studio.
To cut a Studio release:
Bump
__version__instudio/provsql_studio/__init__.pyto the newX.Y.Z(the wheel’s version is dynamic and reads from that string;pyproject.tomldoes not need a manual edit).Bump
STUDIO_VERSIONindocker/Dockerfileto the sameX.Y.Zso the nextinriavalda/provsql:<extension-version>image installs the matching Studio from PyPI. Forgetting this bump leaves the Docker image installing a stale Studio. (We do not derive this from__version__because between releases that string is something like1.1.0.dev0, which is not a PyPI artifact: hardcoding the last-released version is the lesser evil.)Write the changelog entry in
studio/CHANGELOG.md: prepend a new## [X.Y.Z]section above any prior versioned section, listing user-visible changes under the conventional sub-headings (Highlights / Added / Fixed / Changed / Removed). PRs do not modify this file; the maintainer assembles the section from the merged-since-last-release PR descriptions when cutting the release. The release workflow extracts the section matching the tag’s version and embeds it under “What’s changed” in the GitHub release notes; if the section is missing or empty, the workflow aborts before publishing.Commit the version bumps + changelog entry, push, and let
studio.ymlconfirm the matrix is green on the resulting commit.Tag
studio-vX.Y.Zand push the tag.studio-release.ymltakes over from there.
The release workflow runs four jobs:
gate: a single-cell sanity test (Py 3.12 × PG 16) that mirrors
studio.yml’s lint + pytest steps, run against a checkout of the tagged commit. Cheaper than re-running the 24-cell matrix at release time, and catches a regressing tag that was not first verified on push.build:
python -m buildproducesdist/*.tar.gz(sdist) anddist/*-py3-none-any.whl(wheel) understudio/dist/. The job verifies that the produced filenames carry the version parsed from the tag, so a stale__version__fails loudly instead of publishing a mismatched wheel.publish: uploads the artifacts to PyPI via
pypa/gh-action-pypi-publishusing OIDC (no API token in repo secrets). Theenvironment: pypideclaration on this job is what binds it to the Pending Publisher configured atpypi.org/manage/account/publishing/: PyPI accepts the upload only when the workflow file (studio-release.yml), the environment (pypi), and the tag pattern (studio-v*) all match the publisher record. The matchingpypiGitHub environment also enforces a tag rule (studio-v*) so an unrelated workflow cannot reach the publisher.release: extracts the version’s section from
studio/CHANGELOG.md, builds the release notes (install command, “What’s changed”, requirements, docs link), and runsgh release create studio-vX.Y.Z dist/*to create the GitHub release with the sdist + wheel attached as assets.
The studio-only studio-vX.Y.Z tag does not trigger
pgxn.yml (which only fires on extension-style vX.Y.Z tags)
or the Docker image push (gated on extension tags too).
Website and Deployment
make website builds the public-facing Jekyll site under
website/. It depends on make docs, copies the Sphinx HTML
output and the two Doxygen outputs into the Jekyll source tree
(website/docs/, website/doxygen-sql/html/,
website/doxygen-c/html/), copies shared branding assets, and
runs jekyll build to produce website/_site/.
make deploy builds the website and then rsyncs
website/_site/ to the live server (provsql.org). The rsync
uses --checksum so that files Jekyll rewrote without content
change are not retransferred.
CI Workflows
Eight GitHub Actions workflows are defined:
Workflow |
What it does |
|---|---|
|
Builds and tests on Linux with PostgreSQL 10–18 (Docker-based). Also builds and pushes the Docker image on tagged releases. Runs on every push. |
|
Builds and tests on macOS with Homebrew PostgreSQL. Runs on every push. |
|
Builds and tests on Windows Subsystem for Linux. Runs on every push. |
|
Builds the full documentation (Doxygen + Sphinx + coherence check).
Uses the |
|
Static analysis via GitHub CodeQL. Runs on every push. |
|
Publishes the release to PGXN.
Runs only on version tags ( |
|
Lint, package smoke, and pytest+Playwright e2e on a Py 3.10/
3.11/3.12/3.13 × PG 14/15/16 matrix for ProvSQL Studio.
Runs on every push that touches |
|
Tag-driven PyPI publish for ProvSQL Studio. Builds sdist +
wheel, publishes via Trusted Publisher (no API token), and
creates a GitHub release with the artifacts attached and
notes assembled from |
The five non-release workflows that run on every applicable push
(build_and_test, macos, wsl, docs, codeql, plus
studio for Studio-touching pushes) must pass before merging to
master.