Build System
This page describes how ProvSQL is built, how PostgreSQL version compatibility is handled, and how CI is configured.
Makefile Structure
ProvSQL uses two Makefiles:
Makefile(top-level) – user-facing targets:make,make test,make docs,make website,make deploy. This file delegates toMakefile.internalfor the actual build.Makefile.internal– the real build file, based on PostgreSQL’s PGXS (PostgreSQL Extension Building Infrastructure). It includes$(PG_CONFIG) --pgxsto inherit compiler flags, install paths, and thepg_regresstest runner.
Key variables in Makefile.internal:
MODULE_big = provsql– the shared library name.OBJS– the list of object files to compile (both C and C++).DATA– SQL files installed to the extension directory.REGRESS– test names forpg_regress.PG_CPPFLAGS– extra compiler flags (C++17, Boost headers).LINKER_FLAGS–-lstdc++ -lboost_serialization.
LLVM JIT is explicitly disabled (with_llvm = no) due to known
PostgreSQL bugs with C++ extensions.
PostgreSQL Version Compatibility
ProvSQL supports PostgreSQL 10 through 18. Version-specific C code
uses the PG_VERSION_NUM macro (from PostgreSQL’s own pg_config.h):
#if PG_VERSION_NUM >= 140000
// PostgreSQL 14+ specific code
#endif
compatibility.c / compatibility.h provide shim
functions for APIs that changed across PostgreSQL versions (e.g.,
list_insert_nth, list_delete_cell).
Generated SQL
The SQL layer is assembled from two hand-edited sources:
sql/provsql.common.sql– functions for all PostgreSQL versions.sql/provsql.14.sql– functions requiring PostgreSQL 14+.
The Makefile concatenates them into an intermediate
provsql.sql based on the major version reported by
pg_config:
sql/provsql.sql: sql/provsql.*.sql
cat sql/provsql.common.sql > sql/provsql.sql
if [ $(PGVER_MAJOR) -ge 14 ]; then \
cat sql/provsql.14.sql >> sql/provsql.sql; \
fi
provsql.sql is then copied to
sql/provsql--<version>.sql. The two files have identical content,
but the version-suffixed name is the one PostgreSQL’s extension
machinery expects: CREATE EXTENSION provsql looks up
provsql--<version>.sql in the extensions directory. The
unsuffixed provsql.sql also serves as the Doxygen input
file for the SQL API reference.
Do not edit either generated file directly. Edit the source SQL
files (provsql.common.sql / provsql.14.sql) instead.
Extension Upgrades
ProvSQL supports in-place upgrades between released versions via PostgreSQL’s standard mechanism:
ALTER EXTENSION provsql UPDATE;
When this statement is issued, PostgreSQL looks for a chain of
upgrade scripts named provsql--<from>--<to>.sql in the
extensions directory and applies them in order. ProvSQL’s upgrade
scripts live under sql/upgrades/ in the source tree and are
installed alongside the main provsql--<version>.sql file by the
DATA variable in Makefile.internal:
UPGRADE_SCRIPTS = $(wildcard sql/upgrades/$(EXTENSION)--*--*.sql)
DATA = sql/$(EXTENSION)--$(EXTVERSION).sql $(UPGRADE_SCRIPTS)
Upgrade support starts with 1.2.1: there is a chain
1.0.0 → 1.1.0 → 1.2.0 → 1.2.1 of committed upgrade scripts, so
users on any of those versions can run a single
ALTER EXTENSION provsql UPDATE to reach 1.2.1.
Writing an Upgrade Script
Each upgrade script is a hand-written delta file that re-runs the
SQL changes made during one release cycle. Since every release so
far has consisted of in-place CREATE OR REPLACE FUNCTION rewrites
or purely additive CREATE FUNCTION / CREATE CAST statements,
writing an upgrade script is usually a matter of copy-pasting the
modified function bodies from sql/provsql.common.sql (or
sql/provsql.14.sql) into a new file under sql/upgrades/:
/**
* @file
* @brief ProvSQL upgrade script: A.B.C → X.Y.Z
*
* <one-paragraph summary of SQL-surface changes>
*/
SET search_path TO provsql;
CREATE OR REPLACE FUNCTION ... ;
-- or CREATE FUNCTION, CREATE CAST, etc.
The script runs inside an implicit transaction (the whole
ALTER EXTENSION UPDATE is one transaction), so any error rolls
everything back. The script is executed once when transitioning
from A.B.C to X.Y.Z; it does not have to be idempotent on its own.
Non-idempotent statements such as CREATE SCHEMA, CREATE TYPE
(without IF NOT EXISTS), CREATE TABLE, CREATE OPERATOR,
CREATE CAST, and CREATE AGGREGATE are allowed only if the
upgrade is the first to introduce the corresponding object –
PostgreSQL will not re-run the script against an already-upgraded
installation.
If a release genuinely introduces no SQL-surface change, the upgrade
script still has to exist so that PostgreSQL can offer the update
path, but it may be a no-op (just the header comment and a
SET search_path). The release.sh script (see below)
auto-generates such a no-op file when it detects no SQL diff since
the previous tag.
The On-Disk mmap ABI
In-place upgrades only cover the SQL catalog state. ProvSQL’s
persistent circuit lives in four memory-mapped files
(provsql_gates.mmap, provsql_wires.mmap,
provsql_mapping.mmap, provsql_extra.mmap) that are pure
plain-old-data dumps of C and C++ structs – no format version
header, no magic bytes, no schema.
An upgrade therefore requires that the binary layout of
GateInformation (in MMappedCircuit.h), the
gate_type enum (in provsql_utils.h), and the
MMappedUUIDHashTable slot structure are all byte-compatible
between the two versions. Since 1.0.0 these layouts have been
deliberately frozen: zero commits to any of them. The block
comments at the top of those files explicitly call this out.
If a future contribution has to touch the on-disk layout, the upgrade story has to change. Two reasonable options:
Bump the on-disk format version. Add a small format-version header to each of the four
*.mmapfiles, write a migration pass that rewrites the old format to the new one at worker startup, and gate the new pass on detecting an old header. The upgrade script for that release also needs to mention the one-time migration cost.Break upgrades for that release. Document it in the release notes and ship a hard failure at worker startup if the existing
*.mmapfiles look like an old format – safer than silently misreading the bytes. Users then go throughDROP EXTENSION provsql CASCADE; CREATE EXTENSION provsql, losing only the circuit data (their base tables are unaffected).
Workflow During Development
Release-engineering for upgrades follows this rhythm:
When a contributor adds or modifies SQL in
provsql.common.sqlorprovsql.14.sql, they do not have to write a committed upgrade script themselves. The maintainer who cuts the next release is responsible for writing it. During the dev cycle, the Makefile auto-generates an empty dev-cycle upgrade script (see above) so thatALTER EXTENSION provsql UPDATEis structurally reachable todefault_version; that file is purely a placeholder and does not replay the SQL deltas introduced during the cycle.When cutting a release,
release.shchecks forsql/upgrades/provsql--<prev>--<new>.sqland refuses to proceed if it is missing unlessgit diffshows no SQL source changes since the previous tag (in which case it auto-generates a no-op script and commits it).When touching any mmap-serialised struct or enum, the contributor must either preserve binary compatibility (e.g., by appending a new
gate_typeenumerator at the end, never in the middle) or coordinate a format-version bump with the maintainer – see the warning block at the top ofprovsql_utils.handMMappedCircuit.h.
Automated Testing of the Upgrade Path
The upgrade chain is exercised end-to-end by a pg_regress test,
test/sql/extension_upgrade.sql. Because the test is
destructive (it DROPs the extension CASCADE to replay
the upgrade from a clean 1.0.0 state), it must run strictly
after every other test in the suite, including the
schedule.14-only tests that follow schedule.common on
PostgreSQL 14+. To guarantee that ordering regardless of which
source schedule files are active, the Makefile.internal
rule that assembles test/schedule appends a single
test: extension_upgrade line after concatenating
schedule.common and (where applicable) schedule.14:
test/schedule: $(wildcard test/schedule.*)
cat test/schedule.common > test/schedule
if [ $(PGVER_MAJOR) -ge 14 ]; then \
cat test/schedule.14 >> test/schedule; \
fi
echo "test: extension_upgrade" >> test/schedule
so the final schedule always ends with extension_upgrade.
The test is therefore not listed in either schedule.common
or schedule.14 source files.
The test itself:
Drops the current provsql extension (
CASCADEdestroys all provenance-tracked state from preceding tests).Installs the oldest supported version via
CREATE EXTENSION provsql VERSION '1.0.0'. PostgreSQL picks upprovsql--1.0.0.sqlfrom the extensions directory – a frozen install-script fixture generated at build time fromsql/fixtures/provsql--1.0.0-common.sqlandsql/fixtures/provsql--1.0.0-14.sql(themselves exact copies of the historical 1.0.0 source files, extracted viagit show v1.0.0:sql/...).Runs
ALTER EXTENSION provsql UPDATE(noTOclause), so PostgreSQL advances the extension all the way to whatever the currentdefault_versionis – walking the full chain of committed upgrade scripts undersql/upgrades/and, on a development build, the auto-generated empty dev-cycle upgrade script described below.Asserts that the post-upgrade
extversionequalsdefault_version(read frompg_available_extensions) via a boolean comparison, so the expected output never contains a hard-coded version string and the test stays correct as master advances.Runs a tiny smoke query against the upgraded extension to confirm that the query rewriter,
add_provenance,create_provenance_mapping, and a compiled semiring evaluator still work.
The test runs on every PostgreSQL version in the CI matrix
(10 through 18), because it lives inside the standard pg_regress
suite. It catches regressions in: committed upgrade scripts,
the DATA wildcard in Makefile.internal that ships them,
and the binary stability of the mmap format across versions.
The Auto-Generated Dev-Cycle Upgrade Script
Between releases, HEAD’s default_version is a dev version
(e.g., 1.3.0-dev) for which no committed upgrade script
exists. Rather than maintaining a hand-written dev-cycle script
on master, Makefile.internal detects dev versions and
generates an empty upgrade script at build time:
ifneq ($(findstring -dev,$(EXTVERSION)),)
LATEST_RELEASE = $(shell git describe --tags --abbrev=0 --match 'v[0-9]*' \
2>/dev/null | sed 's/^v//')
ifneq ($(LATEST_RELEASE),)
DEV_UPGRADE = sql/$(EXTENSION)--$(LATEST_RELEASE)--$(EXTVERSION).sql
endif
endif
The file is created by a one-line touch $@ recipe, included in
DATA so make install ships it, and matched by the existing
sql/provsql--*.sql gitignore pattern so it never lands in git.
Its sole purpose is to give ALTER EXTENSION provsql UPDATE a
reachable path from the last release to default_version during
the dev cycle.
Actual SQL changes made during the dev cycle are not captured in
this auto-generated file; they are captured by the hand-written
upgrade script that release.sh creates (or auto-generates as a
no-op) at release time. An upgrade from the previous release
directly to an intermediate dev commit therefore works at the
ALTER EXTENSION mechanism level but does not replay the
SQL deltas introduced during the dev cycle – master users who
need a functionally-complete upgrade should wait for the release
tag and use the committed upgrade script.
On release builds (where EXTVERSION does not end in -dev)
or on dev tarballs without a reachable git tag, DEV_UPGRADE
expands to the empty string and the Makefile falls back to the
committed upgrade scripts only.
Manual Testing
To exercise the same upgrade path interactively:
# Install the current build (which now ships the 1.0.0 fixture
# and all upgrade scripts alongside the current install script).
# Run as a user with write access to the PostgreSQL directories
make && make install
systemctl restart postgresql
# Install the extension at the old version and populate state.
psql <<'SQL'
CREATE DATABASE upgrade_test;
\c upgrade_test
CREATE EXTENSION provsql VERSION '1.0.0';
SELECT extversion FROM pg_extension WHERE extname='provsql';
-- ... exercise the extension, populate some provenance state ...
SQL
# Apply the upgrade chain.
psql -d upgrade_test -c "ALTER EXTENSION provsql UPDATE TO '1.2.1';"
psql -d upgrade_test -c "SELECT extversion FROM pg_extension WHERE extname='provsql';"
# ... verify the extension still works; mmap state is intact ...
The tdkc Tool
make tdkc builds the standalone Tree-Decomposition Knowledge
Compiler. It links the C++ circuit and tree-decomposition code into
an independent binary (no PostgreSQL dependency) that reads a circuit
from a text file and outputs its probability.
Documentation Build
make docs builds the full documentation:
Generates
provsql.sqlfrom the SQL source files.Runs Doxygen twice (
Doxyfile-cfor C/C++,Doxyfile-sqlfor SQL).Post-processes the SQL Doxygen HTML (C-to-SQL terminology).
Runs Sphinx to build the user guide and developer guide.
Runs the coherence checker (
check-doc-links.py) to validate allsqlfuncandcfunccross-references.
Releases
release.sh <version> (e.g. ./release.sh 1.2.0) automates
creating a new release. It:
Checks that
gh,gpg, and a configured GPG signing key are available, and that the version string is newer than any existingvX.Y.Ztag.Opens
$EDITORto collect release notes (a pre-filled template; leaving it unchanged aborts).Checks that
sql/upgrades/provsql--<prev>--<new>.sqlexists (auto-generating a no-op script if the SQL sources have not changed since the previous tag, aborting otherwise).Updates
default_versioninprovsql.common.control,version:anddate-released:inCITATION.cff, the top-levelversionand theprovides.provsql.versioninMETA.json(the PGXN Meta Spec file), and prepends a new entry to bothwebsite/_data/releases.ymlandCHANGELOG.md(the repo-root changelog mirrors the website release-notes block, with the first## What's new in <version>heading stripped to avoid duplicating the release heading).Commits the bumped files, creates a GPG-signed annotated tag
vX.Y.Z, and offers to push the commit and tag.Offers to create a GitHub Release via
gh release createusing the collected notes.Offers a post-release bump of
default_versiononmasterto the nextX.(Y+1).0-dev(or a user-providedNEXT_VERSION).
The signed tag push is what the Linux CI workflow keys on to build
and push the inriavalda/provsql Docker image for the new version.
Website and Deployment
make website builds the public-facing Jekyll site under
website/. It depends on make docs, copies the Sphinx HTML
output and the two Doxygen outputs into the Jekyll source tree
(website/docs/, website/doxygen-sql/html/,
website/doxygen-c/html/), copies shared branding assets, and
runs jekyll build to produce website/_site/.
make deploy builds the website and then rsyncs
website/_site/ to the live server (provsql.org). The rsync
uses --checksum so that files Jekyll rewrote without content
change are not retransferred.
CI Workflows
Five GitHub Actions workflows run on every push:
Workflow |
What it does |
|---|---|
|
Builds and tests on Linux with PostgreSQL 10–18 (Docker-based). Also builds and pushes the Docker image on tagged releases. |
|
Builds and tests on macOS with Homebrew PostgreSQL. |
|
Builds and tests on Windows Subsystem for Linux. |
|
Builds the full documentation (Doxygen + Sphinx + coherence check).
Uses the |
|
Static analysis via GitHub CodeQL. |
All five must pass before merging to master.