Data Modification Tracking
ProvSQL can track the provenance of data-modification operations –
INSERT, UPDATE, and DELETE – when run on provenance-enabled
tables [Bourhis et al., 2020].
Note
Data modification tracking requires PostgreSQL ≥ 14.
Enabling Update Provenance
Update-provenance tracking is disabled by default. Enable it for a session:
SET provsql.update_provenance = on;
Or permanently in postgresql.conf:
provsql.update_provenance = on
INSERT
When update_provenance is enabled, inserting a row into a
provenance-enabled table creates a new input gate for that row, just
as if the row had been present when provenance was enabled on the table:
INSERT INTO employees(name, dept)
VALUES ('Alice', 'R&D');
-- The new row already has a provenance token
SELECT name, provenance() FROM employees WHERE name = 'Alice';
DELETE
Deleting a row does not remove it from the table, but the provenance is changed to mark the deletion, allowing hypothetical reasoning.
The
update_undo mechanism (see below) relies on this.
DELETE FROM employees WHERE name = 'Alice';
UPDATE
An UPDATE is modelled as a DELETE followed by an INSERT. The
new row gets a fresh provenance token; the old token continues to exist in
the circuit.
UPDATE employees SET dept = 'Sales' WHERE name = 'Bob';
Undoing Updates
ProvSQL provides an undo function that rolls back the
provenance effects of a specific logged modification. Every
provenance-enabled DML statement is recorded in the update_provenance
table; pass its provsql token to undo to reverse its effect:
CREATE TABLE t(id INT PRIMARY KEY);
SELECT add_provenance('t');
INSERT INTO t VALUES (1), (2), (3);
DELETE FROM t WHERE id = 3;
-- Row 3 is gone; undo the DELETE to restore it
SELECT undo(provsql)
FROM update_provenance
WHERE query = 'DELETE FROM t WHERE id = 3;';
Limitations
Update tracking is still experimental, both in terms of operation support and of performance.