Provenance in databases #
This Lean4 library aims at providing formal definitions and proofs relevant for provenance in databases.
This is work in progress. For now:
Provenance.SemiringWithMonuscontains the definition of a semiring with monus (or m-semiring), along with some classical and useful theorems- We include proofs that some common provenance m-semirings are indeed m-semirings:
Provenance.Semirings.Bool: the Boolean m-semiringProvenance.Semirings.BoolFunc: the Bool[X] m-semiring of Boolean functions over a set X of Boolean variablesProvenance.Semirings.How: the ℕ[X] m-semiring of multivariate polynomials with natural integer coefficients, sometimes called the How[X] m-semiring; it is the m-semiring extension of the universal provenance semiringProvenance.Semirings.Lukasiewicz: the Łukasiewicz semiringProvenance.Semirings.MinMax: the min-max semiring over any bounded linear order, such as the security semiring or (the dual of) the fuzzy semiringProvenance.Semirings.Nat: the counting m-semiringProvenance.Semirings.Tropical: the tropical m-semiring (for any linearly ordered commutative monoid with an additively absorbing ⊤ element, e.g., natural integers or reals with ∞ as ⊤)Provenance.Semirings.Viterbi: the Viterbi m-semiringProvenance.Semirings.Which: the Which[X] m-semiring (also called lineage or Lin[X])Provenance.Semirings.Why: the Why[X] m-semiring
Provenance.Databasedefines tuples, relations, and (regular) databasesProvenance.Querydefines the relational algebraProvenance.AnnotatedDatabasedefines annotated databasesProvenance.QueryAnnotatedDatabasedefines the semantics of relational algebra queries over annotated databases through m-semiringsProvenance.QueryRewritingdefines an approach to query evaluation on annotated relations through query rewriting; a proof (partially written at this point) that this rewriting is correct is provided
See Provenance.Example for an example computation.