Documentation

Provenance.Having

Algebraic identities behind HAVING (count) provenance #

This file gathers query-free algebraic identities, in an arbitrary commutative m-semiring, that underpin the correctness of the possible-world semantics for HAVING (count op C) predicates.

For a finite ambient set U : Finset ι and a family α : ι → K, we define

The main results are include/exclude-style recurrences for S and F and a bounding lemma A_V ≤ ⊕_{V ⊆ W ⊆ U} T_U(W) in an idempotent m-semiring.

def Having.A {ι K : Type} [CommSemiringWithMonus K] (α : ιK) (W : Finset ι) :
K

Monomial annotation of a subset: A_W = ∏_{x ∈ W} α x, with the convention A_∅ = 𝟙.

Equations
Instances For
    def Having.T {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (U W : Finset ι) :
    K

    T_U(W) = A_W ⊖ ⊕_{x ∈ U \ W} A_{W ∪ {x}}: the “exactly-W” contribution that removes from A_W all one-step extensions of W inside U.

    Equations
    Instances For
      def Having.S {ι K : Type} [CommSemiringWithMonus K] (α : ιK) (U : Finset ι) (C : ) :
      K

      S_C(U) = ⊕_{W ⊆ U, |W| = C} A_W: the JOIN-based provenance for a HAVING count = C aggregate (up to surface-level reindexing).

      Equations
      Instances For
        def Having.F {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (U : Finset ι) (C : ) :
        K

        F_C(U) = ⊕_{W ⊆ U, |W| ≥ C} T_U(W): the possible-world provenance for a HAVING count ≥ C predicate.

        Equations
        Instances For
          theorem Having.T_eq_mul_one_monus_sum {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (h_distrib : mul_sub_left_distributive K) (U W : Finset ι) :
          T α U W = A α W * (1 - xU \ W, α x)

          Alternative form T_U(W) = A_W ⊗ (𝟙 ⊖ ⊕_{x ∈ U \ W} α x). This is the shape in which T_U(W) first arises from the possible-world semantics; the definition of T is the rewritten form obtained via distributivity of over and over . Holds in any commutative m-semiring with mul_sub_left_distributive.

          theorem Having.SC_recurrence {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) {U : Finset ι} {u : ι} (hu : u U) (C : ) :
          S α U (C + 1) = S α (U.erase u) (C + 1) + S α (U.erase u) C * α u

          Include/exclude recurrence for the JOIN-based provenance S: S_{C+1}(U) = S_{C+1}(U \ {u}) ⊕ S_C(U \ {u}) ⊗ α u. The proof partitions (insert u U').powersetCard (C+1) into subsets that do not contain u and images of C-sized subsets of U' under insert u.

          theorem Having.upward_expansion {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (h_idem : idempotent K) (U V : Finset ι) :
          V UA α V WU.powerset with V W, T α U W

          Upward expansion bound: in an idempotent m-semiring, the monomial of any subset V ⊆ U is bounded above by the sum of T_U(W) over all supersets W ⊇ V inside U. The proof is by strong induction on (U \ V).card, using le_plus_monus for the inductive step and the auxiliary sum_le_of_forall_le to collapse multiplicities by idempotence.

          theorem Having.FC_recurrence {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (h_idem : idempotent K) (h_distrib : mul_sub_left_distributive K) {U : Finset ι} {u : ι} (hu : u U) (C : ) :
          F α U (C + 1) = F α (U.erase u) (C + 1) + F α (U.erase u) C * α u

          Include/exclude recurrence for the possible-world provenance F: F_{C+1}(U) = F_{C+1}(U \ {u}) ⊕ F_C(U \ {u}) ⊗ α u, in any idempotent commutative m-semiring with left-distributivity of over . The proof splits the powerset of U by whether u ∈ W, simplifies the u ∈ W part to F_C(U') ⊗ α u, and combines two opposite inequalities using the upward expansion bound upward_expansion.

          Upward-closed family collapse in absorptive m-semirings #

          The provenance of a finite family F of subsets, weighted by A, agrees with the provenance of any subfamily M ⊆ F such that every element of F contains some element of M. When F is upward-closed under inclusion, the canonical such M is the set of minimal elements of F.

          theorem Having.absorbing_subfamily {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (α : ιK) (h_abs : absorptive K) {F M : Finset (Finset ι)} (hM_sub : M F) (hcover : WF, W'M, W' W) :
          WF, A α W = WM, A α W

          Upward-closed family collapse: in an absorptive commutative m-semiring, the A-weighted sum over a finite family F equals the A-weighted sum over any subfamily M ⊆ F such that every element of F is a superset of some element of M. Taking M = the minimal elements of F (when F is upward-closed) is the canonical application: the provenance of an upward-closed family of worlds collapses to the provenance of its minimal worlds.

          F equals S: algebraic skeleton of HAVING count ≥ C #

          The possible-world provenance F_C(U) agrees with the join-based provenance S_C(U) for all C ≥ 1, in any absorptive commutative m-semiring with left-distributivity of over . Proof by induction on U.card driven by the FC_recurrence and SC_recurrence recurrences; the C = 1 base of the recurrence (where F α U' 0 and S α U' 0 appear on the right-hand side) is closed by the auxiliary fact F_zero_eq_one.

          theorem Having.F_zero_eq_one {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (h_idem : idempotent K) (h_abs : absorptive K) (α : ιK) (U : Finset ι) :
          F α U 0 = 1

          In an absorptive idempotent m-semiring, F α U 0 = 𝟙: the unconstrained possible-world provenance collapses to 𝟙. Lower bound from upward_expansion with V = ∅; upper bound from T α U W ≤ A α W and A α W ≤ 𝟙 (the latter via prod_le_one_absorptive).

          theorem Having.F_eq_S {ι : Type} [DecidableEq ι] {K : Type} [CommSemiringWithMonus K] (h_abs : absorptive K) (h_distrib : mul_sub_left_distributive K) (α : ιK) (U : Finset ι) (C : ) :
          F α U (C + 1) = S α U (C + 1)

          Algebraic skeleton of Theorem 1(i) for HAVING count ≥ C: in an absorptive commutative m-semiring with mul_sub_left_distributive, the possible-world provenance F_C(U) equals the join-based provenance S_C(U) for all C ≥ 1. Absorptive is a strictly stronger hypothesis than the bare “idempotent + distributive” combination one might wish for, but it is what makes the C = 1 base of the recurrence-driven induction go through, via F_zero_eq_one. The absorptive hypothesis is essential: Provenance.Semirings.Tropical.TropicalR.F_ne_S exhibits a non-absorptive (but idempotent and distributive) instance – Tropical (WithTop ℝ) – for which the conclusion fails. The idempotent m-semirings in the library with mul_sub_left_distributive that are absorptive (Bool, BoolFunc, IntervalUnion, Tropical (WithTop ℕ), Viterbi, Łukasiewicz) all satisfy the conclusion.