Documentation

Provenance.QueryAggregation

Annotated semantics of the aggregation operator #

This file gives a first-cut formalisation of the aggregation operator Query.Agg on K-annotated relations, following Section IV-B (Definition 7) of Sen, Maniu & Senellart, ProvSQL and Amsterdamer, Deutch & Tannen, Provenance for aggregate queries.

Scope of this first cut:

For each group g (determined by the grouping indices is), the evaluator produces one output row whose data part is Fin.append g v (the grouping key plus the aggregated values v) and whose annotation is, for each aggregated column k, a pair (value, tensor) where:

The bilinear quotient identifying K ⊗ T with Mathlib's TensorProduct is not enforced in KTensor; the un-quotiented representation is what the evaluator builds. Correctness theorems (commutation with random worlds, with rewriting, etc.) will require either the quotient or careful handling of bilinear equivalence; that work is deferred.

References #

def Query.aggSumRow {T : Type} [ValueType T] {K : Type} {m n₁ n₂ : } (is : Tuple (Fin m) n₁) (ts : Tuple (Term T m) n₂) (r : Multiset (Tuple T m × K)) (g : Tuple T n₁) :
Tuple T n₁ × Tuple (T × KTensor K T) n₂

The aggregator body for a fixed group key g: filter the inner relation to tuples matching the grouping condition, then compute the per-aggregated- column (value, K-tensor) pair. Factored out of evaluateAggSum to allow proofs (notably hom commutation) to reason about the inner body separately from the outer dedup ∘ map structure.

Equations
  • One or more equations did not get rendered due to their size.
Instances For
    def Query.evaluateAggSum {T : Type} [ValueType T] {K : Type} [CommSemiringWithMonus K] [DecidableEq K] {m n₁ n₂ : } (is : Tuple (Fin m) n₁) (ts : Tuple (Term T m) n₂) (q : Query T m) (hq : q.noAgg) (d : AnnotatedDatabase T K) :
    Multiset (Tuple T n₁ × Tuple (T × KTensor K T) n₂)

    Group-by + sum aggregation on a K-annotated relation.

    Given a non-aggregating inner query q on an annotated database, an indexing tuple is selecting the grouping columns, and a tuple of aggregation terms ts, returns one row per distinct group key. Each row carries:

    • the group key (a Tuple T n₁);
    • for each aggregated column k: the aggregated value AggFunc.sum.eval (m.map (ts k).eval) and the K-tensor annotation ∑_{(u, α) ∈ m} α ⊗ (ts k).eval u, where m is the multiset of annotated tuples in the group.

    This corresponds to Query.Agg is ts ![AggFunc.sum, …, AggFunc.sum] q evaluated on the annotated database d, with the annotation in the free formal-sum representation KTensor K T.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For