Documentation

Provenance.QueryRewriting

Query evaluation by rewriting #

This file provides an alternative approach to evaluating queries on annotated databases: instead of directly interpreting operators over annotated tuples, a query on T is rewritten into a query on T ⊕ K that operates on plain tuples whose values encode both data and provenance.

The rewriting implemented here realises rules (R1)–(R5) from Sen, Maniu & Senellart, ProvSQL: A General System for Keeping Track of the Provenance and Probability of Data.

A correctness proof that Query.rewriting agrees with Query.evaluateAnnotated is fully formalised for rules (R1)–(R4): each operator is machine-checked end-to-end. The Diff case splits into an unmatched_eq half (proved via the semijoin identity Multiset.semijoin_proj_eq_filter, after bridging the LinearOrder.toDecidableEq vs instDecidableEqSum mismatch on the inner dedup via Query.rewriting_valid_diff_inner_dd_inst) and a matched_eq half (proved via the keyed-projection semijoin Multiset.semijoin_keyed_proj_eq_filter, after substituting the inner aggregation with the closed-form Query.evaluate_agg_rewriting_eq). The (R5) aggregation correctness lives in Provenance.QueryEvaluateInVK as Query.rewriting_valid_full, with its V_K interpretation; the syntactic (R5) rewriting itself is in this file as Query.rewritingAgg.

References #

def Query.rewriting {T : Type} {n : } {K : Type} [ValueType T] (q : Query T n) (hq : q.noAgg) :
Query (T K) (n + 1)
Equations
Instances For
    theorem Query.rewriting_valid_prod_heqn {n₁ n₂ n : } (hn : n₁ + n₂ = n) :
    n₁ + 1 + (n₂ + 1) = n + 2
    theorem Query.rewriting_valid_prod0 {K T : Type} [Mul K] {n₁ n₂ n : } (hn : n₁ + n₂ = n) (heq : (Fin (n₁ + n₂)T) = (Fin nT)) (ar₁ : AnnotatedRelation T K n₁) (ar₂ : AnnotatedRelation T K n₂) :
    AnnotatedRelation.toComposite (Multiset.map (fun (x : AnnotatedTuple T K n₁ × AnnotatedTuple T K n₂) => (cast heq (Fin.append x.1.1 x.2.1), x.1.2 * x.2.2)) (Multiset.product ar₁ ar₂)) = Relation.cast (AnnotatedRelation.toComposite (Multiset.map (fun (x : AnnotatedTuple T K n₁ × AnnotatedTuple T K n₂) => (Fin.append x.1.1 x.2.1, x.1.2 * x.2.2)) (Multiset.product ar₁ ar₂)))
    theorem cast_apply {T : Type} {n : } {α : Sort u_1} {m : } (f : Tuple T nα) (t : Tuple T m) (hn : n = m) :
    cast f t = f (Tuple.cast t)
    theorem Query.rewriting_valid_prod1 {T K : Type} {n₂ n₁ n : } [ValueType (T K)] (hn : n₁ + 1 + (n₂ + 1) = n + 2) (f : Tuple (T K) (n + 2)Tuple (T K) (n + 1)) (r : Relation (T K) (n₁ + 1 + (n₂ + 1))) :
    Multiset.map f (Relation.cast hn r) = Multiset.map (fun (t : Tuple (T K) (n₁ + 1 + (n₂ + 1))) => f (Tuple.cast hn t)) r
    theorem Query.rewriting_append_left {T : Type} {n₁ n₂ n : } (t₁ : Tuple T n₁) (t₂ : Tuple T n₂) (hn : n₁ + n₂ = n) (k : Fin n) (hk : k < n₁) :
    Eq.rec (motive := fun (x : ) (h : n₁ + n₂ = x) => Fin xT) (Fin.append t₁ t₂) hn k = t₁ (k.castLT hk)
    theorem Query.rewriting_append_right {T : Type} {n₁ n₂ n : } (t₁ : Tuple T n₁) (t₂ : Tuple T n₂) (hn : n₁ + n₂ = n) (k : Fin n) (hk : ¬k < n₁) :
    Eq.rec (motive := fun (x : ) (h : n₁ + n₂ = x) => Fin xT) (Fin.append t₁ t₂) hn k = t₂ k - n₁,
    theorem Query.tupleCast_append_left {T : Type} {n₁ n₂ n : } (t₁ : Tuple T n₁) (t₂ : Tuple T n₂) (hn : n₁ + n₂ = n) (k : Fin n) (hk : k < n₁) :
    Tuple.cast hn (Fin.append t₁ t₂) k = t₁ (k.castLT hk)

    Tuple.cast-flavoured variant of rewriting_append_left. Both Tuple.cast's and 's Eq.rec motives must syntactically agree for rw to fire on Lean v4.29; this version matches the motive produced by Tuple.cast.

    theorem Query.tupleCast_append_right {T : Type} {n₁ n₂ n : } (t₁ : Tuple T n₁) (t₂ : Tuple T n₂) (hn : n₁ + n₂ = n) (k : Fin n) (hk : ¬k < n₁) :
    Tuple.cast hn (Fin.append t₁ t₂) k = t₂ k - n₁,

    Tuple.cast-flavoured variant of rewriting_append_right.

    Helper lemmas for the Dedup case of rewriting_valid #

    theorem Multiset.fold_addFn_map_inr {T K : Type} [ValueType T] [SemiringWithMonus K] [HasAltLinearOrder K] (m : Multiset K) :
    fold addFn 0 (map (fun (k : K) => Sum.inr k) m) = Sum.inr m.sum

    Folding addFn over a multiset of Sum.inr k values in T⊕K reduces to the Multiset.sum in K, wrapped in Sum.inr.

    theorem AnnotatedRelation.toComposite_filter_map_last {T K : Type} [ValueType T] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) (v : Tuple T n) :
    Multiset.map (fun (u : Tuple (T K) (n + 1)) => u (Fin.last n)) (Multiset.filter (fun (u : Tuple (T K) (n + 1)) => ∀ (k' : Fin n), u (Fin.castLE k') = Sum.inl (v k')) ar.toComposite) = Multiset.map (fun (p : AnnotatedTuple T K n) => Sum.inr p.2) (Multiset.filter (fun (p : AnnotatedTuple T K n) => p.1 = v) ar)

    Filtering ar.toComposite by "first-n columns match Sum.inl ∘ v" and projecting to the last column yields the Sum.inr-wrapped annotations of the matching entries of ar.

    theorem AnnotatedRelation.dedup_toComposite_proj_first_n {T K : Type} [ValueType T] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) (h : n n + 1) :
    (Multiset.map (fun (u : Tuple (T K) (n + 1)) (k : Fin n) => u (Fin.castLE h k)) ar.toComposite).dedup = Multiset.map (fun (v : Tuple T n) (k : Fin n) => Sum.inl (v k)) (Multiset.map Prod.fst ar).dedup

    The dedup of the first-n projection of ar.toComposite is the Sum.inl-image of the dedup of the first-projection of ar.

    theorem groupByKey_key_iff {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) (v : Tuple T n) :
    (∃ (w : K), (v, w) (groupByKey ar)) v Multiset.map Prod.fst ar

    Auxiliary: key set of groupByKey ar equals first-projection keys of ar.

    theorem groupByKey_value {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) (v : Tuple T n) (w : K) :
    (v, w) (groupByKey ar)w = (Multiset.map Prod.snd (Multiset.filter (fun (p : AnnotatedTuple T K n) => p.1 = v) ar)).sum

    Auxiliary: if (v, w) ∈ (groupByKey ar).val, then w is the semiring-sum of annotations of entries in ar with key v.

    theorem groupByKey_multiset_eq {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) :
    (groupByKey ar) = Multiset.map (fun (v : Tuple T n) => (v, (Multiset.map Prod.snd (Multiset.filter (fun (p : AnnotatedTuple T K n) => p.1 = v) ar)).sum)) (Multiset.map Prod.fst ar).dedup

    groupByKey ar, as a multiset, equals the dedup of the first-projection of ar, with each key paired with the semiring-sum of annotations sharing that key.

    Helper lemmas for the Diff case of rewriting_valid #

    theorem Multiset.dedup_eq_of_instances {α : Type u_1} (inst₁ inst₂ : DecidableEq α) (m : Multiset α) :

    Multiset.dedup only depends on the DecidableEq instance up to subsingleton equality.

    theorem Multiset.filter_eq_of_instances {α : Type u_1} (p : αProp) (inst₁ inst₂ : DecidablePred p) (m : Multiset α) :
    filter p m = filter p m

    Multiset.filter only depends on the DecidablePred instance up to subsingleton equality.

    theorem Filter.eval_foldr_and_map {T : Type} [ValueType T] {N : } {α : Type u_1} (list : List α) (f : αFilter T N) (t : Tuple T N) :
    (List.foldr (fun (t t' : Filter T N) => t.And t') True (List.map f list)).eval t xlist, (f x).eval t

    Folded Filter.And over a mapped list is equivalent to the universal conjunction.

    theorem Query.rewriting_valid_joinCond_eval {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] [HasAltLinearOrder K] {N n : } [NeZero N] (t : Tuple (T K) N) :
    (List.foldr (fun (t t' : Filter (T K) N) => t.And t') Filter.True (List.map (fun (k : ) => Filter.BT ((fun (x y : Term (T K) N) => BoolTerm.EQ x y) #(Fin.ofNat N k) #(Fin.ofNat N (k + n + 1)))) (List.range n))).eval t ∀ (k : Fin n), t (Fin.ofNat N k) = t (Fin.ofNat N (k + n + 1))

    The folded join condition (#k == #(k+n+1)) for k ∈ List.range n evaluates true iff the tuple's values at indices ofNat k and ofNat (k+n+1) agree for every k < n.

    theorem Query.rewriting_valid_find_getD_eq_sum {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] {n : } (ar : AnnotatedRelation T K n) (u : Tuple T n) :
    (Option.map Prod.snd (List.find? (fun (x : Tuple T n × K) => decide (x.1 = u)) (groupByKey ar))).getD 0 = (Multiset.map Prod.snd (Multiset.filter (fun (p : AnnotatedTuple T K n) => p.1 = u) ar)).sum

    Semiring-sum over the filter, via groupByKey.find?-based lookup.

    Subtraction distributes over Sum.inr in T⊕K.

    theorem AnnotatedRelation.toComposite_proj_first_n {T K : Type} [ValueType T] {n : } (ar : AnnotatedRelation T K n) (h : n n + 1) :
    Multiset.map (fun (u : Tuple (T K) (n + 1)) (k : Fin n) => u (Fin.castLE h k)) ar.toComposite = Multiset.map (fun (v : Tuple T n) (k : Fin n) => Sum.inl (v k)) (Multiset.map Prod.fst ar)

    Non-dedup form of dedup_toComposite_proj_first_n: the first-n projection of ar.toComposite is the Sum.inl-lift of the first-projection of ar.

    theorem Sum.inl_lift_injective {T K : Type} {n : } :
    Function.Injective fun (v : Tuple T n) (k : Fin n) => inl (v k)

    Sum.inl-lift of tuples is injective.

    theorem Multiset.filter_notMem_map_of_injective {α : Type u_1} {β : Type u_2} [DecidableEq α] [DecidableEq β] {f : αβ} (hf : Function.Injective f) (m s : Multiset α) :
    filter (fun (b : β) => bmap f s) (map f m) = map f (filter (fun (a : α) => as) m)

    Filtering by "not a member of an injective image" pulls through the map.

    theorem AnnotatedTuple.toComposite_castLE {T K : Type} [Zero K] {n : } (p : AnnotatedTuple T K n) (k : Fin n) :
    p.toComposite (Fin.castLE k) = Sum.inl (p.1 k)

    Helper: the data part Tuple.fromComposite and AnnotatedTuple.toComposite agree on data.

    theorem AnnotatedTuple.toComposite_last {T K : Type} [Zero K] {n : } (p : AnnotatedTuple T K n) :

    The annotation part of p.toComposite is Sum.inr p.2.

    Roundtrip: Tuple.fromCompositeAnnotatedTuple.toComposite = id. The composite encoding loses no information: peeling the data columns and the annotation column back out reconstructs the original annotated tuple.

    Pushforward version of Tuple.fromComposite_toComposite: mapping Tuple.fromComposite over a composite-encoded annotated relation recovers the original annotated relation.

    theorem Query.rewriting_valid_diff_inner_dd {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] [HasAltLinearOrder K] {n : } (AR₁ AR₂ : AnnotatedRelation T K n) :
    (Multiset.filter (fun (u : Tuple (T K) n) => uMultiset.map (fun (u' : Tuple (T K) (n + 1)) (k : Fin n) => u' (Fin.castLE k)) AR₂.toComposite) (Multiset.map (fun (u' : Tuple (T K) (n + 1)) (k : Fin n) => u' (Fin.castLE k)) AR₁.toComposite)).dedup = Multiset.map (fun (v : Tuple T n) (k : Fin n) => Sum.inl (v k)) (Multiset.filter (fun (v : Tuple T n) => vMultiset.map Prod.fst AR₂) (Multiset.map Prod.fst AR₁)).dedup

    Reduction of the inner Dedup ∘ Diff ∘ Proj block of the Diff rewriting: deduping the difference of first-n projections of AR₁.toComposite and AR₂.toComposite yields the Sum.inl-lift of the deduped "unmatched-keys" filter over the data part. Stated using Fin.castLE (function form) and dot notation (.dedup) so the LHS pattern matches what simp only [evaluate] produces in the Diff case of rewriting_valid.

    theorem Relation.cast_eq_map {T : Type} {n m : } (h : n = m) (r : Relation T n) :

    Relation.cast rewrites to a Multiset.map of Tuple.cast.

    theorem proj_outer_cast_append_eq_fst {α : Type} {n : } (h : n + 1 + n = 2 * n + 1) (p : Tuple α (n + 1)) (q : Tuple α n) :
    (fun (k : Fin (n + 1)) => Tuple.cast h (Fin.append p q) (Fin.castLE k)) = p

    Projecting the first n+1 columns of Tuple.cast h (Fin.append p q) (for p : Tuple α (n+1), q : Tuple α n, h : n+1+n = 2*n+1) returns p.

    theorem cast_append_at_ofNat_left {α : Type} {n : } (h : n + 1 + n = 2 * n + 1) (p : Tuple α (n + 1)) (q : Tuple α n) (k : Fin n) [NeZero (2 * n + 1)] :
    Tuple.cast h (Fin.append p q) (Fin.ofNat (2 * n + 1) k) = p (Fin.castLE k)

    Reading Tuple.cast h (Fin.append p q) at index Fin.ofNat _ k.val (for k : Fin n) returns p k.castSucc.

    theorem cast_append_at_ofNat_right {α : Type} {n : } (h : n + 1 + n = 2 * n + 1) (p : Tuple α (n + 1)) (q : Tuple α n) (k : Fin n) [NeZero (2 * n + 1)] :
    Tuple.cast h (Fin.append p q) (Fin.ofNat (2 * n + 1) (k + n + 1)) = q k

    Reading Tuple.cast h (Fin.append p q) at index Fin.ofNat _ (k.val+n+1) (for k : Fin n) returns q k.

    theorem selFilter_cast_append_iff {T K : Type} [ValueType T] [SemiringWithMonus K] [HasAltLinearOrder K] {n : } (h : n + 1 + n = 2 * n + 1) (p : Tuple (T K) (n + 1)) (q : Tuple (T K) n) [NeZero (2 * n + 1)] :
    (List.foldr (fun (t t' : Filter (T K) (2 * n + 1)) => t.And t') Filter.True (List.map (fun (k : ) => Filter.BT ((fun (x y : Term (T K) (2 * n + 1)) => BoolTerm.EQ x y) #(Fin.ofNat (2 * n + 1) k) #(Fin.ofNat (2 * n + 1) (k + n + 1)))) (List.range n))).eval (Tuple.cast h (Fin.append p q)) (fun (k : Fin n) => p (Fin.castLE k)) = q

    selFilter on Tuple.cast h (Fin.append p q) characterises the first-n projection equality between p and q.

    theorem cast_append_2n2_at_ofNat_left {α : Type} {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple α (n + 1)) (k : Fin n) [NeZero (2 * n + 2)] :
    Tuple.cast h (Fin.append p q) (Fin.ofNat (2 * n + 2) k) = p (Fin.castLE k)

    Arity-(2n+2) analogue of cast_append_at_ofNat_left: reading Tuple.cast h (Fin.append p q) at index Fin.ofNat _ k.val (for k : Fin n) returns p (k.castLE (Nat.le_succ n)). Here q : Tuple α (n+1) (rather than Tuple α n).

    theorem cast_append_2n2_at_ofNat_right {α : Type} {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple α (n + 1)) (k : Fin n) [NeZero (2 * n + 2)] :
    Tuple.cast h (Fin.append p q) (Fin.ofNat (2 * n + 2) (k + n + 1)) = q (Fin.castLE k)

    Arity-(2n+2) analogue of cast_append_at_ofNat_right: reading Tuple.cast h (Fin.append p q) at index Fin.ofNat _ (k.val+n+1) (for k : Fin n) returns q (k.castLE (Nat.le_succ n)).

    theorem cast_append_2n2_at_ofNat_n {α : Type} {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple α (n + 1)) [NeZero (2 * n + 2)] :
    Tuple.cast h (Fin.append p q) (Fin.ofNat (2 * n + 2) n) = p (Fin.last n)

    Arity-(2n+2) helper: reading Tuple.cast h (Fin.append p q) at index Fin.ofNat _ n returns p (Fin.last n).

    theorem cast_append_2n2_at_last {α : Type} {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple α (n + 1)) :
    Tuple.cast h (Fin.append p q) (Fin.last (2 * n + 1)) = q (Fin.last n)

    Arity-(2n+2) helper: reading Tuple.cast h (Fin.append p q) at index Fin.last (2*n+1) (the last index of Fin (2*n+2)) returns q (Fin.last n).

    theorem proj_outer_2n2_cast_append_eq_fst {α : Type} {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple α (n + 1)) (k : Fin (n + 1)) :
    Tuple.cast h (Fin.append p q) (Fin.castLE k) = p k

    Arity-(2n+2) projection helper: reading Tuple.cast h (Fin.append p q) at index k.castLE _ (for k : Fin (n+1)) returns p k. This is the analogue of proj_outer_cast_append_eq_fst for the 2n+2 case (i.e., q : Tuple α (n+1)).

    theorem selFilter_cast_append_2n2_iff {T K : Type} [ValueType T] [SemiringWithMonus K] [HasAltLinearOrder K] {n : } (h : n + 1 + (n + 1) = 2 * n + 2) (p q : Tuple (T K) (n + 1)) [NeZero (2 * n + 2)] :
    (List.foldr (fun (t t' : Filter (T K) (2 * n + 2)) => t.And t') Filter.True (List.map (fun (k : ) => Filter.BT ((fun (x y : Term (T K) (2 * n + 2)) => BoolTerm.EQ x y) #(Fin.ofNat (2 * n + 2) k) #(Fin.ofNat (2 * n + 2) (k + n + 1)))) (List.range n))).eval (Tuple.cast h (Fin.append p q)) (fun (k : Fin n) => p (Fin.castLE k)) = fun (k : Fin n) => q (Fin.castLE k)

    Arity-(2n+2) analogue of selFilter_cast_append_iff: the join condition on Tuple.cast h (Fin.append p q) with q : Tuple (T⊕K) (n+1) characterises equality of the first-n projections of p and q.

    theorem AnnotatedRelation.toComposite_filter {T K : Type} [ValueType T] [Zero K] {n : } (ar : AnnotatedRelation T K n) (pred : AnnotatedTuple T K nProp) [DecidablePred pred] :
    toComposite (Multiset.filter pred ar) = Multiset.filter (fun (t : Tuple (T K) (n + 1)) => pred t.fromComposite) ar.toComposite

    Filter pushes through AnnotatedRelation.toComposite via the Tuple.fromCompositeAnnotatedTuple.toComposite = id roundtrip: filtering before taking the composite encoding equals filtering the composite encoding by the same predicate composed with Tuple.fromComposite.

    theorem Multiset.semijoin_proj_eq_filter {α : Type u_1} {β : Type u_2} [DecidableEq β] (r : Multiset α) (s : Multiset β) (g : αβ) (hs : s.Nodup) :
    map Prod.fst (filter (fun (pair : α × β) => g pair.1 = pair.2) (r.product s)) = filter (fun (a : α) => g a s) r

    Semijoin reduction. Given multisets r : Multiset α and s : Multiset β and a key function g : α → β, with s Nodup, the projection-after-filter of the cartesian product (keeping pairs whose g-image matches) coincides with filtering r to those a whose g a belongs to s. This is the multiset version of the relational semijoin and is the structural identity behind the unmatched_eq half of the Diff-case rewriting correctness.

    theorem Multiset.semijoin_keyed_proj_eq_filter {α : Type u_1} {γ : Type u_2} {δ : Type u_3} {β : Type u_4} [DecidableEq β] (r : Multiset α) (S : Multiset β) (val : βγ) (key_r : αβ) (key_s : γβ) (mk : αγδ) (hS : S.Nodup) (h_val : vS, key_s (val v) = v) :
    map (fun (pair : α × γ) => mk pair.1 pair.2) (filter (fun (pair : α × γ) => key_r pair.1 = key_s pair.2) (r.product (map val S))) = map (fun (a : α) => mk a (val (key_r a))) (filter (fun (a : α) => key_r a S) r)

    Keyed-projection semijoin. Generalizes Multiset.semijoin_proj_eq_filter in two directions: the right multiset is the image S.map val of a Nodup keyset S under a value function val : β → γ, and the projection is an arbitrary mk : α → γ → δ rather than Prod.fst. The compatibility hypothesis h_val asserts that key_s ∘ val is the identity on S (i.e., val reconstructs an element whose key_s-image is the original key). The semijoin then reduces to filtering r by key_r a ∈ S and projecting through mk a (val (key_r a)) (the unique matching γ-value). This is the structural identity behind the matched_eq half of the Diff-case rewriting correctness.

    The Agg of q.rewriting (the inner aggregation used in both the Dedup and Diff rewritings) evaluates to a map over the deduped data-projection of the inner annotated relation, with each row paired (via AnnotatedTuple.toComposite) with the semiring sum of the matching annotations.

    theorem Query.rewriting_valid_diff_inner_dd_inst {T K : Type} [ValueType T] [SemiringWithMonus K] [DecidableEq K] [HasAltLinearOrder K] {n : } (AR₁ AR₂ : AnnotatedRelation T K n) (instA : DecidableEq (Tuple (T K) n)) (instDP : DecidablePred fun (u : Tuple (T K) n) => uMultiset.map (fun (u' : Tuple (T K) (n + 1)) (k : Fin n) => u' (Fin.castLE k)) AR₂.toComposite) :
    (Multiset.filter (fun (u : Tuple (T K) n) => uMultiset.map (fun (u' : Tuple (T K) (n + 1)) (k : Fin n) => u' (Fin.castLE k)) AR₂.toComposite) (Multiset.map (fun (u' : Tuple (T K) (n + 1)) (k : Fin n) => u' (Fin.castLE k)) AR₁.toComposite)).dedup = Multiset.map (fun (v : Tuple T n) (k : Fin n) => Sum.inl (v k)) (Multiset.filter (fun (v : Tuple T n) => vMultiset.map Prod.fst AR₂) (Multiset.map Prod.fst AR₁)).dedup

    Instance-polymorphic restatement of Query.rewriting_valid_diff_inner_dd. Inside the Diff case of rewriting_valid, Lean's instance synthesis picks inconsistent DecidableEq (T⊕K) instances at different positions in the goal: the inner Multiset.dedup is elaborated with LinearOrder.toDecidableEq (via ValueType (T⊕K)), while the surrounding Multiset.filter's decidableMem uses instDecidableEqSum. This wrapper accepts both as explicit parameters and bridges to the canonical helper via Subsingleton.elim.

    (R5) Rewriting of top-level aggregation #

    The aggregation rewriting rule (R5) of Sen, Maniu & Senellart, ProvSQL:

    γ_{i₁,…,iₘ}[t₁ : f₁, …, tₙ : fₙ](q) is rewritten to γ_{i₁,…,iₘ}[t₁ * #(k+1) : f̂₁, …, tₙ * #(k+1) : f̂ₙ, #(k+1) : δ(⊕)](q).

    Unlike (R1)–(R4), which keep the rewriting target in Query (T ⊕ K) and the standard evaluate semantics, (R5)'s rewritten query is interpreted in the K-semimodule V_K – the per-column term t_j * #(k+1) evaluates to a K-tensor monomial α ⊗ v_j, not to a plain T ⊕ K value. The companion evaluator Query.evaluateInVK (in Provenance.QueryEvaluateInVK) carries that interpretation.

    Query.rewritingAgg here implements the rewriting syntactically as a Query (T ⊕ K). Its semantic correctness – the analogue of rewriting_valid stating that ⟪Agg ...⟫_Î matches evaluateInVK (rewritingAgg ...) Î.toComposite – is proved as the (R5) case of Query.rewriting_valid_full (in Provenance.QueryEvaluateInVK), packaged together with the (R1)–(R4) correctness. The R4 sorries in Query.rewriting_valid for the diff case are carried over there as the only remaining gap.

    def Query.rewritingAgg {T K : Type} [ValueType T] {m n₁ n₂ : } (is : Tuple (Fin m) n₁) (ts : Tuple (Term T m) n₂) (as : Tuple AggFunc n₂) (q_inner : Query T m) (hq_inner : q_inner.noAgg) :
    Query (T K) (n₁ + n₂ + 1)

    (R5) Top-level aggregation rewriting. Produces a plain Query (T ⊕ K) representing γ_{is}[t_j * #(k+1) : f̂_j, #(k+1) : δ(⊕)](q.rewriting).

    The inner query q is required to be noAgg (the ICDE paper imposes aggregation-at-root); q.rewriting handles its (R1)–(R4) rewriting and the resulting query operates on tuples of arity m+1 (the original m data columns plus one annotation column). The output Agg has n₁ grouping columns, n₂+1 aggregated columns (the original n₂ plus the δ(⊕) annotation column at the end), and arity n₁ + n₂ + 1.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Unified rewriting for queries with at-most-top-level aggregation #

      def Query.wellFormed {T : Type} {n : } :
      Query T nProp

      A query is well-formed for rewriting if it has no aggregation at all (corresponding to the (R1)–(R4) cases) or if it is a top-level aggregation whose inner query has no aggregation (the (R5) case). This is the structural precondition imposed by the ICDE paper, which restricts the aggregation operator to the root of a query plan.

      Equations
      Instances For
        def Query.rewritingFull {T K : Type} [ValueType T] {n : } (q : Query T n) :
        q.wellFormedQuery (T K) (n + 1)

        Unified rewriting: dispatches between the (R1)–(R4) rewriting for non-aggregating queries and the (R5) rewriting for top-level aggregations. The single function realises the rewriting rules (R1)–(R5) of Sen, Maniu & Senellart together.

        Equations
        Instances For