Documentation

Provenance.LiftedTK

The `V_K`-lifted value type #

This file defines LiftedTK T K, a three-constructor sum that extends the composite encoding T ⊕ K with K-tensor monomials. It is the target value type for evaluating the rewriting of aggregate queries (rule (R5) of Sen, Maniu & Senellart, ProvSQL).

Why a new value type #

The existing ValueType (T ⊕ K) instance has the unfortunate property that multiplying a data value with an annotation drops the annotation:

Sum.inl v * Sum.inr α = Sum.inl v

That is fine for the rules (R1)–(R4), which only multiply same-kind values (data × data in the projected products, annotation × annotation when combining two tuples' K-coordinates), but it loses information in (R5). The rewritten aggregate column term t_j * #(k+1) evaluates to Sum.inl v_j * Sum.inr α, which the current Mul collapses to Sum.inl v_j – the K-side α is gone before the aggregator f̂_j ever sees it.

The paper resolves this by interpreting the rewritten aggregation in the semimodule V_K, which can represent formal sums Σ vᵢ ⊗ αᵢ. LiftedTK T K gives us a Lean encoding of that domain:

LiftedTK.data v – a plain value v ∈ T;
LiftedTK.ann α – a plain annotation α ∈ K;
LiftedTK.ktensor t – a formal K-tensor sum t : KTensor K T.

The fixed Mul produces a ktensor monomial when given a data and an annotation:

data v * ann α = ktensor (embed α v)

so the rewritten R5 aggregation now sees per-tuple K-tensor monomials, and its aggregator f̂_j = sum (multiset union in KTensor K T) accumulates them into the formal sum the paper prescribes.

Notes #

LiftedTK is not (yet) made into a ValueType – ValueType requires a LinearOrder, and a decidable total order on KTensor K T (a Multiset) is not naturally available. We provide the algebraic operations (Zero, Add, Sub, Mul) directly and skip the order requirement; the rewriting evaluator (Query.evaluateInVK, defined elsewhere) does not need to compare values via ≤.
The lifting T ⊕ K → LiftedTK T K is not a Mul-homomorphism on mixed cases – that is precisely the point. It is a homomorphism on same-kind cases, which is what justifies that evaluateInVK agrees with the standard evaluate on the non-aggregate fragment.

References #

Sen, Maniu & Senellart, ProvSQL (Section IV-B, R5)
Amsterdamer, Deutch & Tannen

inductive LiftedTK (T K : Type) :

The V_K-lifted value type: a plain T-value, a plain K-annotation, or a formal K-tensor sum.

data {T K : Type} : T → LiftedTK T K
A plain data value v ∈ T.
ann {T K : Type} : K → LiftedTK T K
A plain annotation α ∈ K.
ktensor {T K : Type} : KTensor K T → LiftedTK T K
A formal K-tensor sum Σ αᵢ ⊗ vᵢ : KTensor K T.

Instances For

def LiftedTK.ofSum {T K : Type} (x : T ⊕ K) :

Embed a T ⊕ K value into LiftedTK T K.

Equations

LiftedTK.ofSum (Sum.inl t) = LiftedTK.data t
LiftedTK.ofSum (Sum.inr k) = LiftedTK.ann k

Instances For

Zero #

@[implicit_reducible]

instance LiftedTK.instZero {T K : Type} [Zero K] :

Zero (LiftedTK T K)

Equations

LiftedTK.instZero = { zero := LiftedTK.ann 0 }

@[simp]

theorem LiftedTK.zero_def {T K : Type} [Zero K] :

0 = ann 0

Addition #

The same-kind cases use the underlying additive structure on T, K, or KTensor K T (multiset union). The mixed cases follow the same “data wins” convention as the existing ValueType (T ⊕ K) Add, with 0 (= ann 0) absorbing into the other operand so that 0 + x = x for every kind. This is enough for the aggregation evaluator: the aggregator f̂_j = sum operates within a single kind (the per-column term t_j * #(k+1) always produces a ktensor; the per-row annotation #(k+1) : δ(⊕) always produces an ann), and the fold's initial value is 0 = ann 0 which is absorbed by the first non-zero summand.

@[implicit_reducible]

instance LiftedTK.instAdd {T K : Type} [Add T] [Add K] :

Add (LiftedTK T K)

Equations

One or more equations did not get rendered due to their size.

@[simp]

theorem LiftedTK.add_data_data {T K : Type} [Add T] [Add K] (x y : T) :

data x + data y = data (x + y)

@[simp]

theorem LiftedTK.add_ann_ann {T K : Type} [Add T] [Add K] (α β : K) :

ann α + ann β = ann (α + β)

@[simp]

theorem LiftedTK.add_ktensor_ktensor {T K : Type} [Add T] [Add K] (t t' : KTensor K T) :

ktensor t + ktensor t' = ktensor (t + t')

@[simp]

theorem LiftedTK.add_data_ann {T K : Type} [Add T] [Add K] (x : T) (α : K) :

data x + ann α = data x

@[simp]

theorem LiftedTK.add_ann_data {T K : Type} [Add T] [Add K] (α : K) (x : T) :

ann α + data x = data x

@[simp]

theorem LiftedTK.add_ktensor_ann {T K : Type} [Add T] [Add K] (t : KTensor K T) (α : K) :

ktensor t + ann α = ktensor t

@[simp]

theorem LiftedTK.add_ann_ktensor {T K : Type} [Add T] [Add K] (α : K) (t : KTensor K T) :

ann α + ktensor t = ktensor t

@[simp]

theorem LiftedTK.add_data_ktensor {T K : Type} [Add T] [Add K] (x : T) (t : KTensor K T) :

data x + ktensor t = data x

@[simp]

theorem LiftedTK.add_ktensor_data {T K : Type} [Add T] [Add K] (t : KTensor K T) (x : T) :

ktensor t + data x = data x

instance LiftedTK.instCommutativeHAdd {T K : Type} [AddCommSemigroup T] [AddCommSemigroup K] :

Std.Commutative fun (x1 x2 : LiftedTK T K) => x1 + x2

The Add on LiftedTK T K is commutative whenever T and K are. The mixed cases give the same result either way (data wins over ann and ktensor; ktensor wins over ann), and same-kind cases inherit commutativity.

instance LiftedTK.instAssociativeHAdd {T K : Type} [AddCommSemigroup T] [AddCommSemigroup K] :

Std.Associative fun (x1 x2 : LiftedTK T K) => x1 + x2

The Add on LiftedTK T K is associative whenever T and K are. The mixed cases follow the priority order “data > ktensor > ann”; same-kind cases inherit associativity from the underlying carriers.

Subtraction #

LiftedTK inherits Sub componentwise on same-kind operands; mixed cases mirror the additive convention. R5 does not exercise Sub on mixed kinds, so the specific choices on those cases are not load-bearing for correctness; they are picked to keep Sub a total function.

@[implicit_reducible]

instance LiftedTK.instSub {T K : Type} [Sub T] [Sub K] :

Sub (LiftedTK T K)

Equations

One or more equations did not get rendered due to their size.

Multiplication – the fix for R5 #

Same-kind multiplication uses the underlying Mul on T, K, or KTensor (the last being the KTensor.smul-style action; here we take KTensor * KTensor to be 0 since R5 does not exercise it, and we already know the un-quotiented Multiset (K × M) is not closed under a meaningful "tensor of tensors" operation).

The mixed data × ann case is the one that distinguishes LiftedTK from T ⊕ K: instead of dropping α, it produces a single-monomial K-tensor α ⊗ v.

@[implicit_reducible]

instance LiftedTK.instMul {T K : Type} [CommSemiringWithMonus K] [Mul T] :

Mul (LiftedTK T K)

Equations

One or more equations did not get rendered due to their size.

Decidable equality #

Decidable equality for LiftedTK T K requires it on T, K, and on KTensor K T. The last reduces to decidable equality on Multiset (K × T), which is decidable from DecidableEq K and DecidableEq T.

@[implicit_reducible]

instance LiftedTK.instDecidableEq {T K : Type} [DecidableEq T] [DecidableEq K] :

DecidableEq (LiftedTK T K)

Equations

(LiftedTK.data x_2).instDecidableEq (LiftedTK.data y) = decidable_of_iff (x_2 = y) ⋯
(LiftedTK.ann α).instDecidableEq (LiftedTK.ann β) = decidable_of_iff (α = β) ⋯
(LiftedTK.ktensor t).instDecidableEq (LiftedTK.ktensor t') = match Multiset.decidableEq t t' with | isTrue h => isTrue ⋯ | isFalse h => isFalse ⋯
(LiftedTK.data a).instDecidableEq (LiftedTK.ann a_1) = isFalse ⋯
(LiftedTK.data a).instDecidableEq (LiftedTK.ktensor a_1) = isFalse ⋯
(LiftedTK.ann a).instDecidableEq (LiftedTK.data a_1) = isFalse ⋯
(LiftedTK.ann a).instDecidableEq (LiftedTK.ktensor a_1) = isFalse ⋯
(LiftedTK.ktensor a).instDecidableEq (LiftedTK.data a_1) = isFalse ⋯
(LiftedTK.ktensor a).instDecidableEq (LiftedTK.ann a_1) = isFalse ⋯

Lifted view of an annotated relation #

AnnotatedRelation.toLifted is the LiftedTK-tuple view of an annotated relation: a row (t, α) becomes the (n+1)-tuple whose first n entries are data (t i) and whose last entry is ann α. This is the "Definition 7" representation of a non-aggregate K-annotated relation in V_K-lifted form, mirroring AnnotatedRelation.toComposite (which produces a T ⊕ K-tuple in the same shape).

The bridge toLifted_eq_map_ofSum_toComposite says that lifting an annotated relation via toLifted agrees pointwise with “go through toComposite and then LiftedTK.ofSum each entry”, which is exactly how Query.evaluateInVK interprets the non-Agg cases. This lets us upgrade the existing Query.rewriting_valid (stated in toComposite form) to the unified Query.rewriting_valid_full (stated in toLifted form).

def AnnotatedTuple.toLifted {T K : Type} {n : ℕ} (p : AnnotatedTuple T K n) :

Tuple (LiftedTK T K) (n + 1)

Lifted view of an AnnotatedTuple: the n data entries become data v_i, with the annotation appended as ann α in the last slot.

Equations

p.toLifted = Fin.append (fun (k : Fin n) => LiftedTK.data (p.1 k)) ![LiftedTK.ann p.2]

Instances For

def AnnotatedRelation.toLifted {T K : Type} {n : ℕ} (ar : AnnotatedRelation T K n) :

Multiset (Tuple (LiftedTK T K) (n + 1))

Lifted view of an AnnotatedRelation: pointwise toLifted on tuples.

Equations

ar.toLifted = Multiset.map AnnotatedTuple.toLifted ar

Instances For

theorem AnnotatedTuple.toLifted_eq_ofSum_toComposite {T K : Type} {n : ℕ} (p : AnnotatedTuple T K n) :

p.toLifted = fun (k : Fin (n + Nat.succ 0)) => LiftedTK.ofSum (p.toComposite k)

The bridge: lifting an annotated tuple agrees pointwise with composite- encoding followed by LiftedTK.ofSum.

theorem AnnotatedRelation.toLifted_eq_map_ofSum_toComposite {T K : Type} {n : ℕ} (ar : AnnotatedRelation T K n) :

ar.toLifted = Multiset.map (fun (t : Tuple (T ⊕ K) (n + 1)) (k : Fin (n + 1)) => LiftedTK.ofSum (t k)) ar.toComposite

The bridge at the multiset level: toLifted agrees with toComposite ∘ map ofSum-on-tuples.

Hom pushforward on `LiftedTK` #

The pointwise pushforward of a function K → K' on a LiftedTK T K value: data v is preserved, ann α becomes ann (h α), and a tensor sum ktensor t becomes ktensor (mapHom h t) (each monomial's K-coefficient through h).

def LiftedTK.mapHomFn {T K K' : Type} (h : K → K') :

LiftedTK T K → LiftedTK T K'

Pointwise hom pushforward on LiftedTK.

Equations

LiftedTK.mapHomFn h (LiftedTK.data v) = LiftedTK.data v
LiftedTK.mapHomFn h (LiftedTK.ann α) = LiftedTK.ann (h α)
LiftedTK.mapHomFn h (LiftedTK.ktensor t) = LiftedTK.ktensor (KTensor.mapHom h t)

Instances For

@[simp]

theorem LiftedTK.mapHomFn_ofSum {T K K' : Type} (h : K → K') (x : T ⊕ K) :

mapHomFn h (ofSum x) = ofSum (Sum.map id h x)