feat: query complexity model for algorithms#275
feat: query complexity model for algorithms#275Shreyas4991 wants to merge 154 commits intoleanprover:mainfrom
Conversation
Co-authored-by: Sorrachai Yingchareonthawornhcai <sorrachai@users.noreply.github.com>
Co-authored-by: Sorrachai Yingchareonthawornhcai <sorrachai@users.noreply.github.com>
…ity-freeM-shreyas
…ity-freeM-shreyas
…ity-freeM-shreyas
| @[grind =] | ||
| lemma SortModel_insertHeadquery_iff [LinearOrder α] (l : List α) (x : α) : | ||
| (sortModel α).evalQuery (insertHead x l) = x :: l := by | ||
| simp [sortModel] | ||
|
|
||
| @[simp] | ||
| lemma cost_cmpLT_compares [LinearOrder α] : ((sortModel α).2 (cmpLE head x)).compares = 1 := by | ||
| simp [sortModel] | ||
|
|
||
| @[simp] | ||
| lemma cost_cmpLT_inserts [LinearOrder α] : | ||
| ((sortModel α).2 (cmpLE head x)).inserts = 0 := by | ||
| simp [sortModel] | ||
|
|
||
| @[simp] | ||
| lemma cost_insertHead_compares [LinearOrder α] : | ||
| ((sortModel α).2 (insertHead x l)).compares = 0 := by | ||
| simp [sortModel] |
There was a problem hiding this comment.
Does putting simps on def sortModel generate all these lemmas? (you can check what it generates with simps? or whatsnew in)
| the index type depends. This way, any instance parameters of `α` can be used easily | ||
| for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed. | ||
| So you can't supply instances for the index type there. | ||
| 2. Define a `Model Q C` type instance |
There was a problem hiding this comment.
This is no longer an instance.
I think you're missing the step where you choose a cost model.
| 1. Define an inductive type of queries which carries. This type must at least one index parameter | ||
| which denotes the output type of the query. Additionally it helps to have a parameter `α` on which | ||
| the index type depends. This way, any instance parameters of `α` can be used easily | ||
| for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed. |
There was a problem hiding this comment.
| for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed. | |
| for the output types. The signatures of `Model.evalQuery` and `Model.cost` are fixed. |
| FreeM.liftM (m := Id) (fun {ι} q => (sortModelNat α).evalQuery q) | ||
| (mergeSort (xs.take (xs.length / 2))) = | ||
| mergeSortNaive (xs.take (xs.length / 2)) := by |
There was a problem hiding this comment.
Flagging again that this is type-incorrect; this should be
| FreeM.liftM (m := Id) (fun {ι} q => (sortModelNat α).evalQuery q) | |
| (mergeSort (xs.take (xs.length / 2))) = | |
| mergeSortNaive (xs.take (xs.length / 2)) := by | |
| Id.run (FreeM.liftM (fun {ι} q => pure <| (sortModelNat α).evalQuery q) | |
| (mergeSort (xs.take (xs.length / 2)))) = | |
| mergeSortNaive (xs.take (xs.length / 2)) := by |
| /-- | ||
| The vanilla-lean version of `merge` that merges two lists. When the two lists | ||
| are sorted, so is the merged list. | ||
| -/ | ||
| def mergeNaive [LinearOrder α] (x y : List α) : List α := | ||
| match x,y with | ||
| | [], ys => ys | ||
| | xs, [] => xs | ||
| | x :: xs', y :: ys' => | ||
| if x ≤ y then | ||
| let rest := mergeNaive xs' (y :: ys') | ||
| x :: rest | ||
| else | ||
| let rest := mergeNaive (x :: xs') ys' | ||
| y :: rest |
There was a problem hiding this comment.
What happened to using List.merge here?
There was a problem hiding this comment.
I'd rather not make this change. It has a visual purpose. It shows how to do things like define a function looking identical to the Prog version. Also List.merge has no fun_induction induction principle and needs unnecessary digressions into boolean valued comparisons, so it is a pain to work with. Working with DecidableEq everywhere is more consistent. We are doing algorithms theory here anyway. Not producing efficient code.
There was a problem hiding this comment.
I've not tried any of this myself, but even if fun_induction doesn't work, can you write your own induction principle that matches what you're doing here?
There was a problem hiding this comment.
That would be a mathlib PR. Given mathlib-PR lifecycle, I would not push for that within this PR.
There was a problem hiding this comment.
I don't think cslib should be in the business of duplicating pure functions from Lean core.
There was a problem hiding this comment.
@Shreyas4991 I don't understand your meaning. Besides that this is from core, not Mathlib, we often write extensions and then later find the right place to upstream them. If there's a usability issue, let's figure out what we need instead of duplicating core.
There was a problem hiding this comment.
I agree, but I have a very specific purpose here. To have a nearly identical looking function just next to the Prog version.
There was a problem hiding this comment.
If you want that purpose then I recommend writing
private theorem merge_eq :
List.merge x y
=
match x,y with
| [], ys => ys
| xs, [] => xs
| x :: xs', y :: ys' =>
if x ≤ y then
let rest := List.merge xs' (y :: ys')
x :: rest
else
let rest := List.merge (x :: xs') ys'
y :: rest
which avoids creating a duplicate definition.
There was a problem hiding this comment.
@chenson2018 List.merge can generate induct lemmas. It works if I copy-paste that definition in my file. For some reason the core version doesn't. Proving a whole induction lemma cannot be the answer since that is also pointless duplication of what the compiler should already by proving
There was a problem hiding this comment.
and fwiw, mergeSortNaive is also functionally equivalent to List.mergeSort. However I am keeping this function as it is because I want its recursive structure to match my mergeSort. It is labelled private. If it is considered worthwhile to match the libary's mergeSort, then I leave it to a future PR author.
| /-- | ||
| A cost type for counting the operations of `SortOps` with separate fields for | ||
| counting calls to `cmpLT` and `insertHead` | ||
| -/ | ||
| @[ext, grind] | ||
| structure SortOpsCost where | ||
| /-- `compares` counts the number of calls to `cmpLT` -/ | ||
| compares : ℕ | ||
| /-- `inserts` counts the number of calls to `insertHead` -/ | ||
| inserts : ℕ |
There was a problem hiding this comment.
It looks like this is quite a common pattern, of "just count how many times each operation was called". Perhaps we should add a TODO about autogenerating this machinery?
Another more abstract option would be to encourage using FreeAddMonoid (SortOps α) as the cost, then the user can write their own cost function to accumulate all the operations later; in this case it would be
abbref SortOpsCost := FreeAddMonoid (SortOps α)
def SortOpsCost.compares := SortOpsCost.lift fun | cmpLE _ _ => 1 | _ => 0
def SortOpsCost.inserts := SortOpsCost.lift fun | inserts Head _ _ => 1 | _ => 0
This gives you the additive structure for free, but not the order.
I think it's worth discussion these options in a docstring or on Zulip.
There was a problem hiding this comment.
For basic models, I want to fix cost functions in a file in the Models folder. For now I'd suggest sticking with this. We can have a discussion on zulip and handle this in a future PR.
| lemma merge_is_mergeNaive [LinearOrder α] (x y : List α) : | ||
| (merge x y).eval (sortModelNat α) = mergeNaive x y := by | ||
| fun_induction mergeNaive | ||
| lemma merge_eval_eq_list_merge [LinearOrder α] (x y : List α) : |
There was a problem hiding this comment.
The naming convention says
| lemma merge_eval_eq_list_merge [LinearOrder α] (x y : List α) : | |
| lemma merge_eval_eq_listMerge [LinearOrder α] (x y : List α) : |
Generalizing @tannerduve 's free monad take on the query complexity idea. This model of algorithmic complexity provides a lightweight approach to complexity verification of algorithms, similar to #165
However it offers several improvements over #165 :
Update to the below drawback : The model requires users to choose a cost for each pure operation upfront by a typeclass instance for a given type ofCost. This is a departure fromTimeMs model. But with custom Cost types with a specific field forpure, these costs can be separated from that of calls to queries.One drawback : It is still possible to sneak in free operations inside
pure. However this is unavoidable in any lightweight monadic approach. A deeply embedded DSL is the only foolproof way to avoid this. Nevertheless this approach removes annotation and review burden and ensures that any actual call to a query will be counted. Thus it is easy to notice when a monadic operation is not being called.Zulip Discussion Thread: https://leanprover.zulipchat.com/#narrow/channel/513188-CSLib/topic/Query.20model.20for.20Algorithms.20complexity.20.3A.20updates/near/569606456