Skip to content

Comments

feat: query complexity model for algorithms#275

Open
Shreyas4991 wants to merge 154 commits intoleanprover:mainfrom
Shreyas4991:query-complexity-freeM-shreyas
Open

feat: query complexity model for algorithms#275
Shreyas4991 wants to merge 154 commits intoleanprover:mainfrom
Shreyas4991:query-complexity-freeM-shreyas

Conversation

@Shreyas4991
Copy link
Contributor

@Shreyas4991 Shreyas4991 commented Jan 21, 2026

Generalizing @tannerduve 's free monad take on the query complexity idea. This model of algorithmic complexity provides a lightweight approach to complexity verification of algorithms, similar to #165

However it offers several improvements over #165 :

  1. No explicit ticks needed, so the chances of making mistakes like forgetting to count specific calls is removed.
  2. The operations which are counted are made explicit as an inductive type and a model of this type.
  3. The model of a query type accepts custom cost structures so long as you can define addition and a 0 operation on them (basically additive monoids suffice).
  4. With a notion of reductions between models, this approach is more modular for the algorithm specifier, allowing high level specs of operations which can be translated into simpler query models later.

Update to the below drawback : The model requires users to choose a cost for each pure operation upfront by a typeclass instance for a given type of Cost. This is a departure from TimeMs model. But with custom Cost types with a specific field for pure, these costs can be separated from that of calls to queries.

One drawback : It is still possible to sneak in free operations inside pure. However this is unavoidable in any lightweight monadic approach. A deeply embedded DSL is the only foolproof way to avoid this. Nevertheless this approach removes annotation and review burden and ensures that any actual call to a query will be counted. Thus it is easy to notice when a monadic operation is not being called.

Zulip Discussion Thread: https://leanprover.zulipchat.com/#narrow/channel/513188-CSLib/topic/Query.20model.20for.20Algorithms.20complexity.20.3A.20updates/near/569606456

@Shreyas4991 Shreyas4991 changed the title Query complexity free m shreyas Query complexity formalisation for describing algorithmic models Jan 22, 2026
Comment on lines 120 to 137
@[grind =]
lemma SortModel_insertHeadquery_iff [LinearOrder α] (l : List α) (x : α) :
(sortModel α).evalQuery (insertHead x l) = x :: l := by
simp [sortModel]

@[simp]
lemma cost_cmpLT_compares [LinearOrder α] : ((sortModel α).2 (cmpLE head x)).compares = 1 := by
simp [sortModel]

@[simp]
lemma cost_cmpLT_inserts [LinearOrder α] :
((sortModel α).2 (cmpLE head x)).inserts = 0 := by
simp [sortModel]

@[simp]
lemma cost_insertHead_compares [LinearOrder α] :
((sortModel α).2 (insertHead x l)).compares = 0 := by
simp [sortModel]
Copy link
Contributor

@eric-wieser eric-wieser Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does putting simps on def sortModel generate all these lemmas? (you can check what it generates with simps? or whatsnew in)

the index type depends. This way, any instance parameters of `α` can be used easily
for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed.
So you can't supply instances for the index type there.
2. Define a `Model Q C` type instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer an instance.

I think you're missing the step where you choose a cost model.

1. Define an inductive type of queries which carries. This type must at least one index parameter
which denotes the output type of the query. Additionally it helps to have a parameter `α` on which
the index type depends. This way, any instance parameters of `α` can be used easily
for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for the output types. The signatures of `Model.evalQuery` and `Model.Cost` are fixed.
for the output types. The signatures of `Model.evalQuery` and `Model.cost` are fixed.

Comment on lines 135 to 137
FreeM.liftM (m := Id) (fun {ι} q => (sortModelNat α).evalQuery q)
(mergeSort (xs.take (xs.length / 2))) =
mergeSortNaive (xs.take (xs.length / 2)) := by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging again that this is type-incorrect; this should be

Suggested change
FreeM.liftM (m := Id) (fun {ι} q => (sortModelNat α).evalQuery q)
(mergeSort (xs.take (xs.length / 2))) =
mergeSortNaive (xs.take (xs.length / 2)) := by
Id.run (FreeM.liftM (fun {ι} q => pure <| (sortModelNat α).evalQuery q)
(mergeSort (xs.take (xs.length / 2)))) =
mergeSortNaive (xs.take (xs.length / 2)) := by

Comment on lines 19 to 33
/--
The vanilla-lean version of `merge` that merges two lists. When the two lists
are sorted, so is the merged list.
-/
def mergeNaive [LinearOrder α] (x y : List α) : List α :=
match x,y with
| [], ys => ys
| xs, [] => xs
| x :: xs', y :: ys' =>
if x ≤ y then
let rest := mergeNaive xs' (y :: ys')
x :: rest
else
let rest := mergeNaive (x :: xs') ys'
y :: rest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to using List.merge here?

Copy link
Contributor Author

@Shreyas4991 Shreyas4991 Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not make this change. It has a visual purpose. It shows how to do things like define a function looking identical to the Prog version. Also List.merge has no fun_induction induction principle and needs unnecessary digressions into boolean valued comparisons, so it is a pain to work with. Working with DecidableEq everywhere is more consistent. We are doing algorithms theory here anyway. Not producing efficient code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not tried any of this myself, but even if fun_induction doesn't work, can you write your own induction principle that matches what you're doing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a mathlib PR. Given mathlib-PR lifecycle, I would not push for that within this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think cslib should be in the business of duplicating pure functions from Lean core.

Copy link
Collaborator

@chenson2018 chenson2018 Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shreyas4991 I don't understand your meaning. Besides that this is from core, not Mathlib, we often write extensions and then later find the right place to upstream them. If there's a usability issue, let's figure out what we need instead of duplicating core.

Copy link
Contributor Author

@Shreyas4991 Shreyas4991 Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I have a very specific purpose here. To have a nearly identical looking function just next to the Prog version.

Copy link
Contributor

@eric-wieser eric-wieser Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want that purpose then I recommend writing

private theorem merge_eq :
  List.merge x y
  =
  match x,y with
  | [], ys => ys
  | xs, [] => xs
  | x :: xs', y :: ys' =>
      if x ≤ y then
        let rest := List.merge xs' (y :: ys')
        x :: rest
      else
        let rest := List.merge (x :: xs') ys'
        y :: rest

which avoids creating a duplicate definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenson2018 List.merge can generate induct lemmas. It works if I copy-paste that definition in my file. For some reason the core version doesn't. Proving a whole induction lemma cannot be the answer since that is also pointless duplication of what the compiler should already by proving

Copy link
Contributor Author

@Shreyas4991 Shreyas4991 Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and fwiw, mergeSortNaive is also functionally equivalent to List.mergeSort. However I am keeping this function as it is because I want its recursive structure to match my mergeSort. It is labelled private. If it is considered worthwhile to match the libary's mergeSort, then I leave it to a future PR author.

Comment on lines +32 to +41
/--
A cost type for counting the operations of `SortOps` with separate fields for
counting calls to `cmpLT` and `insertHead`
-/
@[ext, grind]
structure SortOpsCost where
/-- `compares` counts the number of calls to `cmpLT` -/
compares : ℕ
/-- `inserts` counts the number of calls to `insertHead` -/
inserts : ℕ
Copy link
Contributor

@eric-wieser eric-wieser Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is quite a common pattern, of "just count how many times each operation was called". Perhaps we should add a TODO about autogenerating this machinery?

Another more abstract option would be to encourage using FreeAddMonoid (SortOps α) as the cost, then the user can write their own cost function to accumulate all the operations later; in this case it would be

abbref SortOpsCost := FreeAddMonoid (SortOps α)
def SortOpsCost.compares := SortOpsCost.lift fun | cmpLE _ _ => 1 | _ => 0
def SortOpsCost.inserts := SortOpsCost.lift fun | inserts Head _ _ => 1 | _ => 0

This gives you the additive structure for free, but not the order.

I think it's worth discussion these options in a docstring or on Zulip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For basic models, I want to fix cost functions in a file in the Models folder. For now I'd suggest sticking with this. We can have a discussion on zulip and handle this in a future PR.

@eric-wieser eric-wieser mentioned this pull request Feb 24, 2026
lemma merge_is_mergeNaive [LinearOrder α] (x y : List α) :
(merge x y).eval (sortModelNat α) = mergeNaive x y := by
fun_induction mergeNaive
lemma merge_eval_eq_list_merge [LinearOrder α] (x y : List α) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming convention says

Suggested change
lemma merge_eval_eq_list_merge [LinearOrder α] (x y : List α) :
lemma merge_eval_eq_listMerge [LinearOrder α] (x y : List α) :

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants