Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: faster, linear HashMap.alter and modify #6573

Merged
merged 18 commits into from
Jan 14, 2025
Merged

Conversation

datokrat
Copy link
Contributor

@datokrat datokrat commented Jan 8, 2025

This PR replaces the existing implementations of (D)HashMap.alter and (D)HashMap.modify with primitive, more efficient ones and in particular provides proofs that they yield well-formed hash maps (WF typeclass).

@datokrat datokrat added the WIP This is work in progress, and only a PR for the sake of CI or sharing. No review yet. label Jan 8, 2025
@datokrat datokrat requested a review from TwoFX as a code owner January 8, 2025 08:25
@datokrat datokrat marked this pull request as draft January 8, 2025 08:40
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 9, 2025 09:37 Inactive
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Jan 9, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 9, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 9, 2025
@leanprover-community-bot leanprover-community-bot added the builds-mathlib CI has verified that Mathlib builds against this PR label Jan 9, 2025
@leanprover-community-bot
Copy link
Collaborator

leanprover-community-bot commented Jan 9, 2025

Mathlib CI status (docs):

  • ✅ Mathlib branch lean-pr-testing-6573 has successfully built against this PR. (2025-01-09 10:36:44) View Log
  • ✅ Mathlib branch lean-pr-testing-6573 has successfully built against this PR. (2025-01-09 13:45:31) View Log
  • ✅ Mathlib branch lean-pr-testing-6573 has successfully built against this PR. (2025-01-10 10:00:43) View Log
  • 🟡 Mathlib branch lean-pr-testing-6573 build against this PR was cancelled. (2025-01-10 13:30:15) View Log
  • ✅ Mathlib branch lean-pr-testing-6573 has successfully built against this PR. (2025-01-10 14:11:16) View Log
  • ✅ Mathlib branch lean-pr-testing-6573 has successfully built against this PR. (2025-01-13 14:12:48) View Log
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 05aa256c99ca16ac90536c5ea1322df37b24e958 --onto d2c4471cfa4611977bf4927b5cd849df1a4272b7. (2025-01-14 07:38:34)

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 9, 2025 12:56 Inactive
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 9, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 9, 2025
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 10, 2025 08:53 Inactive
@datokrat datokrat changed the title Implement faster, linear HashMap.alter feat: faster, linear HashMap.alter and modify Jan 10, 2025
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 10, 2025 09:07 Inactive
@datokrat datokrat added awaiting-review Waiting for someone to review the PR and removed WIP This is work in progress, and only a PR for the sake of CI or sharing. No review yet. labels Jan 10, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 10, 2025
src/Std/Data/DHashMap/Internal/AssocList/Lemmas.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/WF.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Basic.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/AssocList/Lemmas.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/List/Associative.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/List/Associative.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/List/Associative.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/Model.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/WF.lean Outdated Show resolved Hide resolved
src/Std/Data/DHashMap/Internal/WF.lean Outdated Show resolved Hide resolved
@TwoFX
Copy link
Member

TwoFX commented Jan 10, 2025

Might want to remove "linear" from the PR title to avoid the interpretation that there was previously a linearity problem.

@kim-em There was, in fact, a linearity problem. Here is the code for alter:

@[inline] def alter [LawfulBEq α] (m : DHashMap α β) (a : α) (f : Option (β a) → Option (β a)) : DHashMap α β :=
  match m.get? a with
  | none =>
    match f none with
    | none => m
    | some b => m.insert a b
  | some b =>
    match f (some b) with
    | none => m.erase a
    | some b => m.erase a |>.insert a b

In the very bottom branch, we verified using the IR and using benchmarks that the compiler decided to first run f (some b) and then call m.erase a, so when b was passed into f it was in fact shared, leading to quadratic runtimes for example if b is an array and f calls b.push.

@datokrat
Copy link
Contributor Author

datokrat commented Jan 10, 2025

(Update: I didn't see Markus' comment before writing this)

Might want to remove "linear" from the PR title to avoid the interpretation that there was previously a linearity problem.

Indeed, it turned out that alter accidentally wasn't linear before (in the case where the map is List-valued and we modify the contained lists), so this interpretation isn't wrong for alter.

I'm a little surprised at the decision to implement modify at the bottom level, rather than just defining it in terms of alter "further out". Perhaps it is a tiny bit faster, but it seems like a lot of extra work.

In my benchmarks, modify is about 30% faster than doing the same with alter. As far as I can tell, this is because we do not need to recompute the size, since modify never changes the size. Luckily, it wasn't hard to show that modify equals alter, so I was able to avoid most of the work.

@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 10, 2025 13:04 Inactive
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 10, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 10, 2025
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 10, 2025 13:20 Inactive
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 10, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 10, 2025
@kim-em
Copy link
Collaborator

kim-em commented Jan 11, 2025

In my benchmarks, modify is about 30% faster than doing the same with alter.

Awesome! Glad I asked.

Copy link
Member

@TwoFX TwoFX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the final few remarks, feel free to merge after fixing.

src/Std/Data/DHashMap/Internal/WF.lean Outdated Show resolved Hide resolved
@datokrat datokrat added awaiting-author Waiting for PR author to address issues and removed awaiting-review Waiting for someone to review the PR labels Jan 13, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/batteries that referenced this pull request Jan 13, 2025
leanprover-community-mathlib4-bot added a commit to leanprover-community/mathlib4 that referenced this pull request Jan 13, 2025
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 13, 2025 14:41 Inactive
@datokrat datokrat removed the awaiting-author Waiting for PR author to address issues label Jan 14, 2025
@datokrat datokrat marked this pull request as ready for review January 14, 2025 07:15
@datokrat datokrat added the changelog-library Library label Jan 14, 2025
@github-actions github-actions bot temporarily deployed to lean-lang.org/lean4/doc January 14, 2025 07:25 Inactive
@datokrat datokrat added this pull request to the merge queue Jan 14, 2025
Merged via the queue into master with commit 821c9b7 Jan 14, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
builds-mathlib CI has verified that Mathlib builds against this PR changelog-library Library toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants