-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: embed and check githash in .olean #2766
Conversation
|
Options for a successful PR run:
|
Note: this will break |
While we're changing the olean format, I also have a request: add a version field, pinned to I don't want to start the discussion of ABI versioning yet, but it would be nice if we had a little foresight... |
I have a patch for leantar up at https://github.com/digama0/leangz/tree/olean_v1 . It is backward compatible with earlier versions, so I will make a release for it as soon as the new olean format is finalized and this PR is merged. |
I'll repeat my request for adding a version field. |
I added a version field for the .olean header format itself but I don't see how we could realistically version the payload format itself given that any change to a user environment extension can be an ABI change. And even if we restricted ourselves to core, it seems infeasible to me without some automation double-checking the contract.
It doesn't have to, |
Thanks. A "good enough" solution is fine for the leantar usecase, as if it gets the format wrong the worst that happens is the compression ratio goes way down. The main idea here is that when we anticipate breakage due to use cases in the wild we have a number we can bump to fix them. For lean itself it makes sense that you would want to use the exact githash.
The problem here is that one would have to know which leantar version to use given an olean, and unless it is being distributed with the toolchain and being released with every commit I don't how to do that. The number of olean version changes that are relevant to leantar that have happened in the last year can be counted on one hand, so it is perfectly feasible to release new leantar versions when needed and have complete backward compatibility to every change made to the format so far. Maybe this won't be the case forever but the changes so far have been quite tame and predictable. I have some ideas for how to do complete olean versioning using "schema generation" from the lean environment; a lot of this is already implemented in oleandump, which reads lean declarations to determine how to deserialize oleans. That would make it possible to detect and automatically bump the version whenever there is an ABI breaking change. But I'm not proposing that at this time, only a version that is manually bumped to avoid the worst fallout of ABI breaks. |
Why does my suggestion of hardcoding the leantar version in the |
Oh I see, yes that would work, you are saying that the "olean version" would be tracked in the cache source code instead of the lean core source code. This can be done without needing to pin leantar versions too, cache could pass an olean version as a command line argument. I would go for all of the above: a lean embedded olean version number, a cache version command line arg, and leantar versioned releases. The reason is because these each have slightly different development processes and this gives us flexibility to detect and correct issues caused by any of these places. |
(ready for review but I want to do one final test round before merge) |
Updated https://github.com/digama0/leangz/tree/olean_v1 , I will make a release once this PR is merged. Fortunately or unfortunately, old leantar versions will detect that these oleans have a funny format and will instead use a generic (and much less efficient) compression format, so if we are not on the ball bumping the leantar version at the right time we will simply have some large cache files. The new leantar is backward compatible though, so I will try to get it merged in advance of this change landing in mathlib, and then hopefully it will not be an issue. |
709cb77
to
a3793f4
Compare
Mathlib used parts of ProofWidgets in tests, that were not being built by |
After leanprover/lean4#2766 this becomes critical. Co-authored-by: Scott Morrison <[email protected]>
After leanprover/lean4#2766 this becomes critical. Co-authored-by: Scott Morrison <[email protected]>
After leanprover/lean4#2766 this becomes critical. Co-authored-by: Scott Morrison <[email protected]>
@digama0 It's merged! |
After leanprover/lean4#2766 lands, the git hash of will be embedded in every olean. `lean4checker` will thus reject oleans unless it was compiled on the same toolchain as they were! We may want to relax this later, but for now, since we are building `lean4checker` in CI anyway, let's just make sure it is on the same toolchain as mathlib. Co-authored-by: Scott Morrison <[email protected]>
leanprover/lean4#2766 changed the format of .olean files, so `leantar` also needs to be updated to support it. `leantar 0.1.9` supports both the new format and the old one, and is fully backward compatible, so it is safe to land this in advance of the next RC bump. (This change has already been tested on [`lean-pr-testing-2766`](0c9d123).) In fact, leantar is also forward-compatible in this release: old leantar can pack and unpack oleans made by new lean, but it does not recognize the files as oleans, so it falls back to plain old gzip (with a significant loss in compression quality). But it's probably best to land this first so we don't waste space and download time. Co-authored-by: Mario Carneiro <[email protected]>
After leanprover/lean4#2766 this becomes critical. Co-authored-by: Scott Morrison <[email protected]>
After leanprover/lean4#2766 lands, the git hash of will be embedded in every olean. `lean4checker` will thus reject oleans unless it was compiled on the same toolchain as they were! We may want to relax this later, but for now, since we are building `lean4checker` in CI anyway, let's just make sure it is on the same toolchain as mathlib. Co-authored-by: Scott Morrison <[email protected]>
leanprover/lean4#2766 changed the format of .olean files, so `leantar` also needs to be updated to support it. `leantar 0.1.9` supports both the new format and the old one, and is fully backward compatible, so it is safe to land this in advance of the next RC bump. (This change has already been tested on [`lean-pr-testing-2766`](0c9d123).) In fact, leantar is also forward-compatible in this release: old leantar can pack and unpack oleans made by new lean, but it does not recognize the files as oleans, so it falls back to plain old gzip (with a significant loss in compression quality). But it's probably best to land this first so we don't waste space and download time. Co-authored-by: Mario Carneiro <[email protected]>
This is an additional safety net on top of #2749: it protects users that circumvent the build system (e.g. with
lake env
) as well as obviates the need for TOCTOU-like race condition checks in the build system.The check is activated by
CHECK_OLEAN_VERSION=ON
, which now defaults toOFF
as the sensible default for local development. When activated,USE_GITHASH=ON
is also force-enabled for stage 0 in order to make sure that stage 1 can load its own core library.