Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider pex instead of pip for locking #601

Closed
phaer opened this issue Jul 26, 2023 · 18 comments
Closed

Consider pex instead of pip for locking #601

phaer opened this issue Jul 26, 2023 · 18 comments
Labels
enhancement New feature or request python

Comments

@phaer
Copy link
Member

phaer commented Jul 26, 2023

We are currently using pip install --dry-install --report in fetchPipMetadata to create our python lock files. While this works better than earlier FOD-based approaches for many use-cases, it has (at least) to major issues:

  • cross-platform locking: We can't currently use that approach to e.g. create aarch64-darwin lock files on x86_64-linux, because we can't easily fake platform environment markers, see Various package-index filtering flags do not affect the environment markers pypa/pip#11664 (comment)
  • evaluation of extras & environment markers currently happens at lock-time, not at build-time. While this approach was easier to implement because we could to it in python, it means that
    • the lock file becomes much larger as we need to lock each platform-extra combination ahead of time
    • it can result in incorrect cross-platform locks for legacy sdists which use setup.py. This is because we would need to prepare metadata by executing the setup.py on the build-platform. If the code in setup.py introspects the interpreter to determine the platform that could result in wrong metadata being output.

#583 (comment) recommended https://pex.readthedocs.io as an alternative to pip to tackle the first problem (and support cyclic dependencies, as that's what the linked issue was about).

pex lock files include unevaluated enviroment markers,same as pips installation report, i.e. "brotli>=1.0.9; platform_python_implementation == \"CPython\" and extra == \"brotli\"",. So we would need to evaluate those in nix. Prior work for that includes poetry2nix pep425.nix.

What do you think, is that something we should try? Would obsolete my work in https://github.com/phaer/untangled_snakes/ but I am happy about that as long as it improves our python support.

@DavHau @chaoflow

@phaer phaer added enhancement New feature or request python labels Jul 26, 2023
@jsirois
Copy link

jsirois commented Jul 26, 2023

pex lock files include unevaluated enviroment markers,same as pips installation report, i.e. "brotli>=1.0.9; platform_python_implementation == "CPython" and extra == "brotli"",. So we would need to evaluate those in nix.

What is the transform you do at a high level? From lock file to ... venv? sys.path entry (flat directory of installed wheels) or something else? Pex has a lot of tools to work with a lock file and those tools evaluate env markers to make the products appropriate for the target (whether a local interpreter of a foreign one described by a complete platform JSON generated on the foreign target with pex3 interpreter inspect --markers --tags).

@yajo
Copy link
Contributor

yajo commented Jul 27, 2023

Would it still work with the max-date setting that's currently supported by the pip locker? That's very handy!

@phaer
Copy link
Member Author

phaer commented Jul 27, 2023

What is the transform you do at a high level? From lock file to ... venv? sys.path entry (flat directory of installed wheels) or something else?

I'd say "something else", but a sys.path entry sounds close; We eventually put each python package in separate derivations.
Those derivations can be build "purely", meaning they get all their info from inputs declared in the repo (lock file, nixpkgs, etc) but don't have network access during build-time.

With the current approach, we just evaluated the markers during lock-time and wrote the evaluated/"effective" dependencies into the lock file, see e.g.

for what that looks like atm.
This has the significant drawback that we'd need to lock each extra/platform combination separately, so moving marking evaluation to build-time seems like the right way forward to me.

Poetry2nix re-implements pep425 and pep508 in nix.
Another alternative would be to call pex again during build-time (pex shouldn't need network connectivity or so for those commands).
The latter has the advantage of probably having a better tested implementation, but the disadvantage of adding pex to our build-closure.

I'll try to write a proof-of-concept for the latter approach next week or so :)

@phaer
Copy link
Member Author

phaer commented Jul 27, 2023

Would it still work with the max-date setting that's currently supported by the pip locker? That's very handy!

@DavHau and me recently wondered if there are any real use cases for that, so curios what you are using for? In any case, that feature works by mitm-proxying pypi, so that should be doable with pex as well if needed.

@yajo
Copy link
Contributor

yajo commented Jul 27, 2023

what you are using for?

I'm packaging Odoo. It's a mastodon composed of many pieces and tons of dependencies (js, binary, ruby, python...). This beast moves fast in some points and slow in some others. Their approach towards dependencies is "we officially support those supported by current debian stable at the time of launching Odoo". Odoo releases have a lifespan of 3 years.

As you can imagine, the deps diff accumulated between nixpkgs and debian stable in 3 years is abysmal. Besides, it's quite easy that there's a new dependency release that breaks Odoo in some way. One recent example: odoo/odoo#124351 (all CI is ❌ there).

Our derivations include many more modules. Those modules evolve at a different pace and may introduce new dependencies.

As you can imagine, keeping the balance between having a stable and an updated-enough deployment is no easy task.

So, summarizing, all I use this feature for is to be able to tell my devs: "if you add a new addon that has new dependencies, just run nix run .#refresh and commit those locks". And I like the peace of mind that the date capping gives me. That command will only add new stuff for that new module, but will not update any other dependencies that may break some other portion of the system. Then I can set up a renovate action that upgrades that date and refreshes the lock files, to let CI detect incompatibilities if any.

@DavHau
Copy link
Member

DavHau commented Jul 29, 2023

I think the max-date setting is very valuable as of now as it allows us to update individual dependencies. If we instead use a third party tool to manage the lock file, then this tool might already come with the ability to only update a single dependency, so we might not need the max-date feature anymore.

@phaer
Copy link
Member Author

phaer commented Jul 31, 2023

Hello,

I am giving a python module for dream2nix based on pex lock files and pyproject.nix a try - see #611 for details.

@jsirois
Copy link

jsirois commented Aug 3, 2023

If we instead use a third party tool to manage the lock file, then this tool might already come with the ability to only update a single dependency, so we might not need the max-date feature anymore.

This is in fact true for Pex. You can use pex3 lock update -p "just-this<3" lock.json to update the existing lock.json if possible by keeping everything the same but trying to bump the "just-this" project to the maximum compatible version less than 3.

@yajo
Copy link
Contributor

yajo commented Sep 14, 2023

I've tested it pex3. Indeed the lock file is quite cool! I tested it with this command FWIW:

pex3 lock create --transitive --indent 2 --style universal -o lock.json --resolver-version pip-2020-resolver copier pdm
cat lock.json

It seems like plugging that into dream2nix would be a piece of cake.

However, regarding UX, until now, dream2nix provides the interface to refresh the lock file:

nix run -L .#package.config.lock.refresh

This, together with the mitmproxy, gives a reproducible output based on the date. It means, in practice, that the target package can be packaged no matter with wich framework: poetry, pdm, flit, setuptools, or just a raw requirements.txt file. As long as the dependency is properly fed into dream2nix, it will produce the correct lock file.

I'm not talking only about times when you have to install dependencies, but also times where you use dream2nix to maintain a python project itself. In your example, it's using setuptools. But if I were using poetry, it'd work too.

If we rely on pex3 to maintain the lock file by following the strategy you're saying, then dream2nix users would need to manually keep it up to date, instead of the current process of:

  1. Update your dependency.
  2. Re-lock.

Given the use case I have (#601 (comment)), I really appreciate the current simplicity of that process.

FWIW this is the requirements.txt file I feed into dream2nix. Pretty please don't make me maintain manually a lock file based on those requirements. 🙏🏼 😓

@phaer
Copy link
Member Author

phaer commented Sep 14, 2023

Hello & thank you for the detailed description, but could you clarify the following part:

If we rely on pex3 to maintain the lock file by following the strategy you're saying, then dream2nix users would need to manually keep it up to date, instead of the current process of:

Because i am not sure if I follow here: You'd like to have a mode where you update the lock-file to the newest packages that satisfy the constraints in your pyproject.tom/requirements.txt/setup.py?

If so, wouldn't it work like that if you just set the snaphot date to null?

Or are you talking about the the incremental aspects of #601 (comment), where one could re-use the existing lock-file?
Agree that this would be a nice feature to have and plan to do so, but it's not implemented yet.

@yajo
Copy link
Contributor

yajo commented Sep 22, 2023

You'd like to have a mode where you update the lock-file to the newest packages that satisfy the constraints in your pyproject.tom/requirements.txt/setup.py?

Yes.

If so, wouldn't it work like that if you just set the snaphot date to null?

Well, the difference is that each time you run it, you'd get different results. Right now, OTOH, each time you run it, you get the same results because of the mitmproxy + pypi snapshot date.

Imagine the problem when one dependency fails; then you patch it; then you re-lock but another dependency slips in and breaks differently. I've been there when using poetry and poetry2nix: poetry has the nasty habit of updating dependencies every time it can. It's really frustrating.

So, the current approach on dream2nix is awesome. That's what I meant! ❤️ The only thing it's lacking is multi-arch locking. So, if possible, please fix just that (maybe with pex) and leave the rest of the UX as it is, because it's great!

It's so great that I've stopped using poetry and went back to setuptools on a project because poetry now adds no value. It's so great because I could use hatch, setuptools, or flit, or poetry, or whatever... and the workflow would be exactly the same.

@yajo
Copy link
Contributor

yajo commented Feb 23, 2024

TIL about pip-tools... would this help? https://pip-tools.readthedocs.io/en/stable/#using-hashes
It seems to produce a hashed lock file from any requirements.txt

@DavHau
Copy link
Member

DavHau commented Feb 25, 2024

I think we should just focus on stabilizing the pdm module. It fixes most of the major issues we currently have with pip.

@DavHau
Copy link
Member

DavHau commented Feb 25, 2024

@yajo the pdm module does allow you to update individual dependencies via the pdm cli and it also has multi platform lock files. Maybe you want to give it a shot? I could use someone thoroughly testing it.

@yajo
Copy link
Contributor

yajo commented Feb 27, 2024

I'll give it a shot then and report. However I can't promise it'll be soon 😅

@phaer phaer closed this as completed Apr 18, 2024
@yajo
Copy link
Contributor

yajo commented Apr 18, 2024

I still didn't get a chance to test the pdm module. Is that the final answer to this issue?

@phaer
Copy link
Member Author

phaer commented Apr 18, 2024

No! Sorry for the confusion, I just noticed that this issue was still open while I had closed the PR #611 a while ago because I lack time/priority on that.

Happy to re-open if you think thats still useful and/or want to work on it?

@yajo
Copy link
Contributor

yajo commented Apr 18, 2024

No, don't worry. I think the issue title is a bit misleading because it focuses on the solution rather than the problem.

The problem is that dream2nix can't build multiarch flakes. If that's solved it's OK closing it AFAIK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

No branches or pull requests

4 participants