-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to exclude dependencies from pex file #2097
Comments
You'd have to find a way to ensure the version already present in the target env is compatible with the version janked from the pex or you might find yourself with hard to debug issues. |
Yeah, as @kaos points out this is definitely buyer beware. It at least requires the Such that This would allow not using |
Ok, I've spent some time prototyping this (the advanced implementation described above). This works great but its also very sensitive. It turns out two installs of the same wheel can lead to a different hash due to ~inconsequential differences. For example:
So, in this case the cowsay installed in a venv was installed by a Pip that used a newer version of wheel to do the install work than Pex did and that version of wheel both leaves its version in metadata files as well as handling some (~inconsequential) metadata differently. My experiment looks like so when this happens:
Running it yields a failure since the cowsay wheel is excluded:
But letting the PEX get cowsay from another venv works with warning:
I'll think a bit on how to boild this all down into a sane set of defaults and options to control all this. |
Ok, a conversation over in Pants slack has re-awakened the possibility of bloated 3rdparty deps and the need to sometimes say: I know better, exclude this because I know this dep is not actually used. The case in that discussion was So I think there are two types of excludes to support:
Basically, case 1 is where you know the code goes unused at runtime - you know better. Case 2 is where you need the code, but its a waste to ship in the PEX because you know the PEX will only run in environments that provide that code in pre-installed packages. Implementing case 1 solves both use cases and could ship before implementing case 2. You have to be very careful with it under use case 2 though setting up Implementing case 2 is what I explored above with nice fail-fast behavior and no need for |
When excluding a requirement from a PEX, any resolved distribution matching that requirement, as well as any of its transitive dependencies not also needed by non-excluded requirements, are elided from the PEX. At runtime these missing dependencies will not trigger boot resolve errors, but they will cause errors if the modules they would have provided are attempted to be imported. If the intention is to load the modules from the runtime environment, then `--pex-inherit-path` / `PEX_INHERIT_PATH` or `PEX_EXTRA_SYS_PATH` knobs must be used to allow the PEX to see distributions installed in the runtime environment. Clearly, you must know what you're doing to use this option and not encounter runtime errors due to import errors. Be ware! A forthcoming `--provided` option, with similar effects on the PEX contents, will both automatically inherit any needed missing distributions from the runtime environment and require all missing distributions are found; failing fast if they are not. Work towards pex-tool#2097.
When excluding a requirement from a PEX, any resolved distribution matching that requirement, as well as any of its transitive dependencies not also needed by non-excluded requirements, are elided from the PEX. At runtime these missing dependencies will not trigger boot resolve errors, but they will cause errors if the modules they would have provided are attempted to be imported. If the intention is to load the modules from the runtime environment, then `--pex-inherit-path` / `PEX_INHERIT_PATH` or `PEX_EXTRA_SYS_PATH` knobs must be used to allow the PEX to see distributions installed in the runtime environment. Clearly, you must know what you're doing to use this option and not encounter runtime errors due to import errors. Be ware! A forthcoming `--provided` option, with similar effects on the PEX contents, will both automatically inherit any needed missing distributions from the runtime environment and require all missing distributions are found; failing fast if they are not. Work towards #2097.
Ok, item 1. - |
I have not started work on @AlexTereshenkov, I'm not sure if you took my advice here: #455 (comment), but, if not, you should be able to use exhaustive excludes if that is somehow easier for you. |
Still some issues to work through on that PR #2409. I may not get back to this until the 29th or so. |
I'm commenting here since this is still open and you have #2409 in progress. I can open a new issue if preferred though. I tried using the
I can get around this by adding |
@TonySherman can you provide more context for your case? Notably, it seems to involve Pants and, fwict maybe involves Pants performing a subset of a PEX repository? For example, creating a PEX using exclude works fine afaict:
It's only when you try to use code that needs the excluded dep, that you hit issues - as you should:
And, which you can correct by using either
|
@TonySherman actually a PEX repository subset works fine. Using the
So I'll need very detailed information to flesh out your Pants error message. In particular the Pex command line it runs that leads to that error. The Pex version in use would be helpful too. |
Sorry about the lack of information. I recently worked on pantsbuild/pants#20939 which passes additional arguments to the My use case is building aws lambda function zips which don't need to have boto3/botocore in the package because they are in the runtime. |
Alright. The now polished |
Pex currently relies on Pip to resolve dependencies and build the PEX file. However, there are cases where users may want to exclude certain packages, either because they are already available on the target environment or because they are too large to include in the PEX file.
The proposed solution is to add an exclude flag to Pex, which would allow users to specify a list of packages to exclude from the PEX file. During the assembly phase, Pex would exclude these packages and their transitive dependencies, resulting in a smaller PEX file.
I believe this would be a useful feature for many users. We would want to use it with Pants to exclude dependencies such as pyspark for an environment that includes pyspark already.
This issue follows up on the comment here: #2082 (comment)
Other related issues have been filed previously in pants and pex as well. #1146 pantsbuild/pants#11324
The text was updated successfully, but these errors were encountered: