-
-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow building AWS Lambda Layers #18880
Labels
Comments
Some prior art from @bobthemighty: https://github.com/bobthemighty/pants-lambda-layer |
huonw
added a commit
that referenced
this issue
May 20, 2023
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level, e.g. the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl` (plus the dynamic PEX initialisation). This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |-------------------------------|----------|------------------| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) This functionality is driven by adding a new `layout` field. It defaults to `lambdex` (retaining the current code paths), but also supports `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. It comes in three phases: 1. Add the `pex_venv.py` util rules for running `pex3 venv create ...`. Currently this only supports a limited subset of the functionality there, but can presumably be expanded freely as required. (First commit) 2. Do some minor refactoring. (Commits labelled "refactor: ...") 3. Draw the rest of the owl. (The others.) I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - deprecate `layout="lambdex"` (in favour of `layout="zip"` and/or normal `pex_binary`) (#19032) - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first
This was referenced May 20, 2023
thejcannon
pushed a commit
that referenced
this issue
May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
WorkerPants
pushed a commit
that referenced
this issue
May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
huonw
added a commit
to huonw/pants
that referenced
this issue
May 23, 2023
…d#19076) This fixes pantsbuild#18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (pantsbuild#19027) - adjust documentation pantsbuild#19067 - other improvements like pantsbuild#18195 and pantsbuild#18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as discussed in pantsbuild#19074 (comment) and pantsbuild#19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
huonw
added a commit
that referenced
this issue
May 23, 2023
…ck of #19076) (#19120) This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
This was referenced May 23, 2023
huonw
added a commit
that referenced
this issue
May 30, 2023
This fixes #18880 by making pants able to create a AWS Lambda package with the layout expected by a "Layer" (https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html). For Python, this the same as a normal Lambda, except within a `python/`: `python/cowsay/__init__.py` will be importable via `import cowsay`. The big win here is easily separating requirements from sources: ```python # path/to/BUILD python_sources() python_awslambda( name="lambda", entry_point="./foo.py:handler", include_requirements=False ) python_aws_lambda_layer( name="layer", dependencies=["./foo.py"], include_requirements=True, include_sources=False, ) ``` Packaging that will result in `path.to/lambda.zip` that contains only first-party sources, and `path.to/layer.zip` that contains only third-party requirements. This results in faster builds, less cache usage, and smaller deploy packages when only changing first-party sources. This side-steps some of the slowness in #19076. For the example in that PR: | metric | Lambdex | zip, no layers | using zip + layers (this PR) | |-------------------------------|----------|----------------|--------------------------------| | init time on cold start | 2.3-2.5s | 1.3-1.4s | not yet tested | | compressed size | 24.6MB | 23.8MB | 28KB (lambda), 23.8MB (layer) | | uncompressed size | 117.8MB | 115.8MB | 72KB (lambda), 115.7MB (layer) | | PEX-construction build time | ~5s | ~5s | ~1.5s (lambda), ~5s (layer) | | PEX-postprocessing build time | 0.14s | 4.8s | 1s (lambda), ~5s (layer) | That is, the first-party-only lambda package ~1000×; smaller, and is ~2× faster to build than even the Lambdex version. This uses a separate target, `python_aws_lambda_layer`. The target has its inputs configured using the `dependencies=[...]` field. For instance, the example above is saying create a lambda using all of the third-party requirements required (transitively) by `./foo.py`, and none of the first-party sources. (The initial implementation bdbc1cb just added a `layout="layer"` option to the existing `python_awslambda` target, but, per the discussion in this PR, that was deemed unnecessarily confusing, e.g. it'd keep the `handler` field around, which is meaningless for a layer.) Follow-up not handled here: - for 2.18: - documentation - renaming the `python_awslambda` target to `python_aws_lambda_function` to be clearer (NB. I proposing also taking the chance to add an underscore in the name, which I've done with the `python_aws_lambda_layer` target. Let me know if there's a strong reason for the `awslambda` name without an underscore. I note the GCF target is `python_google_cloud_function`) - not necessarily for 2.18: - potentially, adding "sizzle" like the ability to have `python_aws_lambda_function` specify which layers it will be used with, and thus automatically exclude the contents of the layer from the function (e.g. the example above could hypothetically replace `include_requirements=False` with `layers=[":layer"]`)
huonw
added a commit
that referenced
this issue
May 30, 2023
This adjusts the AWS Lambda and Google Cloud Function documentation for the new Zip layout, added in #19076 and targeted for 2.17. This PR is just what's required for 2.17, ready to cherry-pick. The "Migrating" section is written with this in mind. It will require adjustment for 2.18 to reflect the change in defaults, and, hopefully, support for AWS Lambda Layers (#18880, #19123). Fixes #19067
huonw
added a commit
to huonw/pants
that referenced
this issue
May 30, 2023
This adjusts the AWS Lambda and Google Cloud Function documentation for the new Zip layout, added in pantsbuild#19076 and targeted for 2.17. This PR is just what's required for 2.17, ready to cherry-pick. The "Migrating" section is written with this in mind. It will require adjustment for 2.18 to reflect the change in defaults, and, hopefully, support for AWS Lambda Layers (pantsbuild#18880, pantsbuild#19123). Fixes pantsbuild#19067
huonw
added a commit
that referenced
this issue
May 31, 2023
This adjusts the AWS Lambda and Google Cloud Function documentation for the new Zip layout, added in #19076 and targeted for 2.17. This PR is just what's required for 2.17, ready to cherry-pick. The "Migrating" section is written with this in mind. It will require adjustment for 2.18 to reflect the change in defaults, and, hopefully, support for AWS Lambda Layers (#18880, #19123). Fixes #19067
huonw
added a commit
that referenced
this issue
Jun 16, 2023
…9314) This allows the `runtime` argument to `python_aws_lambda_function`, `python_aws_lambda_layer` and `python_google_cloud_function` to be inferred from the relevant interpreter constraints, when they cover only one major/minor version. For instance, having `==3.9.*` will infer `runtime="python3.9"` for AWS Lambda. The inference is powered by checking for two patterns of interpreter constraints that limit to a single major version: equality `==3.9.*` (implies 3.9) and range `>=3.10,<3.11` (implies 3.10). This inference doesn't always work: when it doesn't work, the user gets an error message to clarify by providing the `runtime` field explicitly. Failure cases: - if the interpreter constraints are too wide (for instance, `>=3.7,<3.9` covering 2 versions, or `>=3.11` that'll eventually include many versions), we can't be sure which is meant - if the interpreter constraints limit the patch versions (for instance, `==3.8.9` matching a specific version, or `==3.9.*,!=3.9.10` excluding one), we can't be sure the cloud environment runs that version, so inferring the runtime would be misleading - if the interpreter constraints are non-obvious (for instance, `>=3.7,<3.10,!=3.9.*` is technically 3.8 only), we don't try _too_ hard to handle it. We can expand the inference if required in future. For instance, if one has set `[python].interpreter_constraints = ["==3.9.*"]` in `pants.toml`, one can build a lambda artefact like (and similarly for a GCF artifact): ```python python_sources() python_aws_lambda_function(name="func", entry_point="./foo.py:handler") ``` This is the final piece* of my work to improve the FaaS backends in Pants 2.18: - using the simpler "zip" layout as recommended by AWS and GCF, deprecating Lambdex (#18879) - support for AWS Lambda layers (#18880) - Pants-provided complete platforms JSON files* when specifying a known `runtime` (#18195) - this PR, inferring the `runtime` from ICs, when unambiguous (including using the new Pants-provided complete platform when available) (#19304) (* The fixed complete platform files are currently only provided for AWS Lambda, not GCF. #18195.) The commits are individually reviewable. Fixes #19304
huonw
added a commit
that referenced
this issue
Aug 8, 2023
In yet more follow up to #19123 and #19550 for #18880, this ensures the `PythonAwsLambdaLayerFieldSet` is part of the `PackageFieldSet` union, so that `pants package path/to:some-layer` actually works. I clearly didn't test this properly in #19123 or #19550, but now I have: in a separate repo `PANTS_SOURCE=~/... pants package path/to:some-layer` produces `dist/path.to/some-layer.zip`, with the expected contents. 🎉
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
AWS Lambda has a concept of a layer, which is a (potentially) shared chunk of dependencies that are loaded into the environment, along with the main deploy package. This allows potential optimisations:
Describe the solution you'd like
Create a new
python_awslambda_layer
target that usespex3 venv create --layout=flat-zipped --prefix=python ...
pex-tool/pex#2140 (in 2.1.135) to export deps/src in the format expected: https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-pathSome questions/notes:
handler
at all? Or just require a user to specifydependencies=["..."]
andinclude_requirements
/include_sources
as desired? I'm thinking justdependencies
--layout=flat
to support Avoid Zipping AWS Lambda #18282.Describe alternatives you've considered
Reusing the
python_awslambda
target with a flag likeformat="deployment" | "layer"
, which seems needlessly confusing.Additional context
N/A
The text was updated successfully, but these errors were encountered: