Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement layout="zip" for Lambda/GCF, deprecating lambdex (Cherry-pick of #19076) #19120

Merged
merged 1 commit into from
May 23, 2023

Conversation

huonw
Copy link
Contributor

@huonw huonw commented May 23, 2023

This fixes #18879 by allowing the python_awslambda and python_google_cloud_function FaaS artefacts to be generated in "simple" format, using the pex3 venv create --layout=flat-zipped functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the zip contains cowsay/__init__.py etc., rather than .deps/cowsay-....whl. This avoids the need to do the dynamic PEX initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can:

  • reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up.
  • reduce package size somewhat (the PEX .bootstrap/ folder seems to be about 2MB uncompressed, ~1MB compressed).
  • increase build times.

For instance, for one Python 3.9 Lambda in our codebase:

metric before after
init time on cold start 2.3-2.5s 1.3-1.4s (-1s)
compressed size 24.6MB 23.8MB (-0.8MB)
uncompressed size 117.8MB 115.8MB (-2.0MB)
PEX-construction build time ~5s ~5s
PEX-postprocessing build time 0.14s 4.8s

(The PEX-postprocessing time metric is specifically the time to run the Setting up handler (lambdex) or Build python_awslambda (pex3 venv create) process, computed by running pants --keep-sandboxes=always package ... for each layout, and then hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh. This doesn't include the time to construct the input PEX, which is the same for both.)


This functionality is driven by adding a new option to the [lambdex].layout option added in #19074. In #19074 (targeted for 2.17), it defaults lambdex (retaining the current code paths). This PR flips the default to the new option zip, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently.

I think this is an acceptable MVP for this functionality, but there's various bits of follow-up:

This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be:

release supports lambdex? supports zip? default layout deprecation warnings
2.17 (this PR) lambdex if layout = "lambdex" is implicit, tell people to set it: recommend zip, but allow lambdex if they have to
2.18 zip if layout = "lambdex" is set at all, tell people to remove it and switch to zip
2.19 zip none, migration over (or maybe just about removing the [lambdex] section entirely)

…d#19076)

This fixes pantsbuild#18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (pantsbuild#19027)
- adjust documentation pantsbuild#19067
- other improvements like pantsbuild#18195 and pantsbuild#18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as
discussed in
pantsbuild#19074 (comment)
and
pantsbuild#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
@huonw huonw merged commit 3453187 into pantsbuild:2.17.x May 23, 2023
@huonw huonw deleted the cherry-pick-19076-to-2.17.x branch May 23, 2023 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants