Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image.add_local_python_source #2574

Merged
merged 5 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,6 @@ New minor client version `0.67.x` comes with an internal data model change for h



### 0.66.29 (2024-11-21)

* Adds `Image.add_local_python_packages` which works similarly to `Mount.from_local_python_packages` but for images.



### 0.66.12 (2024-11-19)

`Sandbox.exec` now accepts arguments `text` and `bufsize` for streaming output, which controls text output and line buffering.
Expand Down
21 changes: 13 additions & 8 deletions modal/image.py
Original file line number Diff line number Diff line change
Expand Up @@ -677,13 +677,13 @@ def build_dockerfile(version: ImageBuilderVersion) -> DockerfileSpec:
context_mount=mount,
)

def _add_local_python_packages(self, *packages: str, copy: bool = False) -> "_Image":
"""Adds Python package files to containers
def add_local_python_source(self, *modules: str, copy: bool = False) -> "_Image":
"""Adds locally available Python packages/modules to containers

Adds all files from the specified Python packages to containers running the Image.
Adds all files from the specified Python package or module to containers running the Image.

Packages are added to the `/root` directory of containers, which is on the `PYTHONPATH`
of any executed Modal Functions.
of any executed Modal Functions, enabling import of the module by that name.

By default (`copy=False`), the files are added to containers on startup and are not built into the actual Image,
which speeds up deployment.
Expand All @@ -693,9 +693,14 @@ def _add_local_python_packages(self, *packages: str, copy: bool = False) -> "_Im
required if you want to run additional build steps after this one.

**Note:** This excludes all dot-prefixed subdirectories or files and all `.pyc`/`__pycache__` files.
To add full directories with finer control, use `.add_local_dir()` instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder if we should change the default condition here to include only .py files, or maybe only directories containing .py files or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to it! We need to add easy to override filters for users though - @op just ran into issues with this and resorted back to copy_mount(Mount...) to get around the limitations :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changed this to only use .py-files. @kramstrom is working on docker glob parsing, and as a followup we could add some sdk-level globbing/include/exclude syntax too to make condition creation simpler

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used something like this:

def allow_globs(*patterns: str):
    def condition(file_path: str) -> bool:
        path = Path(file_path).relative_to(repo_root)
        for pattern in patterns:
            if fnmatch(str(path), pattern):
                return True
        return False

    return condition


modal_mount = Mount.from_local_dir(
    ".",
    condition=allow_globs(
        "Cargo.toml",
        "ffi/**/*.rs",
        "crates/**/*.rs",
        "crates/**/*.proto",
        "crates/**/*.sql",
        "**/Cargo.toml",
        "**/requirements.dev.txt",
    ),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget that we now also have a dockerignore-compliant pattern matcher in our library: https://github.com/modal-labs/modal-client/blob/main/modal/_utils/pattern_matcher.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow I wasn't aware of this! @kramstrom this sounds very relevant to the stuff you're working on!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we'd just held off on figuring out exactly what the API should be (i.e. are we using it for exclusion or inclusion or both? do you pass a list of string patterns or a file? etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pitched to Kasper that maybe the easiest for users would to have it always specify inclusion, but have "set arithmetic" operators (+ and - for simplicity) that lets you combine them in any way you please in order to build complex include/excludes.

E.g.

expr = files("**/*.py", "*.data") - files("bigblob.data")

To add full directories with finer control, use `.add_local_dir()` instead and specify `/root` as
the destination directory.
"""
mount = _Mount.from_local_python_packages(*packages)

def only_py_files(filename):
return filename.endswith(".py")

mount = _Mount.from_local_python_packages(*modules, condition=only_py_files)
return self._add_mount_layer_or_copy(mount, copy=copy)

def copy_local_dir(self, local_path: Union[str, Path], remote_path: Union[str, Path] = ".") -> "_Image":
Expand Down Expand Up @@ -1005,8 +1010,8 @@ def poetry_install_from_file(
If not provided as argument the path to the lockfile is inferred. However, the
file has to exist, unless `ignore_lockfile` is set to `True`.

Note that the root project of the poetry project is not installed,
only the dependencies. For including local packages see `modal.Mount.from_local_python_packages`
Note that the root project of the poetry project is not installed, only the dependencies.
For including local python source files see `add_local_python_source`
"""

def build_dockerfile(version: ImageBuilderVersion) -> DockerfileSpec:
Expand Down
18 changes: 9 additions & 9 deletions test/image_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -1311,13 +1311,13 @@ def hydrate_image(img, client):

def test_add_local_lazy_vs_copy(client, servicer, set_env_client, supports_on_path):
deb = Image.debian_slim()
image_with_mount = deb._add_local_python_packages("pkg_a")
image_with_mount = deb.add_local_python_source("pkg_a")

hydrate_image(image_with_mount, client)
assert image_with_mount.object_id == deb.object_id
assert len(image_with_mount._mount_layers) == 1

image_additional_mount = image_with_mount._add_local_python_packages("pkg_b")
image_additional_mount = image_with_mount.add_local_python_source("pkg_b")
hydrate_image(image_additional_mount, client)
assert len(image_additional_mount._mount_layers) == 2 # another mount added to lazy layer
assert len(image_with_mount._mount_layers) == 1 # original image should not be affected
Expand All @@ -1328,12 +1328,12 @@ def test_add_local_lazy_vs_copy(client, servicer, set_env_client, supports_on_pa
# error about using non-copy add commands before other build steps
hydrate_image(image_non_mount, client)

image_with_copy = deb._add_local_python_packages("pkg_a", copy=True)
image_with_copy = deb.add_local_python_source("pkg_a", copy=True)
hydrate_image(image_with_copy, client)
assert len(image_with_copy._mount_layers) == 0

# do the same exact image using copy=True
image_with_copy_and_commands = deb._add_local_python_packages("pkg_a", copy=True).run_commands("echo 'hello'")
image_with_copy_and_commands = deb.add_local_python_source("pkg_a", copy=True).run_commands("echo 'hello'")
hydrate_image(image_with_copy_and_commands, client)
assert len(image_with_copy_and_commands._mount_layers) == 0

Expand All @@ -1351,7 +1351,7 @@ def test_add_local_lazy_vs_copy(client, servicer, set_env_client, supports_on_pa

def test_add_locals_are_attached_to_functions(servicer, client, supports_on_path):
deb_slim = Image.debian_slim()
img = deb_slim._add_local_python_packages("pkg_a")
img = deb_slim.add_local_python_source("pkg_a")
app = App("my-app")
control_fun: modal.Function = app.function(serialized=True, image=deb_slim, name="control")(
dummy
Expand All @@ -1368,7 +1368,7 @@ def test_add_locals_are_attached_to_functions(servicer, client, supports_on_path

def test_add_locals_are_attached_to_classes(servicer, client, supports_on_path, set_env_client):
deb_slim = Image.debian_slim()
img = deb_slim._add_local_python_packages("pkg_a")
img = deb_slim.add_local_python_source("pkg_a")
app = App("my-app")
control_fun: modal.Function = app.function(serialized=True, image=deb_slim, name="control")(
dummy
Expand Down Expand Up @@ -1399,7 +1399,7 @@ class A:
@skip_windows("servicer sandbox implementation not working on windows")
def test_add_locals_are_attached_to_sandboxes(servicer, client, supports_on_path):
deb_slim = Image.debian_slim()
img = deb_slim._add_local_python_packages("pkg_a")
img = deb_slim.add_local_python_source("pkg_a")
app = App("my-app")
with app.run(client=client):
modal.Sandbox.create(image=img, app=app, client=client)
Expand All @@ -1418,15 +1418,15 @@ def empty_fun():

def test_add_locals_build_function(servicer, client, supports_on_path):
deb_slim = Image.debian_slim()
img = deb_slim._add_local_python_packages("pkg_a")
img = deb_slim.add_local_python_source("pkg_a")
img_with_build_function = img.run_function(empty_fun)
with pytest.raises(InvalidError):
# build functions could still potentially rewrite mount contents,
# so we still require them to use copy=True
# TODO(elias): what if someone wants do use an equivalent of `run_function(..., mounts=[...]) ?
hydrate_image(img_with_build_function, client)

img_with_copy = deb_slim._add_local_python_packages("pkg_a", copy=True)
img_with_copy = deb_slim.add_local_python_source("pkg_a", copy=True)
hydrate_image(img_with_copy, client) # this is fine


Expand Down
2 changes: 1 addition & 1 deletion test/watcher_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def dummy():

def test_add_local_mount_included_in_serve_watchers(servicer, client, supports_on_path, disable_auto_mount):
deb_slim = modal.Image.debian_slim()
img = deb_slim._add_local_python_packages("pkg_a")
img = deb_slim.add_local_python_source("pkg_a")
app = modal.App()

@app.function(serialized=True, image=img)
Expand Down
Loading