Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image.add_local_python_source #2574

Merged
merged 5 commits into from
Dec 5, 2024
Merged

Image.add_local_python_source #2574

merged 5 commits into from
Dec 5, 2024

Conversation

freider
Copy link
Contributor

@freider freider commented Nov 26, 2024

Reshape _add_local_python_packages into add_local_python_dir and to make it "public".

Won't merge until feedback has been collected and docs updated

Changelog

  • Adds Image.add_local_python_source which works similarly to the old and soon-to-be-deprecated Mount.from_local_python_packages but for images. One notable difference is that the new add_local_python_source only includes .py-files by default

@freider freider changed the title Freider/add local python dir Add local python dir Nov 26, 2024
modal/image.py Outdated
@@ -666,13 +666,13 @@ def build_dockerfile(version: ImageBuilderVersion) -> DockerfileSpec:
context_mount=mount,
)

def _add_local_python_packages(self, *packages: Union[str, Path], copy: bool = False) -> "_Image":
def add_local_python_dir(self, module_name: str, *, copy: bool = False) -> "_Image":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just being annoying but is it a problem that "module" technically corresponds to a single file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely a problem... The naming of this function and argument is super hard :(

A single file module could also be considered a package right, if it's the only top level entry of a project. So maybe add_local_python_package(+-s) is good after all? But it's still potentially confusing since it doesn't "install" anything...

@@ -682,9 +682,10 @@ def _add_local_python_packages(self, *packages: Union[str, Path], copy: bool = F
required if you want to run additional build steps after this one.

**Note:** This excludes all dot-prefixed subdirectories or files and all `.pyc`/`__pycache__` files.
To add full directories with finer control, use `.add_local_dir()` instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder if we should change the default condition here to include only .py files, or maybe only directories containing .py files or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to it! We need to add easy to override filters for users though - @op just ran into issues with this and resorted back to copy_mount(Mount...) to get around the limitations :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just changed this to only use .py-files. @kramstrom is working on docker glob parsing, and as a followup we could add some sdk-level globbing/include/exclude syntax too to make condition creation simpler

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used something like this:

def allow_globs(*patterns: str):
    def condition(file_path: str) -> bool:
        path = Path(file_path).relative_to(repo_root)
        for pattern in patterns:
            if fnmatch(str(path), pattern):
                return True
        return False

    return condition


modal_mount = Mount.from_local_dir(
    ".",
    condition=allow_globs(
        "Cargo.toml",
        "ffi/**/*.rs",
        "crates/**/*.rs",
        "crates/**/*.proto",
        "crates/**/*.sql",
        "**/Cargo.toml",
        "**/requirements.dev.txt",
    ),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget that we now also have a dockerignore-compliant pattern matcher in our library: https://github.com/modal-labs/modal-client/blob/main/modal/_utils/pattern_matcher.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow I wasn't aware of this! @kramstrom this sounds very relevant to the stuff you're working on!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we'd just held off on figuring out exactly what the API should be (i.e. are we using it for exclusion or inclusion or both? do you pass a list of string patterns or a file? etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pitched to Kasper that maybe the easiest for users would to have it always specify inclusion, but have "set arithmetic" operators (+ and - for simplicity) that lets you combine them in any way you please in order to build complex include/excludes.

E.g.

expr = files("**/*.py", "*.data") - files("bigblob.data")

modal/image.py Outdated
Note that the root project of the poetry project is not installed,
only the dependencies. For including local packages see `modal.Mount.from_local_python_packages`
Note that the root project of the poetry project is not installed, only the dependencies.
For including local packages see `add_local_python_dir`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is confusing since usually the root project would be what gets included through the modal source mount right? Or maybe not if we start disabling automount.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends. If this is a separate python project from the current one, it wouldn't be included by automounts unless it's imported in global scope locally. If it's the project that the modal app's module is included in, it would be automounted if the modal app is is deployed/run via module path syntax, but not necessarily if it's included via file/path.py.

All of these inconsistencies would get somewhat fixed by removing automount except for the function file itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the expense of having to be a bit more explicit whenever your modal function references other local files 😬

@freider freider changed the title Add local python dir Image.add_local_python_sources Dec 3, 2024
@freider freider changed the title Image.add_local_python_sources Image.add_local_python_source Dec 5, 2024
@freider freider requested a review from mwaskom December 5, 2024 14:22
@freider
Copy link
Contributor Author

freider commented Dec 5, 2024

@prbot approve

Copy link

@modal-pr-review-automation modal-pr-review-automation bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved 👍. @mwaskom will follow-up review this.

@freider freider merged commit 0aa0977 into main Dec 5, 2024
22 checks passed
@freider freider deleted the freider/add-local-python-dir branch December 5, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants