-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for environment.yml files #457
base: main
Are you sure you want to change the base?
Conversation
c6cca06
to
5e5455e
Compare
2de6d56
to
88c9ba0
Compare
c5325d9
to
7c2b9a2
Compare
a72e423
to
02c28ba
Compare
9f73f59
to
c70fc36
Compare
4ede02f
to
de059c7
Compare
c70fc36
to
f5460cb
Compare
de059c7
to
838231f
Compare
f5460cb
to
63dd427
Compare
838231f
to
fc39cf4
Compare
63dd427
to
2ed9c81
Compare
fc39cf4
to
4da7004
Compare
2ed9c81
to
aa5ca07
Compare
4da7004
to
09e43d8
Compare
aa5ca07
to
d89f6bd
Compare
09e43d8
to
d76d45e
Compare
d89f6bd
to
ea9e35d
Compare
d76d45e
to
8198e99
Compare
ea9e35d
to
466baa9
Compare
8198e99
to
f773397
Compare
466baa9
to
8dc8c89
Compare
8dc8c89
to
3265f9c
Compare
fawltydeps/extract_deps.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks lighter that way!
What do you think about the following code organizing pattern:
└── extract_deps
├── __init__.py
├── pixi_parser.py
├── pyproject_parser.py
├── requirements_parser.py
└── setup_parser.py
The content of the current extract_deps.py
would be in __init__.py
.
We would improve the logical structure of the package with purpose segregation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this idea. And finally implemented in the latest rebase. 😄
fawltydeps/extract_declared_dependencies.py has been growing longer, and is about to grow even longer with the introduction of parsing environment.yml files. Now is the time to split this up into a subpackage containing multiple modules that each focus on one particular (group of) dependency declaration formats. Also, take the opportunity to rename extract_declared_depenedencies into extract_deps: We use the term "deps" everywhere (including in our project name), so there is no need to spell out "dependencies" in all of these filenames. No functional changes in this commit. It's all just moving code between files, and fixing up associated references accordingly.
This is needed to start parsing environment.yml files from the Conda ecosystem. v6.0.1 is the last version to support Python v3.7, so this is our minmum version for now. Adding a yaml dependency to FawltyDeps makes it available inside the Python environment that is used by the packages.SysPathPackageResolver. This changes the behavior of a couple of real_projects tests, where we previously expected yaml/PyYAML to be reported as undeclared/unused when no Python environment with PyYAML installed was present. Now, we can assume that PyYAML -> yaml is always correctly mapped via sys.path. Also, a potential future issue to be solved is made apparent by our need to declare types-PyYAML in ignore_unused: I suspect the "types-$PACKAGE" form is quite common, and should be handled in the same way as "-stubs" suffixes inside Package.has_type_stubs().
Include support for parsing Pip dependencies embedded within environment.yml files, like this: ```yaml name: example dependencies: - jupyterlab=1.0 - pip=19.1 - pip: - kaggle==1.5 - yellowbrick==0.9 ``` Also add test cases, mirroring the structure we already use to test our pyproject.toml and pixi.toml parsers.
In tests/test_deps_parser_determination.py, test_explicit_parse_strategy__mismatch_yields_appropriate_logging() tests what happens when the --deps-parser-choice option is (ab)used to parse a deps file - e.g. setup.cfg - with the "wrong" parser - e.g. parse_setup_py(). The test verifies that we print a warning message when such mismatch occurs. However, the test also _assumes_ that when no mismatch is present, the empty file being passed to the appropriate parser would yield zero warning messages. Although this happens to be true for our existing parsers, we are about to add a new parser for environment.yml files, where an empty file is _not_ silently accepted, but rather produces this error message: Failed to parse path/to/environment.yml: No top-level mapping found! I believe the implicit assumption in this test is wrong: an empty file could very well produce warnings/errors, and we should instead make sure that the files being passed in the test are actually formatted according to their filenames (e.g. a setup.cfg file should be formatted as a valid setup.cfg file). Fortunately, we already have the fake_project() fixture which is able to produce valid deps file in all our supported formats, so this is fairly straightforward to fix. After fixing this, however, I discover a new issue: When passing e.g. a valid setup.cfg file to the parse_setup_py() function, it (obviously, in hindsight) fails with a SyntaxError: parse_setup_py() expects to find valid Python code, and the setup.cfg format is certainly not that. (The same goes for other file formats passed to parse_setup_py().) This SyntaxError is NOT caught by the parser, and causes FawltyDeps to abort. The fix is straightforward - catch the exception and print an error mesage - and the end result is that we now subject our parsers to a bit more scrutiny: - When there is a mismatch between the chosen parser and the given file, we verify that a warning about this mismatch is printed (along with any other warnings/errors that result from applying the "wrong" parser to this file). - When there is no mismatch between the chosen parser and the given file, we verify that a correctly formatted file (courtesy of fake_project()) generates no warnings/errors at all.
Now that we can parse dependency declarations from environment.yml files, we need to expose this functionality in our docs + CLI, as well as automatically find environment.yml files while traversing projects.
This was created with the following commands: - conda create --name conda_example python=3.12 - conda activate conda_example - conda install requests - conda env export --from-history > environment.yml - echo "import requests" > main.py followed by copying the corresponding Conda environment from ~/.conda/envs/conda_example into this sample project directory, and then removing/truncating files that I do not currently consider to be important for FawltyDeps to make sense of this project. The result is a simple Conda project with two declared Conda dependencies (Python itself + requests), along with a single source file that imports requests. The expected result of running FawltyDeps here is to report no undeclared or unused dependencies. Please note that since Conda puts environments under ~/.conda, and NOT inside the project directory, this example is somewhat contrived: We can not expect the Conda environment to be automatically discovered by FawltyDeps (unless FD itself is installed and running from there), but we must instead rely on the use of --pyenv to point to ~/.conda/envs/conda_example (in the general case, or tests/sample_project/conda_example/.conda/envs/conda_example in this test case)
3265f9c
to
31ec342
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jherland , this is a great improvement on both, the code readability side and on the features side.
environment.yml was a declaration file asked by the users to be supported ❤️
✔️ I checked the changes, looked if there are no places left with the old code structure references.
🛑 Then I ran the newest version on a randomly chosen project that uses environment.yaml files to declare dependencies. Parsed names are trailed with >
ehn there was some version specification of a package. To recreate, run in Cartopy project:
fawltydeps environment.yml --list-deps
The output I got is:
cython>
matplotlib-base>
numpy>
owslib>
packaging>
pillow>
pre-commit
pydata-sphinx-theme
pykdtree
pyproj>
pyshp>
pytest
pytest-mpl
pytest-xdist
ruff
scipy>
setuptools_scm
shapely>
sphinx
sphinx-gallery
Could you check if this is something we will not support for now or a parsing bug?
The code for extracting the package name from a Conda dependency specifier was overly simplistic: it assumed that any version/build info in the string always started with a "=" character. This is obviously incorrect (e.g. "numpy>=1.23" is a simple counter-example). Fix the parser to look for the same characters that the upstream looks for, when separating package name from version + build info (see https://github.com/conda/conda/blob/aa0fb6f3ae669a5ade575d340555aa6a9f71ef5e/conda/models/match_spec.py#L716 for details). Add a test case copied from the cartopy project to illustrate the correct use of such specifiers, and also add a couple of tests to verify that we properly deal with invalid package names in dependency specifiers (both Conda and Pip) inside an environment.yml file.
af03b40
to
4e75de3
Compare
Thanks for finding this! I had overlooked the proper parsing of the package name from the version + build info in a Conda dependency specifier, and I obviously also had no test case to catch this. Commit 4e75de3 should fix it: I added the With this added, I also needed to properly handle the case when a package name is missing or malformed. I added a couple of test cases for this, and made sure that we raise and handle the appropriate exceptions, so that we can print an error message about the faulty dependency and then keep on parsing the rest of the dependencies. |
This adds support for Conda projects that use environment.yml for declaring dependencies.
NOTE: The same Conda caveat as for Pixi still applies: We do not currently differentiate between Conda dependencies and PyPI dependencies, meaning that we assume that a Conda dependency named FOO will map one-to-one to a Python package named FOO. This is certainly not true for Conda dependencies that are not Python packages, and it probably isn't even true for all Conda dependencies that do indeed include Python packages.
Commits:
environment.yml
filesenvironment.yml
support to the rest of FawltyDepssample_projects
: Add example Conda projecttest_traverse_project
: Add tests using the Conda sample project