Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warn if dependencies file does not exist #33

Merged
merged 4 commits into from
May 15, 2024

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented May 10, 2024

Description

Contributes to rapidsai/build-planning#31.

Related to #24.

This proposes the following changes:

  • if the file indicated by dependencies-file does not exist, emit a warning

Notes for Reviewers

Why add this warning?

While testing rapidsai/cudf#15245, I observed rapids-build-backend==0.1.1 failing to update the build dependencies of wheels.

This showed up with errors like the following:

ERROR: Could not find a version that satisfies the requirement rmm==24.6.* (from versions: 0.0.1)

It took me maybe an hour to figure out what was going on.... I wasn't setting dependencies-file in the [tool.rapids-build-backend] configuration.

cudf is laid out like this:

cudf/
|___python/
       |_____cudf/
                  |_________pyproject.toml
       |_____dask_cudf/
                  |_________pyproject.toml

Not setting dependencies-file meant that rapids-build-backend defaulted to "dependencies.yaml" (e.g. cudf/python/cudf/dependencies.yaml).

That file doesn't exist, and so it just silently skipped all dependencies.yaml-related stuff via this:

try:
parsed_config = rapids_dependency_file_generator.load_config_from_file(
config.dependencies_file
)
except FileNotFoundError:
parsed_config = None

With the warning added in this PR, I would have at least had a hint from CI logs pointing me at the potential problem.

Why change the default?

edited: removed (click me)

"./dependencies.yaml" (e.g., dependencies.yaml located next to the wheel's pyproject.toml) is not a working default for any RAPIDS project, as far as I know.

In my experience, most of them are either laid out like cuml:

cuml/
|____dependencies.yaml
|____python/
         |_____pyproject.toml

Or like rmm

rmm/
|____dependencies.yaml
|____python/
         |_____librmm
                    |_____pyproject.toml
         |_____rmm
                    |______pyproject.toml

See https://github.com/search?q=%22%5Bproject%5D%22+org%3Arapidsai+path%3Apyproject.toml&type=code.

Changing to ../dependencies.yaml would at least automatically work for some projects (like cuml) with minimal risk of reaching up outside of the repo to some other project's dependencies.yaml (which would be more of a risk with, say, ../../dependencies.yaml.

Could we just make this a hard error instead of a warning?

I think we should!

I'd personally support being even stricter at this point in development and making an existing dependencies.yaml file a hard requirement for running rapids-build-backend, meaning:

  • no default value for dependencies-file
  • loud runtime error if the file pointed to by dependencies-file doesn't exist

If we can't think of an existing case that requires running this without a dependencies.yaml, then I think we should implement this stricter interface. Making something more flexible later is always easier than making it stricter, and this strictness right now will save some developer time (e.g. would have saved me some time working on cudf today).

@jameslamb jameslamb added breaking Introduces a breaking change improvement Improves an existing functionality labels May 10, 2024
@jameslamb jameslamb requested a review from a team as a code owner May 10, 2024 22:28
@jameslamb
Copy link
Member Author

@vyasr @KyleFromNVIDIA @bdice I think it'd be worth talking through this synchronously next week. That will probably go faster than using GItHub comments.

@vyasr
Copy link
Contributor

vyasr commented May 10, 2024

Sure, happy to link you to some relevant information. The tl;dr is I'd be fine with warning, but I don't think we want to hard error.

@jameslamb jameslamb changed the title warn if dependencies file does not exist, change default for 'dependencies-file' warn if dependencies file does not exist May 13, 2024
@jameslamb
Copy link
Member Author

Thanks, I see now from the links you shared that having this try-catch was an explicit design choice.

Specifically this quote:

...if dependencies.yaml is not present, then do nothing. ...[if it encounters an] exception that dependencies.yaml cannot be found then we skip on updating dependencies but continue the build

from #17 (comment)

I wasn't involved in that conversation, didn't know about that. Sorry for bringing up something that's already covered.

I do still think at least the warning would be nice to have at this stage of development, where rapids-build-backend is only being used to build wheels, not sdists. I think we should continue with this PR to add this warning, to make this current phase of development a bit easier, then maybe rip it out in a future where we have sdists or some other specific use-case where dependencies.yaml no existing is acceptable.

I've changed this PR in the following ways:

  • removed the stuff about changing the dependencies.yaml default location
  • added a comment on this try-catch around dependencies.yaml explaining that it's intentional

Copy link
Contributor

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you changed the default location but then changed it back - I presume this is because it broke the tests. I think the new default location that you proposed is better, and we should change the tests to match.

Comment on lines +171 to +172
# "dependencies.yaml" might not exist in sdists and wouldn't need to... so don't
# raise an exception if that file can't be found when this runs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should distinguish between "dependencies.yaml doesn't exist because it's an sdist" and "dependencies.yaml doesn't exist because the developer forgot to configure it properly." The former is easy to mark with some sort of marker file or a setting that gets written to pyproject.toml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a setting that gets written to pyproject.toml

Maybe we should have this anyway, instead of assuming that the user doesn't want dependencies.yaml just because they forgot to include it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other things about this project would need to change to support sdists too.

The relative paths that reach up above pyproject.toml, like this:

[tool.rapids-build-backend]
dependencies-file = "../../dependencies.yaml"

are unlikely to work in an sdist, where I'd expect pyproject.toml to be at the root of the file layout.

This is just another place that's exposing the tension between building from an sdist vs. the file layout in source control (#17 (comment)).

From my perspective, having rapids-build-backend be stricter here (raising an error if the file that dependencies-file points to does not exist) is useful at this stage of development, where we're only building wheels and don't know the exact shape of what building RAPIDS sdists would even look like (#17 (comment)).

But if we don't want to do that, then I think raising this warning is sufficient (to at least help with debugging mistakes like the one I made in cudf) and that we shouldn't try to add configuration to differentiate between the sdist vs misconfiguration distinction. I think any attempt to do that is going to be difficult to review without knowing what we want out of sdist support, and that sdist support isn't a high priority right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised the sdist discussions, but I am also leaning towards us not needing to plan for them right now. I think it's causing additional complexity that isn't merited at this stage. I'd be fine with being stricter now and loosening up restrictions if and when we decide that we want to support sdists. The current state is a bit confusing because the backend does wrap the wrapped backend's build_sdist call correctly right now, which makes it seem like we support sdists, but I don't want to remove that stuff. For now we can leave it in whatever half-supported state is convenient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks for that!

I think we should proceed right now with adding this warning, and defer any other changes related to this.

I spent an hour debugging issues in my cudf PR that had the root cause "misconfigured dependencies-file". Having this case be an error would have saved me 59 of those 60 minutes... having this result in a warning would have saved me 58 of those 60 minutes. So the warning gets most of the benefit I wanted out of this discussion, and leaves the library in a state where it's possible to experiment with sdists.

@KyleFromNVIDIA I just updated this to latest main now that you merged #30 and #32 . I think we should merge this and then cut a release.

@jameslamb
Copy link
Member Author

I saw you changed the default location but then changed it back - I presume this is because it broke the tests

I changed it back for 2 reasons:

  1. to reduce the scope of this PR, so the default-location discussion and the what-if-dependencies.yaml-does-not-exist discussion could be decoupled
  2. because I looked back at https://github.com/search?q=%22%5Bproject%5D%22+org%3Arapidsai+path%3Apyproject.toml&type=code and realized I was wrong... there ARE some projects across RAPIDS with dependencies.yaml and pyproject.toml at the same level, like:

@KyleFromNVIDIA
Copy link
Contributor

there ARE some projects across RAPIDS with dependencies.yaml and pyproject.toml at the same level

Indeed there are, but I think ../dependencies.yaml is more common. OTOH, it's also more error-prone (reaching outside the directory.) I agree that changing it (if we decide to change it at all) is best for a separate PR. I'm fine with it either way, changing it or not.

@jameslamb jameslamb added non-breaking Introduces a non-breaking change and removed breaking Introduces a breaking change labels May 15, 2024
@KyleFromNVIDIA KyleFromNVIDIA merged commit 1dd6a83 into rapidsai:main May 15, 2024
7 checks passed
@jameslamb jameslamb deleted the warn-missing-file branch May 15, 2024 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants