-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special treatment of pre-installed packages by the solver #9669
Comments
Branch which re-enables It seems sensible to me to choose pre-installed packages, then you don't have to build them again. If the version constraints don't disallow it then the solver could choose that install plan anyway. If you really don't want a package to be part of the install plan then perhaps instead you want a means to instruct the solver to never choose a particular version as an additional constraint form. |
Regardless of the outcome of the discussion, it would make sense to find the PR that commented |
@mpickering I'm worried about defaults here. I don't think there's an easy way to tell my users that there was a subtle bug in What if this is a security bug? What do I do? I have no communication channels. A This is how all linux distributions work, afaik. "Saving compilation time" seems like a questionable priority, imo (as a default). |
@mpickering also says about |
I agree that --prefer-newest (so, no special-casing the boot packages) would be a cleaner default. Perhaps, there should be --prefer-installed to get the current default if we change the default to --prefer-newest? |
While I like the default @hasufell proposed better, for the reasons given and also because it's more uniform, I worry it would hit hard the users with big sets of installed packages, which may include Nix users, Linux distribution users and v1-/cabal-env users (whether for teaching purposes or others). So we'd need some good backward compatibility scheme. |
Another solution, which unfortunately increases complication, might be to treat libraries installed together with GHC (and perhaps all installed not by the user directly, but by install/upgrade scripts) specially and apply Edit: which somehow agrees with how we treat local packages even if newer versions are on Hackage, local packages being "user-installed" and so automatic upgrades being disabled (I think?). I remember |
CCing @Ericson2314 @angerman wrt Nix |
What happens if you deprecate the version with the bug (the installed version)? Will cabal still prefer it? |
Re Nix, I would like to have no notion of "preinstalled dependencies" because one should not be "preinstalling" things with Nix. So I like this.
In conclusion, Proud Nix Hater @hasufell has proposed something that I think is actually great for Nix. Thank you! :) |
I'm a bit confused. In fact, the way we use nix at work involves "pre-installing" everything in the sense that everything goes into a package database which nix provides, instead of the cabal store, no? So with the current behavior, if I am building (the workaround for this, which is possible but slightly irritating, is to use a cabal.project or the like to disable hackage or any other package repository for packages developed in a nix provided environment) |
All that said, I modestly prefer the current behavior, in part because I'm afraid of changing this sort of stuff given large and unpredictable effects it may have on many users, and in part because users will expect that if they have a I do think an explicit flag like |
I don't think cabal has any obligation to "not break nix". It's nix packagers obligation to keep it working. Changes like the proposed one would be communicated early enough with a migration period, so that users can adapt and opt out of the changed behavior. This is the same with the v1 vs v2 change that the cabal team executed over several years. Except this one seems much less disruptive. I can't see how the current behavior is a sensible default from any angle, if it causes average users to miss bugfixes. It is not safe. |
@gbaz What I mean is that in Haskell.nix-style approaches planning takes place with no / empty package database. Ideally even for " The current trick of "re-planning" in the Nix shell and hoping it solves for as little few possible not-already-built things as possibly is comparatively gross, and (IIRC) runs into issues when sources are funny (e.g. modified local packages). (That said, the "already installed" constraint is useful for the above hack, and I imagine also useful for anyone that is wondering why their boot packages aren't being used under this issue's proposal.) |
Speaking as a member of the Haskell Security Response Team, our hope is that cabal-install will be enhanced to directly use the data from the advisory database, and either omit affected packages from build plans by default, or alert users when build plans contain affected packages. This issue poses some good questions but I don't think the SRT would have an opinion on it one way or the other, given the objective of more explicit cabal-install features/behaviour regarding known security issues. |
The issue with forcing as much newest dependencies as possible is that your library/app might end up with a very different set of dependencies than your tests. If tests involve I think making |
What is perplexing for me is the following. Suppose we have
Question: when installing
I am baffled about why we could possibly want to treat
Which choice is best isn't obvious to me. But I can't see any justification for treating the two differently. |
This is mostly correct, with the caveat that cabal treats any package that is in the global package db specially. It just so happens that That, imo, makes it even worse. There are many other mechanisms to avoid cabal rebuilds (e.g. just don't run |
I understand doctest is a special case, but I do not believe this justifies having the current default.
It is the maintainers responsibility to ensure testing across multiple setups. If they can't do that, then their cabal version bounds are simply wrong or their test suite is sub-par. For anyone wondering, there are two practical solutions to avoid doctest:
|
I don't want to digress too much on this, but I'm rather surprised by this sentiment and disagree rather strongly (wrt this being enough). I'll put my response in a collapsible section to keep the thread clean. I'm happy to continue that discussion privately or on the security response team issue tracker. Wrt security updatesTo my knowledge, there are currently two main definitions of "software security" (or "insecurity").
For the first definition there are a variety of technical models, but most of them are rather hard to apply. E.g. there is "An attack Surface Metric" by Pratyusa K. Manadhata, which attempts to model a system based on IO automatons and syscalls a user can trigger. The second definition allows more insights into the whole business of redistribution, software maintenance, supply chain issues, buying 0-day exploits or DDoS attacks on the darkweb, etc. But most importantly: packaging and update policies. The easiest way to make attacks more expensive is to always update to the latest versions of all packages. The reason is that an attacker has to invest more time in studying/keeping up with new versions of software packages and coming up with attacks (or buying them somewhere) as compared to versions that have been around for 2 years. There has just been less time to adapt. Software updates are disruptive for attackers. In that way, the most secure model is a "rolling release distro". In addition, there is no clear definition in research of what a "security bug" is and this idea has also been rejected numerous times by the Linux kernel (see various interviews and ML posts by Greg Kroah-Hartman, the maintainer of the stable linux branch). The Linux kernel backports any bugfix. Because any bug can (as per the "unexpected computation" definition of insecurity) potentially lead to a security vulnerability, even if none is known yet. You are not safe just because you're not running a version that has no publicly known vulnerabilities. All this said... we want to optimize an ecosystem of packages in a way that an attacker has to invest a lot of resources to drive attacks on users. And the most important way to do this is to be as aggressive as possible with software updates, even if there's no CVEs that mandate an update. This conflicts (heavily) with the current default of cabal. |
I think modern cabal works rather differently. The new cabal v2-build (or v2-install) builds wombat-2.7.2 locally and does not "install" it, at least not in the same sense that GHC distribution installs the packages it comes with. Modern cabal discourages installing any libraries and encourages building them anew (with smart caching via "store"). In fact, if GHC stopped providing/exposing the bundled packages, the problem of the exceptional treatment of installed packages would be immediately gone (until the user insists on manually installing some other packages, which is discouraged and hard to do properly). If GHC ships with the packages so that the user saves on compilation, it's no wonder cabal tries to accommodate it. However, I'm guessing GHC exposes the packages, because the Therefore, the inconsistent cabal behaviour may be caused primarily by Am I anywhere close to the root cause of keeping this old functionality in modern cabal? Can cabal handle GHC in some alternative way without incurring this irregular behaviour? E.g., what can go wrong if GHC renames all the packages (in the package db and/or on Hackage) it bundles so that they can't be reinstalled at all? Edit: actually, what happens if cabal reinstalls a dependency of |
Yes indeed:
But that is a simple consequence of depending on But let's suppose that your build plan does not depend on |
I think this is not quite right. I don't think anyone is asking to change the behaviour when you depend on a non-reinstallable package: there we really do have to go with what GHC ships, and that's that. I think the request is about packages that are reinstallable, but happen to have versions in the global package-db, like |
The reason that things in the In general, it is a bit of an issue that |
My guess is that cabal covers the case of dependencies of non-reinstallable package in a lazy way --- by treating specially all packages that reside in the relevant package DB [edit; and regardless of the build plan]. This is has several advantages: simplicity of implementation, simplicity of configuration (though, as @mpickering states, this may be too hard-wired at this point), an extra benefit of backward compatibility for other legacy workflows using a central package DB, simplicity of conveying the behaviour to the user (though it's probably not conveyed yet or not well enough). Edit: one more advantage: this primitive solution does not increase the coupling of GHC and cabal, because the list of non-installable packages that changes between GHC versions (#9092) does not need to be used for yet another purpose in cabal code. Which is why I'm considering the other option, changing the behaviour of GHC, not of cabal. Or even of the GHC installer, e.g., making it rename all the packages it installs [edit: a less brutal variant: install them to another package db, if that matters]. That's probably absurd for fundamental reasons, but I'd like to improve my understanding of the situation by learning why exactly. Edit: to be fair, the improvement of analysing the build plan (and auto-upgrading if the package in question is not a dependency of a non-reinstallable package) would not detract from the ease of configuration (though it would eliminate the backward compatibility side-benefit). However, I'm not able to predict what interplay it could have with the solver (because deciding to auto-upgrade changes the build plan, so perhaps we need to solve anew to verify all constraints are respected? what if the new solution implies the package should not be automatically upgraded?). |
The comments here are swaying me towards supporting a change here. However, I really do feel it needs to be flag-controlled and I'm definitely worried that any deep change like this could well confuse workers and disrupt workflows in a way that is very unexpected and hard to diagnose, especially for those who don't read release notes. |
There seems to be some suggestion in https://discourse.haskell.org/t/is-cabal-install-stable-enough-yet-that-ghc-should-be-inflicting-it-on-newbies/9979/52 that However, to my understanding that is false because If I am mistaken about this then a reproduce which demonstrates the issue would be much appreciated. |
There's a clear reproducer for the original topic:
I'm not sure if anyone suggested that and it seems out of scope of this ticket how This ticket is about the solver. |
It seems that I misunderstood that suggestion in the thread (it is quite complicated all the discussions going on). Thank you for clarifying that. In particular the comment "Do v2 builds even look at the globally installed packages?", globally installed packages might mean, packages installed in the the global package database OR packages installed in the user package database by |
What's the status of this?
I think @michaelpj's suggestion is great. Though I think it's a bit tempting to just make the entire change in one release. I struggle to see how we could communicate this "deprecation" to users, and I imagine we won't get much usage until it becomes the default -- I think the users who would test this feature would just as easily test it from a nightly build. I would be really keen to make progress here because this looks to be one of the issues blocking decoupling GHC upgrades from boot library upgrades, and I think that would be a huge improvement (though maybe I'm misunderstanding this issue?). |
This issue is about default behavior. The example "reproducer" https://github.com/hasufell/toto.git has no lower-bound on |
Good point. I should've been more precise. Here is the sort of scenario I had in mind: (Assuming the package uses cabal-install and doesn't depend on If I want to submit a PR to bump the bound for a non-boot library, I can just update the bound, and be confident that it will be tested by CI. If I want to bump a boot library bound, then I need to apply this workaround where I forbid the old version. I think most of the community is understandably not aware that this is even possible. Even if someone was aware, I think most maintainers wouldn't want to deal with this extra complexity. They are likely to instead wait for a version of GHC where that library is bundled. So, this issue blocks updating the ecosystem to "reinstallable" boot library bumps before a GHC comes out with that version bundled. |
@Mikolaj wrote early on in the discussion:
Git blame points us to this commit by @dcoutts 324b324#diff-e2de3403daa75f77ddd177d0a040f0547097abddb360328293ba46da21673a4e And as far as I can tell
So it sounds like these have never been applicable to v2 style commands |
The You correctly blame Duncan. It might made sense to get something quickly done, but in long term that was a bad choice. The
|
Well... you have to keep testing against the old versions too, because in larger setup there might be But I agree, having a flag to prefer installed, or not (or oldest; many way choice) would be helpful. Then good CI setups could test against all options. (Note to myself to add IMHO, the default choice doesn't really make a good argument for CI, you should test against as many "naturally occuring" dependencies as possible. Building against bundled boot libs will occur as long as So I'm 👍 to have more options to the solver. I actually don't care what's the default is. (If I don't like it, I hope I can change it in the global cabal-install config). |
This is my general stance about these "let's change the default" proposals. Make options available, make them configurable, and then you can (discuss the) change of the default. The recent |
Yes this definitely should be (globally) configurable. That really lowers the cost of changing the default as you say -- users can just override it. |
I wholeheartedly agree. Given that this discussion took so long and didn't bring us consensus nor even a review of available options, I'd propose to implement the new behaviour as a flag and I hope we can merge this PR without as long a delay and afterwards let's resume discussing the default, with a very concrete and tested flag in hand that will be the candidate for the new default. |
From the distro perspective I think the current behaviour is good, so I am not at all convinced changing the default behaviour is desirable, but having a flag is fine of course. If a dependency is already satisfied why should it have to be updated by default? Modern compiled languages already burn far too much energy with needless constant rebuilding... |
That's an odd thing to say, since every distro I know has the opposite behavior. If you install a package and there are updates for its dependencies, they will get pulled in. Can I get a proper decision from Cabal team what the way forward is? |
You might be interested to know that GHC HQ is considering my suggestion to ship a list of packages that can't be upgraded and possibly a list of packages that "freeze" the entire set of bootlibs, which would allow the solver to determine when it has to behave the way it does now and otherwise treat bootlibs like normal. (This would work better if ghcup could retrofit this data into older ghc versions which won't include it.) That said, @grayjay told me this may be difficult for the solver to use; it's possible that the current situation is the best the current solver can do. |
That won't be easily possible due to haskell/ghcup-hs#361 which is a major rewrite that has stalled |
In the absence of a PR, if you'd like the Cabal team as a whole to make a decision, please add this topic to the agenda of the fortnightly cabal devs chat at https://hackmd.io/X62yS0d6RxW3ybh8AmRqlw or, even better, please come to the meeting. Until we have such a decision or any objection, my proposal from a few comments back stands. |
Yes, please add the topic to the agenda. I don't have capacity to be involved in cabal meetings, thanks. |
I think it's reasonable that distro (and nix dev shells) want to provide a set of packages and force cabal to use those. But I don't think that means that other users need to be stuck with the current behaviour. I feel like cabal-install should provide configuration options to allow system config to provide these packages, but shouldn't force boot packages on users. So, I don't think there's necessarily a conflict between these two desires. |
And note that isn't the ideal thing for Nix either. When we hop in a dev shell we shouldn't be re-planning at all. Forcing cabal to choose installed packages is just a hack around not being able to use the original plan from which the nix shell dependencies were chosen. |
There could be greedy pick-installed-only solver for Nix (and alike) use cases. But cabal-install developers are going to remove support for different solvers (e.g. #9206), so there's no going back. (Constructing install plan, even from existing one is still "constructing install plan"; arguably |
@geekosaur I thought about the freezing behavior that you mentioned, and I realized that I'm not sure I understand. Would the solver need any additional logic beyond the existing requirement that an installed package depend on the exact installed versions of the dependencies that it was built against? I would expect that including |
I think the issue is that, since it's a "special" package that is dependent on the exact packages shipped with ghc (more specifically, on the exact packages that |
cabal already only considers the installed version of cabal/cabal-install/src/Distribution/Client/Dependency.hs Lines 451 to 476 in 63c486a
Choosing the installed |
It's already doing the freezing behavior; the problem is that it's freezing too much and we need a way to say "only these packages need to be frozen unless the |
To be more clear: certain packages are hardcoded into cabal as being completely non-reinstallable. Any other package in the global package db is "soft non-reinstallable", or |
The issue for amending the list of non-reinstallable packages is #10087, perhaps we could move conversation related to that into that issue considering this thread is already really long |
The cabal solver seems to treat pre-installed packages specially (e.g. those shipped with GHC).
To reproduce:
This should cause a failure, because ghc-9.4.8 ships with filepath-1.4.2.2, but the package above uses modules from 1.4.100.1. The package has no upper bounds on filepath. For any other non-pre-installed package, the solver would pick the latest.
I understand that this is by design, but I question this design here, because:
@mpickering found out that there used to be a
--upgrade-dependencies
switch, which is now disabled.I argue that the default should be to pick the latest possible version anyway.
CCing some potentially interested parties: @simonpj @frasertweedale
The text was updated successfully, but these errors were encountered: