From 0d768980e83faf8605b78aa36ab6c66f315d2876 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Fri, 29 May 2015 05:44:46 +0000 Subject: [PATCH] Rewrite after talking to @acrichto on IRC. tl;dr: - Motivation is significantly expanded - Phase 1 is cut for being too half-assed --- text/0000-cargo-libstd-awareness.md | 178 ++++++++++++++++++---------- 1 file changed, 118 insertions(+), 60 deletions(-) diff --git a/text/0000-cargo-libstd-awareness.md b/text/0000-cargo-libstd-awareness.md index 911f36da78d..1ad4c4bffb5 100644 --- a/text/0000-cargo-libstd-awareness.md +++ b/text/0000-cargo-libstd-awareness.md @@ -5,92 +5,150 @@ # Summary -Currently, all packages implicitly depend on libstd. This makes Cargo unsuitable for packages that -need a custom-built libstd, or otherwise depend on crates with the same names as libstd and the -crates behind the facade. The proposed fixes also open the door to a future where libstd can be -Cargoized. +Currently, Cargo doesn't know whether packages depend on libstd. This makes Cargo unsuitable for +packages that need a cross-compiled or custom libstd, or otherwise depend on crates with the same +names as libstd and the crates behind the facade. The proposed fixes also open the door to a future +where libstd can be Cargoized. + # Motivation -Bare-metal work cannot use a standard build of libstd. But since any crate built with Cargo can link -with a system-installed libstd if the target matches, using Cargo for such projects can be irksome -or impossible. +First some background. The current situation seems to be more of an accident of `rustc`'s pre-Cargo +history than an explicit design decision. Cargo passes the location and name of all depended-on +crates to `rustc`. This method is good for a number of reasons stemming from its fine granularity, +such as: -Cargoizing libstd also generally simplifies the infrastructure, and makes cross compiling much -slicker, but that is a separate discussion. + - No undeclared dependencies can be used -Finally, I first raised this issue here: https://github.com/rust-lang/Cargo/issues/1096 Also, there -are some (heavily bit-rotted) projects at https://github.com/RustOS-Fork-Holding-Ground that depend -on each other in the way this RFC would make much more feasible. + - Conversely, `rustc` can warn against *unused* declared dependencies -# Detailed design + - Crate/symbol names are frobbed so that packages with the overlapping names don't conflict + + +However rather than passing in libstd and its deps, Cargo lets the compiler look for them as need in +the compiler's sysroot [specifically `/lib/`]. This is quite coarse in comparison, +and we loose all the advantages of the previous method: + + - Packages may link or not link against libs in that directory as they please, with Cargo being + none the wiser. + + - Cargo-built crates with the same name as those in there will collide, as the sysroot libs don't + have their names frobbed. -The current situation seems to be more of an accident of `rustc`'s pre-Cargo history than an -explicit design decision. Cargo passes the location and name of all depended on crates to `rustc`. -This is good because it means that that no undeclared dependencies on other Cargo packages can leak -through. However, it also passes in `--sysroot /path/to/some/libdir`, the directory being were -libstd is. This means packages are free to use libstd, the crates behind the facade, or none of the -above, with Cargo being none the wiser. + - Cross compiling may fail at build-time (as opposed to the much shorter + "gather-dependencies-time") because of missing packages -The only new interface proposed is a boolean field to the package meta telling Cargo that the -package does not depend on libstd by default. This need not imply Rust's `no_std`, as one might want -to `use` their own build of libstd by default. To disambiguate, this field is called -`implicit-deps`; please, go ahead and bikeshead the name. `implicit-deps` is true by default to -maintain compatibility with existing packages. -The meaning of this flag is defined in 3 phases, where each phase extends the last. The idea being -is that while earlier phases are easier to implement, later phases yield a more elegant system. +Cargo doesn't look inside the sysroot to see what is or isn't there, but it would hardly help if it +did, because it doesn't know what any package needs. Assuming all packages need libstd, for example, +means Cargo just flat-out won't build freestanding packages that just use libcore on a platform that +doesn't support libstd. -## Phase 1 +For an anecdote: in https://github.com/RustOS-Fork-Holding-Ground I tried to rig up Cargo to cross +compile libstd for me. Since I needed to use an unstable compiler anyways, it was possible in +principle to build absolutely everything I needed with the same `rustc` version. Because of some +trouble with Cargo and target JSONs, I didn't use a custom target specification, and just used +`x86_64-gnu-linux`, meaning that depending on platform I was compiling on, I may or may have been +cross-compiling. In the case where I wasn't, I couldn't complete the build because `rustc` +complained about the libstd I was building overlapping with the libstd in the sysroot. -Add a `--use-sysroot=` flag to `rustc`, where true is the default. Make Cargo pass -`--use-sysroot=false` to `rustc` is the case that `implicit-deps` is false. +For these reasons, most freestanding projects I know of avoid Cargo altogether, and just include +submodule rust and run make in that. Cargo can still be used if one manages to get the requisite +libraries in the sysroot. But this is a tedious operation that individual projects shouldn't need to +reimplement, and one that has serious security implications if the normal libstd is modified. -This hotfix is enough to allow us bare-metal devs to use Cargo for our own projects, but doesn't -suffice for creating an ecosystem of packages that depend on crates behind the facade but not libstd -itself. This is because the choices are all or nothing: Either one implicitly depends on libstd or -the crates behind the facade, or they don't depend on them at all. +The fundamental plan proposed in this RFC is to make sure that anything Cargo builds never blindly +links against libraries in the sysroot. This is achieved by making Cargo aware of all dependencies, +including those libstd or its backing crates. That way, these problems are avoided. -## Phase 2 +For the record, I first raised this issue [here](https://github.com/rust-lang/Cargo/issues/1096). -Since, passing in a directory of crates is inherently more fragile than passing in a crate itself, -make Cargo use `--use-sysroot=false` in all cases. -Cargo would special case package names corresponding to the crates behind the facade, such that if -the package don't exist, it would simply pass the corresponding system crate to `rustc`. I assume -the names are blacklisted on crates.io already, so by default the packages won't exist. But users -can use config files to extend the namespace so their own modded libstds can be used instead. Even -if they don't want to change libstd but just cross-compile it, this is frankly the easiest way as -Cargo will seemliest cross compile both their project and it's transitive dependencies. +# Detailed design + +The only new interface proposed is a boolean field in `Cargo.toml` specifying that the package does +not depend on libstd by default. Note that this is technically orthogonal to Rust's `no_std`, as one +might want to `use` their own build of libstd by default, or implicitly depend on it but not +glob-import the prelude. To disambiguate, this field is called `implicit-deps`; please, go ahead and +bikeshead the name. `implicit-deps` is true by default to maintain compatibility with existing +packages. When true, "std" will be implicitly appended to the list of dependencies. + +When Cargo sees a package name it cannot resolve, it will query `rustc` for the default sysroot, and +look inside to see if it can find a matching rlib. [It is necessary to query `rustc` because the +`rustc` directory layout is not stabilized and `rustc` and Cargo are versioned independently. The +same version issues make giving a Cargo a whitelist of potential standard library crate-names +risky.] If a matching rlib is successful found, Cargo will copy it (or simlink it) into the +project's build directly as if it built the rlib. Each rlib in the sysroot must be paired with some +sort of manifest listing its dependencies, so Cargo can copy those too. -In this way we can put packages on crates.io that depend on the crates behind the facade. Some -packages that already exist, like liblog and libbitflags, should be given features that optionally -allow them to avoid libstd and just depend directly on the crates behind the facade they really -need. +`rustc` will have a new `--use-sysroot=` flag. When Cargo builds a package, it will +always pass `--use-sysroot=false` to `rustc`, as any rlibs it needs will have been copied to the +build directory. Cargo can and will then pass those rlibs directly just as it does with normal Cargo +deps. -## Phase 3 +If Cargo cannot find the libraries it needs in the sysroot, or a library's dependency manifest is +missing, it will complain that the standard libraries needed for the current job are missing and +give up. -If/when the standard library is built with Cargo and put on crates.io, all the specially-cased -package names can be treated normally, +## Future Compatibility -The standard library is downloaded and built from crates.io. Or equivalently, Cargo comes with a -cache of that build, as Cargo should be able cache builds between projects at this point. Just as in -phase 2, `implicit-deps = false` just prevents libstd from implicitly being appended to the list of -dependencies. +In the future, rather than giving up if libraries are missing Cargo could attempt to download them +from some build cache. In the farther future, the stdlib libraries may be Cargoized, and Cargo able +to query pre-built binaries for any arbitrary package. In that scenario, we can remove all code +relating to falling back on the sysroot to look for rlibs. + +In the meantime, developers living dangerously with an unstable compiler can package the standard +library themselves, and use their Cargo config file to get Cargo to cross compiler libstd for them. -Again, to make this as least controversial as possible, this RFC does not propose outright that the -standard library should be Cargoized. This 3rd phases just describes how this feature would work -were that to happen. # Drawbacks -I really don't know of any. Development for hosted environments would hardly be very affected. +Cargo does more work than is strictly necessary for rlibs installed in sysroot; some more metadata +must be maintained by `rustc` or its installation. + + - But in a future where Cargo can build stdlib like any other, all this cruft goes away. + # Alternatives -Make it so all dependencies, even libstd, must be explicit. C.f. Cabal and base. + - Simply have `implicit-deps = false` make Cargo pass `--use-sysroot=false` to `rustc`. + + - This doesn't by-itself make a way for package to depend on only some of the crates behind the + facade. That, in turn, means Cargo is little better at cross compiling those than before. + + - While unstable compiler users can just package the standard library and depend on it as a + normal crate, it would be weird to have freestanding projects coalesce around some bootleg + libcore on crates.io. + + - Make it so all dependencies, even libstd, must be explicit. C.f. Cabal and base. Slightly + simpler, but breaks nearly all existing packages. + + - Don't track stdlib depencies. Then, in the future when Cargo tries to obtain libs for cross + compiling, stick them in the sysroot instead. Cargo either assumes package needs all of stdlib, + or examines target to see what crates behind the facade are buildable and just goes for those. + + - Cargo does extra work if you need less of the stdlib + + - No nice migration into a world where Cargo can build stdlib without hacks. + # Unresolved questions -There are multiple lists of dependencies for different things (e.g. tests), Should libstd be append -to all of them in phases 2 and 3? + - There are multiple lists of dependencies for different things (e.g. tests), Should libstd be + append to all of them in phases 2 and 3? + + - Should rlibs in the sysroot respect Cargo name-frobbing conventions? If they don't, should Cargo + frob the name when it copies it (e.g. with `ld -i`)? + + - Just as make libstd a real dependency, we can make `rustc` a real dev dependency. The standard + library can thus be built with Cargo by depending on the associated unstable compiler. There are + some challenges to be overcome, including: + + - Teaching Cargo and its frobber an "x can build for y" relation for stable/unstable compiler + compatibility, rather than simply assuming all distinct compilers are mutually incompatible. + + - Coalescing a "virtual package" out of many different packages with disjoint dependencies. This + is needed because different `rustc` version has a different library implementation that + present the same interface. + + This almost certainly is better addressed in a later RFC.