Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Cargo aware of standard library dependencies #1133

Closed
wants to merge 66 commits into from
Closed
Changes from 6 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
7f3d678
Copy in template
Ericson2314 May 25, 2015
4c0ea2a
First version
Ericson2314 May 26, 2015
eee7dba
Fix were/where Typo
Ericson2314 May 26, 2015
9da1b02
Use true defaults to avoid double negatives as suggested by @Valloric
Ericson2314 May 26, 2015
630ac97
Typo
Ericson2314 May 28, 2015
a7425e6
Rewrite after talking to @alexcrichton on IRC.
Ericson2314 May 29, 2015
37e9246
Messed up target triple in anecdote
Ericson2314 May 30, 2015
d549d15
Clarify that stable Rust won't allow linking any library in the sysroot
Ericson2314 Jun 15, 2015
b3476e1
Motivate incrementally
Ericson2314 Aug 16, 2015
c22463f
Nit: Fix Some wrongly-capitalized "Cargo"s
Ericson2314 Aug 31, 2015
c257f34
Detailed design for overrides
Ericson2314 Sep 3, 2015
4a8bf9b
Overhaul and expand on detailed design: core is stable and [reaplace]…
Ericson2314 Apr 25, 2016
4b4d8a3
Two more drawbacks
Ericson2314 Apr 25, 2016
1ee980f
Fix typo
Ericson2314 May 11, 2016
f0a0b3b
Rewrite to use upcoming Cargo registries.
Ericson2314 Jul 12, 2016
3422763
Fix broken hyperlink
Ericson2314 Jul 12, 2016
8392c4a
Typos caught by @eternaleye, thanks!
Ericson2314 Jul 12, 2016
ffc442b
Clarify what implicit deps are, including unresolved issues
Ericson2314 Jul 12, 2016
1c53ae1
Motivate the use of registries, and move the speculative `[ideally..]…
Ericson2314 Jul 12, 2016
cc8a1f2
Unresolved question on `cargo new`
Ericson2314 Jul 13, 2016
71ba640
Prevent snooping binaries from the sysroot; sysroot binaries not copied
Ericson2314 Jul 13, 2016
9a61baa
`test` is an implicit dev-dependency.
Ericson2314 Jul 13, 2016
89a461a
Whether implicit test and dev dependencies are mandatory is unresolved
Ericson2314 Jul 13, 2016
4656eb9
Typos
Ericson2314 Jul 13, 2016
984bcb5
Add `language-version` key, and alternative for compiler-rt handling
Ericson2314 Jul 13, 2016
af9442b
Add unresolved questions for fine-grained configuration (e.g. exact CPU)
Ericson2314 Jul 13, 2016
d16f8c3
Either both the sysroot source and binaries can resolved, or neither
Ericson2314 Jul 13, 2016
77be419
Add parenthetical.
Ericson2314 Jul 13, 2016
afae118
Use `stdlib = true|false` rather than crates.io fallback to begin with.
Ericson2314 Jul 27, 2016
a1f8ab1
Clarify the forward compatibility section
Ericson2314 Jul 27, 2016
2334770
Copy-edit 0000-cargo-libstd-awareness.md
jethrogb Aug 1, 2016
61cd8ca
Revise a view corrections / add more of my own.
Ericson2314 Aug 1, 2016
fac0c9b
By the time this lands, Compiler-rt will be tamed.
Ericson2314 Aug 4, 2016
a062095
Extra 'a' typo
Ericson2314 Aug 5, 2016
bbf1ec6
"bounds" -> "semver requirements"
Ericson2314 Aug 6, 2016
81a923a
Clarify "at the time": "in Rust 1.0"
Ericson2314 Aug 6, 2016
51cfdb6
Remove stale alternative
Ericson2314 Aug 6, 2016
a36a46b
Clarify drawback on unconstrained interfaces
Ericson2314 Aug 6, 2016
e6aae38
Clarify drawback and fix typo
Ericson2314 Aug 6, 2016
7e73557
More eloquence and details about how sysroot binaries will be used.
Ericson2314 Aug 6, 2016
f7fd3df
Fix sysroot paths
Ericson2314 Aug 6, 2016
b327075
Add missing closing paren
Ericson2314 Aug 6, 2016
d38daad
Clarify how implicit dependencies are disabled
Ericson2314 Aug 6, 2016
f8859d5
Grammar for implicit dep disabling
Ericson2314 Aug 6, 2016
d208a5f
`core` is the only crate needing to use the `implicit-dependencies` key
Ericson2314 Aug 6, 2016
cb67deb
Clarify stdlib deps vs rust language dep
Ericson2314 Aug 6, 2016
e8b1aa2
For now, "stdlib = true" is a source
Ericson2314 Aug 6, 2016
477d8bc
Finally figure out how deps are passed to rustc
Ericson2314 Aug 8, 2016
b9aa2da
Add alternatives for "sysroot = true" and language version
Ericson2314 Aug 12, 2016
26e1b6e
Purge language about registries, misc clarifications of surrounding text
Ericson2314 Aug 13, 2016
d35237f
Split huge paragraph
Ericson2314 Aug 13, 2016
28353b2
Clamp down on dependencies; mention unstability
Ericson2314 Aug 13, 2016
44f1e84
Talk about `version = "*"`
Ericson2314 Aug 13, 2016
2ea3e83
the -> a
Ericson2314 Aug 13, 2016
324e225
Add @brson's text on sysroot binary filenames
Ericson2314 Aug 13, 2016
02889b8
Remove obsolete unresolved question
Ericson2314 Aug 13, 2016
bbc503b
Talk about lockfiles.
Ericson2314 Aug 13, 2016
b1e1b4b
Remove rustbuild ideas, and talk about implementation roadmap instead
Ericson2314 Aug 13, 2016
50a0862
Slightly update the roadmap
Ericson2314 Aug 19, 2016
e8f14b7
Typo
Ericson2314 Aug 23, 2016
c501fd9
Prevent stdlib deps as replacements
Ericson2314 Aug 30, 2016
2719c44
No `stdlib = false`
Ericson2314 Aug 30, 2016
54aa001
Include the name of the temporary stdlib deps pruning option
Ericson2314 Aug 30, 2016
2774668
Talk about `custom-implicit-stdlib-dependencies` and hack flags in ge…
Ericson2314 Aug 31, 2016
d85091a
Fix typo
Ericson2314 Oct 10, 2016
ee4104f
Fix typos found by @est31 and @JinShil. Thanks!
Ericson2314 Jan 7, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions text/0000-cargo-libstd-awareness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
- Feature Name: cargo_libstd_awareness
- Start Date: 2015-05-26
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Currently, Cargo doesn't know whether packages depend on libstd. This makes Cargo unsuitable for
packages that need a cross-compiled or custom libstd, or otherwise depend on crates with the same
names as libstd and the crates behind the facade. The proposed fixes also open the door to a future
where libstd can be Cargoized.


# Motivation

First some background. The current situation seems to be more of an accident of `rustc`'s pre-Cargo
history than an explicit design decision. Cargo passes the location and name of all depended-on
crates to `rustc`. This method is good for a number of reasons stemming from its fine granularity,
such as:

- No undeclared dependencies can be used

- Conversely, `rustc` can warn against *unused* declared dependencies

- Crate/symbol names are frobbed so that packages with the overlapping names don't conflict


However rather than passing in libstd and its deps, Cargo lets the compiler look for them as need in
the compiler's sysroot [specifically `<sysroot>/lib/<target>`]. This is quite coarse in comparison,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to spell out here that it's far from standard practice to dump libraries in the sysroot, and the only stable library in the sysroot today is libstd. We have been very hesitant to stabilize any more than precisely one library for many of these reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused what you'd like me to elaborate on. I already wanted to emphasize we just do this in the case of libstd and its dependencies---in other words that we are so close---just 1 library away!---from not linking with the sysroot at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm re-reading I'm not quite sure what I was thinking... It may have been from the aspect that "and its deps" isn't so relevant in stable Rust today as libstd is the only library that can be implicitly linked to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. I'll clarify the situation for stable Rust.

and we loose all the advantages of the previous method:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/loose/lose/


- Packages may link or not link against libs in that directory as they please, with Cargo being
none the wiser.

- Cargo-built crates with the same name as those in there will collide, as the sysroot libs don't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite true, for example liblibc exists in both the sysroot and on crates.io. Cargo will pass --extern libc=... which overrides everything (including the sysroot).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'll need to update my code so I can see why the build is failing, or whether it does today.

have their names frobbed.

- Cross compiling may fail at build-time (as opposed to the much shorter
"gather-dependencies-time") because of missing packages


Cargo doesn't look inside the sysroot to see what is or isn't there, but it would hardly help if it
did, because it doesn't know what any package needs. Assuming all packages need libstd, for example,
means Cargo just flat-out won't build freestanding packages that just use libcore on a platform that
doesn't support libstd.

For an anecdote: in https://github.com/RustOS-Fork-Holding-Ground I tried to rig up Cargo to cross
compile libstd for me. Since I needed to use an unstable compiler anyways, it was possible in
principle to build absolutely everything I needed with the same `rustc` version. Because of some
trouble with Cargo and target JSONs, I didn't use a custom target specification, and just used
`x86_64-gnu-linux`, meaning that depending on platform I was compiling on, I may or may have been
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/x86_64-gnu-linux/x86_64-unknown-linux-gnu/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks!

cross-compiling. In the case where I wasn't, I couldn't complete the build because `rustc`
complained about the libstd I was building overlapping with the libstd in the sysroot.

For these reasons, most freestanding projects I know of avoid Cargo altogether, and just include
submodule rust and run make in that. Cargo can still be used if one manages to get the requisite
libraries in the sysroot. But this is a tedious operation that individual projects shouldn't need to
reimplement, and one that has serious security implications if the normal libstd is modified.

The fundamental plan proposed in this RFC is to make sure that anything Cargo builds never blindly
links against libraries in the sysroot. This is achieved by making Cargo aware of all dependencies,
including those libstd or its backing crates. That way, these problems are avoided.

For the record, I first raised this issue [here](https://github.com/rust-lang/Cargo/issues/1096).


# Detailed design

The only new interface proposed is a boolean field in `Cargo.toml` specifying that the package does
not depend on libstd by default. Note that this is technically orthogonal to Rust's `no_std`, as one
might want to `use` their own build of libstd by default, or implicitly depend on it but not
glob-import the prelude. To disambiguate, this field is called `implicit-deps`; please, go ahead and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this from another angle, the only implicitly available crate that is available in stable Rust is std. This means that we have quite a bit of freedom when considering the other crates distributed with Rust itself. For example this field could in theory just be implicit-std = false which passes a flag to the compiler disabling the implicit usage of std, and then the compiler will implicitly deny access to all other crates by default (e.g. even libcore).

Just a note that we have very few constraints today (just the name "std"), and we can do whatever we like with the other deps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, I think the semantics you are describing is exactly what I proposed. implicit-deps controls access std and it's dependencies---whatever those may be.

I went with -deps and not -std because of unstable Rust. But if we want to gear the name around stable Rust, sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes we're definitely thinking of the same thing, I was just wondering if the name implicit-std was better. For example the name implicit-deps seems kinda scary that any crate could be an implicit dependency, when in fact there is only one crate in the stable world that can be an implicit dependency -- libstd.

I agree there are more crates that can be implicitly depended upon, but none of them are stable today, so we may not need to consider them. I think I was just somewhat startled at how this may imply that implicit dependencies are allowed from anywhere (when it's in fact just the sysroot)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Somebody beginning Rust, or beginning just even unstablized Rust, has no idea what dependencies would normally be implicit. One the other hand implicit-std sounds like one is just getting access to std, since Cargo normally doesn't allow one to extern transitive dependencies lest a direct dependency changes its dependencies. A name like implicit-std-and-its-deps would be the most clear, but my is it wordy.

Another option is to change it so that implicit-std = true only gives access to std itself, and dependencies on the crates behind the facade must always be explicit. This would also help on the off-chance that we want to version unstable std independently from its backing crates. Unfortunately, it also breaks backwards compatibility with existing packages, but only those using unstable Rust.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another, more concise phrasing might be implicit-stdlibs = true - the 's' punches well above its weight class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not bad! I tried to think of a short name to capture all that, but came up short.

bikeshead the name. `implicit-deps` is true by default to maintain compatibility with existing
packages. When true, "std" will be implicitly appended to the list of dependencies.

When Cargo sees a package name it cannot resolve, it will query `rustc` for the default sysroot, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what this means for "a package name it cannot resolve"? For example Cargo does not attempt to resolve the name "std" in any way today, so I'm not sure where this sort of resolution failure will start from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading a little more, my interpretation is that you're proposing that a crate explicitly declares that it depends on core and std (if the boolean field above is specified), is that right? If so, can you go into some more detail about what the syntax for doing so might be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I want it so

implicit-deps = false;

[dependencies]
core = "*" # Or some more appropriate version specifier
alloc = "*"
# ...other crates behind the facade....
std = "*"

and

implicit-deps = true;

and

# implicit-deps is true by default

all mean the same thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm ok, your first snippet has a bit of a different interpretation because it means the dependencies like std come from crates.io, which probably isn't going to happen any time soon. Put another way there's no way to express a dependency on std in Cargo.toml today because we distribute it in binary form instead of on crates.io.

I think the reason I'm somewhat uneasy to add implicit-deps is that if you specify implicit-deps there's no way for you to actually link to the standard library. Did you have something in mind for doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, this RFC changes the way cargo works so that depended-on crates not on crates.io are looked for in the syroot instead, precisely so we can continue distributing those crates the same way for the time being. This allows:

  • A clear migration path to a future where those crates are defined on crates.io
  • A finer way to distinguish what functionality libraries intended for kernel use need
  • Unstable rust users to locally define those crates so cargo cross-compiles them with the rest of their program.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what this is saying now, although unfortunately I feel like that's a little too much magic going on under the hood. Cargo understands the "source" for any particular package, and it needs to understand if that source is crates.io or the sysroot ahead of time. Along those lines I think that this needs to have some new source syntax, such as:

[dependencies]
std = { rustc-sysroot = true }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While yes, that does make Cargo's life easier, does this information really belong in the package metadata? A package just cares what version std is, not how cargo obtained it. Also, if we switch to deploying these crates via crates.io before the end of 1.0, we wouldn't want packages to break because they mandated that std must come from the sysroot.

That said, I'd still rather have that than the status quo. IIRC local packages (with cargo config) override no matter the version so this doesn't prohibit my third bullet point in my previous post.

look inside to see if it can find a matching rlib. [It is necessary to query `rustc` because the
`rustc` directory layout is not stabilized and `rustc` and Cargo are versioned independently. The
same version issues make giving a Cargo a whitelist of potential standard library crate-names
risky.] If a matching rlib is successful found, Cargo will copy it (or simlink it) into the
project's build directly as if it built the rlib. Each rlib in the sysroot must be paired with some
sort of manifest listing its dependencies, so Cargo can copy those too.

`rustc` will have a new `--use-sysroot=<true|false>` flag. When Cargo builds a package, it will
always pass `--use-sysroot=false` to `rustc`, as any rlibs it needs will have been copied to the
build directory. Cargo can and will then pass those rlibs directly just as it does with normal Cargo
deps.

If Cargo cannot find the libraries it needs in the sysroot, or a library's dependency manifest is
missing, it will complain that the standard libraries needed for the current job are missing and
give up.

## Future Compatibility

In the future, rather than giving up if libraries are missing Cargo could attempt to download them
from some build cache. In the farther future, the stdlib libraries may be Cargoized, and Cargo able
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah one point I forgot about previously, which is probably pretty relevant to this, is: the compiler can only link against libraries it previously built. This means that we would need a build cache per-revision of the compiler, which unfortunately makes this much more infeasible :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we over at NixOS maintain a build cache for Haskell and it works. If you only focus the slower release channels, and prioritize popular packages, it can still be useful.

The idea of a stable ABI scares me, but if/when it happens, that problem goes away too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah we could definitely pre-cache builds of std for each official release of the compiler, for example, but it may want to be mentioned here as a potential downside. For example all custom builds of the compiler (e.g. nightlies) will not have access to pre-built archives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the downside. Today, the compiler and library are built together, so we are already building and storing std for each prebuilt compiler. Whether or not std is downloaded with a prebuilt compiler or separately from a crates.io build cache, the build time and storage requirements are the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downloading std separately will stop those who have installed rust to look at it during their commute (case in point: me, a few weeks ago). Not everyone has a fast internet connection everywhere, so there may be other hidden downsides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, once std is downloaded once, it doesn't in principle need to be downloaded again. Assuming a local build cache shared between projects is implemented at this point in the future, the compiler's install script could set it up and pre-populate it with std. That way nobody forgets std on their commute :).

to query pre-built binaries for any arbitrary package. In that scenario, we can remove all code
relating to falling back on the sysroot to look for rlibs.

In the meantime, developers living dangerously with an unstable compiler can package the standard
library themselves, and use their Cargo config file to get Cargo to cross compiler libstd for them.


# Drawbacks

Cargo does more work than is strictly necessary for rlibs installed in sysroot; some more metadata
must be maintained by `rustc` or its installation.

- But in a future where Cargo can build stdlib like any other, all this cruft goes away.


# Alternatives

- Simply have `implicit-deps = false` make Cargo pass `--use-sysroot=false` to `rustc`.

- This doesn't by-itself make a way for package to depend on only some of the crates behind the
facade. That, in turn, means Cargo is little better at cross compiling those than before.

- While unstable compiler users can just package the standard library and depend on it as a
normal crate, it would be weird to have freestanding projects coalesce around some bootleg
libcore on crates.io.

- Make it so all dependencies, even libstd, must be explicit. C.f. Cabal and base. Slightly
simpler, but breaks nearly all existing packages.

- Don't track stdlib depencies. Then, in the future when Cargo tries to obtain libs for cross
compiling, stick them in the sysroot instead. Cargo either assumes package needs all of stdlib,
or examines target to see what crates behind the facade are buildable and just goes for those.

- Cargo does extra work if you need less of the stdlib

- No nice migration into a world where Cargo can build stdlib without hacks.


# Unresolved questions

- There are multiple lists of dependencies for different things (e.g. tests), Should libstd be
append to all of them in phases 2 and 3?

- Should rlibs in the sysroot respect Cargo name-frobbing conventions? If they don't, should Cargo
frob the name when it copies it (e.g. with `ld -i`)?

- Just as make libstd a real dependency, we can make `rustc` a real dev dependency. The standard
library can thus be built with Cargo by depending on the associated unstable compiler. There are
some challenges to be overcome, including:

- Teaching Cargo and its frobber an "x can build for y" relation for stable/unstable compiler
compatibility, rather than simply assuming all distinct compilers are mutually incompatible.

- Coalescing a "virtual package" out of many different packages with disjoint dependencies. This
is needed because different `rustc` version has a different library implementation that
present the same interface.

This almost certainly is better addressed in a later RFC.