Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

&A as *const _ == &A as *const _ not guaranteed for const A #131

Closed
oli-obk opened this issue Feb 9, 2017 · 21 comments
Closed

&A as *const _ == &A as *const _ not guaranteed for const A #131

oli-obk opened this issue Feb 9, 2017 · 21 comments
Labels
A-interpreter Area: affects the core interpreter C-bug Category: This is a bug.

Comments

@oli-obk
Copy link
Contributor

oli-obk commented Feb 9, 2017

miri creates two allocations for the *b"hi" in A.
see MIR in https://is.gd/bCUD7p
This makes sense, since there's _1 = A; _2 = &_1 instead of _2 = &A.

Should this

a) be fixed in rustc, or
b) should we cache all literals (we have FIXMEs for that), or
c) should we add some Cow-like structure to lvalues, so they can refer to other lvalues "ByVal"?
d) is the test ( https://github.com/rust-lang/rust/blob/master/src/test/run-pass/const-str-ptr.rs ) wrong?

@eddyb: any preferences?

Found more evidence for b and against d:

#![feature(const_fn)]

const fn foo() -> *const i8 {
    b"foo" as *const _ as *const i8
}

const fn bar() -> i32 {
    *&{(1, 2, 3).1}
}

fn main() {
    assert_eq!(foo(), b"foo" as *const _ as *const i8);
    assert_eq!(bar(), 2);
}
@solson
Copy link
Member

solson commented Feb 9, 2017

@nikomatsakis and @eddyb were just talking about a way to change how promoteds work that I think might fix this problem. (I.e. I'm leaning a. However, I think we have other reasons to do b anyway.)

@solson solson added the C-bug Category: This is a bug. label Feb 9, 2017
@eddyb
Copy link
Member

eddyb commented Feb 9, 2017

Specifically, changing promoted fragments to be lvalues instead of constant rvalues and moving the reference out, so they're (inside a function) runtime borrows of static lvalues instead of copies of constant rvalues, and then we also want borrows inside const and static to also be explicitly promoted.

@nikomatsakis likes this and I agree it should work better, at least in the sense that you're encoding the "right" semantics, as you likely want to allow &0 and ban { let x = 0; &x } in constants.
We can fully define it in terms of rvalue "drop scopes" but those are not computed for constants atm AFAIK.

@solson
Copy link
Member

solson commented Feb 10, 2017

I implemented b in 6ffd700 but we'll still need the fix for promoteds.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jun 19, 2017

Another example from the rustc test suite:

use std::{str, string};

const A: [u8; 2] = ['h' as u8, 'i' as u8];
const B: &'static [u8; 2] = &A;
const C: *const u8 = B as *const u8;

pub fn main() {
    unsafe {
        let foo = &A as *const u8;
        assert_eq!(foo, C);
        assert_eq!(str::from_utf8_unchecked(&A), "hi");
        assert_eq!(*C, A[0]);
        assert_eq!(*(&B[0] as *const u8), A[0]);
    }
}

@RalfJung
Copy link
Member

This test also differs between rust (it passes) and miri (it fails):

fn main() {
    assert!(&A as *const A as *const () == &() as *const _);
    assert!(&A as *const A == &A as *const A);
}

@RalfJung
Copy link
Member

This seems to be just an instance of "we should not really allocate ZSTs", right? I.e., fixing this should also fix the above test.

@oli-obk
Copy link
Contributor Author

oli-obk commented Sep 13, 2017

We'll still have some zsts whose address would differ, e.g. because they are in a field of another type. I don't feel too strongly about any variant, but I'm not sure what semantics we want in const eval

@RalfJung
Copy link
Member

Actually this has nothing to do with ZST allocation. rustc somehow makes it so that with

promoted0 in main: &() = {
    let mut _0: &();                     // return pointer
    let mut _1: ();

    bb0: {
        _1 = ();                         // scope 0 at src/main.rs:10:45: 10:47
        _0 = &_1;                        // scope 0 at src/main.rs:10:44: 10:47
        return;                          // scope 0 at src/main.rs:10:44: 10:47
    }
}

promoted1 in main: &A = {
    let mut _0: &A;                      // return pointer
    let mut _1: A;

    bb0: {
        _1 = A::{{constructor}};         // scope 0 at src/main.rs:10:14: 10:15
        _0 = &_1;                        // scope 0 at src/main.rs:10:13: 10:15
        return;                          // scope 0 at src/main.rs:10:13: 10:15
    }
}

the code

        _8 = promoted1;                  // scope 0 at src/main.rs:10:13: 10:15
        _7 = _8;                         // scope 0 at src/main.rs:10:13: 10:15
        _6 = _7 as *const A (Misc);      // scope 0 at src/main.rs:10:13: 10:15
        _5 = _6;                         // scope 0 at src/main.rs:10:13: 10:27
        _4 = _5 as *const () (Misc);     // scope 0 at src/main.rs:10:13: 10:40
        _12 = promoted0;                 // scope 0 at src/main.rs:10:44: 10:47
        _11 = _12;                       // scope 0 at src/main.rs:10:44: 10:47
        _10 = _11 as *const () (Misc);   // scope 0 at src/main.rs:10:44: 10:47
        _9 = _10;                        // scope 0 at src/main.rs:10:44: 10:59
        _3 = Ne(_4, _9);                 // scope 0 at src/main.rs:10:13: 10:59
        _2 = Not(_3);                    // scope 0 at <assert macros>:2:4: 2:12

ends up saying that the two pointers are equal. Is it going deduplication of promoted constants or how is that working?

@eddyb
Copy link
Member

eddyb commented Sep 25, 2017

All immutable constants are deduplicated in librustc_trans - basically, interned.

@RalfJung
Copy link
Member

TBH I do not feel "bug" is the right label for this; we don't guarantee this deduplication (I would think) so what miri does seems reasonable to me. For example, does it happen across crates?

We could also make it error out when comparing, because we don't want to predict run-time behavior. Not sure if that would be better.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 27, 2018

So... I've come to the conclusion that we can only compare

  • statics with interior mutability or static mut
  • live heap allocations
  • live mutable locals

without erroring, because any other comparisons might differ between execution and interpretation.

Of course pointers into the same allocation can always be compared.

@eddyb
Copy link
Member

eddyb commented Nov 27, 2018

Even immutable statics are still guaranteed distinct address, though?

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 27, 2018

Indeed.

I have now built the following comparison table that I think represents everything we want:

heap static constant/ promoted local mutable local heap (deallocated) local (deallocated) fn
heap == false false false false ERROR false false
static == false false false false false false
constant/ promoted MAYBE MAYBE false false ERROR false
local MAYBE false false ERROR false
mutable local == false ERROR false
heap (deallocated) ERROR false false
local (deallocated) ERROR false
fn ERROR

MAYBE means "if the value is equal { Err(..) } else { Ok(false) }"

The reasoning behind MAYBE is that you can't deduplicate constants with different values.

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 28, 2018

Ok... it's not as clear cut as I assumed.

E.g. slice comparison has an optimization where if the length is the same and the address is the same, we don't compare the contents at all. With the rules in the table above, let x = ["foo"; 8]; assert!(x == x); would error.

This is a similar situation to the problems we've had around from_utf8 having an optimization around the alignment of the elements of the array to be checked for utf8 validity.

Can we assume that named and live local variables are guaranteed distinct, even if the compiler can figure out that they are immutable and have the same value?

Basically:

let x = 42;
let y = foo();
if x == y {
    // is it a misoptimization to free the stack slot of either x or y to
    // be reused and all further mentions of `x` and `y` refer to the same
    // stack slot?
    return;
}

@RalfJung
Copy link
Member

I think we should take a step back and think about the goals here. All these examples here are essentially cases where whether the physical values of the pointers are equal is non-deterministic. Do we want to make miri detect all such possible non-determinism and stop execution? There is at least one more case: Comparing the pointer to the beginning of one allocation and one-past-the-end of another allocation could actually return true at run-time if the two allocations sit right next to each other and LLVM doesn't optimize the comparison away at compile-time.

Right now, I think being more aggressive than we already are about ruling out non-determinism will quickly make us bail out on existing code. I see two options, in principle:

  • Decide we want to rule out all-non-determinism. Then we will likely need a new intrinsic or so, weak_ptr_eq, that compares pointers for equality but may spuriously declare pointers unequal even if they are equal. It can be used for optimizations such as the slice comparison you mentioned. In LLVM this would just compile to ptr equality, but miri would execute it as compare_ptrs(...).unwrap_or(false) -- assuming compare_ptrs returns Option<bool>, with None indicating non-determinism. I am not sure if this will cover all cases, but it might be worth a try.
  • Decide we are fine fixing some non-determinism in miri, even if things might behave differently in LLVM. For miri-the-tool I think this is fine, we'll never be able to cover all the non-deterministic branches of an execution anyway. Can this lead to problems in CTFE, particularly around promotion?

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 29, 2018

particularly around promotion?

Only if we start allowing casting to raw pointers in promoteds or allow calling arbitrary const functions

Can this lead to problems in CTFE

Right now we bail out for any pointer comparison in CTFE, so we're good and can consider this independently.

For miri-the-tool I think this is fine, we'll never be able to cover all the non-deterministic branches of an execution anyway.

Yes I think that's a good solution. Pointers are not equal if there's any optimization that may consider them unequal.

This means we can simplify the rules to "if both are heap or both are static or both are mut local, compare Pointer, otherwise, false"

@RalfJung
Copy link
Member

RalfJung commented Nov 29, 2018

Only if we start allowing casting to raw pointers in promoteds or allow calling arbitrary const functions

Long-term we want to allow the latter with explicit promotion, right? So, doesn't that mean it will be a problem?

Right now we bail out for any pointer comparison in CTFE, so we're good and can consider this independently.

True.

Yes I think that's a good solution. Pointers are not equal if there's any optimization that may consider them unequal.

Do you mean "any possible execution"?

I don't think mut should have any semantic effect, so there's just heap, stack, static, const. However TBH I do not see how considering these classes is useful.

The problem with comparing out-of-bounds pointers of different allocations is that if we declare them never equal, we get strange effects. Basically, the following function will return true with any input on any possible execution of real Rust (well, hopefully... pointer comparison is tricky and LLVM performs some strange optimizations here), and hence should return true in miri (or error as "unsupported"):

fn compare_all_offsets(x: *const u8, y: *const u8) -> bool {
  (0..=usize::MAX).any(|offset| x.wrapping_offset(offset) == y)
}

@oli-obk
Copy link
Contributor Author

oli-obk commented Nov 29, 2018

TBH I do not see how considering these classes is useful.

It is not. We should simply ensure that every creation of a function pointer gets a new AllocId, even if they point to the same function.

Basically, the following function will return true with any input on any possible execution of real Rust

On 64 bit that program is basically an infinite loop

But I agree, we should keep the code as it is. The obvious issues are addressible at the time the AllocId is created

@RalfJung
Copy link
Member

RalfJung commented Dec 4, 2018

We should simply ensure that every creation of a function pointer gets a new AllocId, even if they point to the same function.

Oh right, with function pointers we currently have spurious equality even when they can be inequal at run-time. Seems preferrable to only ever err on one side (spurious inequality).

We could also special-case function pointers during comparison, though. Not sure what would work better / be more efficient / more clean.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 4, 2018

I'm addressing this directly on the rustc side: rust-lang/rust@f6b3eb0

@RalfJung
Copy link
Member

Closing in favor of #1955

@RalfJung RalfJung closed this as not planned Won't fix, can't repro, duplicate, stale Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-interpreter Area: affects the core interpreter C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

4 participants