Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Memory Reference on --release builds (only) #73

Closed
jramapuram opened this issue Jun 1, 2016 · 10 comments
Closed

Invalid Memory Reference on --release builds (only) #73

jramapuram opened this issue Jun 1, 2016 · 10 comments
Assignees

Comments

@jramapuram
Copy link
Member

Note: This happens only with cargo run --release

extern crate arrayfire as af;

use af::{Array, Dim4, DType};

pub fn assert_types(v: Vec<&Array>){
  let base_type = v[0].get_type();
  for i in 1..v.len() {
    let cur_type = v[i].get_type();
    assert!(cur_type == base_type
            , "type mismatch detected: {:?} vs {:?}"
            , cur_type, base_type);
  }
}

fn main() {
  for _ in 0..100000 {
    let a = af::constant(3.0f32, Dim4::new(&[128, 128, 1, 1]));
    let b =  af::constant(3.0f32, Dim4::new(&[128, 128, 1, 1]));
    assert_types(vec![&a, &b]);
  }
}

Stack trace:

(lldb) r
Process 71790 launched: '/Users/jramapuram/Dropbox/projects/af_playground/target/release/af_playground' (x86_64)
warning: (x86_64) /usr/local/cuda/lib/libcublas.7.5.dylib empty dSYM file detected, dSYM was created with an executable with no debug info.
warning: (x86_64) /usr/local/cuda/lib/libcufft.7.5.dylib empty dSYM file detected, dSYM was created with an executable with no debug info.
warning: (x86_64) /Developer/NVIDIA/CUDA-7.5/nvvm/lib/libnvvm.3.0.0.dylib empty dSYM file detected, dSYM was created with an executable with no debug info.
warning: (x86_64) /usr/local/cuda/lib/libcusolver.7.5.dylib empty dSYM file detected, dSYM was created with an executable with no debug info.
Process 71790 stopped
* thread #1: tid = 0x591924, 0x0000000100001696 af_playground`main::h935482300789cbfaWda + 326, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x7fff5effffd7)
    frame #0: 0x0000000100001696 af_playground`main::h935482300789cbfaWda + 326
af_playground`main::h935482300789cbfaWda:
->  0x100001696 <+326>: movb   %r15b, -0x29(%rbp)
    0x10000169a <+330>: movq   0x8(%r14), %rdi
    0x10000169e <+334>: callq  0x1000018c0               ; array::Array::get_type::h6963b010eb7489401ka
    0x1000016a3 <+339>: movb   %al, -0x2a(%rbp)

Multiple runs cause different issues:

➜  af_playground git:(master) ✗ cargo run --release --verbose
       Fresh rustc-serialize v0.3.19
       Fresh lazy_static v0.2.1
       Fresh libc v0.1.12
       Fresh winapi-build v0.1.1
       Fresh winapi v0.2.7
       Fresh semver v0.1.20
       Fresh custom_derive v0.1.5
       Fresh libc v0.2.11
       Fresh num-traits v0.1.32
       Fresh rustc_version v0.1.7
       Fresh conv v0.3.3
       Fresh rand v0.3.14
       Fresh num-integer v0.1.32
       Fresh num-complex v0.1.32
       Fresh num-iter v0.1.32
       Fresh num-bigint v0.1.32
       Fresh kernel32-sys v0.2.2
       Fresh num-rational v0.1.32
       Fresh time v0.1.35
       Fresh num v0.1.32
       Fresh arrayfire v3.3.0
       Fresh af_playground v0.1.0 (file:///Users/jramapuram/Dropbox/projects/af_playground)
     Running `target/release/af_playground`
error: Process didn't exit successfully: `target/release/af_playground` (signal: 11, SIGSEGV: invalid memory reference)
➜  af_playground git:(master) ✗ cargo run --release --verbose
       Fresh custom_derive v0.1.5
       Fresh libc v0.2.11
       Fresh winapi-build v0.1.1
       Fresh num-traits v0.1.32
       Fresh semver v0.1.20
       Fresh rustc-serialize v0.3.19
       Fresh lazy_static v0.2.1
       Fresh winapi v0.2.7
       Fresh conv v0.3.3
       Fresh rand v0.3.14
       Fresh num-integer v0.1.32
       Fresh rustc_version v0.1.7
       Fresh num-complex v0.1.32
       Fresh libc v0.1.12
       Fresh num-iter v0.1.32
       Fresh num-bigint v0.1.32
       Fresh kernel32-sys v0.2.2
       Fresh num-rational v0.1.32
       Fresh time v0.1.35
       Fresh num v0.1.32
       Fresh arrayfire v3.3.0
       Fresh af_playground v0.1.0 (file:///Users/jramapuram/Dropbox/projects/af_playground)
     Running `target/release/af_playground`
thread '<main>' panicked at 'Error message: Unknown Error', /Users/jramapuram/.cargo/registry/src/github.com-88ac128001ac3a9a/arrayfire-3.3.0/src/error.rs:47
note: Run with `RUST_BACKTRACE=1` for a backtrace.
fatal runtime error: Could not unwind stack, error = 5
error: Process didn't exit successfully: `target/release/af_playground` (signal: 4, SIGILL: illegal instruction)
@9prady9
Copy link
Member

9prady9 commented Jun 1, 2016

Quick debugging points to the member function Array::get_type() that is failing in release mode but not in debug mode.

@9prady9 9prady9 self-assigned this Jun 2, 2016
@9prady9
Copy link
Member

9prady9 commented Jun 7, 2016

Had first break through regarding this today.

Adding the following to Cargo.toml makes the program run fine.

[profile.release]
opt-level = 0

@jramapuram So, our initial guess about something is being moved by compiler optimization might be the most probable reason behind this issue.

@jramapuram
Copy link
Member Author

This is definitely not a solution though. We can't be removing optimization

@9prady9
Copy link
Member

9prady9 commented Jun 7, 2016

Of course!, this justs confirms it is optimization that is causing it. No matter what's the optimization level ([1|2|3]) it just crashes if it's not zero.

@jramapuram
Copy link
Member Author

I believe that we are improperly using the static assignment methods though. So in reality we are doing something incorrectly. Something the compiler is designed to optimize away is in fact being optimized away.

@9prady9
Copy link
Member

9prady9 commented Jun 7, 2016

Most probably, and that is what i am trying to figure out, if it is statics or the references or the combination.

@9prady9
Copy link
Member

9prady9 commented Jun 8, 2016

@jramapuram Wooho! looks like fixed, problem was what i suspected initially, the function get_type was using 8-bit integer to get the type of data(enum) from FFI call af_get_type where was the C function was expecting output variable size to be 32 bit. Once i fixed that, the example you reported above is running fine. I will check other enums for similar issue now.

@9prady9 9prady9 closed this as completed in cb7a6f4 Jun 8, 2016
@jramapuram
Copy link
Member Author

Nice work @9prady9 ! I'm still confused why this was happening while using the static error handling stuff though!

@pavanky
Copy link
Member

pavanky commented Jun 8, 2016

Can you check if u8 is being used for enums elsewhere?

@9prady9
Copy link
Member

9prady9 commented Jun 8, 2016

@pavanky I already changed others before publishing 3.3.1 crate.

@jramapuram May be no one used this(get_type) function in release mode ever before.

Edited:
We have to address similar cases as part of #37, what i mean is that - we have to add some kind of testing mechanism that can catch similar problems with the entire wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants