-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argon2-opencl fails on CPU and MIC #5417
Comments
FWIW, the contents of So, not surprisingly, the main issue appears to be beyond pre-processing. This is consistent with this format already failing on CPUs before @alainesp moved the pre-processing from host to device. |
Overriding these didn't make a difference (still works on GPUs, fails on CPUs): #define upsample(a, b) (((ulong)(a) << 32) | (b))
#define mul_hi(a, b) ((ulong)(a) * (b) >> 32) |
With the below hack and - uint warp = (get_local_id(1) * get_local_size(0) + get_local_id(0)) / THREADS_PER_LANE;
+ uint warp = (get_global_id(1) * get_global_size(0) + get_global_id(0)) / THREADS_PER_LANE; So the issue is probably not specific to behavior of |
Maybe we should print a warning to the user when detecting CPU or Intel GPUs besides the self-test fail? Explain the situation a little more. |
In #5420, @magnumripper shows a macOS system where the format works for the first few test vectors on HD Graphics (edit: specifically, on |
I stumbled upon some things like this
A very weird detail is that code path should not be active on that device: #if !gpu_nvidia(DEVICE_INFO) && !gpu_amd(DEVICE_INFO)
barrier(CLK_LOCAL_MEM_FENCE);
#elif !__OS_X__ && gpu_amd(DEVICE_INFO) && DEV_VER_MAJOR < 2500
asm("" ::: "memory");
#endif It's not AMD!? Or maybe it is (the runtime) but then it's not a GPU! |
That's weird, because we do check for GPU: #define gpu(n) ((n & DEV_GPU) == (DEV_GPU))
#define gpu_amd(n) ((n & DEV_AMD) && gpu(n)) I tried reviewing the host code as well and don't see a bug that would cause |
Good catch, I'll fix that right away (although I'm not expecting it to be the problem here). |
There was a |
There's also this weirdly formatted comment (is interpreted as a //** Get execution time **// and weird line wrap here: if gpu_amd
(device_info[sequential_id]) { |
BTW in my output above we can see |
And no outer parens... the macro makes up for that but I'll be fixing that code so it doesn't look so weird. |
Yes, and also |
Yup, this was a red herring. Here's a snippet from my terminal history: Device 6: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
(...)
Testing: rar-opencl, RAR3 (length 5) [SHA1 OpenCL AES]... (8xOMP) Device 3: Tahiti [AMD Radeon HD 7900 Series]
Testing: AndroidBackup-opencl [PBKDF2-SHA1 AES OpenCL]... PASS
Testing: agilekeychain-opencl, 1Password Agile Keychain [PBKDF2-SHA1 AES OpenCL]... PASS
Testing: ansible-opencl, Ansible Vault [PBKDF2-SHA256 HMAC-SHA256 OpenCL]... FAILED (cmp_all(1))
Testing: argon2-opencl, Argon2 [BlaMka OpenCL]... Options used: -I opencl -cl-mad-enable -D__GPU__ -DDEVICE_INFO=138 -D__SIZEOF_HOST_SIZE_T__=8 -DDEV_VER_MAJOR=1800 -DDEV_VER_MINOR=5 -D_OPENCL_COMPILER -DUSE_WARP_SHUFFLE=0 ../run/opencl/argon2_kernel.cl
Build time: 19.913 ms
Build log: "/tmp/OCLrVNPK3.cl", line 101: error: an "asm" declaration is not allowed here
asm("" ::: "memory"); Looking closer, in the middle of one of the first lines (rar-opencl) it switches to the Tahiti Device 3 after a segfault (not caught by console redirection) with device 6 (I was running something like It shouldn't fail on Tahiti though, was that a known problem? |
Not a known problem. In fact, I think I had added this |
Also fix a few weird syntaxes in opencl_common.c See openwall#5417
... I've just retested, and it just works on well's devices 3, 4, 5 from my account. It also builds for devices 1, 2, 6 but fails self-test. |
Yeah I can't reproduce. Doing some forensics it looks like I was on the branch for #5638 which is now merged and works fine. I have no idea what happened there but let's assume something was off with my testing, or perhaps a "missing" kernel-cache-clean somehow led to an odd state of things? |
Also fix a few weird syntaxes in opencl_common.c See #5417
A known shortcoming/bug of the
argon2-opencl
format is that it fails self-test on CPU(-like) devices, as tested with ancient Intel OpenCL and AMD APP SDK that we have on our online dev boxes and with recent Intel OpenCL that @alainesp has on his laptop. We don't know exactly why - a guess is this has something to do with our usage of local memory.The format works on most GPUs, the only exception identified so far being Intel HD Graphics, where it also fails.
The failures on CPUs and Intel GPU are
FAILED (cmp_one(1))
. The failure on MIC includes segfaults.The text was updated successfully, but these errors were encountered: