Issue with --local command line parameter. #49

DitiD · 2022-10-21T10:52:23Z

While executing the test case blocked_transform which is present under runtime (https://github.com/bcosenza/sycl-bench/blob/master/runtime/blocked_transform.cpp), we noticed that we are getting a core dump error.

Command used to execute - ./blocked_transform --device=gpu

Output -

********** Results for Runtime_BlockedTransform_iter_64_blocksize_0**********
problem-size: 3072
local-size: 1024
device-name: NVIDIA RTX A4000
sycl-implementation: LLVM CUDA (Codeplay)
blocked_transform: /tmp/llvm-sycl-nightly-20220222/sycl/source/detail/scheduler/commands.cpp:1826: void cl::sycl::detail::adjustNDRangePerKernel(cl::sycl::detail::NDRDescT&, cl::sycl::detail::pi::PiKernel, const cl::sycl::detail::device_impl&): Assertion `NDR.NumWorkGroups[0] != 0 && NDR.LocalSize[0] == 0' failed.
Aborted (core dumped)

However, when we are explicitly assigning the value of the --local parameter to 256 (which is the default value) during runtime, it is executing without any errors.

Command used to execute - ./blocked_transform --device=gpu --local=256

We would like to know if there is a fix for this issue? If so, where can we get the revised code?

DitiD · 2022-10-27T05:25:42Z

Hi, we would like to know if there are any solutions for the above mentioned issue. Thanks.

illuhad · 2022-10-27T15:25:46Z

I cannot reproduce the issue with hipSYCL. Your output indicates that somehow a block size of 0 enters the benchmark. This value is derived from the local size. I had a quick look at the code paths, and I don't understand how this could happen - it does not for me.
There's an assert that checks that the block size is non-zero. Can you check what happens when compiling with debug assertions enabled?

DitiD · 2022-10-31T05:53:37Z

Hi, we are not working with hipSYCL. The issue that we are facing is occurring during runtime. The test case is failing to execute when we are not passing the local parameter (as in, when it is taking the value of local parameter as 256 by default).

Command being used to execute - ./blocked_transform --device=gpu

However, it is working fine when we are explicitly defining the local parameter to 256 during runtime.

Command being used to execute - ./blocked_transform --device=gpu --local=256

We are not sure as to why this issue is occurring.

Thanks.

illuhad · 2022-10-31T12:55:10Z

Hi, we are not working with hipSYCL. The issue that we are facing is occurring during runtime.

I'm aware of this. But I don't have an installation of the DPC++ SYCL implementation with CUDA backend here. I'm just saying I cannot reproduce this with my setup. And I don't understand why DPC++ or hipSYCL would behave differently here anyway. The error does not seem to be related to SYCL specific functionality.

The test case is failing to execute when we are not passing the local parameter (as in, when it is taking the value of local parameter as 256 by default).
Command being used to execute - ./blocked_transform --device=gpu
However, it is working fine when we are explicitly defining the local parameter to 256 during runtime.
Command being used to execute - ./blocked_transform --device=gpu --local=256

I understood this. As I've said I cannot reproduce here. Command line option handling is the same for DPC++ and hipSYCL. For further investigation into the issue, I asked you the following:

There's an assert that checks that the block size is non-zero. Can you check what happens when compiling with debug assertions enabled?

i.e. make sure that the NDEBUG macro is not set when building.

DitiD · 2022-11-07T11:52:29Z

Hi, as suggested, I've added the following in the blocked_transform.cpp code and I've rebuilt it again.

#include<assert.h>
#define NDEBUG

It seems that by default, the value of local size is being taken as 1024 (please see attached screenshot below).

However, when I am defining '--local' to be either 256 (default value) or 1024 explicitly, it is working fine.

Command being used:

./blocked_transform --device=gpu --local=256
./blocked_transform --device=gpu --local=1024

Could this be a bug in the code?
Thanks.

DitiD · 2022-11-29T04:09:43Z

Hi, is there any update regarding this issue? Thanks.

DitiD changed the title ~~Issue with --local command line paramete.~~ Issue with --local command line parameter. Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with --local command line parameter. #49

Issue with --local command line parameter. #49

DitiD commented Oct 21, 2022

DitiD commented Oct 27, 2022

illuhad commented Oct 27, 2022

DitiD commented Oct 31, 2022 •

edited

Loading

illuhad commented Oct 31, 2022

DitiD commented Nov 7, 2022

DitiD commented Nov 29, 2022

Issue with --local command line parameter. #49

Issue with --local command line parameter. #49

Comments

DitiD commented Oct 21, 2022

DitiD commented Oct 27, 2022

illuhad commented Oct 27, 2022

DitiD commented Oct 31, 2022 • edited Loading

illuhad commented Oct 31, 2022

DitiD commented Nov 7, 2022

DitiD commented Nov 29, 2022

DitiD commented Oct 31, 2022 •

edited

Loading