Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test18 failures with -fupc-pts=struct on OpenBSD X86 #66

Open
PHHargrove opened this issue Jul 9, 2014 · 4 comments
Open

test18 failures with -fupc-pts=struct on OpenBSD X86 #66

PHHargrove opened this issue Jul 9, 2014 · 4 comments

Comments

@PHHargrove
Copy link
Collaborator

I now have OpenBSD testers for clang-upc on both amd64 and i386, and have chosen to configure with --with-upc-pts=struct for more coverage.

In conducting the initial "smoke test" run of the Intrepid suite I encountered failures of test18 only on the i386 system. In a debug build I get the following failure:

[intrepid/test18_st02]   0sec  20140709_160046  FAILED (CRASH=SIGTERM/NEW)
commandline: [env  UPC_QUIET=1 ./test18_st02 -n 2 ]
PassExpr: passed
FailExpr: rror
--- App stdout ---
--- App stderr ---
./test18_st02: UPC error: Thread number in shared address is out of range
thread 1 terminated with signal: 'Abort trap'

While a non-debug build gets a SEGV instead:

[intrepid/test18_st02]   0sec  20140709_160046  FAILED (CRASH=SIGTERM/NEW)
commandline: [env  UPC_QUIET=1 ./test18_st02 -n 2 ]
PassExpr: passed
FailExpr: rror
--- App stdout ---
--- App stderr ---
./test18_st02: UPC error: Thread number in shared address is out of range
thread 1 terminated with signal: 'Abort trap'

Outputs above show the static-threads builds of the test, but the dynamic threads cases fail in the same manner.

I went on to investigate other 32-bit platforms and found the majority to fail test18 with the struct PTS.

On an x86 build on FreeBSD I get a SEGV:

$ ./a.out -n2
thread 1 terminated with signal: 'Segmentation fault'
Terminated

On an "-m32" build on Mac OS X I see a different failure mode:

$ ./a.out -n2
./a.out: UPC error: Invalid conversion of shared address to local pointer;
thread does not have affinity to shared address
thread 0 terminated with signal: 'Abort trap'
Terminated: 15

On an x86 build NetBSD I don't see any error.

I don't presently have any 32-bit builds for Linux.

In all of the cases reported above as failing, I have verified that there is no error with the packed PTS representation.

@nenadv
Copy link

nenadv commented Jul 13, 2014

Confirmed the issue on my VM too. I was not able to duplicate it on Linux 32 machine, and I thought this was good as I can compare the code. It turns out that code is completely different as Linux uses xmm registers in the generated code while FreeBSD does not. Error can be duplicated with -O0 and only one thread which is good for debugging.

Error can be duplicated with this code:

shared [5] int a_blk5[10*THREADS];
shared [5] int *ptr_to_blk5;

void
test18()
{
  int got;
  int expected;
  /* bug 52: upc_resetphase unimplemented */
  ptr_to_blk5 = upc_resetphase (&a_blk5[1]);
  got = upc_phaseof (ptr_to_blk5);
  expected = 0;
  upc_barrier;
}

I think the issue is related to an optimization where FreeBSD does not save/use the frame pointer. Instead, stack pointer is used for the register spill:

        movl    %eax, 20(%esp)          # 4-byte Spill
        calll   upc_resetphase
[...]
        subl    $4, %esp
[...]
        movl    20(%esp), %eax          # 4-byte Reload

Looks like code generation bug, and we might be able to create a C test case for this.

@nenadv
Copy link

nenadv commented Aug 13, 2014

I did try to create a test case for this without any luck.

@nenadv nenadv changed the title test18 failures with -fupc-pts=struct on x86 test18 failures with -fupc-pts=struct on OpenBSD X86 Aug 20, 2014
@PHHargrove
Copy link
Collaborator Author

Today I retested clang-upc on openbsd-i386 configured using --with-upc-pts=struct.
The failures below were observed at runtime and are not present with --with-upc-pts=packed.

run.rpt:[bugzilla/bug276_st04]   0sec  20150224_150819  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug276]   0sec  20150224_150820  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase1_st04]   0sec  20150224_151129  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase1]   0sec  20150224_151130  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase2_st04]   0sec  20150224_151130  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[guts_main/resetphase2]   0sec  20150224_151130  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[intrepid/test18_st04]   0sec  20150224_151306  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[intrepid/test18]   1sec  20150224_151307  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug88_st02]   0sec  20150224_152645  FAILED (CRASH=SIGTERM/NEW)
run.rpt:[bugzilla/bug88]   1sec  20150224_152646  FAILED (CRASH=SIGTERM/NEW)

All failures produced the same message:

[testname]: UPC error: Thread number in shared address is out of range

There was no difference between -g and -O in terms of which tests failed (though the -g run did have one test time-out).

@PHHargrove
Copy link
Collaborator Author

I have again tried the struct PTS representation on OpenBSD and this error is still present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants