Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIPS64 barrier failures #95

Open
PHHargrove opened this issue May 24, 2016 · 7 comments
Open

MIPS64 barrier failures #95

PHHargrove opened this issue May 24, 2016 · 7 comments

Comments

@PHHargrove
Copy link
Collaborator

I am currently able to build clang-upc for MIPS64, both big- and little-endian, on Linux.
Currently libupc only builds for the "n64" ABI.

Running clang-upc "native" (not w/ the Berkeley UPCR) with the Berkeley UPC test harness reveals that over 300 tests fail at runtime with messages like the following:

./cg-W: UPC error: UPC barrier identifier mismatch
thread 0 terminated with signal: 'Aborted'

This is reproducible on gcc22 (big-endian) and gcc{23,24} (little-endian) of the GCC CFarm.
On the little-endian systems, I built with:

cmake \
        -DCMAKE_INSTALL_PREFIX:PATH=<SOMETHING> \
        -DLLVM_TARGETS_TO_BUILD:=Mips \
        -DCMAKE_BUILD_TYPE:=MinSizeRel \
        -DCMAKE_C_COMPILER=mipsel-linux-gnu-gcc-4.9 \
        -DCMAKE_CXX_COMPILER=mipsel-linux-gnu-g++-4.9 \
        -DCMAKE_C_FLAGS=-mabi=64 \
        -DCMAKE_CXX_FLAGS=-mabi=64 \
        -DCMAKE_ASM_FLAGS=-mabi=64 \
        -DLLVM_DEFAULT_TARGET_TRIPLE=mips64el-linux-gnu

On the big-endian system (gcc22) the system gcc/g++ is 4.6, which is too old to build clang-3.8.
I had to build a newer gcc/g++ (I chose 4.9 to match the little-endian systems).
That required that I build gmp, mpfr and mpc.
That, in turn, required that I track down a patch to fix builds of mpfr on MIPS w/ gcc-4.
So, you probably want to avoid trying to reproduce there.
If you do want to try, I can probably open perms on my install of gcc-4.9 for you.

@PHHargrove PHHargrove self-assigned this May 24, 2016
@PHHargrove
Copy link
Collaborator Author

I am looking into this issue.

I have GNU UPC tests running on the same platforms right now, and think it likely the same bug is present there.

An initial look at upc_sync.h as compared to the nearest equivalents in GASNet, GCC's sync atomics, and the Linux kernel suggest that libupc is incorrect in its assumption that MIPS does not require a Read Fence.

@PHHargrove
Copy link
Collaborator Author

So far I only have ABI=n64 builds of clang-upc and ABI=n32 builds of GNU UPC.
The GNU UPC builds are not showing this error, but since the ABIs are different I cannot yet be sure if that is meaningful.

I continue to investigate.

@PHHargrove
Copy link
Collaborator Author

With ABI=n64 builds of GNU UPC I still don't see this error, despite the fact that upc_sync.h and upc_barrier.upc in the respective runtimes are nearly identical. I may have to give up on this issue as being too far outside my expertise (and because MIPS is likely to be of relatively low importance).

@PHHargrove PHHargrove removed their assignment Jun 1, 2016
@PHHargrove
Copy link
Collaborator Author

I have completely testing GNU UPC and found no equivalent of this issue.

Adding the possibly-missing read fence in runtime/libupc/smp/upc_sync.h does not resolve this problem.

@nenadv
Copy link

nenadv commented Jun 27, 2016

Right now libupc is being compiled with -Os (optimization for speed). I was easily able to reproduce the problem with intrepid's 'test17' and it seems that barrier fails on negative ID values (in this case -1 which is used in UPC lock implementation - test17 is the first one to use locks).

Test passes if libupc compiled with -O0.

Test fails with only one thread.

GDB does not work on MIPS gcc23 (I'll try to build a new one) and was not able to simple debug it. However, after adding some print statements, it seems that this trivial line of code fails:

upc_barrier.upc

265   /* Check the barrier ID with the one from the notify phase.  */
266   if (barrier_id != INT_MIN && __upc_barrier_id != INT_MIN &&
267       __upc_barrier_id != barrier_id)
268     {
269       __upc_fatal ("UPC barrier identifier mismatch");
270     }

@nenadv
Copy link

nenadv commented Jun 29, 2016

I tried to rearrange the code with no success. The code generated in a bad case:

   1200039f4:   0240282d        move    a1,s2
   1200039f8:   8e220000        lw      v0,0(s1)
   1200039fc:   10500006        beq     v0,s0,120003a18 <$BB2_4>

I wonder if the processor has the load delay slot and v0 showing up on conditional branch is not loaded yet. GUPC optimized version has another instruction in between lw and beq.

Maybe we are not building clang correctly.

@nenadv
Copy link

nenadv commented Jul 13, 2016

Since gdb was segfaulting on gcc23 I had to add segfault instructions (*((volatile int *)0) = 0), generate core, and review registers, stack etc. Core of the problem is in this example:

extern void x (int);
int
main ()
{
  x (-3);
  upc_barrier (-3);
}

that generates this code on mpis:

        ld      $25, %call16(my_upc_barrier)($gp)
        jalr    $25
        daddiu  $4, $zero, -3

        daddiu  $1, $zero, 1
        dsll    $1, $1, 32
        ld      $25, %call16(__upc_barrier)($gp)
        jalr    $25
        daddiu  $4, $1, -3

Call to __upc_barrier will pass '-3' in the lower end of the register, and higher end (32-63) will be zero, as '1' was placed into it.

The llvm code is like this:

define i32 @upc_main() #0 {
  call void @x(i32 signext -3)
  call void @__upc_barrier(i32 -3)
  ret i32 0
}

On x86_64 there is no difference in LLVM between these two procedures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants