Buffer overflow in PotentialTree with OpenMP #93

rtobar · 2021-06-23T06:44:17Z

When running with the inputs and configuration from #87/#88 with 4 OpenMP threads, 1 MPI rank and compiling with clang address sanitizer I got the following output:

[0000] [  87.191] [debug] unbind.cxx:284 Unbinding 1521 groups ...
=================================================================
==182466==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020001e99a8 at pc 0x000000a457dc bp 0x7ff6f2c55470 sp 0x7ff6f2c55468
WRITE of size 8 at 0x6020001e99a8 thread T5
    #0 0xa457db in .omp_outlined._debug__.7 /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:1240:19
    #1 0xa46cfb in .omp_outlined..8 /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:1228:1
    #2 0x7ff7114e3ca2 in __kmp_invoke_microtask (/usr/lib/x86_64-linux-gnu/libomp.so.5+0xabca2)
    #3 0x7ff7114789c2  (/usr/lib/x86_64-linux-gnu/libomp.so.5+0x409c2)
    #4 0x7ff7114775f9  (/usr/lib/x86_64-linux-gnu/libomp.so.5+0x3f5f9)
    #5 0x7ff7114cb149  (/usr/lib/x86_64-linux-gnu/libomp.so.5+0x93149)
    #6 0x7ff71153f58f in start_thread nptl/pthread_create.c:463:8
    #7 0x7ff71134c222 in clone misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

0x6020001e99a8 is located 16 bytes to the right of 8-byte region [0x6020001e9990,0x6020001e9998)
allocated by thread T0 here:
    #0 0x53ca8d in operator new[](unsigned long) (/home/rtobar/scm/git/VELOCIraptor-STF/builds/88-debug-addrsan/stf+0x53ca8d)
    #1 0xa40b32 in PotentialTree(Options&, long long, NBody::Particle*&, NBody::KDTree*&) /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:1185:11
    #2 0xa3db16 in Potential(Options&, long long, NBody::Particle*) /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:960:5
    #3 0xa5357e in CalculatePotentials(Options&, NBody::Particle**, long long&, long long*) /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:434:13
    #4 0xa38e98 in Unbind(Options&, NBody::Particle**, long long&, long long*, long long*, long long**, int) /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:793:5
    #5 0xa48efd in CheckUnboundGroups(Options, long long, NBody::Particle*, long long&, long long*&, long long*, long long**, int, long long*) /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:350:18
    #6 0x8b040f in SearchBaryons(Options&, long long&, NBody::Particle*&, long long, std::vector<NBody::Particle, std::allocator<NBody::Particle> >&, long long*&, long long&, long long&, int, int, PropData*) /home/rtobar/scm/git/VELOCIraptor-STF/src/search.cxx:3840:13
    #7 0x547208 in main /home/rtobar/scm/git/VELOCIraptor-STF/src/main.cxx:463:13
    #8 0x7ff71125bcb1 in __libc_start_main csu/../csu/libc-start.c:314:16

Thread T5 created by T0 here:
    #0 0x4f770a in pthread_create (/home/rtobar/scm/git/VELOCIraptor-STF/builds/88-debug-addrsan/stf+0x4f770a)
    #1 0x7ff7114ca823  (/usr/lib/x86_64-linux-gnu/libomp.so.5+0x92823)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/rtobar/scm/git/VELOCIraptor-STF/src/unbind.cxx:1240:19 in .omp_outlined._debug__.7
Shadow bytes around the buggy address:
  0x0c04800352e0: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c04800352f0: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c0480035300: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c0480035310: fa fa fd fa fa fa fd fa fa fa fd fa fa fa fd fa
  0x0c0480035320: fa fa fd fa fa fa fd fa fa fa fd fa fa fa 00 fa
=>0x0c0480035330: fa fa 00 fa fa[fa]fa fa fa fa fa fa fa fa fa fa
  0x0c0480035340: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 00 fa
  0x0c0480035350: fa fa 00 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480035360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480035370: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480035380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==182466==ABORTING
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[57249,1],0]
  Exit code:    1
--------------------------------------------------------------------------

The text was updated successfully, but these errors were encountered:

There were a couple of problems with the old version of this code. Firstly, it had an unused maxthreads variable, which has now been removed. Secondly, and more importantly, it used a complex and incorrect way of calculating the number of threads (stored in the nthreads variable) that would be used in new "omp parallel" blocks: instead of directly calling omp_get_max_threads and writing its result into nthreads (the new behavior) it opened an "omp parallel" block and called omp_get_num_threads, only on the first thread of the newly created thread team. The result was then written back to nthreads. Although this seems fine, it creates a race condition, as there is no guarantee that thread #0 in the newly formed team is the same thread from which the team was created. This in turn means that the write to nthreads is not necessarily visible immediately from the parent thread. The nthreads value is later on used to allocate memory to be used by "omp parallel" regions; having a wrong value (indeed the original 1 to which nthreads is initialised to at declaration time) leads to wrongly-sized memory blocks, and to invalid writes down the line. We saw this problem when running with clang's address sanitizer, which led us to find this particular bug in the code. This problem was reported in #93. Signed-off-by: Rodrigo Tobar <[email protected]>

rtobar · 2021-06-23T06:50:02Z

As mentioned in 511767e this was a race condition in the calculation of available OpenMP threads. With the commit above I don't get the errors anymore, so I'll merge to the master straight away after double-checking that the Travis runs don't fail either.

rtobar · 2021-06-23T06:52:12Z

BTW, I also thought that maybe this could have been related to the unsolved problem of #53 but that's not the case: I can still reproduce #53 after applying this fix.

rtobar · 2021-06-23T07:01:12Z

This has been solved now and the fix has been merged to the master branch.

This is similar to the error detected in #93 and fixed in 511767e, but with the difference that in this case it doesn't seem to cause issues in real life, as most probably at this point of the execution thread #0 of the newly created team corresponds to the main thread executing the rest of the code, so the write to nthreads *is* visible. Signed-off-by: Rodrigo Tobar <[email protected]>

rtobar mentioned this issue Jun 23, 2021

Correctly calculate number of threads #94

Merged

rtobar added the bug Something isn't working label Jun 23, 2021

rtobar closed this as completed Jun 23, 2021

rtobar mentioned this issue Jun 24, 2021

Draft: Unsafe OpenMP operations -- part 2 #95

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffer overflow in PotentialTree with OpenMP #93

Buffer overflow in PotentialTree with OpenMP #93

rtobar commented Jun 23, 2021 •

edited

Loading

rtobar commented Jun 23, 2021

rtobar commented Jun 23, 2021

rtobar commented Jun 23, 2021

Buffer overflow in PotentialTree with OpenMP #93

Buffer overflow in PotentialTree with OpenMP #93

Comments

rtobar commented Jun 23, 2021 • edited Loading

rtobar commented Jun 23, 2021

rtobar commented Jun 23, 2021

rtobar commented Jun 23, 2021

rtobar commented Jun 23, 2021 •

edited

Loading