More potential issues hidden in MPISendReceive*InfoBetweenThreads functions #89

rtobar · 2021-06-15T12:50:45Z

In #87 it was found that the four MPISendReceiveFOF*InfoBetweenThreads functions all had a bug in their logic (which was duplicated across all functions) where an input buffer was not sized correctly for the amount of data it received via MPI. A fix was issued for this that both solved the buffer size issue and also removed the code duplication by providing a single function that performed the data exchange. It was also found that #54 had fixed one of those functions already, but had failed to identify the broader problem affecting all four functions.

After fixing #87 I went and had another look at the rest of the functions in this file (mpiroutines.cxx). I realised there are several MPISendReceive*InfoBetweenThreads families of functions, namely:

MPISendReceive<component>InfoBetweenThreads
MPISendReceiveBuffWith<component>InfoBetweenThreads
MPISendReceiveFOF<component>InfoBetweenThreads

(<component> are Hydro, Star, BH and ExtraDM)

From this list, the last item are the functions fixed in #87. The rest however seem to also follow a similar structure, and from a quick overview they also contain a copy of the same data exchange pattern (some with and some without the same buffer sizing bug) that was fixed and consolidated as a reusable function in #87. We should revisit these functions and try to reuse the new function where possible, thus minimizing the chances of running into memory corruption issues again.

The text was updated successfully, but these errors were encountered:

Fixes issue identified in ICRAR#89 ICRAR#87 Still needs testing to verify changes are correct.

* Bug fix for MPI communication of extra properties Fixes issue identified in ICRAR#89 ICRAR#87 Still needs testing to verify changes are correct. * Code clean-up Added changes pulled from ICRAR fork. Key additions here are clean-up of writing HDF5 properties files, code clean-up of the Options structure, and some associated small changes in a few other files to account for changes from * to std::vector and removal of unused variables. Also have some minor code clean-up of the fitting routines. * Fix mismatch in header and properties written Data sets written in properties file updated to match with the data sets outlined by the properties header in allvars.cxx * Improve how compilation options are set and stored Combination of commits from ICRAR master 8fb8b64 cc4f717 VR stores compilation options as string and reports this at runtime. * Minor fix when parsing config As highlighted in c2202da of icrar/master fork * Add Timer class for easy time tracking This was taken almost verbatimg from shark, where we've been using it (successfully) for years. Signed-off-by: Rodrigo Tobar <[email protected]> * Add utility code for formatting time and memory This code will give us proper formatting for memory and time amounts. I took it from shark, where we have been using this for a while, plus some minor simplifications and modifications. Signed-off-by: Rodrigo Tobar <[email protected]> * Fixed mismatch in return argument between definition and implementation CalcGravitationalConstant defined as having return type Double_t but implementation has double. This is fine in default compilations as Double_t = double but not if running in single precision. Fixed. * Update to ioutils * Move non-positive particle density errors up Based on icrar/master fork commit a9c6285 Minor changes to make exceptions files as explicitly vr_exceptions. Comment was " After some experiments I was able to reproduce the non-positive density errors earlier on in the code, right after all particles have their densities calculated. I thus decided to refactor the creation of the error message out of its previous place (localbgcomp.cxx) and move it into an exception that can be easily thrown from various places in the code. Signed-off-by: Rodrigo Tobar <[email protected]> " * Adding logger class Addition of a logger class. Code requires further updates to use logger and timer class. First commit is from icrar/master 93d6384. * Update to user interface Fixed typo and also updated code to use the logger class. * Added Logger, Timer, and MemReport calss Changed logging information from using cout to using the LogStatement class. Also removed explicit use of chrono in favour of the Timer class. Memory usage reporting also updated to use the LogStatement formatting. Minor bug fix in writing profiles in io.cxx. Had incorrectly used the wrong set of MPI tasks to write meta information when doing parallel HDF5. This update is based on multiple commits in icrar/master which updated the logging of VR. * Avoid memory leak on MPISetFilesRead calls Fixes minor memory leak. From icrar/master commit b3619fa. Almost all calls to MPISetFilesRead resulted on small memory leaks because callers were allocating the array that was passed as an argument (as a pointer reference), which then internally was allocated again, creating a leak even if the caller was careful of deallocating what they had allocated. To make things simpler in the long run I replaced all these arrays by std::vector objects, ensuring their memory is automatically managed through RAII, and cleaning up the interface of MPISetFilesRead a bit. This addresses #65, but also similar memory leaks that were hidden in the code handling the other input formats (RAMSES, Gadget, NChilada). Signed-off-by: Rodrigo Tobar <[email protected]> * Minor bug fixes Cleaned up some minor memory leaks and incorrect function definition. Also minor update to cmake related to logging. Removed unused memory reporting functions. * Minor code clean-up of io routines Added in logging, fixed some tabbing and added inclusion of io.h to io routines. * Update to HDF5 interface based on commits from icrar fork These are fairly simple but do the work. In time we might want to add more bits of code like this for other containers though. Signed-off-by: Rodrigo Tobar <[email protected]> Add verbosity flag to H5OutputFile class This will allow us to set the flag on a per-file basis whenever we want to print more information about the output of a particular file. Signed-off-by: Rodrigo Tobar <[email protected]> Output written dataset names and dimensionality Signed-off-by: Rodrigo Tobar <[email protected]> Simplify safe_hdf5 function usage The previous incarnation of this function required callers to specify the function return type, but safe_hdf5 already has all the information required to automatically calculate this. This new version of safe_hdf5 automatically calculates the return type, and also prints the hdf5 function name in the error message. To do so without much friction safe_hdf5 is actually a macro that feeds _safe_hdf5, the actual function. Signed-off-by: Rodrigo Tobar <[email protected]> Use safe_hdf5 in write_dataset_nd Most of the hdf5 operations performed within write_dataset_nd had their error code or result not checked, and therefore we could be failing silently before the actual data writing happens. This commit adds usage of safe_hdf5 in the HDF5 operations occurring in write_dataset_nd, thus potentially alerting us of otherwise silent errors. Signed-off-by: Rodrigo Tobar <[email protected]> Rework bits of H5OuptutFile The H5OutputFile contained some pieces of logic that were very hard to follow easily around the create/append/close methods. Also, these methods were defined in the header file itself, but this only adds complexity to the header file and adds to the compilation time. This commit both re-implements some of the H5OutputFile class methods (the create/append/close public methods, plus associated private member), and moves the actual definition of the methods to a new h5_output_file.cxx module, leaving the header thinner and easier to follow. The implementation of H5OutputFile uses the FileExists function. This function is only declared in proto.h, which is a single, big pool of file definitions. We have been trying to move away from that and into a per-module header file; therefore we used this opportunity to create a new io.h file and move the declaration of FileExists into it. This required including the new header in a few modules, all of which are included in this commit. Signed-off-by: Rodrigo Tobar <[email protected]> Simplify FileExist implementation There were a few unused or unnecessary variables, now the implementation is a two-liner. Signed-off-by: Rodrigo Tobar <[email protected]> Move safe_hdf5 into new h5_utils module Having a h5_utils header+implementation module is in general probably a good idea, as it will help moving some of the more general utilities out of hdfitems.h, which currently has defined loads of code inline. Add new dataspace_information function This will allow us to inspect more closely the datasets we have opened, either for reading or for writing. Signed-off-by: Rodrigo Tobar <[email protected]> Inspect datasets before they are written This will hopefully help us identify what's going on. Signed-off-by: Rodrigo Tobar <[email protected]> Remove redundant parameters from write_dataset The only "flag" parameter from the write_dataset family of functions that was being actively used was flag_parallel; the rest were always left to their default value (true) and therefore served only to pollute the interface. This commit removes these parameters, defining them as local, fixed variables in the final routine that performs writing. Later on we can simplify the code in that function based on these fixed values. Signed-off-by: Rodrigo Tobar <[email protected]> Cleanup and move main write_dataset_nd function The implementation of the main write_dataset_nd function was slightly confusing, as it contained some code duplication and too many #ifdef macros spread around the code. Additionally, it still used the different "flag" variables that we now know are always true. This commit re-implements this routine. The overall structure is still the same, and the logic remains untouched. However variables have been renamed for clarity, common code has been factored out into reusable functions, #ifdef macro has been greatly reduced, and the code has been simplified after removing all usage of the "flags" variables. Since this is a big change, I'm taking the opportunity to move the implementation of this function down to the new h5_output_file.cxx file, thus avoiding re-compiling it each time we include the hdfitems.h header. Signed-off-by: Rodrigo Tobar <[email protected]> Cleanup and move rest of write_dataset functions Now that the main write_dataset_nd function has been cleaned up and moved to the h5_output_file we can move the rest of the functions that depend on it. During this move I removed some small duplicate checks, adjusted indentation, and added asserts for otherwise unnecessary arguments. The same treatment was given to write_dataset templated functions, which must remain in the header. Signed-off-by: Rodrigo Tobar <[email protected]> Unify write_attribute method implementations There were two very similar implementations of this functionality, one for plain scalar values and another one for string, but at the end they were very much the same. This commit puts the common bits of both implementations in a private method, then adjusts the two existing public methods to invoke the private one with the correct arguments. In addition to this unification, the version of write_attribute taking a string value was failing to close the new data type it created, leading to object identifier leaks in HDF5. This has been now fixed. Finally, scalar attributes can (and are, in VR) only be written in serial. A new assertion to this effect has been added to the code. Signed-off-by: Rodrigo Tobar <[email protected]> Write HDF5 datasets in 2GB chunks HDF5 doesn't properly support writing >= 2GB of data in a single I/O operations. Even though it was reported that 1.10.2 fixed this limitation, we have seen several situations in which this limit breaks our code in strange ways. This commit changes the actual writing of the dataset. Instead of writing the given data on a single H5Dwrite call, we perform writes as big as 2 GB at a time. Chunking is done only in the first dimension, which is also the dimension that is written in parallel from multiple MPI ranks. I also took the opportunity to further adjust some variable names and simplify the code in the write_dataset_nd function. This is the main change required to remove the problems we were reported in #88. Signed-off-by: Rodrigo Tobar <[email protected]> Close objects in exact reverse order This might not be a problem in reality, but better safe than sorry. Signed-off-by: Rodrigo Tobar <[email protected]> Add test program for H5OutputFile This in general is a useful tool to have (and even the beginning of a unit test suite...), and in particular it helps us reproduce #88 much faster. Signed-off-by: Rodrigo Tobar <[email protected]> Fix incorrect assertion When writing HDF5 files in parallel, attribute writing must happen in parallel. Sometimes it's not possible to bypass this limitation, and therefore attribute writing must be done by a single rank even on MPI-able scenarios. The H5OutputFile class has been designed with this usage in mind. While refactoring the code as part of #88, I added some assertions to make sure this held true from the caller site (see e703f6f). This assertion is faulty however, as it only applies when compiling against MPI-enabled HDF5 libraries, which is what this commit fixes. This faulty assertion was reported in #113. Signed-off-by: Rodrigo Tobar <[email protected]> * Bug fix Minor bug fix resulting from a rebase + minor updates of the set of commits in icrar master related to the hdf5 update. * Fixed improper const * Restructure SO information collection and writing Spherical overdensity information (Particle IDs and types) are collected by two different routines: GetInclusiveMasses and GetSOMasses. Both routines use the following technique: first, for "ngroup" groups, a C array-of-vectors (AOV) of "ngroup + 1" elements was allocated via "new" (which was eventually delete[]'ed). Vectors for each group are filled independently as groups are processed; the loop over groups is 1-indexed, and indexing into these AOVs happens using the same iteration variable, meaning that their 0 element is skipped. Finally, these AOVs are passed to WriteSOCatalog for writing. WritingSOCatalog is aware of the 1-based indexing, and additionally it flattens the AOVs into a single vector (thus duplicating their memory requirement) to finally write the data into the output HDF5 file. This commit originally aimed to reduce the memory overhead of the final writing of data into the HDF5 file (see #71). The main change required to achieve this is to perform the flattening of data at separate times, such that particles IDs and types are not flattened at the same time, but one after the other, with memory from the first flattening operation being released before the second flattening happens, thus reducing the maximum memory requirement of the application. This goal was achieved. However, while performing these changes two things became clear: firstly, that using a vector-of-vectors (VOV) was a better interface than an AOV (due to automatic memory management), and secondly that the 1-based indexing of the original AOVs introduced much complexity in the code. The scope of these changes was then broadened to cover these two extra changes, and therefore this commit considerably grew in size. In particular the 0-indexing of the VOVs allowed us to more easily use more std algorithms that clarify the intent in certain places of the code. There are other minor changes that have also been included in this commit, mostly to reduce variable scopes, reduce code duplication, and such. Assertions have also been sprinkled here and there to add further assurance that the code is working as expected. As an unintended side-effect, this commit also fixed the wrongly-calculated Offset dataset, which was off by one index in the original values. This problem was reported in #108, and seems to have always been there. * Adding logging information in local field Minor updates related to logging information along with some cleaning-up some code. * Split memory usage collection and reporting It will be useful in some places of the code to measure memory usage but not necessarily to display it as a normal "report" in the logs. Thus, splitting metrics collection from reporting will help us reuse the collection bits in other places of the code. While doing this I also took the chance of simplifying how these memory collection metrics are performed. For starters, we don't need to know about that many fields, as we're usually only interested on VM/RSS curernt/peak usage. Additionally we can use a single file as the source of the information, which should help speeding up this data collection as well. Signed-off-by: Rodrigo Tobar <[email protected]> Use a simpler function names in memory reports Pretty names are unnecessarily long: they don't really give any useful information, specially because we already have the location in the file where the report is generated Signed-off-by: Rodrigo Tobar <[email protected]> Simplify memory usage reporting calls Memory reporting doesn't really need to use the Options object anymore (it was previously used to keep track of peak memory usage, which the kernel already does for us); additionally we already have the file/line location showing up on the log statement when we generate a memory report, so there's no need to have the information twice in the same line. Signed-off-by: Rodrigo Tobar <[email protected]> Remove memuse_* fields from Options class These were not used in reality, or were previously not used correctly, so there's no need to keep them around. If we need some memory tracking functionality in the future, we can reinstate them in a different form. Signed-off-by: Rodrigo Tobar <[email protected]> Finish VR graciously on uncaught exceptions VR's main function didn't have a high-level try/catch block to handle uncaught exceptions rising from the code, leaving the behavior of the program to the fate of whatever default the C++ runtime library provides. This commit adds such a block. When an uncaught exception is received, an error is logged, and then either an MPI_Abort is issued in COMM_WORLD, or a simpler exit is invoked on non-MPI builds. Signed-off-by: Rodrigo Tobar <[email protected]> Log memory requirements for data transfers This might help us understand what exactly is causing #118 (if logs are correctly flushed and don't get too scrambled). Signed-off-by: Rodrigo Tobar <[email protected]> * Minor updates Cleaned up some hdf5 code and some minor clean-up of code. May need to check the unbind code for nested parallelism. * Adding most bound BH ID and position to output Added the position and ID of the BH that is closest to the most bound dark matter particle. This was tested in COLIBRE and works well. * Update to store entropy and temperature Storing temperature in gas particles requires update to Particle class in NBodylib as well as update to HDF5 interface. * Update to SO calculations Based on commits merged in 21ffc70 These updates are based on several commits to improve and add to SO and half mass radii calculations made by Claudia Lagos - Code has been updated to compute the radii of differentspherical overdensities in log10(rho)-natural radii space as it is done by Roi Kugel. Once radii are computed, the enclosed masses are calculated in the log10(M)-radii space. - small modification to the SO calculations to follow what Pascal has in his branch so-gasstar-masses but keeping the interpolation in log10 space. With this I find that the M200crit halo MF of EAGLE between VELOCIraptor and SUBFIND are in perfect agreement down to ~32 particles, which is the limit imposed in SUBFIND to find subhalos. - I’m not using interpolation to compute all half-mass radii, and do that by looping over particles that are ordered in radii. - The previous calculation for Rhalfmass_star was wrong while the aperture r50 were correct (this was determined by comparing the VR catalogues with those of SUBFIND in EAGLE). It seems this was because the particle rc was defined differently in Rhalfmass_star than in the aperture calculation, and because Rhalfmass_star wasn't using interpolation. I have now updated the Rhalfmass_* calculations to follow the aperture ones. * Port SWIFT interface to use better logging This is by no means complete, but is a step in the correct direction. Signed-off-by: Rodrigo Tobar <[email protected]> * De-duplicate InitVelociraptor* functions The body of the two existing functions was almost identical, with that from InitVelociraptorExtra *not* initialising a couple of things, which looks like a bug. Their major difference was the Options variable that they worked on, so instead of duplicating the code it's much easier to reuse it and pass the corresponding Options object as an argument. Signed-off-by: Rodrigo Tobar <[email protected]> * Further logging improvements on swiftinterface.cxx Signed-off-by: Rodrigo Tobar <[email protected]> * Updates to swift interface Update to logging and timing in swift interface * Bug fix for spherical overdensity calculation of subhalos SO masses and related quantities for subhalos was incorrectly calculated. * Feature/core binding report (#110) * Added report of core binding VR reports where MPI ranks and OpenMP threads are bound. Initially tested on OSX, needs testing on Linux. * Update to formating of core binding * Minor change, remove unnecessary for loop * Cleaned up mpi report of on what nodes code is running * Bug fix to HDF5 reading, improved error handling of HDF5 read * Logging now also reports function from which it is called * Bug fix for omp fof when running with mpi * Added the c++ standard to use * Bug fix for sending extra properties between MPI ranks * Minor fix to ensure MPI calls are all before finalize * Added clang formating standard * Minor update to reading other formats related to mpi decomposition * Minor bug fix introduced in previous commit * Update to allow VR to period wrap particles outside the periodic domain for gadget input * Fixes for z-mesh scaling with number of mpi processes * Update to period wrap of input for gadget input * Minor update to binding report * Update to NBodylib used (#114) --------- Signed-off-by: Rodrigo Tobar <[email protected]> Co-authored-by: Rodrigo Tobar <[email protected]>

pelahi added a commit to pelahi/VELOCIraptor-STF that referenced this issue Oct 12, 2021

Bug fix for MPI communication of extra properties

2652b39

Fixes issue identified in ICRAR#89 ICRAR#87 Still needs testing to verify changes are correct.

pelahi mentioned this issue Oct 12, 2021

Pull request to get updates from ICRAR master pelahi/VELOCIraptor-STF#108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More potential issues hidden in MPISendReceive*InfoBetweenThreads functions #89

More potential issues hidden in MPISendReceive*InfoBetweenThreads functions #89

rtobar commented Jun 15, 2021 •

edited

Loading

More potential issues hidden in MPISendReceive*InfoBetweenThreads functions #89

More potential issues hidden in MPISendReceive*InfoBetweenThreads functions #89

Comments

rtobar commented Jun 15, 2021 • edited Loading

rtobar commented Jun 15, 2021 •

edited

Loading