Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache file handles in HDF5 output to avoid opening and closing the same file multiple times #6

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rhaas80
Copy link
Member

@rhaas80 rhaas80 commented Aug 8, 2024

@rhaas80's pull request 34 https://bitbucket.org/eschnett/carpet/pull-requests/34

  • CarpetIOHDF5: keep outND_var files open until all data is written

    this reduces file system open/close calls from once per variable per reflevel per iteration of output to once per variable per iteration of output or less (if one_file_per_group or one_file_per_proc is used).

    It does however keep more files open which can be an issue if the OS limit on open files is too small. Usually this limit is per process though and as long as one process does not open many files, the whole simulation can still open very many.

  • CarpetIOHDF5: be more carefull closing all open hdf5 files

Copy link
Member Author

@rhaas80 rhaas80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferred comments from bitbucket

if (CCTK_EQUALS(flush_to_disk, "immediate"))
error_count += CloseFile(cctkGH, file);

if (nioprocs > 1) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschnett says

This would do the wrong thing if there are n processes, but only 1 I/O process. MPI_Allreduce for dist::comm() is superfluous only if nprocs==1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure I follow. If (I quote) “is superfluous only if nprocs==1“ and the check is if (nioprocs > 1) is the condition used (and nioprocs >= 1) isn’t if(nioprocs > 1) the same as if(nioprocs == 1).
I suspect that for this IO thorn there is either nioprocs == 1 or nioprocs == nprocs. In the first case then the ioproc already has “good” values in io_files at least that seems to be the assumption in the place where I copied this code from, namely:
I would think that AllReduce can be safe if io_files and io_bytes are initialized to 0 on al ranks initially. Should have tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks correct to me, or at least as correct as the it was when it was still part of the CloseFile function (which is where it was moved from).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant