Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable support for internally parallelized evaluation of the loss function using multi-rank workers #110

Merged
merged 52 commits into from
Mar 13, 2024

Conversation

mcw92
Copy link
Member

@mcw92 mcw92 commented Mar 13, 2024

This PR mainly implements multi-rank workers for internally parallelized evaluation of the loss function. In the future, this will be particularly useful for data-parallel training of individual neural networks during the Propulate optimization.

In addition the the intra-island communicator and the Propulate optimization communicator (which corresponded to MPI.COMM_WORLD before), the suggested implementation introduces another communicator at the level of each multi-rank worker, worker_sub_comm. This was realized by splitting MPI.COMM_WORLD reasonably in the islands.py module and also lead to changes with the other communicators before. Note that the worker sub communicator needs to be passed to the loss function as well to enable parallelized evaluation.
An example script multi_rank_worker_example.py is available in the tutorials/ folder.

In addition, the the following other things were fixed "on the fly":

  • Clean up CMA-ES.
    • Write clean docstrings.
    • Remove unnecessary getter and setter functions.
    • Rename variables so their names speak for themselves.
    • Make serial CMA-ES reproducible by passing the Propulate RNG correctly.
  • Improve README:
    • Add comment about new Propulate mailing list and GitHub discussions.
    • Add minimum working example to Quickstart section (see also tutorials/minimum_working_example).
  • Fix docstrings and type hints whenever I encountered things to fix.
  • Basic update of ReadTheDocs pages:
    • Add minimum working example to Quickstart section.
    • Update simple Propulator example with logging functionality.
    • Minor changes for the islands and HPO tutorial.
    • Add a stub for the multi-rank worker tutorial.

Things to do and fix in future PRs:

  • Write integration test for multi-rank worker functionality. This will be done once the already available tests are transformed to MPI-enabled tests.
  • Add tutorials for PSO, CMA-ES, and multi-rank workers to ReadTheDocs.

Closes #24, #102, #107

@mcw92 mcw92 added the enhancement New feature or request label Mar 13, 2024
@mcw92 mcw92 linked an issue Mar 13, 2024 that may be closed by this pull request
@mcw92 mcw92 requested a review from oskar-taubert March 13, 2024 12:28
@mcw92 mcw92 self-assigned this Mar 13, 2024
@oskar-taubert oskar-taubert merged commit 547c1fb into master Mar 13, 2024
1 check passed
@mcw92 mcw92 deleted the feature/multi_rank_workers branch March 28, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants