feat: refactor core/thread logic for mpibackend

This takes George's old GUI-specific `_available_cores()` method, moves it, and greatly expands it to include updates to the logic about cores and hardware-threading which was previously inside `MPIBackend.__init__()`. This was necessary due to the number of common but different outcomes based on platform, architecture, hardware-threading support, and user choice. These changes do not involve very many lines of code, but a good amount of thought and testing has gone into them. Importantly, these `MPIBackend` API changes are backwards-compatible, and no changes to current usage code are needed. I suggest you read the long comments in `parallel_backends.py::_determine_cores_hwthreading()` outlining how each variation is handled. Previously, if the user did not provide the number of MPI Processes they wanted to use, `MPIBackend` assumed that the number of detected "logical" cores would suffice. As George previously showed, this does not work for HPC environments like on OSCAR, where the only true number of cores that we are allowed to use is found by `psutil.Process().cpu_affinity()`, the "affinity" core number. There is a third type of number of cores besides "logical" and "affinity" which is important: "physical". However, there was an additional problem here that was still unaddressed: hardware-threading. Different platforms and situations report different numbers of logical, affinity, and physical CPU cores. One of the factors that affects this is if there is hardware-threading present on the machine, such as Intel Hyperthreading. In the case of an example Linux laptop having an Intel chip with Hyperthreading, the logical and physical core numbers will report different values with respect to each other: logical includes Hyperthreads (e.g. `psutil.cpu_count(logical=True)` reports 8 cores), but physical does not (e.g. `psutil.cpu_count(logical=False)` reports 4 cores). If we tell MPI to use 8 cores ("logical"), then we ALSO need to tell it to also enable the hardware-threading option. However, if the user does not want to enable hardware-threading, then we need to make this an option, tell MPI to use 4 cores ("physical"), and tell MPI to not use the hardware-threading option. The "affinity" core number makes things even more complicated, since in the Linux laptop example, it is equal to the logical core number. However, on OSCAR, it is very different than the logical core number, and on Macos, it is not present at all. In `_determine_cores_hwthreading()`, if you read the lengthy comments, I have thought through each common scenario, and I believe resolved what to do for each, with respect to the number of cores to use and whether or not to use hardware-threading. These scenarios include: the user choosing to use hardware-threading (default) or not, across Macos variations with and without hardware-threading, Linux local computer variations with and without hardware-threading, and Linux HPC (e.g. OSCAR) variations which appear to never support hardware-threading. In the Windows case, due to both jonescompneurolab#589 and the currently-untested MPI integration on Windows, I always report the machine as not having hardware-threading. Additionally, previously, if the user did provide a number for MPI Processes, `MPIBackend` used some "heuristics" to decide whether to use MPI oversubscription and/or hardware-threading, but the user could not override these heuristics. Now, when a user instantiates an `MPIBackend` with `__init__()` and uses the defaults, hardware-threading is detected more robustly and enabled by default, and oversubscription is enabled based on its own heuristics; this is the case when the new arguments `hwthreading` and `oversubscribe` are set to their default value of `None`. However, if the user knows what they're doing, they can also pass either `True` or `False` to either of these options to force them on or off. Furthermore, in the case of `hwthreading`, if the user indicates they do not want to use it, then `_determine_cores_hwthreading()` correctly returns the number of NON-hardware-threaded cores for MPI's use, instead of the core number including hardware-threads. I have also modified and expanded the appropriate testing to compensate for these changes. Note that this does NOT change the default number of jobs to use for the GUI if MPI is detected. Such a change breaks the current `test_gui.py` testing: see jonescompneurolab#960 jonescompneurolab#960
asoplata · Dec 13, 2024 · 25c78c0 · 25c78c0
1 parent a67b026
commit 25c78c0
Show file tree

Hide file tree

Showing 3 changed files with 368 additions and 91 deletions.
diff --git a/hnn_core/gui/gui.py b/hnn_core/gui/gui.py
@@ -8,8 +8,6 @@
 import logging
 import mimetypes
 import numpy as np
-import platform
-import psutil
 import sys
 import json
 import urllib.parse
@@ -36,7 +34,9 @@
                                      get_L5Pyr_params_default)
 from hnn_core.hnn_io import dict_to_network, write_network_configuration
 from hnn_core.cells_default import _exp_g_at_dist
-from hnn_core.parallel_backends import _has_mpi4py, _has_psutil
+from hnn_core.parallel_backends import (_determine_cores_hwthreading,
+                                        _has_mpi4py,
+                                        _has_psutil)
 
 hnn_core_root = Path(hnn_core.__file__).parent
 default_network_configuration = (hnn_core_root / 'param' /
@@ -347,7 +347,10 @@ def __init__(self, theme_color="#802989",
         self.params = self.load_parameters(network_configuration)
 
         # Number of available cores
-        self.n_cores = self._available_cores()
+        [self.n_cores, _] = _determine_cores_hwthreading(
+            enable_hwthreading=False,
+            sensible_default_cores=True,
+        )
 
         # In-memory storage of all simulation and visualization related data
         self.simulation_data = defaultdict(lambda: dict(net=None, dpls=list()))
@@ -407,7 +410,8 @@ def __init__(self, theme_color="#802989",
         self.widget_mpi_cmd = Text(value='mpiexec',
                                    placeholder='Fill if applies',
                                    description='MPI cmd:', disabled=False)
-        self.widget_n_jobs = BoundedIntText(value=1, min=1,
+        self.widget_n_jobs = BoundedIntText(value=1,
+                                            min=1,
                                             max=self.n_cores,
                                             description='Cores:',
                                             disabled=False)
@@ -496,22 +500,6 @@ def __init__(self, theme_color="#802989",
         self._init_ui_components()
         self.add_logging_window_logger()
 
-    @staticmethod
-    def _available_cores():
-        """Return the number of available cores to the process.
-
-        This is important for systems where the number of available cores is
-        partitioned such as on HPC systems. Linux and Windows can return cpu
-        affinity, which is the number of available cores. MacOS can only return
-        total system cores.
-        """
-        # For macos
-        if platform.system() == 'Darwin':
-            return psutil.cpu_count()
-        # For Linux and Windows
-        else:
-            return len(psutil.Process().cpu_affinity())
-
     @staticmethod
     def _check_backend():
         """Checks for MPI and returns the default backend name"""
@@ -2108,7 +2096,10 @@ def run_button_clicked(widget_simulation_name, log_out, drive_widgets,
         if backend_selection.value == "MPI":
             backend = MPIBackend(
                 n_procs=n_jobs.value,
-                mpi_cmd=mpi_cmd.value)
+                mpi_cmd=mpi_cmd.value,
+                hwthreading=False,
+                oversubscribe=False,
+            )
         else:
             backend = JoblibBackend(n_jobs=n_jobs.value)
             print(f"Using Joblib with {n_jobs.value} core(s).")