diff --git a/joss/paper.bib b/joss/paper.bib index 3b0c063..a82e64d 100644 --- a/joss/paper.bib +++ b/joss/paper.bib @@ -29,7 +29,7 @@ @article{Mortensen:2019 number = {36}, pages = {1340}, author = {Mortensen, Mikael and Dalcin, Lisandro and Keyes, David Elliot}, - title = {mpi4py-fft: Parallel Fast Fourier Transforms with MPI for Python}, + title = {mpi4py-fft: Parallel Fast {F}ourier Transforms with MPI for {P}ython}, journal = {Journal of Open Source Software} } @@ -42,7 +42,7 @@ @article{Morgan:2024 number = {96}, pages = {6415}, author = {Joanna Piper Morgan and Ilham Variansyah and Samuel L. Pasmann and Kayla B. Clements and Braxton Cuneo and Alexander Mote and Charles Goodman and Caleb Shaw and Jordan Northrop and Rohan Pankaj and Ethan Lame and Benjamin Whewell and Ryan G. McClarren and Todd S. Palmer and Lizhong Chen and Dmitriy Y. Anistratov and C. T. Kelley and Camille J. Palmer and Kyle E. Niemeyer}, - title = {Monte Carlo / Dynamic Code (MC/DC): An accelerated Python package for fully transient neutron transport and rapid methods development}, + title = {{M}onte {C}arlo / Dynamic Code (MC/DC): An accelerated {P}ython package for fully transient neutron transport and rapid methods development}, journal = {Journal of Open Source Software} } diff --git a/joss/paper.md b/joss/paper.md index 37c289b..557643b 100644 --- a/joss/paper.md +++ b/joss/paper.md @@ -39,18 +39,18 @@ in scientific inverse problems can be decomposed into a series of computational When addressing distributed inverse problems, we identify three distinct families of problems: -- **1. Fully distributed models and data**: Both model and data are split across nodes, with each node processing its own portion of the model and data. This leads to minimal +1. **Fully distributed models and data**: Both model and data are split across nodes, with each node processing its own portion of the model and data. This leads to minimal communication, mainly when performing dot products in the solver or in the regularization terms. -- **2. Distributed data, model available on all nodes**: Data is distributed across nodes, whilst the model is available on all nodes. +2. **Distributed data, model available on all nodes**: Data is distributed across nodes, whilst the model is available on all nodes. Communication happens during the adjoint pass to sum models and in the solver for data vector operations. -- **3. Model and data available on all nodes**: All nodes have identical copies of the data and model. Communication only happens within +3. **Model and data available on all nodes**: All nodes have identical copies of the data and model. Communication only happens within the operator, with no communication in solver needed. MPI for Python (mpi4py [@Dalcin:2021]) provides Python bindings for the MPI standard, allowing applications to leverage multiple processors. Projects like mpi4py-fft [@Mortensen:2019], mcdc [@Morgan:2024], and mpi4jax [@mpi4jax] -utilize mpi4py to provide distributed computing capabilities. Similarly, PyLops-MPI, which is built on top of PyLops [@Ravasi:2020] leverages mpi4py to solve large-scale problems in a distributed fashion. +utilize mpi4py to provide distributed computing capabilities. Similarly, PyLops-MPI, which is built on top of PyLops [@Ravasi:2020], leverages mpi4py to solve large-scale problems in a distributed fashion. Its intuitive API provide functionalities to scatter and broadcast data and model vector across nodes and allows various mathematical operations (e.g., summation, subtraction, norms) to be performed. Additionally, a suite of MPI-powered linear operators and solvers is offered, and its flexible design eases the integration of custom operators and solvers. @@ -59,7 +59,7 @@ to be performed. Additionally, a suite of MPI-powered linear operators and solve PyLops-MPI is designed to tackle large-scale linear inverse problems that are difficult to solve using a single process (due to either extremely high computational cost or memory requirements). -![Software Framework representation of the ``PyLops-MPI`` API.](figs/soft_framework.png) +![Software framework representation of the ``PyLops-MPI`` API.](figs/soft_framework.png) Fig. 1 illustrates the main components of the library, emphasizing the relationship between the DistributedArray class, stacked operators, and MPI-powered solvers. @@ -71,8 +71,8 @@ NumPy [@Harris:2020] or CuPy [@cupy] arrays across multiple processes. It also s ## HStack, VStack, BlockDiag Operators PyLops facilitates the combinations of multiple linear operators via horizontal, vertical, or diagonal stacking. PyLops-MPI provides -distributed versions of these operations-e.g., `pylops_mpi.MPIBlockDiag` applies different operators in parallel on separate portions of the model -and data. `pylops_mpi.MPIVStack` applies multiple operators in parallel to the whole model, with its adjoint applies the adjoint of each individual operator to portions of the data vector and sums the individual output. `pylops_mpi.MPIHStack` is the adjoint of MPIVStack. +distributed versions of these operations Examples include `pylops_mpi.MPIBlockDiag`, which applies different operators in parallel on separate portions of the model +and data, `pylops_mpi.MPIVStack`, which applies multiple operators in parallel to the whole model, with its adjoint applies the adjoint of each individual operator to portions of the data vector and sums the individual output, and `pylops_mpi.MPIHStack`, which is the adjoint of MPIVStack. ## Halo Exchange