Merge pull request #6 from LSSTDESC/user/aimalz/renaming

naming consistency/clarity within src/rail/estimation
LSSTDESC · Jul 14, 2023 · 7e40056 · 7e40056
2 parents 60f1a68 + ed30d3f
commit 7e40056
Show file tree

Hide file tree

Showing 6 changed files with 21 additions and 21 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -37,7 +37,7 @@ jobs:
         if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
     - name: Test with pytest
       run: |
-        python -m pytest --cov=rail.estimation.algos.gpz_v1 --cov-report=xml
+        python -m pytest --cov=rail.estimation.algos.gpz --cov-report=xml
     - name: Upload coverage to Codecov
       uses: codecov/codecov-action@v1
       with:

diff --git a/README.md b/README.md
@@ -12,7 +12,7 @@ If subsequent versions of GPz with either improved features or faster performanc
 
 Any use of `rail_gpz_v1` in a paper or report should cite [Almosallam et al. 2016](https://ui.adsabs.harvard.edu/abs/2016MNRAS.462..726A/abstract)
 
-There are several free parameters that can be set via the `config_params` in `Inform_GPz_v1` that will be described in brief below, See [Almosallam et al. 2016](https://ui.adsabs.harvard.edu/abs/2016MNRAS.462..726A/abstract) for more details on the parameters, their meanings, and their effects :<br>
+There are several free parameters that can be set via the `config_params` in `GPzInformer` that will be described in brief below, See [Almosallam et al. 2016](https://ui.adsabs.harvard.edu/abs/2016MNRAS.462..726A/abstract) for more details on the parameters, their meanings, and their effects :<br>
 `gpz_method` (str): this parameter takes a str argument that sets how the length scale and covariance of the radial basis functions behave in the Gaussian process.  Valid options are `GL`, `VL`, `GD`, `VD`, `GC`, and `VC`, and give the following behavior:<br>
 - `GL`: "global length scale", all basis functions share a single length scale.<br>
 - `VL`: "variable length scale", each basis function has its own length scale.<br>

diff --git a/added_examples/GPz_estimation_example.ipynb b/added_examples/GPz_estimation_example.ipynb
@@ -5,16 +5,16 @@
    "id": "69a40421-e7b3-4a7d-9a97-b70bf6cb8f28",
    "metadata": {},
    "source": [
-    "# GPz_v1 example notebook\n",
+    "# GPzEstimator example notebook\n",
     "\n",
-    "A quick demo of running gpz_v1 on the typical test data.  You should have installed both RAIL and rail_gpz_v1 (we highly recommend that you do this from within a custom conda environment so that all dependencies for package versions are met), either by cloning and installing from github, or with:\n",
+    "A quick demo of running GPz on the typical test data.  You should have installed rail_gpz_v1 (we highly recommend that you do this from within a custom conda environment so that all dependencies for package versions are met), either by cloning and installing from github, or with:\n",
     "```\n",
     "pip install pz-rail-gpz-v1\n",
     "```\n",
     "\n",
-    "As RAIL is a namespace package, installing rail_gpz_v1 will make `Inform_GPz_v1` and `GPz_v1` available, and they can be imported via:<br>\n",
+    "As RAIL is a namespace package, installing rail_gpz_v1 will make `GPzInformer` and `GPzEstimator` available, and they can be imported via:<br>\n",
     "```\n",
-    "from rail.estimation.algos.gpz_v1 import Inform_GPz_v1, GPz_v1\n",
+    "from rail.estimation.algos.gpz import GPzInformer, GPzEstimator\n",
     "```\n",
     "\n",
     "Let's start with all of our necessary imports:"
@@ -34,7 +34,7 @@
     "import qp\n",
     "from rail.core.data import TableHandle\n",
     "from rail.core.stage import RailStage\n",
-    "from rail.estimation.algos.gpz_v1 import Inform_GPz_v1, GPz_v1"
+    "from rail.estimation.algos.gpz import GPzInformer, GPzEstimator"
    ]
   },
   {
@@ -98,15 +98,15 @@
    "outputs": [],
    "source": [
     "# set up the stage to run our GPZ_training\n",
-    "pz_train = Inform_GPz_v1.make_stage(name=\"GPz_Train\", model=\"GPz_model.pkl\", **gpz_train_dict)"
+    "pz_train = GPzInformer.make_stage(name=\"GPz_Train\", model=\"GPz_model.pkl\", **gpz_train_dict)"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "2cceb899-0acb-448d-8a58-7a61227547a9",
    "metadata": {},
    "source": [
-    "We are now ready to run the stage to create the model.  We will use the training data from `test_dc2_training_9816.hdf5`, which contains 10,225 galaxies drawn from healpix 9816 from the cosmoDC2_v1.1.4 dataset, to train the model.  Note that we read this data in called `train_data` in the DataStore.  Note that we set `trainfrac` to 0.8, so 80% of the data will be used in the \"main\" training, but 20% will be reserved by `Inform_GPz_v1` to determine a SIGMA parameter.  We set `max_iter` to 150, so we will see 150 steps where the stage tries to maximize the likelihood. We run the stage as follows:"
+    "We are now ready to run the stage to create the model.  We will use the training data from `test_dc2_training_9816.hdf5`, which contains 10,225 galaxies drawn from healpix 9816 from the cosmoDC2_v1.1.4 dataset, to train the model.  Note that we read this data in called `train_data` in the DataStore.  Note that we set `trainfrac` to 0.8, so 80% of the data will be used in the \"main\" training, but 20% will be reserved by `GPzInformer` to determine a SIGMA parameter.  We set `max_iter` to 150, so we will see 150 steps where the stage tries to maximize the likelihood. We run the stage as follows:"
    ]
   },
   {
@@ -125,7 +125,7 @@
    "id": "c5c7a409-1919-49c6-a258-4af05fe30e00",
    "metadata": {},
    "source": [
-    "This should have taken about 30 seconds on a typical desktop computer, and you should now see a file called `GPz_model.pkl` in the directory.  This model file is used by the `GPz_v1` stage to determine our redshift PDFs for the test set of galaxies.  Let's set up that stage, again defining a dictionary of variables for the config params:"
+    "This should have taken about 30 seconds on a typical desktop computer, and you should now see a file called `GPz_model.pkl` in the directory.  This model file is used by the `GPzEstimator` stage to determine our redshift PDFs for the test set of galaxies.  Let's set up that stage, again defining a dictionary of variables for the config params:"
    ]
   },
   {
@@ -137,7 +137,7 @@
    "source": [
     "gpz_test_dict = dict(hdf5_groupname=\"photometry\", model=\"GPz_model.pkl\")\n",
     "\n",
-    "gpz_run = GPz_v1.make_stage(name=\"gpz_run\", **gpz_test_dict)"
+    "gpz_run = GPzEstimator.make_stage(name=\"gpz_run\", **gpz_test_dict)"
    ]
   },
   {
@@ -197,7 +197,7 @@
    "id": "093dd6f9-f935-4aa5-9898-1c52b3bef6d1",
    "metadata": {},
    "source": [
-    "GPz_v1 parameterizes each PDF as a single Gaussian, here we see a few examples of Gaussians of different widths.  Now let's grab the mode of each PDF (stored as ancil data in the ensemble) and compare to the true redshifts from the test_data file:"
+    "GPzEstimator parameterizes each PDF as a single Gaussian, here we see a few examples of Gaussians of different widths.  Now let's grab the mode of each PDF (stored as ancil data in the ensemble) and compare to the true redshifts from the test_data file:"
    ]
   },
   {

diff --git a/src/rail/estimation/algos/GPz.py → src/rail/estimation/algos/_gpz_util.py b/src/rail/estimation/algos/GPz.py → src/rail/estimation/algos/_gpz_util.py
diff --git a/src/rail/estimation/algos/gpz_v1.py → src/rail/estimation/algos/gpz.py b/src/rail/estimation/algos/gpz_v1.py → src/rail/estimation/algos/gpz.py
@@ -7,7 +7,7 @@
 from ceci.config import StageParameter as Param
 from rail.core.common_params import SHARED_PARAMS
 from rail.estimation.estimator import CatEstimator, CatInformer
-from .GPz import GP, getOmega
+from ._gpz_util import GP, getOmega
 import qp
 
 
@@ -33,7 +33,7 @@ def _prepare_data(data_dict, bands, err_bands, nondet_val, maglims, logflag):
     return data
 
 
-class Inform_GPz_v1(CatInformer):
+class GPzInformer(CatInformer):
     """Inform stage for GPz_v1
     Parameters
     ----------
@@ -44,7 +44,7 @@ class Inform_GPz_v1(CatInformer):
       model file containing the trained GPz model to be used in estimate
       stage
     """
-    name = "Inform_GPz_v1"
+    name = "GPzInformer"
     config_options = CatInformer.config_options.copy()
     config_options.update(nondetect_val=SHARED_PARAMS,
                           mag_limits=SHARED_PARAMS,
@@ -115,10 +115,10 @@ def run(self):
         self.add_data('model', self.model)
 
 
-class GPz_v1(CatEstimator):
-    """GPz_v1 estimator
+class GPzEstimator(CatEstimator):
+    """ Estimate stage for GPz_v1
     """
-    name = "GPz_v1"
+    name = "GPzEstimator"
     config_options = CatEstimator.config_options.copy()
     config_options.update(zmin=SHARED_PARAMS,
                           zmax=SHARED_PARAMS,

diff --git a/tests/test_gpz.py b/tests/test_gpz.py
@@ -3,7 +3,7 @@
 from rail.core.stage import RailStage
 from rail.core.algo_utils import one_algo
 from rail.core.utils import RAILDIR
-from rail.estimation.algos.gpz_v1 import Inform_GPz_v1, GPz_v1
+from rail.estimation.algos.gpz import GPzInformer, GPzEstimator
 import scipy.special
 sci_ver_str = scipy.__version__.split(".")
 
@@ -18,8 +18,8 @@ def test_gpz_v1():
     train_config_dict = {"hdf5_groupname": "photometry", "max_iter": 30, "max_attempt": 25,
                          "model": "gpz_default.pkl"}
     estim_config_dict = {"hdf5_groupname": "photometry", "model": "gpz_default.pkl"}
-    train_algo = Inform_GPz_v1
-    pz_algo = GPz_v1
+    train_algo = GPzInformer
+    pz_algo = GPzEstimator
     zb_expected = np.array([0.12, 0.13, 0.12, 0.14, 0.07, 0.13, 0.14, 0.13,
                             0.06, 0.12])
     results, rerun_results, _ = one_algo("GPz_v1", train_algo, pz_algo,