Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new wrapper to create all of the zcatalogs with a single command #2410

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

akremin
Copy link
Member

@akremin akremin commented Nov 8, 2024

Overview

This addresses #2360.

This creates a new script, currently called desi_zcatalog_wrapper, that will use the tiles-SPECPROD.fits file in a production to identify which SURVEY+PROGRAM pairs exist and produce zcatalogs for each of them. If all are created successfully, then the zall-* catalog is created with all of the individual zcatalogs.

Required arguments are:
--groupto specify whether you're running healpix redshifts or cumulative redshifts. It should also work for other tile-based flavors but I have not tested that.
--cat-version OR --outdir to specify where to save the zcatalogs. If --cat-version is specified then it is saved under $DESI_SPECTRO_REDUX/$SPECPROD/zcatalog/<cat_version>/.

The user can specify --indir, but it is optional and defaults to the typical places for a production.

All of the boolean flags accepted by desi_zcatalog are also accepted by this script, which propagates them to desi_zcatalog. Though I did not thoroughly vet my implementation of these.

This PR also includes updates to desispec.io.meta.findfile to include the ztile, zpix, and zall files. That required adding a new function argument version that is only used for these redshift catalogs at the moment but could be used for other things in the future. Because there is ongoing work on the zcatalogs v2, I did not edit the desi_zcatalog code itself to use findfile, but that should be updated to use findfile at some point during the v2 refactor.

Tests

Recommended usage, which uses only --cat-version and --group:

desi_zcatalog_wrapper -g healpix --cat-version=v1 --nproc=2 -v
desi_zcatalog_wrapper -g cumulative --cat-version=v1 --nproc=2 -v

Alternate method that uses --outdir

desi_zcatalog_wrapper -g healpix --outdir=/global/cfs/cdirs/desi/users/kremin/PRs/zcat_maker/test --nproc=2
desi_zcatalog_wrapper -g cumulative --outdir=/global/cfs/cdirs/desi/users/kremin/PRs/zcat_maker/test --nproc=2

Alternate method that uses cat-version for output but specify input

desi_zcatalog_wrapper -g cumulative --cat-version=v2 --nproc=2 -i /global/cfs/cdirs/desi/spectro/redux/loa/tiles/cumulative
desi_zcatalog_wrapper -g healpix --cat-version=v2 --nproc=2 -i /global/cfs/cdirs/desi/spectro/redux/loa/healpix

Alternate method that uses --indir and --outdir

desi_zcatalog_wrapper -g healpix -i /global/cfs/cdirs/desi/spectro/redux/loa/healpix -outdir=/global/cfs/cdirs/desi/users/kremin/PRs/zcat_maker/test2 --nproc=2
desi_zcatalog_wrapper -g cumulative -i /global/cfs/cdirs/desi/spectro/redux/loa/tiles/cumulative --outdir=/global/cfs/cdirs/desi/users/kremin/PRs/zcat_maker/test2 --nproc=2

These all work. I also linked the zcatalogs from loa and ran again, and the zall generation appears to work as expected.

@coveralls
Copy link

coveralls commented Nov 8, 2024

Coverage Status

coverage: 30.078% (-0.07%) from 30.145%
when pulling 5e3f3cb on zcat_maker
into 08d4acd on main.

@akremin
Copy link
Member Author

akremin commented Nov 8, 2024

The doc generation passes on my checkout at NERSC and there is nothing informative I have been able to glean from the GitHub run logs.

Trivial wrapper around create_summary_catalog that takes a dict
and passes the key-value pairs to create_summary_catalog as keyword arguments
"""
return create_summary_catalog(**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that this function exists only so that desispec.scripts.zcatalog_wrap can call it with runcmd(create_summary_catalog_wrapper, args=kwargs, ...) because runcmd only supports a list of args and not a dict of kwargs. Ideally runcmd would also support kwargs, but that is admittedly kind of messy given everything else that it is doing (function vs. script, MPI vs. not...). Lacking that, I suggest moving this function into desispec/scripts/zcatalog_wrap.py as a _hidden function for this one use-case and not making it part of the desispec.zcatalog namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't agree more about your runcmd comments. That was what I initially wanted to do before realizing it didn't accept kwargs. That change would be subtle and feature creep on this PR, so I've taken the second suggestion of making it a _hidden function in zcatalog_wrap.py.

py/desispec/scripts/zcatalog_wrap.py Outdated Show resolved Hide resolved
## but since we are only using findfile for the filename and not the
## directory tree, the version isn't actually relevant
if args.cat_version is None:
args.cat_version = os.path.basename(args.outdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default --cat-version logic is a big subtle, turning the basename into the catalog version which then propagates into the output filenames. Consider

  • making --cat-version a required parameter (a slight hassle but safer, and potentially mitigated by making a -V shortcut, or -v and making --verbose be the long thing without a -v shortcut because it is afterall, verbose, like this comment...); or
  • logging a warning message about how cat_version is being derived given that it wasn't specified on the command line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a -V shortcut and made it required.

@sbailey
Copy link
Contributor

sbailey commented Nov 19, 2024

Looking at /global/cfs/cdirs/desi/users/kremin/PRs/zcat_maker/test/zcatalog/v1/zpix-cmx-other.fits, it appears that this didn't add units. I made the same mistake when running Loa by hand (thanks to @weaverba137 for noticing; I'll post-facto fix that), but let's get this wrapper updated so that it automatically includes units too.

@akremin
Copy link
Member Author

akremin commented Nov 19, 2024

I had assumed this was fixed downstream since #2364 was closed. The wrapper includes the --add-units option, just as desi_zcatalog does.

Given that the issue isn't resolved, I am happy to fix it in this PR. I can think of two options:

  1. Remove the --add-unit option for both the wrapper and desi_zcatalog and always add units.
  2. Flip the option to be opt-out, i.e. --do-not-add-units for both the wrapper and desi_zcatalog

Also happy to hear other ideas

@weaverba137
Copy link
Member

@akremin, I know this is somewhat deeply buried, but in order for --add-unit to do anything at all, module load desidatamodel must be run before the script starts.

@akremin
Copy link
Member Author

akremin commented Nov 26, 2024

@sbailey I believe all changes have been addressed, including making the addition of units opt-out rather than opt-in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants