Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remodel entry points and Command Line Interface to use "Swiss Army knife" approach #517

Open
4 of 13 tasks
ns-rse opened this issue Mar 27, 2023 · 4 comments · Fixed by #540
Open
4 of 13 tasks

Remodel entry points and Command Line Interface to use "Swiss Army knife" approach #517

ns-rse opened this issue Mar 27, 2023 · 4 comments · Fixed by #540
Labels

Comments

@ns-rse
Copy link
Collaborator

ns-rse commented Mar 27, 2023

I think it would be useful to revise the Command Line Interface (CLI) entry point to TopoStats. Currently there are two run_topostats and toposum but in order to make this extensible I feel we should adopt what is termed the "Swiss Army knife" approach to Command Line Interfaces.

This is what programmes such as git, pre-commit and many others use. They have a single command for invocation followed by a sub-command. Using pre-commit as an example (as its written in Python and provides a good pattern to emulate)...

The main pre-commit command has the following help...

❱ pre-commit --help
usage: pre-commit [-h] [-V]
                  {autoupdate,clean,gc,init-templatedir,install,install-hooks,migrate-config,run,sample-config,try-repo,uninstall,validate-config,validate-manifest,help,hook-impl}
                  ...

positional arguments:
  {autoupdate,clean,gc,init-templatedir,install,install-hooks,migrate-config,run,sample-config,try-repo,uninstall,validate-config,validate-manifest,help,hook-impl}
    autoupdate          Auto-update pre-commit config to the latest repos' versions.
    clean               Clean out pre-commit files.
    gc                  Clean unused cached repos.
    init-templatedir    Install hook script in a directory intended for use with `git config init.templateDir`.
    install             Install the pre-commit script.
    install-hooks       Install hook environments for all environments in the config file. You may find `pre-commit install --install-hooks` more useful.
    migrate-config      Migrate list configuration to new map configuration.
    run                 Run hooks.
    sample-config       Produce a sample .pre-commit-config.yaml file
    try-repo            Try the hooks in a repository, useful for developing new hooks.
    uninstall           Uninstall the pre-commit script.
    validate-config     Validate .pre-commit-config.yaml files
    validate-manifest   Validate .pre-commit-hooks.yaml files
    help                Show help for a specific command.

options:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit

That is the positional arguments are sub-commands, if you want to run then you have the following options to that...

❱ pre-commit run --help
usage: pre-commit run [-h] [--color {auto,always,never}] [-c CONFIG] [--verbose] [--all-files | --files [FILES ...]] [--show-diff-on-failure]
                      [--hook-stage {commit,merge-commit,prepare-commit-msg,commit-msg,post-commit,manual,post-checkout,push,post-merge,post-rewrite}]
                      [--remote-branch REMOTE_BRANCH] [--local-branch LOCAL_BRANCH] [--from-ref FROM_REF] [--to-ref TO_REF] [--commit-msg-filename COMMIT_MSG_FILENAME]
                      [--prepare-commit-message-source PREPARE_COMMIT_MESSAGE_SOURCE] [--commit-object-name COMMIT_OBJECT_NAME] [--remote-name REMOTE_NAME]
                      [--remote-url REMOTE_URL] [--checkout-type CHECKOUT_TYPE] [--is-squash-merge IS_SQUASH_MERGE] [--rewrite-command REWRITE_COMMAND]
                      [hook]

positional arguments:
  hook                  A single hook-id to run

options:
  -h, --help            show this help message and exit
  --color {auto,always,never}
                        Whether to use color in output. Defaults to `auto`.
  -c CONFIG, --config CONFIG
                        Path to alternate config file
  --verbose, -v
  --all-files, -a       Run on all the files in the repo.
  --files [FILES ...]   Specific filenames to run hooks on.
  --show-diff-on-failure
                        When hooks fail, run `git diff` directly afterward.
  --hook-stage {commit,merge-commit,prepare-commit-msg,commit-msg,post-commit,manual,post-checkout,push,post-merge,post-rewrite}
                        The stage during which the hook is fired. One of commit, merge-commit, prepare-commit-msg, commit-msg, post-commit, manual, post-checkout, push, post-
                        merge, post-rewrite
  --remote-branch REMOTE_BRANCH
                        Remote branch ref used by `git push`.
  --local-branch LOCAL_BRANCH
                        Local branch ref used by `git push`.
  --from-ref FROM_REF, --source FROM_REF, -s FROM_REF
                        (for usage with `--to-ref`) -- this option represents the original ref in a `from_ref...to_ref` diff expression. For `pre-push` hooks, this represents the
                        branch you are pushing to. For `post-checkout` hooks, this represents the branch that was previously checked out.
  --to-ref TO_REF, --origin TO_REF, -o TO_REF
                        (for usage with `--from-ref`) -- this option represents the destination ref in a `from_ref...to_ref` diff expression. For `pre-push` hooks, this
                        represents the branch being pushed. For `post-checkout` hooks, this represents the branch that is now checked out.
  --commit-msg-filename COMMIT_MSG_FILENAME
                        Filename to check when running during `commit-msg`
  --prepare-commit-message-source PREPARE_COMMIT_MESSAGE_SOURCE
                        Source of the commit message (typically the second argument to .git/hooks/prepare-commit-msg)
  --commit-object-name COMMIT_OBJECT_NAME
                        Commit object name (typically the third argument to .git/hooks/prepare-commit-msg)
  --remote-name REMOTE_NAME
                        Remote name used by `git push`.
  --remote-url REMOTE_URL
                        Remote url used by `git push`.
  --checkout-type CHECKOUT_TYPE
                        Indicates whether the checkout was a branch checkout (changing branches, flag=1) or a file checkout (retrieving a file from the index, flag=0).
  --is-squash-merge IS_SQUASH_MERGE
                        During a post-merge hook, indicates whether the merge was a squash merge
  --rewrite-command REWRITE_COMMAND
                        During a post-rewrite hook, specifies the command that invoked the rewrite

I envisage replacing existing commands with the following (see also table below for further thoughts/details)...

Current Proposed
run_topostats topostats process
toposum topostats summarise
run_topostats --create-config-file topostats config

Implementation

Following the example of pre-commit this would entail introducing a topostats/main.py module to provide an entry point of topostats.main:main.

topostats/main.py then imports the various commands from a multitude of sub-modules under topostats/commands/*.py (there is one for each command).

Each arguments are defined for each sub-command within main.py in what may be an Abstract Factory design pattern (not quite sure on this front yet!).

Additional Changes

In addition all documentation (README.md, docs/usage.md etc.) would also require updating to reflect these changes.

Modules to Add

The ground work for this has been set thanks to @SylviaWhittle work in #540. We now need to further modularise the CLI with individual steps corresponding to each class as these are the way in which we delineate the processing steps in the code to run the following processing, each step saving the results for subsequent steps to be used courtesy of #613 which introduced io.save_topostats_file() to save the current state to a hdf5 file.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Apr 21, 2023

Documenting possible structure/options

Command Option(s) Description
config --copy Make a straight copy of topostats/default_config.yaml, this will include the field descriptors and satisfy #536
--create Loads topostats/default_config.yaml and updates any options with those specified on the command line, e.g. --output ~/somewhere/else would updated the output value. This would lose the field descriptors requested in #253
--plotting-dictionary Generate a sample plotting dictionary from topostats/plotting_dictionary.yaml.
--file File to write output to (default would be sample_config.yaml for default_config.yaml variants or plotting_config.yaml if --plotting-dictionary is requested.
process <config_options> Run topostats modifying the topostats/default_config.yaml with any specified command line options.
filter <config_options> Run just the filtering stage of processing.
grains <config_options> Run grain detection on filtered NumPy arrays.
grain_stats <config_options> Run grain statsitics calculations on grain detected NumPy arrays.
dnatracing <config_options> Run tracing on detected grains.
curvature <config_options> Calculate curvature from traced NumPy arrays.
summarise <config_options> Run summary plot generation along with specific options.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Apr 21, 2023

@ns-rse
Copy link
Collaborator Author

ns-rse commented May 23, 2023

Remember to ensure basename is derived for grainstats_df as well as tracing_stats_df.

@ns-rse
Copy link
Collaborator Author

ns-rse commented Nov 30, 2023

Re-opening to undertake further modularisation as per table.

Individual issues created to address each step in the processing and this issue will serve as an Epic and be closed when each is completed.

@ns-rse ns-rse reopened this Nov 30, 2023
ns-rse added a commit that referenced this issue Oct 28, 2024
As a step towards #517 this commit removes the legacy entry points and their associated tests.
@ns-rse ns-rse added the v2.3.1 label Nov 26, 2024
ns-rse added a commit that referenced this issue Jan 6, 2025
Closes #1067

Modifies `LoadScan.load_topostats()` to take an argument `extract: str = "all"` so that by default the cleaned
image (post filter) that is stored at `image`, `px_to_nm_scaling` and `data` that are stored in `.topostats` HDF5 are
returned.

To assist with #517 though it is also possible to specify other data to extract such as `raw` to get the original
image array and `pixel_to_nm_scaling` should the user want to re-run the `Filter` stage and `filter` should the user
wish to re-run the grain detection on the cleaned (post-Filter) array

The user options are mapped to the keys used in the HDF5 structure by means of a dictionary (which is local to the
`.load_topostats()` function) and will be extended as required in subsequent work.

Tests are expanded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant