Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Support variable substitutions in YAML files #21

Merged
merged 23 commits into from
May 23, 2024

Conversation

gadorlhiac
Copy link
Collaborator

@gadorlhiac gadorlhiac commented May 1, 2024

Description

This PR provides a way for users to specify input parameters in terms of other parameters (or environment variables) directly from the configuration YAML file. It extends the use of Pydantic validators to define default values in terms of other parameters to arbitrary combinations specified at run time.

Substitution uses a minimal subset of the Jinja syntax.

  • To use the value of another parameter specified in the YAML file: {{ other_param }}
  • To use an environment variable: {{ $ENV_VAR }}

An example use case would be to define paths in terms of experiment or run number:

MyTask:
  input_file: /path/to/my/{{ experiment }}/{{ run }}/file.inp
  # Or in terms of environment variables
  other_file: /path/to/my/{{ $EXPERIMENT }}/{{ $RUN_NUM }}/other_file.inp

This PR also introduces a unified way to introduce early program exits tied to environment variables. These are intended for debug use, where an environment variable can be used to allow program exit (eventually pause) at certain markers in the program.

Checklist

  • Variable substitutions during YAML parsing
    • Substitute other variables specified in the YAML file
    • Substitute environment variables
    • Support limited string formatting
      • E.g. {{ $MY_VAR:04d }}
  • Documentation updates
  • Test examples
  • Environment variables to control early exit (eventually pause) at debug markers in the program.
  • Move krtc import into function - allows running in more standard environments.
  • Change name of BaseBinaryParameters and BinaryTask to ThirdPartyParameters and ThirdPartyTask

PR Type:

  • New feature/Enhancement

Address issues:

  • NA

Testing

A debug exit point has been set after the variable substitution has been performed. A test YAML file has also been provided, so the output after substitution can be examined by setting a number of environment variables.
The input YAML is as follows:

%YAML 1.3
---
title: "Configuration to Test YAML Substitution"
experiment: "TestYAMLSubs"
run: 12
date: "2024/05/01"
lute_version: 0.1
task_timeout: 600
work_dir: "/sdf/scratch/users/d/dorlhiac"
...
---
OtherTask:
  useful_other_var: "USE ME!"

NonExistentTask:
  test_sub: "/path/to/{{ experiment }}/file_r{{ run:04d }}.input"
  test_env_sub: "/path/to/{{ $EXPERIMENT }}/file.input"
  test_nested:
    a: "outfile_{{ run }}_one.out"
    b:
      c: "outfile_{{ run }}_two.out"
      d: "{{ OtherTask.useful_other_var }}"
  test_fmt: "{{ run:04d }}"
  test_env_fmt: "{{ $RUN:04d }}"
...

The following command command can be run to explore the output:

# Run the line below
> RUN=54321 EXPERIMENT=ENVEXPSUB LUTE_DEBUG_EXIT_AT_YAML=1 python -B run_task.py -t Tester -c config/test_var_subs.yaml
DEBUG:lute.execution.ipc:SocketCommunicator defines socket_path: /tmp/lute_1b478283a028457b9637518759139791.sock
INFO:lute.execution.executor:Cannot source environment from /sdf/group/lcls/ds/tools/ccp4-8.0/bin/ccp4.setup-sh!
INFO:lute.execution.executor:Cannot source environment from /sdf/group/lcls/ds/tools/ccp4-8.0/bin/ccp4.setup-sh!
DEBUG:lute.execution.executor:Absolute path to subprocess_task.py not found.
DEBUG:lute.execution.ipc:PipeCommunicator (Executor) - Set _use_pickle=False
INFO:lute.execution.executor:LUTE_DEBUG_EXIT - lute/lute/io/config.py, line: 186
{'NonExistentTask': {'test_env_fmt': '54321',
                     'test_env_sub': '/path/to/ENVEXPSUB/file.input',
                     'test_fmt': '0012',
                     'test_nested': {'a': 'outfile_12_one.out',
                                     'b': {'c': 'outfile_12_two.out',
                                           'd': 'USE ME!'}},
                     'test_sub': '/path/to/TestYAMLSubs/file_r0012.input'},
 'OtherTask': {'useful_other_var': 'USE ME!'}}

INFO:lute.io.db:Unable to access TaskParameters object. Likely wasn't created. Cannot store result.

Screenshots

@gadorlhiac gadorlhiac requested a review from valmar May 3, 2024 19:41
@gadorlhiac gadorlhiac marked this pull request as ready for review May 3, 2024 19:41
Copy link
Contributor

@valmar valmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@valmar valmar merged commit 6b54dde into slac-lcls:dev May 23, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants