Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Support variable substitutions in YAML files #21

Merged
merged 23 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
e5cf918
SKL Begin support for variable substitution in config YAML
gadorlhiac May 1, 2024
bafd152
MNT Type hints and extraneous prints
gadorlhiac May 1, 2024
2b631ad
ENH Debug break points. Variable substitution minus string formating.
gadorlhiac May 2, 2024
2342c41
ENH Support substituting variables from other Tasks
gadorlhiac May 2, 2024
bebe21e
ENH Substitutions with string formatting. Better environment var support
gadorlhiac May 2, 2024
9295791
TST Config YAML for testing substitutions
gadorlhiac May 2, 2024
0a4b2ec
DOC Begin documentation on LUTE usage, including YAML subs. Modify he…
gadorlhiac May 2, 2024
3d07241
DOC Add more YAML substitution docs
gadorlhiac May 2, 2024
809462a
DOC Hopefully table formatting?
gadorlhiac May 2, 2024
7fa20c6
DOC Edit import in docs
gadorlhiac May 2, 2024
05716b2
BUG Handle case of multiple substitutions in single parameter
gadorlhiac May 2, 2024
3b59475
Merge branch 'dev' into ENH/yaml_substitutions
gadorlhiac May 2, 2024
62a7106
MNT Move import of krtc to within little used function.
gadorlhiac May 3, 2024
d229324
DOC Add more information on utilities, and cloning repo etc.
gadorlhiac May 3, 2024
0f2d04d
DOC Add some information on eLog
gadorlhiac May 3, 2024
8c8c2c8
ENH Add support for substitutions in header. Convert back to numeric …
gadorlhiac May 3, 2024
9a0bcfa
DOC Notice about inability to run Airflow DAGs from the command-line
gadorlhiac May 3, 2024
d7f116e
Merge branch 'dev' into ENH/yaml_substitutions
gadorlhiac May 6, 2024
a62e92b
DOC Extra use case for YAML substitutions
gadorlhiac May 14, 2024
0e9b959
DOC Reminder about YAML substitution order
gadorlhiac May 15, 2024
0e09321
DOC Comment on type casting during YAML substitutions if required by …
gadorlhiac May 16, 2024
6a560a0
MNT Change name from BinaryTask, etc., to ThirdPartyTask
gadorlhiac May 17, 2024
1659b63
Merge branch 'dev' into ENH/yaml_substitutions
gadorlhiac May 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions config/test_var_subs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
%YAML 1.3
---
title: "Configuration to Test YAML Substitution"
experiment: "TestYAMLSubs"
run: 12
date: "2024/05/01"
lute_version: 0.1
task_timeout: 600
work_dir: "/sdf/scratch/users/d/dorlhiac"
...
---
OtherTask:
useful_other_var: "USE ME!"

NonExistentTask:
test_sub: "/path/to/{{ experiment }}/file_r{{ run:04d }}.input"
test_env_sub: "/path/to/{{ $EXPERIMENT }}/file.input"
test_nested:
a: "outfile_{{ run }}_one.out"
b:
c: "outfile_{{ run }}_two.out"
d: "{{ OtherTask.useful_other_var }}"
test_fmt: "{{ run:04d }}"
test_env_fmt: "{{ $RUN:04d }}"
...
26 changes: 13 additions & 13 deletions docs/tutorial/new_task.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ A brief overview of parameters objects will be provided below. The following inf
**`Task`s and `TaskParameter`s**

All `Task`s have a corresponding `TaskParameters` object. These objects are linked **exclusively** by a named relationship. For a `Task` named `MyThirdPartyTask`, the parameters object **must** be named `MyThirdPartyTaskParameters`. For third-party `Task`s there are a number of additional requirements:
- The model must inherit from a base class called `BaseBinaryParameters`.
- The model must inherit from a base class called `ThirdPartyParameters`.
- The model must have one field specified called `executable`. The presence of this field indicates that the `Task` is a third-party `Task` and the specified executable must be called. This allows all third-party `Task`s to be defined exclusively by their parameters model. A single `ThirdPartyTask` class handles execution of **all** third-party `Task`s.

All models are stored in `lute/io/models`. For any given `Task`, a new model can be added to an existing module contained in this directory or to a new module. If creating a new module, make sure to add an import statement to `lute.io.models.__init__`.
Expand All @@ -39,13 +39,13 @@ from pydantic import Field, validator
# Also include any pydantic type specifications - Pydantic has many custom
# validation types already, e.g. types for constrained numberic values, URL handling, etc.

from .base import BaseBinaryParameters
from .base import ThirdPartyParameters

# Change class name as necessary
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for RunTask..."""

class Config(BaseBinaryParameters.Config): # MUST be exactly as written here.
class Config(ThirdPartyParameters.Config): # MUST be exactly as written here.
...
# Model-wide configuration will go here

Expand Down Expand Up @@ -83,10 +83,10 @@ As an example, we can again consider defining a model for a `RunTask` `Task`. Co

A model specification for this `Task` may look like:
```py
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for the runtask binary."""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True # For the --method parameter

# Prefer using full/absolute paths where possible.
Expand Down Expand Up @@ -144,7 +144,7 @@ For example, consider the `method_param1` field defined above for `RunTask`. We

```py
from pydantic import Field, validator, ValidationError
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for the runtask binary."""

# [...]
Expand Down Expand Up @@ -205,10 +205,10 @@ Parameters used to run a `Task` are recorded in a database for every `Task`. It
```py
from pydantic import Field, validator

from .base import BaseBinaryParameters
from .base import ThirdPartyParameters
from ..db import read_latest_db_entry

class RunTask2Parameters(BaseBinaryParameters):
class RunTask2Parameters(ThirdPartyParameters):
input: str = Field("", description="Input file.", flag_type="--")

@validator("input")
Expand Down Expand Up @@ -241,8 +241,8 @@ After a pydantic model has been created, the next required step is to define a *
As mentioned, for most cases you can setup a third-party `Task` to use the first type of `Executor`. If, however, your third-party `Task` uses MPI, you can use either. When using the standard `Executor` for a `Task` requiring MPI, the `executable` in the pydantic model must be set to `mpirun`. For example, a third-party `Task` model, that uses MPI but can be run with the `Executor` may look like the following. We assume this `Task` runs a Python script using MPI.

```py
class RunMPITaskParameters(BaseBinaryParameters):
class Config(BaseBinaryParameters.Config):
class RunMPITaskParameters(ThirdPartyParameters):
class Config(ThirdPartyParameters.Config):
...

executable: str = Field("mpirun", description="MPI executable")
Expand Down Expand Up @@ -297,14 +297,14 @@ LUTE provides two additional base models which are used for template parsing in
- `TemplateParameters` objects which hold parameters which will be used to render a portion of a template.
- `TemplateConfig` objects which hold two strings: the name of the template file to use and the full path (including filename) of where to output the rendered result.

`Task` models which inherit from the `BaseBinaryParameters` model, as all third-party `Task`s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as `TemplateParameters` objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the **only** requirement on the Python-side when adding template rendering functionality to the `Task` is the addition of one parameter - an instance of `TemplateConfig`. The instance **MUST** be called `lute_template_cfg`.
`Task` models which inherit from the `ThirdPartyParameters` model, as all third-party `Task`s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as `TemplateParameters` objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the **only** requirement on the Python-side when adding template rendering functionality to the `Task` is the addition of one parameter - an instance of `TemplateConfig`. The instance **MUST** be called `lute_template_cfg`.

```py
from pydantic import Field, validator

from .base import TemplateConfig

class RunTaskParamaters(BaseBinaryParameters):
class RunTaskParamaters(ThirdPartyParameters):
...
# This parameter MUST be called lute_template_cfg!
lute_template_cfg: TemplateConfig = Field(
Expand Down
Loading
Loading