Skip to content

Commit

Permalink
SKL Basic idea for ThirdPartyShellParameters model
Browse files Browse the repository at this point in the history
  • Loading branch information
gadorlhiac committed May 13, 2024
1 parent 925b597 commit 8a3e8fb
Show file tree
Hide file tree
Showing 10 changed files with 88 additions and 49 deletions.
26 changes: 13 additions & 13 deletions docs/tutorial/new_task.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ A brief overview of parameters objects will be provided below. The following inf
**`Task`s and `TaskParameter`s**

All `Task`s have a corresponding `TaskParameters` object. These objects are linked **exclusively** by a named relationship. For a `Task` named `MyThirdPartyTask`, the parameters object **must** be named `MyThirdPartyTaskParameters`. For third-party `Task`s there are a number of additional requirements:
- The model must inherit from a base class called `BaseBinaryParameters`.
- The model must inherit from a base class called `ThirdPartyParameters`.
- The model must have one field specified called `executable`. The presence of this field indicates that the `Task` is a third-party `Task` and the specified executable must be called. This allows all third-party `Task`s to be defined exclusively by their parameters model. A single `ThirdPartyTask` class handles execution of **all** third-party `Task`s.

All models are stored in `lute/io/models`. For any given `Task`, a new model can be added to an existing module contained in this directory or to a new module. If creating a new module, make sure to add an import statement to `lute.io.models.__init__`.
Expand All @@ -39,13 +39,13 @@ from pydantic import Field, validator
# Also include any pydantic type specifications - Pydantic has many custom
# validation types already, e.g. types for constrained numberic values, URL handling, etc.

from .base import BaseBinaryParameters
from .base import ThirdPartyParameters

# Change class name as necessary
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for RunTask..."""

class Config(BaseBinaryParameters.Config): # MUST be exactly as written here.
class Config(ThirdPartyParameters.Config): # MUST be exactly as written here.
...
# Model-wide configuration will go here

Expand Down Expand Up @@ -83,10 +83,10 @@ As an example, we can again consider defining a model for a `RunTask` `Task`. Co

A model specification for this `Task` may look like:
```py
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for the runtask binary."""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True # For the --method parameter

# Prefer using full/absolute paths where possible.
Expand Down Expand Up @@ -144,7 +144,7 @@ For example, consider the `method_param1` field defined above for `RunTask`. We

```py
from pydantic import Field, validator, ValidationError
class RunTaskParameters(BaseBinaryParameters):
class RunTaskParameters(ThirdPartyParameters):
"""Parameters for the runtask binary."""

# [...]
Expand Down Expand Up @@ -205,10 +205,10 @@ Parameters used to run a `Task` are recorded in a database for every `Task`. It
```py
from pydantic import Field, validator

from .base import BaseBinaryParameters
from .base import ThirdPartyParameters
from ..db import read_latest_db_entry

class RunTask2Parameters(BaseBinaryParameters):
class RunTask2Parameters(ThirdPartyParameters):
input: str = Field("", description="Input file.", flag_type="--")

@validator("input")
Expand Down Expand Up @@ -241,8 +241,8 @@ After a pydantic model has been created, the next required step is to define a *
As mentioned, for most cases you can setup a third-party `Task` to use the first type of `Executor`. If, however, your third-party `Task` uses MPI, you can use either. When using the standard `Executor` for a `Task` requiring MPI, the `executable` in the pydantic model must be set to `mpirun`. For example, a third-party `Task` model, that uses MPI but can be run with the `Executor` may look like the following. We assume this `Task` runs a Python script using MPI.

```py
class RunMPITaskParameters(BaseBinaryParameters):
class Config(BaseBinaryParameters.Config):
class RunMPITaskParameters(ThirdPartyParameters):
class Config(ThirdPartyParameters.Config):
...

executable: str = Field("mpirun", description="MPI executable")
Expand Down Expand Up @@ -297,14 +297,14 @@ LUTE provides two additional base models which are used for template parsing in
- `TemplateParameters` objects which hold parameters which will be used to render a portion of a template.
- `TemplateConfig` objects which hold two strings: the name of the template file to use and the full path (including filename) of where to output the rendered result.

`Task` models which inherit from the `BaseBinaryParameters` model, as all third-party `Task`s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as `TemplateParameters` objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the **only** requirement on the Python-side when adding template rendering functionality to the `Task` is the addition of one parameter - an instance of `TemplateConfig`. The instance **MUST** be called `lute_template_cfg`.
`Task` models which inherit from the `ThirdPartyParameters` model, as all third-party `Task`s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as `TemplateParameters` objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the **only** requirement on the Python-side when adding template rendering functionality to the `Task` is the addition of one parameter - an instance of `TemplateConfig`. The instance **MUST** be called `lute_template_cfg`.

```py
from pydantic import Field, validator

from .base import TemplateConfig

class RunTaskParamaters(BaseBinaryParameters):
class RunTaskParamaters(ThirdPartyParameters):
...
# This parameter MUST be called lute_template_cfg!
lute_template_cfg: TemplateConfig = Field(
Expand Down
49 changes: 44 additions & 5 deletions lute/io/models/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
TaskParameters(BaseSettings): Base class for Task parameters. Subclasses
specify a model of parameters and their types for validation.
BaseBinaryParameters(TaskParameters): Base class for Third-party, binary
ThirdPartyParameters(TaskParameters): Base class for Third-party, binary
executable Tasks.
TemplateParameters: Dataclass to represent parameters of binary
Expand All @@ -22,12 +22,12 @@
"AnalysisHeader",
"TemplateConfig",
"TemplateParameters",
"BaseBinaryParameters",
"ThirdPartyParameters",
]
__author__ = "Gabriel Dorlhiac"

import os
from typing import Dict, Any, Union
from typing import Dict, Any, Union, List

from pydantic import (
BaseModel,
Expand Down Expand Up @@ -139,7 +139,7 @@ class TemplateParameters:
params: Any


class BaseBinaryParameters(TaskParameters):
class ThirdPartyParameters(TaskParameters):
"""Base class for third party task parameters.
Contains special validators for extra arguments and handling of parameters
Expand All @@ -156,14 +156,53 @@ class Config(TaskParameters.Config):
# lute_template_cfg: TemplateConfig

@root_validator(pre=False)
def extra_fields_to_thirdparty(cls, values):
def extra_fields_to_thirdparty(cls, values) -> Dict[str, Any]:
for key in values:
if key not in cls.__fields__:
values[key] = TemplateParameters(values[key])

return values


class ThirdPartyShellParameters(ThirdPartyParameters):
"""Class for ThirdPartyTask's that need to be run within a shell.
This TaskParameters model will convert all parameters such that they can be
called with a command as `bash -c "$EXECUTABLE $PARAM1 $PARAM2 ..."`.
All parameters MUST appear in the order they need to appear in the command.
The actual executable to call must also be provided, however, it must use a
different parameter name. E.g. to execute `my_binary` within the shell,
a `my_binary`
"""

executable: str = Field("/bin/bash", description="Shell to use.", flag_type="")

bash_cmd: str = Field(
"",
description="Command to run in shell. This is populated automatically.",
flag_type="-",
rename_param="c",
)

@root_validator(pre=False)
def validate_bash_command(cls, values: Dict[str, Any]) -> Dict[str, Any]:
formatted_cmd: str = ""
ignored_keys: List[str] = ["lute_config", "executable"]
for key in values:
if key in ignored_keys or isinstance(values[key], TemplateParameters):
continue
# Create a new formatted command with all parameters
formatted_cmd = " ".join((formatted_cmd, values[key]))
# Then set each parameter to None so it gets ignored
values[key] = None

if values["bash_cmd"] == "":
values["bash_cmd"] = formatted_cmd

return values


class TemplateConfig(BaseModel):
"""Parameters used for templating of third party configuration files.
Expand Down
4 changes: 2 additions & 2 deletions lute/io/models/sfx_find_peaks.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from pydantic import BaseModel, Field, PositiveInt, validator

from .base import BaseBinaryParameters, TaskParameters, TemplateConfig
from .base import ThirdPartyParameters, TaskParameters, TemplateConfig


class FindPeaksPyAlgosParameters(TaskParameters):
Expand Down Expand Up @@ -125,7 +125,7 @@ def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -> str:
return out_file


class FindPeaksPsocakeParameters(BaseBinaryParameters):
class FindPeaksPsocakeParameters(ThirdPartyParameters):

class SZParameters(BaseModel):
compressor: Literal["qoz", "sz3"] = Field(
Expand Down
8 changes: 4 additions & 4 deletions lute/io/models/sfx_index.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Models for serial femtosecond crystallography indexing.
Classes:
IndexCrystFELParameters(BaseBinaryParameters): Perform indexing of hits/peaks using
IndexCrystFELParameters(ThirdPartyParameters): Perform indexing of hits/peaks using
CrystFEL's `indexamajig`.
"""

Expand All @@ -23,18 +23,18 @@
)

from ..db import read_latest_db_entry
from .base import BaseBinaryParameters, TaskParameters
from .base import ThirdPartyParameters, TaskParameters


class IndexCrystFELParameters(BaseBinaryParameters):
class IndexCrystFELParameters(ThirdPartyParameters):
"""Parameters for CrystFEL's `indexamajig`.
There are many parameters, and many combinations. For more information on
usage, please refer to the CrystFEL documentation, here:
https://www.desy.de/~twhite/crystfel/manual-indexamajig.html
"""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True
"""Whether long command-line arguments are passed like `--long=arg`."""

Expand Down
20 changes: 10 additions & 10 deletions lute/io/models/sfx_merge.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
"""Models for merging reflections in serial femtosecond crystallography.
Classes:
MergePartialatorParameters(BaseBinaryParameters): Perform merging using
MergePartialatorParameters(ThirdPartyParameters): Perform merging using
CrystFEL's `partialator`.
CompareHKLParameters(BaseBinaryParameters): Calculate figures of merit using
CompareHKLParameters(ThirdPartyParameters): Calculate figures of merit using
CrystFEL's `compare_hkl`.
ManipulateHKLParameters(BaseBinaryParameters): Perform transformations on
ManipulateHKLParameters(ThirdPartyParameters): Perform transformations on
lists of reflections using CrystFEL's `get_hkl`.
"""

Expand All @@ -23,19 +23,19 @@

from pydantic import Field, validator

from .base import BaseBinaryParameters
from .base import ThirdPartyParameters
from ..db import read_latest_db_entry


class MergePartialatorParameters(BaseBinaryParameters):
class MergePartialatorParameters(ThirdPartyParameters):
"""Parameters for CrystFEL's `partialator`.
There are many parameters, and many combinations. For more information on
usage, please refer to the CrystFEL documentation, here:
https://www.desy.de/~twhite/crystfel/manual-partialator.html
"""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True
"""Whether long command-line arguments are passed like `--long=arg`."""

Expand Down Expand Up @@ -209,15 +209,15 @@ def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -> str:
return out_file


class CompareHKLParameters(BaseBinaryParameters):
class CompareHKLParameters(ThirdPartyParameters):
"""Parameters for CrystFEL's `compare_hkl` for calculating figures of merit.
There are many parameters, and many combinations. For more information on
usage, please refer to the CrystFEL documentation, here:
https://www.desy.de/~twhite/crystfel/manual-partialator.html
"""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True
"""Whether long command-line arguments are passed like `--long=arg`."""

Expand Down Expand Up @@ -337,7 +337,7 @@ def validate_shell_file(cls, shell_file: str, values: Dict[str, Any]) -> str:
return shell_file


class ManipulateHKLParameters(BaseBinaryParameters):
class ManipulateHKLParameters(ThirdPartyParameters):
"""Parameters for CrystFEL's `get_hkl` for manipulating lists of reflections.
This Task is predominantly used internally to convert `hkl` to `mtz` files.
Expand All @@ -347,7 +347,7 @@ class ManipulateHKLParameters(BaseBinaryParameters):
https://www.desy.de/~twhite/crystfel/manual-partialator.html
"""

class Config(BaseBinaryParameters.Config):
class Config(ThirdPartyParameters.Config):
long_flags_use_eq: bool = True
"""Whether long command-line arguments are passed like `--long=arg`."""

Expand Down
10 changes: 5 additions & 5 deletions lute/io/models/sfx_solve.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Models for structure solution in serial femtosecond crystallography.
Classes:
DimpleSolveParameters(BaseBinaryParameters): Perform structure solution
DimpleSolveParameters(ThirdPartyParameters): Perform structure solution
using CCP4's dimple (molecular replacement).
"""

Expand All @@ -18,11 +18,11 @@

from pydantic import Field, validator, PositiveFloat, PositiveInt

from .base import BaseBinaryParameters, TaskParameters
from .base import ThirdPartyParameters, TaskParameters
from ..db import read_latest_db_entry


class DimpleSolveParameters(BaseBinaryParameters):
class DimpleSolveParameters(ThirdPartyParameters):
"""Parameters for CCP4's dimple program.
There are many parameters. For more information on
Expand Down Expand Up @@ -202,7 +202,7 @@ def validate_out_dir(cls, out_dir: str, values: Dict[str, Any]) -> str:
return out_dir


class RunSHELXCParameters(BaseBinaryParameters):
class RunSHELXCParameters(ThirdPartyParameters):
"""Parameters for CCP4's SHELXC program.
SHELXC prepares files for SHELXD and SHELXE.
Expand Down Expand Up @@ -317,7 +317,7 @@ def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -> str:
return out_file


class RunSHELXDParameters(BaseBinaryParameters):
class RunSHELXDParameters(ThirdPartyParameters):
"""Parameters for CCP4's SHELXD program.
SHELXD performs a heavy atom search. Input files can be created by SHELXC.
Expand Down
6 changes: 3 additions & 3 deletions lute/io/models/smd.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Models for smalldata_tools Tasks.
Classes:
SubmitSMDParameters(BaseBinaryParameters): Parameters to run smalldata_tools
SubmitSMDParameters(ThirdPartyParameters): Parameters to run smalldata_tools
to produce a smalldata HDF5 file.
FindOverlapXSSParameters(TaskParameters): Parameter model for the
Expand All @@ -24,10 +24,10 @@
validator,
)

from .base import TaskParameters, BaseBinaryParameters, TemplateConfig
from .base import TaskParameters, ThirdPartyParameters, TemplateConfig


class SubmitSMDParameters(BaseBinaryParameters):
class SubmitSMDParameters(ThirdPartyParameters):
"""Parameters for running smalldata to produce reduced HDF5 files."""

executable: str = Field("mpirun", description="MPI executable.", flag_type="")
Expand Down
8 changes: 4 additions & 4 deletions lute/io/models/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
TestParameters(TaskParameters): Model for most basic test case. Single
core first-party Task. Uses only communication via pipes.
TestBinaryParameters(BaseBinaryParameters): Parameters for a simple multi-
TestBinaryParameters(ThirdPartyParameters): Parameters for a simple multi-
threaded binary executable.
TestSocketParameters(TaskParameters): Model for first-party test requiring
Expand Down Expand Up @@ -35,7 +35,7 @@
validator,
)

from .base import TaskParameters, BaseBinaryParameters
from .base import TaskParameters, ThirdPartyParameters
from ..db import read_latest_db_entry


Expand All @@ -53,12 +53,12 @@ class CompoundVar(BaseModel):
throw_error: bool = False


class TestBinaryParameters(BaseBinaryParameters):
class TestBinaryParameters(ThirdPartyParameters):
executable: str = "/sdf/home/d/dorlhiac/test_tasks/test_threads"
p_arg1: int = 1


class TestBinaryErrParameters(BaseBinaryParameters):
class TestBinaryErrParameters(ThirdPartyParameters):
"""Same as TestBinary, but exits with non-zero code."""

executable: str = "/sdf/home/d/dorlhiac/test_tasks/test_threads_err"
Expand Down
Loading

0 comments on commit 8a3e8fb

Please sign in to comment.