diff --git a/dev/search/search_index.json b/dev/search/search_index.json
index 97c63c4d..2bd46214 100644
--- a/dev/search/search_index.json
+++ b/dev/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Setup","text":"<p>LUTE is publically available on GitHub. In order to run it, the first step is to clone the repository:</p> <pre><code># Navigate to the directory of your choice.\ngit clone@github.com:slac-lcls/lute\n</code></pre> <p>The repository directory structure is as follows:</p> <pre><code>lute\n  |--- config             # Configuration YAML files (see below) and templates for third party config\n  |--- docs               # Documentation (including this page)\n  |--- launch_scripts     # Entry points for using SLURM and communicating with Airflow\n  |--- lute               # Code\n        |--- run_task.py  # Script to run an individual managed Task\n        |--- ...\n  |--- utilities          # Help utility programs\n  |--- workflows          # This directory contains workflow definitions. It is synced elsewhere and not used directly.\n\n</code></pre> <p>In general, most interactions with the software will be through scripts located in the <code>launch_scripts</code> directory. Some users (for certain use-cases) may also choose to run the <code>run_task.py</code> script directly - it's location has been highlighted within hierarchy. To begin with you will need a YAML file, templates for which are available in the <code>config</code> directory. The structure of the YAML file and how to use the various launch scripts are described in more detail below.</p>"},{"location":"#a-note-on-utilties","title":"A note on utilties","text":"<p>In the <code>utilities</code> directory there are two useful programs to provide assistance with using the software:</p> <ul> <li><code>utilities/dbview</code>: LUTE stores all parameters for every analysis routine it runs (as well as results) in a database. This database is stored in the <code>work_dir</code> defined in the YAML file (see below). The <code>dbview</code> utility is a TUI application (Text-based user interface) which runs in the terminal. It allows you to navigate a LUTE database using the arrow keys, etc. Usage is: <code>utilities/dbview -p &lt;path/to/lute.db&gt;</code>.</li> <li><code>utilities/lute_help</code>: This utility provides help and usage information for running LUTE software. E.g., it provides access to parameter descriptions to assist in properly filling out a configuration YAML. It's usage is described in slightly more detail below.</li> </ul>"},{"location":"#basic-usage","title":"Basic Usage","text":""},{"location":"#overview","title":"Overview","text":"<p>LUTE runs code as <code>Task</code>s that are managed by an <code>Executor</code>. The <code>Executor</code> provides modifications to the environment the <code>Task</code> runs in, as well as controls details of inter-process communication, reporting results to the eLog, etc. Combinations of specific <code>Executor</code>s and <code>Task</code>s are already provided, and are referred to as managed <code>Task</code>s. Managed <code>Task</code>s are submitted as a single unit. They can be run individually, or a series of independent steps can be submitted all at once in the form of a workflow, or directed acyclic graph (DAG). This latter option makes use of Airflow to manage the individual execution steps.</p> <p>Running analysis with LUTE is the process of submitting one or more managed <code>Task</code>s. This is generally a two step process.</p> <ol> <li>First, a configuration YAML file is prepared. This contains the parameterizations of all the <code>Task</code>s which you may run.</li> <li>Individual managed <code>Task</code> submission, or workflow (DAG) submission.</li> </ol> <p>These two steps are described below.</p>"},{"location":"#preparing-a-configuration-yaml","title":"Preparing a Configuration YAML","text":"<p>All <code>Task</code>s are parameterized through a single configuration YAML file - even third party code which requires its own configuration files is managed through this YAML file. The basic structure is split into two documents, a brief header section which contains information that is applicable across all <code>Task</code>s, such as the experiment name, run numbers and the working directory, followed by per <code>Task</code> parameters:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Some title.\"\nexperiment: \"MYEXP123\"\n# run: 12 # Does not need to be provided\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nTaskOne:\n  param_a: 123\n  param_b: 456\n  param_c:\n    sub_var: 3\n    sub_var2: 4\n\nTaskTwo:\n  new_param1: 3\n  new_param2: 4\n\n# ...\n...\n</code></pre> <p>In the first document, the header, it is important that the <code>work_dir</code> is properly specified. This is the root directory from which <code>Task</code> outputs will be written, and the LUTE database will be stored. It may also be desirable to modify the <code>task_timeout</code> parameter which defines the time limit for individual <code>Task</code> jobs. By default it is set to 10 minutes, although this may not be sufficient for long running jobs. This value will be applied to all <code>Task</code>s so should account for the longest running job you expect.</p> <p>The actual analysis parameters are defined in the second document. As these vary from <code>Task</code> to <code>Task</code>, a full description will not be provided here. An actual template with real <code>Task</code> parameters is available in <code>config/test.yaml</code>. Your analysis POC can also help you set up and choose the correct <code>Task</code>s to include as a starting point. The template YAML file has further descriptions of what each parameter does and how to fill it out. You can also refer to the <code>lute_help</code> program described under the following sub-heading.</p> <p>Some things to consider and possible points of confusion:</p> <ul> <li>While we will be submitting managed <code>Task</code>s, the parameters are defined at the <code>Task</code> level. I.e. the managed <code>Task</code> and <code>Task</code> itself have different names, and the names in the YAML refer to the latter. This is because a single <code>Task</code> can be run using different <code>Executor</code> configurations, but using the same parameters. The list of managed <code>Task</code>s is in <code>lute/managed_tasks.py</code>. A table is also provided below for some routines of interest..</li> </ul> Managed <code>Task</code> The <code>Task</code> it Runs <code>Task</code> Description <code>SmallDataProducer</code> <code>SubmitSMD</code> Smalldata production <code>CrystFELIndexer</code> <code>IndexCrystFEL</code> Crystallographic indexing <code>PartialatorMerger</code> <code>MergePartialator</code> Crystallographic merging <code>HKLComparer</code> <code>CompareHKL</code> Crystallographic figures of merit <code>HKLManipulator</code> <code>ManipulateHKL</code> Crystallographic format conversions <code>DimpleSolver</code> <code>DimpleSolve</code> Crystallographic structure solution with molecular replacement <code>PeakFinderPyAlgos</code> <code>FindPeaksPyAlgos</code> Peak finding with PyAlgos algorithm. <code>PeakFinderPsocake</code> <code>FindPeaksPsocake</code> Peak finding with psocake algorithm. <code>StreamFileConcatenator</code> <code>ConcatenateStreamFiles</code> Stream file concatenation."},{"location":"#how-do-i-know-what-parameters-are-available-and-what-they-do","title":"How do I know what parameters are available, and what they do?","text":"<p>A summary of <code>Task</code> parameters is available through the <code>lute_help</code> program.</p> <pre><code>&gt; utilities/lute_help -t [TaskName]\n</code></pre> <p>Note, some parameters may say \"Unknown description\" - this either means they are using an old-style defintion that does not include parameter help, or they may have some internal use. In particular you will see this for <code>lute_config</code> on every <code>Task</code>, this parameter is filled in automatically and should be ignored. E.g. as an example:</p> <pre><code>&gt; utilities/lute_help -t IndexCrystFEL\nINFO:__main__:Fetching parameter information for IndexCrystFEL.\nIndexCrystFEL\n-------------\nParameters for CrystFEL's `indexamajig`.\n\nThere are many parameters, and many combinations. For more information on\nusage, please refer to the CrystFEL documentation, here:\nhttps://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n\n\nRequired Parameters:\n--------------------\n[...]\n\nAll Parameters:\n-------------\n[...]\n\nhighres (number)\n    Mark all pixels greater than `x` has bad.\n\nprofile (boolean) - Default: False\n    Display timing data to monitor performance.\n\ntemp_dir (string)\n    Specify a path for the temp files folder.\n\nwait_for_file (integer) - Default: 0\n    Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\n\nno_image_data (boolean) - Default: False\n    Load only the metadata, no iamges. Can check indexability without high data requirements.\n\n[...]\n</code></pre>"},{"location":"#running-managed-tasks-and-workflows-dags","title":"Running Managed <code>Task</code>s and Workflows (DAGs)","text":"<p>After a YAML file has been filled in you can run a <code>Task</code>. There are multiple ways to submit a <code>Task</code>, but there are 3 that are most likely:</p> <ol> <li>Run a single managed <code>Task</code> interactively by running <code>python ...</code></li> <li>Run a single managed <code>Task</code> as a batch job (e.g. on S3DF) via a SLURM submission <code>submit_slurm.sh ...</code></li> <li>Run a DAG (workflow with multiple managed <code>Task</code>s).</li> </ol> <p>These will be covered in turn below; however, in general all methods will require two parameters: the path to a configuration YAML file, and the name of the managed <code>Task</code> or workflow you want to run. When submitting via SLURM or submitting an entire workflow there are additional parameters to control these processes.</p>"},{"location":"#running-single-managed-tasks-interactively","title":"Running single managed <code>Task</code>s interactively","text":"<p>The simplest submission method is just to run Python interactively. In most cases this is not practical for long-running analysis, but may be of use for short <code>Task</code>s or when debugging. From the root directory of the LUTE repository (or after installation) you can use the <code>run_task.py</code> script:</p> <pre><code>&gt; python -B [-O] run_task.py -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt;\n</code></pre> <p>The command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li><code>-O</code> is the flag controlling whether you run in debug or non-debug mode. By default, i.e. if you do NOT provide this flag you will run in debug mode which enables verbose printing. Passing <code>-O</code> will turn off debug to minimize output.</li> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> </ul>"},{"location":"#submitting-a-single-managed-task-as-a-batch-job","title":"Submitting a single managed <code>Task</code> as a batch job","text":"<p>On S3DF you can also submit individual managed <code>Task</code>s to run as batch jobs. To do so use <code>launch_scripts/submit_slurm.sh</code></p> <pre><code>&gt; launch_scripts/submit_slurm.sh -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt; [--debug] $SLURM_ARGS\n</code></pre> <p>As before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided</p> <ul> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> <li><code>--debug</code> is the flag to control whether or not to run in debug mode.</li> </ul> <p>In addition to the LUTE-specific arguments, SLURM arguments must also be provided (<code>$SLURM_ARGS</code> above). You can provide as many as you want; however you will need to at least provide:</p> <ul> <li><code>--partition=&lt;partition/queue&gt;</code> - The queue to run on, in general for LCLS this is <code>milano</code></li> <li><code>--account=lcls:&lt;experiment&gt;</code> - The account to use for batch job accounting.</li> </ul> <p>You will likely also want to provide at a minimum:</p> <ul> <li><code>--ntasks=&lt;...&gt;</code> to control the number of cores in allocated.</li> </ul> <p>In general, it is best to prefer the long-form of the SLURM-argument (<code>--arg=&lt;...&gt;</code>) in order to avoid potential clashes with present or future LUTE arguments.</p>"},{"location":"#workflow-dag-submission","title":"Workflow (DAG) submission","text":"<p>Finally, you can submit a full workflow (e.g. SFX analysis, smalldata production and summary results, geometry optimization...). This can be done using a single script, <code>submit_launch_airflow.sh</code>, similarly to the SLURM submission above:</p> <pre><code>&gt; launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -c &lt;/path/to/yaml.yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] $SLURM_ARGS\n</code></pre> <p>The submission process is slightly more complicated in this case. A more in-depth explanation is provided under \"Airflow Launch Steps\", in the advanced usage section below if interested. The parameters are as follows - as before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li>The first argument (must be first) is the full path to the <code>launch_scripts/launch_airflow.py</code> script located in whatever LUTE installation you are running. All other arguments can come afterwards in any order.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML to use.</li> <li><code>-w &lt;dag_name&gt;</code> is the name of the DAG (workflow) to run. This replaces the task name provided when using the other two methods above. A DAG list is provided below.</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> controls whether to use debug mode (verbose printing)</li> <li><code>--test</code> controls whether to use the test or production instance of Airflow to manage the DAG. The instances are running identical versions of Airflow, but the <code>test</code> instance may have \"test\" or more bleeding edge development DAGs.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> </ul> <p>The <code>$SLURM_ARGS</code> must be provided in the same manner as when submitting an individual managed <code>Task</code> by hand to be run as batch job with the script above. Note that these parameters will be used as the starting point for the SLURM arguments of every managed <code>Task</code> in the DAG; however, individual steps in the DAG may have overrides built-in where appropriate to make sure that step is not submitted with potentially incompatible arguments. For example, a single threaded analysis <code>Task</code> may be capped to running on one core, even if in general everything should be running on 100 cores, per the SLURM argument provided. These caps are added during development and cannot be disabled through configuration changes in the YAML.</p> <p>DAG List</p> <ul> <li><code>find_peaks_index</code></li> <li><code>psocake_sfx_phasing</code></li> <li><code>pyalgos_sfx</code></li> </ul>"},{"location":"#dag-submission-from-the-elog","title":"DAG Submission from the <code>eLog</code>","text":"<p>You can use the script in the previous section to submit jobs through the eLog. To do so navigate to the <code>Workflow &gt; Definitions</code> tab using the blue navigation bar at the top of the eLog. On this tab, in the top-right corner (underneath the help and zoom icons) you can click the <code>+</code> sign to add a new workflow. This will bring up a \"Workflow definition\" UI window. When filling out the eLog workflow definition the following fields are needed (all of them):</p> <ul> <li><code>Name</code>: You can name the workflow anything you like. It should probably be something descriptive, e.g. if you are using LUTE to run smalldata_tools, you may call the workflow <code>lute_smd</code>.</li> <li><code>Executable</code>: In this field you will put the full path to the <code>submit_launch_airflow.sh</code> script:  <code>/path/to/lute/launch_scripts/submit_launch_airflow.sh</code>.</li> <li><code>Parameters</code>: You will use the parameters as described above. Remember the first argument will be the full path to the <code>launch_airflow.py</code> script (this is NOT the same as the bash script used in the executable!): <code>/full/path/to/lute/launch_scripts/launch_airflow.py -c &lt;path/to/yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] $SLURM_ARGS</code></li> <li><code>Location</code>: Be sure to set to <code>S3DF</code>.</li> <li><code>Trigger</code>: You can have the workflow trigger automatically or manually. Which option to choose will depend on the type of workflow you are running. In general the options <code>Manually triggered</code> (which displays as <code>MANUAL</code> on the definitions page) and <code>End of a run</code> (which displays as <code>END_OF_RUN</code> on the definitions page) are safe options for ALL workflows. The latter will be automatically submitted for you when data acquisition has finished. If you are running a workflow with managed <code>Task</code>s that work as data is being acquired (e.g. <code>SmallDataProducer</code>), you may also select <code>Start of a run</code> (which displays as <code>START_OF_RUN</code> on the definitions page).</li> </ul> <p>Upon clicking create you will see a new entry in the table on the definitions page. In order to run <code>MANUAL</code> workflows, or re-run automatic workflows, you must navigate to the <code>Workflows &gt; Control</code> tab. For each acquisition run you will find a drop down menu under the <code>Job</code> column. To submit a workflow you select it from this drop down menu by the <code>Name</code> you provided when creating its definition.</p>"},{"location":"#advanced-usage","title":"Advanced Usage","text":""},{"location":"#variable-substitution-in-yaml-files","title":"Variable Substitution in YAML Files","text":"<p>Using <code>validator</code>s, it is possible to define (generally, default) model parameters for a <code>Task</code> in terms of other parameters. It is also possible to use validated Pydantic model parameters to substitute values into a configuration file required to run a third party <code>Task</code> (e.g. some <code>Task</code>s may require their own JSON, TOML files, etc. to run properly). For more information on these types of substitutions, refer to the <code>new_task.md</code> documentation on <code>Task</code> creation.</p> <p>These types of substitutions, however, have a limitation in that they are not easily adapted at run time. They therefore address only a small number of the possible combinations in the dependencies between different input parameters. In order to support more complex relationships between parameters, variable substitutions can also be used in the configuration YAML itself. Using a syntax similar to <code>Jinja</code> templates, you can define values for YAML parameters in terms of other parameters or environment variables. The values are substituted before Pydantic attempts to validate the configuration.</p> <p>It is perhaps easiest to illustrate with an example. A test case is provided in <code>config/test_var_subs.yaml</code> and is reproduced here:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nOtherTask:\n  useful_other_var: \"USE ME!\"\n\nNonExistentTask:\n  test_sub: \"/path/to/{{ experiment }}/file_r{{ run:04d }}.input\"         # Substitute `experiment` and `run` from header above\n  test_env_sub: \"/path/to/{{ $EXPERIMENT }}/file.input\"                   # Substitute from the environment variable $EXPERIMENT\n  test_nested:\n    a: \"outfile_{{ run }}_one.out\"                                        # Substitute `run` from header above\n    b:\n      c: \"outfile_{{ run }}_two.out\"                                      # Also substitute `run` from header above\n      d: \"{{ OtherTask.useful_other_var }}\"                               # Substitute `useful_other_var` from `OtherTask`\n  test_fmt: \"{{ run:04d }}\"                                               # Subsitute `run` and format as 0012\n  test_env_fmt: \"{{ $RUN:04d }}\"                                          # Substitute environment variable $RUN and pad to 4 w/ zeros\n...\n</code></pre> <p>Input parameters in the config YAML can be substituted with either other input parameters or environment variables, with or without limited string formatting. All substitutions occur between double curly brackets: <code>{{ VARIABLE_TO_SUBSTITUTE }}</code>. Environment variables are indicated by <code>$</code> in front of the variable name. Parameters from the header, i.e. the first YAML document (top section) containing the <code>run</code>, <code>experiment</code>, version fields, etc. can be substituted without any qualification. If you want to use the <code>run</code> parameter, you can substitute it using <code>{{ run }}</code>. All other parameters, i.e. from other <code>Task</code>s or within <code>Task</code>s, must use a qualified name. Nested levels are delimited using a <code>.</code>. E.g. consider a structure like:</p> <pre><code>Task:\n  param_set:\n    a: 1\n    b: 2\n    c: 3\n</code></pre> <p>In order to use parameter <code>c</code>, you would use <code>{{ Task.param_set.c }}</code> as the substitution.</p> <p>Take care when using substitutions! This process will not try to guess for you. When a substitution is not available, e.g. due to misspelling, one of two things will happen:</p> <ul> <li>If it was an environment variable that does not exist, no substitution will be performed, although a message will be printed. I.e. you will be left with <code>param: /my/failed/{{ $SUBSTITUTION }}</code> as your parameter. This may or may not fail the model validation step, but is likely not what you intended.</li> <li>If it was an attempt at substituting another YAML parameter which does not exist, an exception will be thrown and the program will exit.</li> </ul> <p>Defining your own parameters</p> <p>The configuration file is not validated in its totality, only on a <code>Task</code>-by-<code>Task</code> basis, but it is read in its totality. E.g. when running <code>MyTask</code> only that portion of the configuration is validated even though the entire file has been read, and is available for substitutions. As a result, it is safe to introduce extra entries into the YAML file, as long as they are not entered under a specific <code>Task</code>'s configuration. This may be useful to create your own global substitutions, for example if there is a key variable that may be used across different <code>Task</code>s. E.g. Consider a case where you want to create a more generic configuration file where a single variable is used by multiple <code>Task</code>s. This single variable may be changed between experiments, for instance, but is likely static for the duration of a single set of analyses. In order to avoid a mistake when changing the configuration between experiments you can define this special variable (or variables) as a separate entry in the YAML, and make use of substitutions in each <code>Task</code>'s configuration. This way the variable only needs to be changed in one place.</p> <pre><code># Define our substitution. This is only for substitutiosns!\nMY_SPECIAL_SUB: \"EXPMT_DEPENDENT_VALUE\"  # Can change here once per experiment!\n\nRunTask1:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_1: 1\n  var_2: \"a\"\n  # ...\n\nRunTask2:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_3: \"abcd\"\n  var_4: 123\n  # ...\n\nRunTask3:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  #...\n\n# ... and so on\n</code></pre>"},{"location":"#gotchas","title":"Gotchas!","text":"<p>Order matters</p> <p>While in general you can use parameters that appear later in a YAML document to substitute for values of parameters that appear earlier, the substitutions themselves will be performed in order of appearance. It is therefore NOT possible to correctly use a later parameter as a substitution for an earlier one, if the later one itself depends on a substitution. The YAML document, however, can be rearranged without error. The order in the YAML document has no effect on execution order which is determined purely by the workflow definition. As mentioned above, the document is not validated in its entirety so rearrangements are allowed. For example consider the following situation which produces an incorrect substitution:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will incorrectly be \"{{ work_dir }}/additional_path/{{ $RUN }}\"\n  # ...\n\nRunTaskTwo:\n  # Remember `work_dir` and `run` come from the header document and don't need to\n  # be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n...\n</code></pre> <p>This configuration can be rearranged to achieve the desired result:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskTwo:\n  # Remember `work_dir` comes from the header document and doesn't need to be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will now be /sdf/data/lcls/ds/exp/experiment/scratch/additional_path/12\n  # ...\n...\n</code></pre> <p>On the otherhand, relationships such as these may point to inconsistencies in the dependencies between <code>Task</code>s which may warrant a refactor.</p> <p>Found unhashable key</p> <p>To avoid YAML parsing issues when using the substitution syntax, be sure to quote your substitutions. Before substitution is performed, a dictionary is first constructed by the <code>pyyaml</code> package which parses the document - it may fail to parse the document and raise an exception if the substitutions are not quoted. E.g.</p> <pre><code># USE THIS\nMyTask:\n  var_sub: \"{{ other_var:04d }}\"\n\n# **DO NOT** USE THIS\nMyTask:\n  var_sub: {{ other_var:04d }}\n</code></pre> <p>During validation, Pydantic will by default cast variables if possible, because of this it is generally safe to use strings for substitutions. E.g. if your parameter is expecting an integer, and after substitution you pass <code>\"2\"</code>, Pydantic will cast this to the <code>int</code> <code>2</code>, and validation will succeed. As part of the substitution process limited type casting will also be handled if it is necessary for any formatting strings provided. E.g. <code>\"{{ run:04d }}\"</code> requires that run be an integer, so it will be treated as such in order to apply the formatting.</p>"},{"location":"#custom-run-time-dags","title":"Custom Run-Time DAGs","text":"<p>In most cases, standard DAGs should be called as described above. However, Airflow also supports the dynamic creation of DAGs, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Consider a simplified serial femtosecond crystallography DAG which runs peak finding through merging and then calculates some statistics. I.e. we want an execution order that looks like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next:\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>.</p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code>. This name must be identical to a managed <code>Task</code> defined in the LUTE installation you are using.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the <code>next</code> list are run in parallel.</li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined in this way, we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG.  The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"#debug-environment-variables","title":"Debug Environment Variables","text":"<p>Special markers have been inserted at certain points in the execution flow for LUTE. These can be enabled by setting the environment variables detailed below. These are intended to allow developers to exit the program at certain points to investigate behaviour or a bug. For instance, when working on configuration parsing, an environment variable can be set which exits the program after passing this step. This allows you to run LUTE otherwise as normal (described above), without having to modify any additional code or insert your own early exits.</p> <p>Types of debug markers:</p> <ul> <li><code>LUTE_DEBUG_EXIT</code>: Will exit the program at this point if the corresponding environment variable has been set.</li> </ul> <p>Developers can insert these markers as needed into their code to add new exit points, although as a rule of thumb they should be used sparingly, and generally only after major steps in the execution flow (e.g. after parsing, after beginning a task, after returning a result, etc.).</p> <p>In order to include a new marker in your code:</p> <pre><code>from lute.execution.debug_utils import LUTE_DEBUG_EXIT\n\ndef my_code() -&gt; None:\n    # ...\n    LUTE_DEBUG_EXIT(\"MYENVVAR\", \"Additional message to print\")\n    # If MYENVVAR is not set, the above function does nothing\n</code></pre> <p>You can enable a marker by setting to 1, e.g. to enable the example marker above while running <code>Tester</code>:</p> <pre><code>MYENVVAR=1 python -B run_task.py -t Tester -c config/test.yaml\n</code></pre>"},{"location":"#currently-used-environment-variables","title":"Currently used environment variables","text":"<ul> <li><code>LUTE_DEBUG_EXIT_AT_YAML</code>: Exits the program after reading in a YAML configuration file and performing variable substitutions, but BEFORE Pydantic validation.</li> <li><code>LUTE_DEBUG_BEFORE_TPP_EXEC</code>: Exits the program after a ThirdPartyTask has prepared its submission command, but before <code>exec</code> is used to run it.</li> </ul>"},{"location":"#airflow-launch-and-dag-execution-steps","title":"Airflow Launch and DAG Execution Steps","text":"<p>The Airflow launch process actually involves a number of steps, and is rather complicated. There are two wrapper steps prior to getting to the actual Airflow API communication.</p> <ol> <li><code>launch_scripts/submit_launch_airflow.sh</code> is run.</li> <li>This script calls <code>/sdf/group/lcls/ds/tools/lute_launcher</code> with all the same parameters that it was called with.</li> <li><code>lute_launcher</code> runs the <code>launch_scripts/launch_airflow.py</code> script which was provided as the first argument. This is the true launch script</li> <li><code>launch_airflow.py</code> communicates with the Airflow API, requesting that a specific DAG be launched. It then continues to run, and gathers the individual logs and the exit status of each step of the DAG.</li> <li>Airflow will then enter a loop of communication where it asks the JID to submit each step of the requested DAG as batch job using <code>launch_scripts/submit_slurm.sh</code>.</li> </ol> <p>There are some specific reasons for this complexity:</p> <ul> <li>The use of <code>submit_launch_airflow.sh</code> as a thin-wrapper around <code>lute_launcher</code> is to allow the true Airflow launch script to be a long-lived job. This is for compatibility with the eLog and the ARP. When run from the eLog as a workflow, the job submission process must occur within 30 seconds due to a timeout built-in to the system. This is fine when submitting jobs to run on the batch-nodes, as the submission to the queue takes very little time. So here, <code>submit_launch_airflow.sh</code> serves as a thin script to have <code>lute_launcher</code> run as a batch job. It can then run as a long-lived job (for the duration of the entire DAG) collecting log files all in one place. This allows the log for each stage of the Airflow DAG to be inspected in a single file, and through the eLog browser interface.</li> <li>The use <code>lute_launcher</code> as a wrapper around <code>launch_airflow.py</code> is to manage authentication and credentials. The <code>launch_airflow.py</code> script requires loading credentials in order to authenticate against the Airflow API. For the average user this is not possible, unless the script is run from within the <code>lute_launcher</code> process.</li> </ul>"},{"location":"usage/","title":"Setup","text":"<p>LUTE is publically available on GitHub. In order to run it, the first step is to clone the repository:</p> <pre><code># Navigate to the directory of your choice.\ngit clone@github.com:slac-lcls/lute\n</code></pre> <p>The repository directory structure is as follows:</p> <pre><code>lute\n  |--- config             # Configuration YAML files (see below) and templates for third party config\n  |--- docs               # Documentation (including this page)\n  |--- launch_scripts     # Entry points for using SLURM and communicating with Airflow\n  |--- lute               # Code\n        |--- run_task.py  # Script to run an individual managed Task\n        |--- ...\n  |--- utilities          # Help utility programs\n  |--- workflows          # This directory contains workflow definitions. It is synced elsewhere and not used directly.\n\n</code></pre> <p>In general, most interactions with the software will be through scripts located in the <code>launch_scripts</code> directory. Some users (for certain use-cases) may also choose to run the <code>run_task.py</code> script directly - it's location has been highlighted within hierarchy. To begin with you will need a YAML file, templates for which are available in the <code>config</code> directory. The structure of the YAML file and how to use the various launch scripts are described in more detail below.</p>"},{"location":"usage/#a-note-on-utilties","title":"A note on utilties","text":"<p>In the <code>utilities</code> directory there are two useful programs to provide assistance with using the software:</p> <ul> <li><code>utilities/dbview</code>: LUTE stores all parameters for every analysis routine it runs (as well as results) in a database. This database is stored in the <code>work_dir</code> defined in the YAML file (see below). The <code>dbview</code> utility is a TUI application (Text-based user interface) which runs in the terminal. It allows you to navigate a LUTE database using the arrow keys, etc. Usage is: <code>utilities/dbview -p &lt;path/to/lute.db&gt;</code>.</li> <li><code>utilities/lute_help</code>: This utility provides help and usage information for running LUTE software. E.g., it provides access to parameter descriptions to assist in properly filling out a configuration YAML. It's usage is described in slightly more detail below.</li> </ul>"},{"location":"usage/#basic-usage","title":"Basic Usage","text":""},{"location":"usage/#overview","title":"Overview","text":"<p>LUTE runs code as <code>Task</code>s that are managed by an <code>Executor</code>. The <code>Executor</code> provides modifications to the environment the <code>Task</code> runs in, as well as controls details of inter-process communication, reporting results to the eLog, etc. Combinations of specific <code>Executor</code>s and <code>Task</code>s are already provided, and are referred to as managed <code>Task</code>s. Managed <code>Task</code>s are submitted as a single unit. They can be run individually, or a series of independent steps can be submitted all at once in the form of a workflow, or directed acyclic graph (DAG). This latter option makes use of Airflow to manage the individual execution steps.</p> <p>Running analysis with LUTE is the process of submitting one or more managed <code>Task</code>s. This is generally a two step process.</p> <ol> <li>First, a configuration YAML file is prepared. This contains the parameterizations of all the <code>Task</code>s which you may run.</li> <li>Individual managed <code>Task</code> submission, or workflow (DAG) submission.</li> </ol> <p>These two steps are described below.</p>"},{"location":"usage/#preparing-a-configuration-yaml","title":"Preparing a Configuration YAML","text":"<p>All <code>Task</code>s are parameterized through a single configuration YAML file - even third party code which requires its own configuration files is managed through this YAML file. The basic structure is split into two documents, a brief header section which contains information that is applicable across all <code>Task</code>s, such as the experiment name, run numbers and the working directory, followed by per <code>Task</code> parameters:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Some title.\"\nexperiment: \"MYEXP123\"\n# run: 12 # Does not need to be provided\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nTaskOne:\n  param_a: 123\n  param_b: 456\n  param_c:\n    sub_var: 3\n    sub_var2: 4\n\nTaskTwo:\n  new_param1: 3\n  new_param2: 4\n\n# ...\n...\n</code></pre> <p>In the first document, the header, it is important that the <code>work_dir</code> is properly specified. This is the root directory from which <code>Task</code> outputs will be written, and the LUTE database will be stored. It may also be desirable to modify the <code>task_timeout</code> parameter which defines the time limit for individual <code>Task</code> jobs. By default it is set to 10 minutes, although this may not be sufficient for long running jobs. This value will be applied to all <code>Task</code>s so should account for the longest running job you expect.</p> <p>The actual analysis parameters are defined in the second document. As these vary from <code>Task</code> to <code>Task</code>, a full description will not be provided here. An actual template with real <code>Task</code> parameters is available in <code>config/test.yaml</code>. Your analysis POC can also help you set up and choose the correct <code>Task</code>s to include as a starting point. The template YAML file has further descriptions of what each parameter does and how to fill it out. You can also refer to the <code>lute_help</code> program described under the following sub-heading.</p> <p>Some things to consider and possible points of confusion:</p> <ul> <li>While we will be submitting managed <code>Task</code>s, the parameters are defined at the <code>Task</code> level. I.e. the managed <code>Task</code> and <code>Task</code> itself have different names, and the names in the YAML refer to the latter. This is because a single <code>Task</code> can be run using different <code>Executor</code> configurations, but using the same parameters. The list of managed <code>Task</code>s is in <code>lute/managed_tasks.py</code>. A table is also provided below for some routines of interest..</li> </ul> Managed <code>Task</code> The <code>Task</code> it Runs <code>Task</code> Description <code>SmallDataProducer</code> <code>SubmitSMD</code> Smalldata production <code>CrystFELIndexer</code> <code>IndexCrystFEL</code> Crystallographic indexing <code>PartialatorMerger</code> <code>MergePartialator</code> Crystallographic merging <code>HKLComparer</code> <code>CompareHKL</code> Crystallographic figures of merit <code>HKLManipulator</code> <code>ManipulateHKL</code> Crystallographic format conversions <code>DimpleSolver</code> <code>DimpleSolve</code> Crystallographic structure solution with molecular replacement <code>PeakFinderPyAlgos</code> <code>FindPeaksPyAlgos</code> Peak finding with PyAlgos algorithm. <code>PeakFinderPsocake</code> <code>FindPeaksPsocake</code> Peak finding with psocake algorithm. <code>StreamFileConcatenator</code> <code>ConcatenateStreamFiles</code> Stream file concatenation."},{"location":"usage/#how-do-i-know-what-parameters-are-available-and-what-they-do","title":"How do I know what parameters are available, and what they do?","text":"<p>A summary of <code>Task</code> parameters is available through the <code>lute_help</code> program.</p> <pre><code>&gt; utilities/lute_help -t [TaskName]\n</code></pre> <p>Note, some parameters may say \"Unknown description\" - this either means they are using an old-style defintion that does not include parameter help, or they may have some internal use. In particular you will see this for <code>lute_config</code> on every <code>Task</code>, this parameter is filled in automatically and should be ignored. E.g. as an example:</p> <pre><code>&gt; utilities/lute_help -t IndexCrystFEL\nINFO:__main__:Fetching parameter information for IndexCrystFEL.\nIndexCrystFEL\n-------------\nParameters for CrystFEL's `indexamajig`.\n\nThere are many parameters, and many combinations. For more information on\nusage, please refer to the CrystFEL documentation, here:\nhttps://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n\n\nRequired Parameters:\n--------------------\n[...]\n\nAll Parameters:\n-------------\n[...]\n\nhighres (number)\n    Mark all pixels greater than `x` has bad.\n\nprofile (boolean) - Default: False\n    Display timing data to monitor performance.\n\ntemp_dir (string)\n    Specify a path for the temp files folder.\n\nwait_for_file (integer) - Default: 0\n    Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\n\nno_image_data (boolean) - Default: False\n    Load only the metadata, no iamges. Can check indexability without high data requirements.\n\n[...]\n</code></pre>"},{"location":"usage/#running-managed-tasks-and-workflows-dags","title":"Running Managed <code>Task</code>s and Workflows (DAGs)","text":"<p>After a YAML file has been filled in you can run a <code>Task</code>. There are multiple ways to submit a <code>Task</code>, but there are 3 that are most likely:</p> <ol> <li>Run a single managed <code>Task</code> interactively by running <code>python ...</code></li> <li>Run a single managed <code>Task</code> as a batch job (e.g. on S3DF) via a SLURM submission <code>submit_slurm.sh ...</code></li> <li>Run a DAG (workflow with multiple managed <code>Task</code>s).</li> </ol> <p>These will be covered in turn below; however, in general all methods will require two parameters: the path to a configuration YAML file, and the name of the managed <code>Task</code> or workflow you want to run. When submitting via SLURM or submitting an entire workflow there are additional parameters to control these processes.</p>"},{"location":"usage/#running-single-managed-tasks-interactively","title":"Running single managed <code>Task</code>s interactively","text":"<p>The simplest submission method is just to run Python interactively. In most cases this is not practical for long-running analysis, but may be of use for short <code>Task</code>s or when debugging. From the root directory of the LUTE repository (or after installation) you can use the <code>run_task.py</code> script:</p> <pre><code>&gt; python -B [-O] run_task.py -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt;\n</code></pre> <p>The command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li><code>-O</code> is the flag controlling whether you run in debug or non-debug mode. By default, i.e. if you do NOT provide this flag you will run in debug mode which enables verbose printing. Passing <code>-O</code> will turn off debug to minimize output.</li> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> </ul>"},{"location":"usage/#submitting-a-single-managed-task-as-a-batch-job","title":"Submitting a single managed <code>Task</code> as a batch job","text":"<p>On S3DF you can also submit individual managed <code>Task</code>s to run as batch jobs. To do so use <code>launch_scripts/submit_slurm.sh</code></p> <pre><code>&gt; launch_scripts/submit_slurm.sh -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt; [--debug] $SLURM_ARGS\n</code></pre> <p>As before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided</p> <ul> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> <li><code>--debug</code> is the flag to control whether or not to run in debug mode.</li> </ul> <p>In addition to the LUTE-specific arguments, SLURM arguments must also be provided (<code>$SLURM_ARGS</code> above). You can provide as many as you want; however you will need to at least provide:</p> <ul> <li><code>--partition=&lt;partition/queue&gt;</code> - The queue to run on, in general for LCLS this is <code>milano</code></li> <li><code>--account=lcls:&lt;experiment&gt;</code> - The account to use for batch job accounting.</li> </ul> <p>You will likely also want to provide at a minimum:</p> <ul> <li><code>--ntasks=&lt;...&gt;</code> to control the number of cores in allocated.</li> </ul> <p>In general, it is best to prefer the long-form of the SLURM-argument (<code>--arg=&lt;...&gt;</code>) in order to avoid potential clashes with present or future LUTE arguments.</p>"},{"location":"usage/#workflow-dag-submission","title":"Workflow (DAG) submission","text":"<p>Finally, you can submit a full workflow (e.g. SFX analysis, smalldata production and summary results, geometry optimization...). This can be done using a single script, <code>submit_launch_airflow.sh</code>, similarly to the SLURM submission above:</p> <pre><code>&gt; launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -c &lt;/path/to/yaml.yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] $SLURM_ARGS\n</code></pre> <p>The submission process is slightly more complicated in this case. A more in-depth explanation is provided under \"Airflow Launch Steps\", in the advanced usage section below if interested. The parameters are as follows - as before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li>The first argument (must be first) is the full path to the <code>launch_scripts/launch_airflow.py</code> script located in whatever LUTE installation you are running. All other arguments can come afterwards in any order.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML to use.</li> <li><code>-w &lt;dag_name&gt;</code> is the name of the DAG (workflow) to run. This replaces the task name provided when using the other two methods above. A DAG list is provided below.</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> controls whether to use debug mode (verbose printing)</li> <li><code>--test</code> controls whether to use the test or production instance of Airflow to manage the DAG. The instances are running identical versions of Airflow, but the <code>test</code> instance may have \"test\" or more bleeding edge development DAGs.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> </ul> <p>The <code>$SLURM_ARGS</code> must be provided in the same manner as when submitting an individual managed <code>Task</code> by hand to be run as batch job with the script above. Note that these parameters will be used as the starting point for the SLURM arguments of every managed <code>Task</code> in the DAG; however, individual steps in the DAG may have overrides built-in where appropriate to make sure that step is not submitted with potentially incompatible arguments. For example, a single threaded analysis <code>Task</code> may be capped to running on one core, even if in general everything should be running on 100 cores, per the SLURM argument provided. These caps are added during development and cannot be disabled through configuration changes in the YAML.</p> <p>DAG List</p> <ul> <li><code>find_peaks_index</code></li> <li><code>psocake_sfx_phasing</code></li> <li><code>pyalgos_sfx</code></li> </ul>"},{"location":"usage/#dag-submission-from-the-elog","title":"DAG Submission from the <code>eLog</code>","text":"<p>You can use the script in the previous section to submit jobs through the eLog. To do so navigate to the <code>Workflow &gt; Definitions</code> tab using the blue navigation bar at the top of the eLog. On this tab, in the top-right corner (underneath the help and zoom icons) you can click the <code>+</code> sign to add a new workflow. This will bring up a \"Workflow definition\" UI window. When filling out the eLog workflow definition the following fields are needed (all of them):</p> <ul> <li><code>Name</code>: You can name the workflow anything you like. It should probably be something descriptive, e.g. if you are using LUTE to run smalldata_tools, you may call the workflow <code>lute_smd</code>.</li> <li><code>Executable</code>: In this field you will put the full path to the <code>submit_launch_airflow.sh</code> script:  <code>/path/to/lute/launch_scripts/submit_launch_airflow.sh</code>.</li> <li><code>Parameters</code>: You will use the parameters as described above. Remember the first argument will be the full path to the <code>launch_airflow.py</code> script (this is NOT the same as the bash script used in the executable!): <code>/full/path/to/lute/launch_scripts/launch_airflow.py -c &lt;path/to/yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] $SLURM_ARGS</code></li> <li><code>Location</code>: Be sure to set to <code>S3DF</code>.</li> <li><code>Trigger</code>: You can have the workflow trigger automatically or manually. Which option to choose will depend on the type of workflow you are running. In general the options <code>Manually triggered</code> (which displays as <code>MANUAL</code> on the definitions page) and <code>End of a run</code> (which displays as <code>END_OF_RUN</code> on the definitions page) are safe options for ALL workflows. The latter will be automatically submitted for you when data acquisition has finished. If you are running a workflow with managed <code>Task</code>s that work as data is being acquired (e.g. <code>SmallDataProducer</code>), you may also select <code>Start of a run</code> (which displays as <code>START_OF_RUN</code> on the definitions page).</li> </ul> <p>Upon clicking create you will see a new entry in the table on the definitions page. In order to run <code>MANUAL</code> workflows, or re-run automatic workflows, you must navigate to the <code>Workflows &gt; Control</code> tab. For each acquisition run you will find a drop down menu under the <code>Job</code> column. To submit a workflow you select it from this drop down menu by the <code>Name</code> you provided when creating its definition.</p>"},{"location":"usage/#advanced-usage","title":"Advanced Usage","text":""},{"location":"usage/#variable-substitution-in-yaml-files","title":"Variable Substitution in YAML Files","text":"<p>Using <code>validator</code>s, it is possible to define (generally, default) model parameters for a <code>Task</code> in terms of other parameters. It is also possible to use validated Pydantic model parameters to substitute values into a configuration file required to run a third party <code>Task</code> (e.g. some <code>Task</code>s may require their own JSON, TOML files, etc. to run properly). For more information on these types of substitutions, refer to the <code>new_task.md</code> documentation on <code>Task</code> creation.</p> <p>These types of substitutions, however, have a limitation in that they are not easily adapted at run time. They therefore address only a small number of the possible combinations in the dependencies between different input parameters. In order to support more complex relationships between parameters, variable substitutions can also be used in the configuration YAML itself. Using a syntax similar to <code>Jinja</code> templates, you can define values for YAML parameters in terms of other parameters or environment variables. The values are substituted before Pydantic attempts to validate the configuration.</p> <p>It is perhaps easiest to illustrate with an example. A test case is provided in <code>config/test_var_subs.yaml</code> and is reproduced here:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nOtherTask:\n  useful_other_var: \"USE ME!\"\n\nNonExistentTask:\n  test_sub: \"/path/to/{{ experiment }}/file_r{{ run:04d }}.input\"         # Substitute `experiment` and `run` from header above\n  test_env_sub: \"/path/to/{{ $EXPERIMENT }}/file.input\"                   # Substitute from the environment variable $EXPERIMENT\n  test_nested:\n    a: \"outfile_{{ run }}_one.out\"                                        # Substitute `run` from header above\n    b:\n      c: \"outfile_{{ run }}_two.out\"                                      # Also substitute `run` from header above\n      d: \"{{ OtherTask.useful_other_var }}\"                               # Substitute `useful_other_var` from `OtherTask`\n  test_fmt: \"{{ run:04d }}\"                                               # Subsitute `run` and format as 0012\n  test_env_fmt: \"{{ $RUN:04d }}\"                                          # Substitute environment variable $RUN and pad to 4 w/ zeros\n...\n</code></pre> <p>Input parameters in the config YAML can be substituted with either other input parameters or environment variables, with or without limited string formatting. All substitutions occur between double curly brackets: <code>{{ VARIABLE_TO_SUBSTITUTE }}</code>. Environment variables are indicated by <code>$</code> in front of the variable name. Parameters from the header, i.e. the first YAML document (top section) containing the <code>run</code>, <code>experiment</code>, version fields, etc. can be substituted without any qualification. If you want to use the <code>run</code> parameter, you can substitute it using <code>{{ run }}</code>. All other parameters, i.e. from other <code>Task</code>s or within <code>Task</code>s, must use a qualified name. Nested levels are delimited using a <code>.</code>. E.g. consider a structure like:</p> <pre><code>Task:\n  param_set:\n    a: 1\n    b: 2\n    c: 3\n</code></pre> <p>In order to use parameter <code>c</code>, you would use <code>{{ Task.param_set.c }}</code> as the substitution.</p> <p>Take care when using substitutions! This process will not try to guess for you. When a substitution is not available, e.g. due to misspelling, one of two things will happen:</p> <ul> <li>If it was an environment variable that does not exist, no substitution will be performed, although a message will be printed. I.e. you will be left with <code>param: /my/failed/{{ $SUBSTITUTION }}</code> as your parameter. This may or may not fail the model validation step, but is likely not what you intended.</li> <li>If it was an attempt at substituting another YAML parameter which does not exist, an exception will be thrown and the program will exit.</li> </ul> <p>Defining your own parameters</p> <p>The configuration file is not validated in its totality, only on a <code>Task</code>-by-<code>Task</code> basis, but it is read in its totality. E.g. when running <code>MyTask</code> only that portion of the configuration is validated even though the entire file has been read, and is available for substitutions. As a result, it is safe to introduce extra entries into the YAML file, as long as they are not entered under a specific <code>Task</code>'s configuration. This may be useful to create your own global substitutions, for example if there is a key variable that may be used across different <code>Task</code>s. E.g. Consider a case where you want to create a more generic configuration file where a single variable is used by multiple <code>Task</code>s. This single variable may be changed between experiments, for instance, but is likely static for the duration of a single set of analyses. In order to avoid a mistake when changing the configuration between experiments you can define this special variable (or variables) as a separate entry in the YAML, and make use of substitutions in each <code>Task</code>'s configuration. This way the variable only needs to be changed in one place.</p> <pre><code># Define our substitution. This is only for substitutiosns!\nMY_SPECIAL_SUB: \"EXPMT_DEPENDENT_VALUE\"  # Can change here once per experiment!\n\nRunTask1:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_1: 1\n  var_2: \"a\"\n  # ...\n\nRunTask2:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_3: \"abcd\"\n  var_4: 123\n  # ...\n\nRunTask3:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  #...\n\n# ... and so on\n</code></pre>"},{"location":"usage/#gotchas","title":"Gotchas!","text":"<p>Order matters</p> <p>While in general you can use parameters that appear later in a YAML document to substitute for values of parameters that appear earlier, the substitutions themselves will be performed in order of appearance. It is therefore NOT possible to correctly use a later parameter as a substitution for an earlier one, if the later one itself depends on a substitution. The YAML document, however, can be rearranged without error. The order in the YAML document has no effect on execution order which is determined purely by the workflow definition. As mentioned above, the document is not validated in its entirety so rearrangements are allowed. For example consider the following situation which produces an incorrect substitution:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will incorrectly be \"{{ work_dir }}/additional_path/{{ $RUN }}\"\n  # ...\n\nRunTaskTwo:\n  # Remember `work_dir` and `run` come from the header document and don't need to\n  # be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n...\n</code></pre> <p>This configuration can be rearranged to achieve the desired result:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskTwo:\n  # Remember `work_dir` comes from the header document and doesn't need to be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will now be /sdf/data/lcls/ds/exp/experiment/scratch/additional_path/12\n  # ...\n...\n</code></pre> <p>On the otherhand, relationships such as these may point to inconsistencies in the dependencies between <code>Task</code>s which may warrant a refactor.</p> <p>Found unhashable key</p> <p>To avoid YAML parsing issues when using the substitution syntax, be sure to quote your substitutions. Before substitution is performed, a dictionary is first constructed by the <code>pyyaml</code> package which parses the document - it may fail to parse the document and raise an exception if the substitutions are not quoted. E.g.</p> <pre><code># USE THIS\nMyTask:\n  var_sub: \"{{ other_var:04d }}\"\n\n# **DO NOT** USE THIS\nMyTask:\n  var_sub: {{ other_var:04d }}\n</code></pre> <p>During validation, Pydantic will by default cast variables if possible, because of this it is generally safe to use strings for substitutions. E.g. if your parameter is expecting an integer, and after substitution you pass <code>\"2\"</code>, Pydantic will cast this to the <code>int</code> <code>2</code>, and validation will succeed. As part of the substitution process limited type casting will also be handled if it is necessary for any formatting strings provided. E.g. <code>\"{{ run:04d }}\"</code> requires that run be an integer, so it will be treated as such in order to apply the formatting.</p>"},{"location":"usage/#custom-run-time-dags","title":"Custom Run-Time DAGs","text":"<p>In most cases, standard DAGs should be called as described above. However, Airflow also supports the dynamic creation of DAGs, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Consider a simplified serial femtosecond crystallography DAG which runs peak finding through merging and then calculates some statistics. I.e. we want an execution order that looks like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next:\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>.</p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code>. This name must be identical to a managed <code>Task</code> defined in the LUTE installation you are using.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the <code>next</code> list are run in parallel.</li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined in this way, we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG.  The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"usage/#debug-environment-variables","title":"Debug Environment Variables","text":"<p>Special markers have been inserted at certain points in the execution flow for LUTE. These can be enabled by setting the environment variables detailed below. These are intended to allow developers to exit the program at certain points to investigate behaviour or a bug. For instance, when working on configuration parsing, an environment variable can be set which exits the program after passing this step. This allows you to run LUTE otherwise as normal (described above), without having to modify any additional code or insert your own early exits.</p> <p>Types of debug markers:</p> <ul> <li><code>LUTE_DEBUG_EXIT</code>: Will exit the program at this point if the corresponding environment variable has been set.</li> </ul> <p>Developers can insert these markers as needed into their code to add new exit points, although as a rule of thumb they should be used sparingly, and generally only after major steps in the execution flow (e.g. after parsing, after beginning a task, after returning a result, etc.).</p> <p>In order to include a new marker in your code:</p> <pre><code>from lute.execution.debug_utils import LUTE_DEBUG_EXIT\n\ndef my_code() -&gt; None:\n    # ...\n    LUTE_DEBUG_EXIT(\"MYENVVAR\", \"Additional message to print\")\n    # If MYENVVAR is not set, the above function does nothing\n</code></pre> <p>You can enable a marker by setting to 1, e.g. to enable the example marker above while running <code>Tester</code>:</p> <pre><code>MYENVVAR=1 python -B run_task.py -t Tester -c config/test.yaml\n</code></pre>"},{"location":"usage/#currently-used-environment-variables","title":"Currently used environment variables","text":"<ul> <li><code>LUTE_DEBUG_EXIT_AT_YAML</code>: Exits the program after reading in a YAML configuration file and performing variable substitutions, but BEFORE Pydantic validation.</li> <li><code>LUTE_DEBUG_BEFORE_TPP_EXEC</code>: Exits the program after a ThirdPartyTask has prepared its submission command, but before <code>exec</code> is used to run it.</li> </ul>"},{"location":"usage/#airflow-launch-and-dag-execution-steps","title":"Airflow Launch and DAG Execution Steps","text":"<p>The Airflow launch process actually involves a number of steps, and is rather complicated. There are two wrapper steps prior to getting to the actual Airflow API communication.</p> <ol> <li><code>launch_scripts/submit_launch_airflow.sh</code> is run.</li> <li>This script calls <code>/sdf/group/lcls/ds/tools/lute_launcher</code> with all the same parameters that it was called with.</li> <li><code>lute_launcher</code> runs the <code>launch_scripts/launch_airflow.py</code> script which was provided as the first argument. This is the true launch script</li> <li><code>launch_airflow.py</code> communicates with the Airflow API, requesting that a specific DAG be launched. It then continues to run, and gathers the individual logs and the exit status of each step of the DAG.</li> <li>Airflow will then enter a loop of communication where it asks the JID to submit each step of the requested DAG as batch job using <code>launch_scripts/submit_slurm.sh</code>.</li> </ol> <p>There are some specific reasons for this complexity:</p> <ul> <li>The use of <code>submit_launch_airflow.sh</code> as a thin-wrapper around <code>lute_launcher</code> is to allow the true Airflow launch script to be a long-lived job. This is for compatibility with the eLog and the ARP. When run from the eLog as a workflow, the job submission process must occur within 30 seconds due to a timeout built-in to the system. This is fine when submitting jobs to run on the batch-nodes, as the submission to the queue takes very little time. So here, <code>submit_launch_airflow.sh</code> serves as a thin script to have <code>lute_launcher</code> run as a batch job. It can then run as a long-lived job (for the duration of the entire DAG) collecting log files all in one place. This allows the log for each stage of the Airflow DAG to be inspected in a single file, and through the eLog browser interface.</li> <li>The use <code>lute_launcher</code> as a wrapper around <code>launch_airflow.py</code> is to manage authentication and credentials. The <code>launch_airflow.py</code> script requires loading credentials in order to authenticate against the Airflow API. For the average user this is not possible, unless the script is run from within the <code>lute_launcher</code> process.</li> </ul>"},{"location":"adrs/","title":"Architecture Decision Records","text":"<ul> <li>This directory contains a list of architecture and major feature decisions.</li> <li>Please refer to the <code>madr_template.md</code> for creating new ADRs. This template was adapted from the MADR template (MIT License).</li> <li>A table of ADRs is provided below.</li> </ul> ADR No. Record Date Title Status 1 2023-11-06 All analysis <code>Task</code>s inherit from a base class Accepted 2 2023-11-06 Analysis <code>Task</code> submission and communication is performed via <code>Executor</code>s Accepted 3 2023-11-06 <code>Executor</code>s will run all <code>Task</code>s via subprocess Proposed 4 2023-11-06 Airflow <code>Operator</code>s and LUTE <code>Executor</code>s are separate entities. Proposed 5 2023-12-06 Task-Executor IPC is Managed by Communicator Objects Proposed 6 2024-02-12 Third-party Config Files Managed by Templates Rendered by <code>ThirdPartyTask</code>s Proposed 7 2024-02-12 <code>Task</code> Configuration is Stored in a Database Managed by <code>Executor</code>s Proposed 8 2024-03-18 Airflow credentials/authorization requires special launch program. Proposed 9 2024-04-15 Airflow launch script will run as long lived batch job. Proposed"},{"location":"adrs/MADR_LICENSE/","title":"MADR LICENSE","text":"<p>Copyright 2022 ADR Github Organization</p> <p>Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \u201cSoftware\u201d), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:</p> <p>The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.</p> <p>THE SOFTWARE IS PROVIDED \u201cAS IS\u201d, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p>"},{"location":"adrs/adr-1/","title":"[ADR-1] All Analysis Tasks Inherit from a Base Class","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-1/#status","title":"Status","text":"<p>Accepted</p>"},{"location":"adrs/adr-1/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Analysis programs of interest have varied APIs.</li> <li>Nonetheless, for the purposes of this software, a unified interface is desirable.</li> <li>Providing a unified interface can be simplified by inheritance from a base class for all analysis activites.</li> </ul>"},{"location":"adrs/adr-1/#decision","title":"Decision","text":""},{"location":"adrs/adr-1/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>The original <code>btx</code> tasks had heterogenous interfaces.</li> <li>This makes debugging challenging due to the need to look-up or remember different methods of task handling.</li> <li>The need to provide modular access to various types of software.</li> <li>A desire to reduce code redundancy for implementation decisions which affect ALL tasks.</li> <li>A need to provide access to/wrap third-party binaries.</li> </ul>"},{"location":"adrs/adr-1/#considered-options","title":"Considered Options","text":"<ul> <li>Tasks as functions, with common interfaces provided through decorators, etc.</li> <li>Task code as functions wrapped by the execution code (cf. ADR-2)</li> </ul>"},{"location":"adrs/adr-1/#consequences","title":"Consequences","text":"<ul> <li>Simplified package structure.</li> <li>Ability to push feature updates to all <code>Task</code>s simultaneously.</li> <li>Potential complications due to an additional layer of abstraction.</li> </ul>"},{"location":"adrs/adr-1/#compliance","title":"Compliance","text":"<ul> <li>Data validation and type checking performed</li> <li>Common calling interface at higher levels relies on class structure (Execution layers, cf. ADR-2)</li> </ul>"},{"location":"adrs/adr-1/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-2/","title":"[ADR-2] Analysis Task Submission and Communication is Performed Via Executors","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-2/#status","title":"Status","text":"<p>Accepted</p>"},{"location":"adrs/adr-2/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Analysis code should be independent of the location and manner it is run.</li> <li>Additionally, communication is required after task submission to understand task context/state/results.</li> <li>This communication is best handled outside of the submitted job itself.<ul> <li>This provides a mechanism for continued communication even in the case of task failure.</li> </ul> </li> <li>A separate Executor (Controller) provides a mechanism that allows for communication and job submission to be independent of the task code itself.</li> <li>Therefore: An Executor will be submitted, which in turn submits the Task and manages communication activities.</li> </ul>"},{"location":"adrs/adr-2/#decision","title":"Decision","text":""},{"location":"adrs/adr-2/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Removing the job submission and communication components from the <code>Task</code> code itself provides a separation of concerns allowing <code>Task</code>s to run indepently of execution environment.</li> <li>A separate <code>Executor</code> can prepare environment, submission requirements, etc.</li> <li>A desire to reduce code redundancy. Providing unified interfaces through <code>Executor</code> classes avoids maintaining that code independently for each task (cf. alternatives considered).</li> <li>Job submission strategies can be changed at the <code>Executor</code> level and immediately applied to all <code>Task</code>s.</li> <li>If communication APIs change, this does not affect <code>Task</code> code.</li> <li>Difficulties encountered due to edge-cases in the original <code>btx</code> tasks. E.g. task timeout leading to failure of a processing pipeline even if substantial work had been done and subsequent tasks could proceed.</li> <li>Varied methods of <code>Task</code> submission already exist in the original <code>btx</code> but the methods were not fully standardized.</li> <li>E.g. <code>JobScheduler</code> submission vs direct submission of the task.</li> </ul>"},{"location":"adrs/adr-2/#considered-options","title":"Considered Options","text":"<ul> <li>Wrapping the execution, and communication, into the base <code>Task</code> class interface as pre/post analysis operations.</li> <li>Multiple <code>Task</code> subclasses for different execution environments.</li> <li>For communication specifically:</li> <li>Periodic asynchronous communication in the <code>Task</code> class.</li> </ul>"},{"location":"adrs/adr-2/#consequences","title":"Consequences","text":"<ul> <li><code>Task</code> code independent of execution environment.</li> <li>Ability to maintain communication even in the event of <code>Task</code> failure.</li> <li>Potential complications due to an additional layer of abstraction.</li> </ul>"},{"location":"adrs/adr-2/#compliance","title":"Compliance","text":"<ul> <li>Airflow will submit <code>Executor</code>s as the \"Managed Task\"</li> <li>I.e. at the highlest API layer, <code>Task</code>s will not be submitted independently.</li> </ul>"},{"location":"adrs/adr-2/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-3/","title":"[ADR-3] <code>Executor</code>s will run all <code>Task</code>s via subprocess","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-3/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-3/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>A mechanism is needed to submit <code>Task</code>s from within the <code>Executor</code> (cf. ADR-2)</li> <li>Ideally a single method can be used for all <code>Task</code>s, at all locations, but at the very least all <code>Task</code>s at a single location (e.g. S3DF, NERSC)</li> </ul>"},{"location":"adrs/adr-3/#decision","title":"Decision","text":""},{"location":"adrs/adr-3/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Want to simplify the interface for <code>Task</code> submission, but have to submit both first-party and third-party code.</li> <li>Want to have execution/submission separated from the Task submission (cf. ADR-2)</li> <li>Need flexible method which can be used to run any task, and quickly adapted to new Tasks</li> </ul>"},{"location":"adrs/adr-3/#considered-options","title":"Considered Options","text":"<ul> <li>Executor submits a separate SLURM job.</li> <li>This strategy was employed by <code>JobScheduler</code> for <code>btx</code></li> <li>Challenging to maintain - non-trivial issues can arise, e.g. with MPI</li> <li>Use <code>multiprocessing</code> at the Python level.</li> <li>More complex to manage</li> <li>Provides more flexibility</li> <li>Different mechansims for third-party Task or first-party Tasks</li> </ul>"},{"location":"adrs/adr-3/#consequences","title":"Consequences","text":"<ul> <li>Communication must be via pipes or files</li> <li>Very challenging to share state between executor and task</li> <li>Generally want to limit this, but makes certain communciation tasks harder (passing results e.g.)</li> <li>Easier to run binary (i.e. third party) tasks</li> <li>Simple to implement.</li> <li>Need a separate method (e.g. a single script) which is submitted as a subprocess</li> <li>This script, e.g., will select the Task based on options provided by the Executor</li> </ul>"},{"location":"adrs/adr-3/#compliance","title":"Compliance","text":"<ul> <li>Implementation will be at base class level for the executors</li> </ul>"},{"location":"adrs/adr-3/#metadata","title":"Metadata","text":""},{"location":"adrs/adr-4/","title":"[ADR-4] Airflow <code>Operator</code>s and LUTE <code>Executor</code>s are Separate Entities","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-4/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-4/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Airflow operators submit tasks by calling the JID API</li> <li>This is required since tasks running where Airflow is running would not have access to the data</li> <li>The current plan (cf. ADR-1 and ADR-2) requires submission of the <code>Executor</code> which in turn submits the <code>Task</code></li> <li>Under this plan the Executor must be separated from the Airflow operator.</li> </ul>"},{"location":"adrs/adr-4/#decision","title":"Decision","text":""},{"location":"adrs/adr-4/#decision-drivers","title":"Decision Drivers","text":"<p>*</p>"},{"location":"adrs/adr-4/#considered-options","title":"Considered Options","text":"<p>*</p>"},{"location":"adrs/adr-4/#consequences","title":"Consequences","text":"<p>*</p>"},{"location":"adrs/adr-4/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-4/#metadata","title":"Metadata","text":""},{"location":"adrs/adr-5/","title":"[ADR-5] Task-Executor IPC is Managed by Communicator Objects","text":"<p>Date: 2023-12-06</p>"},{"location":"adrs/adr-5/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-5/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>A form (or forms) of inter-process communication needs to be standardized between Task subprocesses and executors.</li> <li>Signals need to be sent potentially bidirectionally.</li> <li>Results need to be retrieved from the Task in a generic manner.</li> </ul>"},{"location":"adrs/adr-5/#decision","title":"Decision","text":"<p><code>Communicator</code> objects which maintain simple <code>read</code> and <code>write</code> mechanisms for <code>Message</code> objects. These latter can contain arbitrary Python objects. <code>Task</code>s do not interact directly with the communicator, but rather through specific instance methods which hide the communicator interfaces. Multiple Communicators can be used in parallel. The same <code>Communicator</code> objects are used identically at the <code>Task</code> and <code>Executor</code> layers - any changes to communication protocols are not transferred to the calling objects.</p>"},{"location":"adrs/adr-5/#decision-drivers","title":"Decision Drivers","text":"<ul> <li><code>Task</code> output needs to be routed to other layers of the software, but the <code>Task</code>s themselves should have no knowledge of where the output ends up.</li> <li>Ideally have the ability to send arbitrary objects (strings, arrays, objects, ...)</li> <li>Ideally not limited by size of the transferred object</li> <li>Communication should be hidden from callers - \"somewhat more declarative than imperative.\"</li> <li>Ability for protocols to be swapped out, or trialled without significant rewrites.</li> <li>Must handle uncontrolled output from \"Third-party\" software as well as \"in-house\" or \"first-party\" communication which is directly managed.</li> </ul>"},{"location":"adrs/adr-5/#considered-options","title":"Considered Options","text":"<ul> <li>Singular specific options:</li> <li>Relying solely on pipes over stdout/stderr<ul> <li>These are already controlled when the Executor opens the <code>subprocess</code></li> <li>Unfortunately, the pipe buffer is limited, and processes may hang when the output is too large (~64k or lower depending on machine)</li> </ul> </li> <li>Using a separate IPC method (e.g. Sockets)<ul> <li>\"Binary\" or \"Third-party\" tasks would have no communication captured at all, and while signalling is not possible in the same way with these tasks, some output must still be captured and routed.</li> </ul> </li> <li>Direct management of multiple communication methods</li> <li>E.g. use a combination of pipes and sockets, directly managed by the <code>Task</code> and <code>Executor</code> layers.</li> </ul>"},{"location":"adrs/adr-5/#communicator-types","title":"Communicator Types","text":"<ul> <li><code>Communicator</code> : Abstract base class - defines interface</li> <li><code>PipeCommunicator</code> : Manages communication through pipes (<code>stderr</code> and <code>stdout</code>)</li> <li><code>SocketCommunicator</code> : Manages communication through Unix sockets</li> </ul>"},{"location":"adrs/adr-5/#consequences","title":"Consequences","text":"<ul> <li>Complexity due to management of (potentially) multiple communication methods</li> <li>Some of this compelxity is isolated, however, to a single object.</li> <li>From the <code>Task</code> and <code>Executor</code> side, IPC is greatly simplified</li> <li>Management is delegated to the <code>Communicator</code></li> <li>Communication is \"pluggable\" -&gt; not limited by the advantages and disadvantages of any single communication method or protocol</li> <li>Arbitrary objects can be sent and received</li> <li>Limits on size or type of object should not be an issue (e.g. large results output can be handled)</li> </ul>"},{"location":"adrs/adr-5/#compliance","title":"Compliance","text":"<ul> <li>Communication is handled in base classes.</li> <li><code>Communicator</code> objects are non-public. Their interfaces (already limited) are handled by simple methods in the base classes of <code>Task</code>s and <code>Executor</code>s.</li> <li>The <code>Communicator</code> should have no need to be directly manipulated by callers (even less so by users)</li> </ul>"},{"location":"adrs/adr-5/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-6/","title":"[ADR-6] Third-party Config Files Managed by Templates Rendered by <code>ThirdPartyTask</code>s","text":"<p>Date: 2024-02-12</p>"},{"location":"adrs/adr-6/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-6/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>While many third-party executables of interest to the LUTE platform can be fully configured via command-line arguments, some also require management of an additional config file.</li> <li>Config files may use a variety of languages and methods. E.g. YAML, TOML, JSON, or even direct management of Python scripts.</li> <li>From the perspective of a generic interface to manage these files this poses a challenge.</li> <li>Ideally all aspects of configuraiton could be managed from the single LUTE configuration file.</li> </ul>"},{"location":"adrs/adr-6/#decision","title":"Decision","text":"<p>Templates will be used for the third party configuration files. A generic interface to heterogenous templates will be provided through a combination of pydantic models and the <code>ThirdPartyTask</code> implementation. The pydantic models will label extra arguments to <code>ThirdPartyTask</code>s as being <code>TemplateParameters</code>. I.e. any extra parameters are considered to be for a templated configuration file. The <code>ThirdPartyTask</code> will find the necessary template and render it if any extra parameters are found. This puts the burden of correct parsing on the template definition itself.</p>"},{"location":"adrs/adr-6/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Need to be able to configure the necessary files from within the LUTE framework.</li> <li>Configuration files take many forms so need a generic interface to disparate file types.</li> <li>Want to maintain as simple a <code>Task</code> interface as possible - but due to the above, need a way of handling multiple output files.</li> <li>Text substiution provides a means to do this.</li> </ul>"},{"location":"adrs/adr-6/#considered-options","title":"Considered Options","text":"<ul> <li>Separate configuration <code>Task</code> to be run before the main <code>ThirdPartyTask</code>.</li> <li>Generate the configuration file in its entirety from within the <code>Task</code>.</li> <li>This removes the simplicity in allowing all <code>ThirdPartyTask</code>s to be run as instances of a single class.</li> </ul>"},{"location":"adrs/adr-6/#consequences","title":"Consequences","text":"<ul> <li>Can configure and run third party tasks which require the use of a configuration file.</li> <li>Must manage templates in addition to the standard configuration parsing code.</li> <li>The templates themselves provide the specific \"programming\" for filling them in. I.e. the Python interface assumes that the template will properly handle the block of parameters it is sent.</li> <li>Due to the above, template errors can be fatal, and appropriate attention to template creation is necessary.</li> <li>Allowing for template parameters in the general configuration file requires accepting the possiblity of extra parameters not defined in the data validation (pydantic) models.</li> <li>Extra parameters are not validated in the same way as standard parameters.</li> <li>We have to assume the template will properly deal with them.</li> </ul>"},{"location":"adrs/adr-6/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-6/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-7/","title":"[ADR-7] <code>Task</code> Configuration is Stored in a Database Managed by <code>Executor</code>s","text":"<p>Date: 2024-02-12</p>"},{"location":"adrs/adr-7/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-7/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>For metadata publishing reasons, need a mechanism to maintain a history of <code>Task</code> parameter configurations.</li> <li>Each <code>Task</code>'s code is designed to be independent of other <code>Task</code>'s aside from code shared by inheritance.</li> <li>Dependencies between <code>Task</code>s are intended to be defined only at the level of workflows.</li> <li>Nonetheless, some <code>Task</code>s may have implicit dependencies on others. E.g. one <code>Task</code> may use the output files of another, and so could benefit from having knowledge of where they were written.</li> </ul>"},{"location":"adrs/adr-7/#decision","title":"Decision","text":"<p>Upon <code>Task</code> completion the managing <code>Executor</code> will write the <code>AnalysisConfig</code> object, including <code>TaskParameters</code>, results and generic configuration information to a database. Some entries from this database can be retrieved to provide default files for <code>TaskParameter</code> fields; however, the <code>Task</code> itself has no knowledge, and does not access to the database.</p>"},{"location":"adrs/adr-7/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Want to reduce explicit dependencies between <code>Task</code>s while allowing information to be shared between them.</li> <li>Have <code>Task</code>-independent IO be managed solely at the <code>Executor</code> level.</li> </ul>"},{"location":"adrs/adr-7/#considered-options","title":"Considered Options","text":"<ul> <li><code>Task</code>s write the database.</li> <li><code>Task</code>s pass information through other mechanisms, such as Airflow.</li> </ul>"},{"location":"adrs/adr-7/#consequences","title":"Consequences","text":"<ul> <li>Requires a database.</li> <li>Additional dependency, although at least one backend can be the standard <code>sqlite</code> which should make everything transferrable.</li> <li>Allows for information to be passed between <code>Task</code>s without any explicit code dependencies/linkages between them.</li> <li>The dependency is still mostly determined by the workflow definition. Default values can be provided by the database if needed.</li> </ul>"},{"location":"adrs/adr-7/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-7/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-8/","title":"[ADR-8] Airflow credentials/authorization requires special launch program","text":"<p>Date: 2024-03-18</p>"},{"location":"adrs/adr-8/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-8/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Airflow is used as the workflow manager.</li> <li>Airflow does not currently support multi-tenancy, and LDAP is not currently supported for authentication.</li> <li>Multiple users will be expected to run the software and thus need to authenticate against the Airflow API.</li> <li>We require a mechanism to control shared credentials for multiple users.</li> <li>The credentials are admin credentials, so we do not want unconstrained access to them.<ul> <li>We want users to run workflows, for instance, but not to have free access to add and remove workflows.</li> </ul> </li> </ul>"},{"location":"adrs/adr-8/#decision","title":"Decision","text":"<p>A closed-source <code>lute_launcher</code> program will be used to run the Airflow launch scripts. This program accesses credentials with the correct permissions. Users should otherwise not have access to the credentials. This will help ensure the credentials can be used by everyone but only to run workflows and not perform restricted admin activities.</p>"},{"location":"adrs/adr-8/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Need shared access to credentials for the purpose of launching jobs.</li> <li>Restricted access to credentials for administrative activities.</li> <li>Ease of use for users</li> <li>Authentication should be automatic - users can not be asked for passwords etc, for jobs that need to run automatically upon data acquisition</li> </ul>"},{"location":"adrs/adr-8/#considered-options","title":"Considered Options","text":"<ul> <li>LDAP - this may be used in the future, but requires backend work outside of our control. We will revisit the implementation arising from this ADR in the future if LDAP is supported. *</li> </ul>"},{"location":"adrs/adr-8/#consequences","title":"Consequences","text":"<ul> <li>Complexity</li> </ul>"},{"location":"adrs/adr-8/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-8/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-9/","title":"[ADR-9] Airflow launch script will run as long lived batch job.","text":"<p>Date: 2024-04-15</p>"},{"location":"adrs/adr-9/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-9/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Each <code>Task</code> will produce its own log file.</li> <li>Log files from jobs (i.e. DAGs/workflows) run by different users will be in different locations/directories.</li> <li>None of these log files will be accessible from the Web UI of the eLog unless they are available to the initial launch script which starts the workflow.</li> </ul>"},{"location":"adrs/adr-9/#decision","title":"Decision","text":"<p>The Airflow launch script will be a long lived process, running for the duration of the entire DAG. It will provide basic status logging information, e.g. what <code>Task</code>s are running, if they succeed or failed. Additionally, at the end of each <code>Task</code> job, the launch job will collect the log file from that job and append it to its own log.</p> <p>As the Airflow launch script is an entry point used from the eLog, only its log file is available to users using that UI. By converting the launch script into a long-lived monitoring job it allows the log information to be easily accessible.</p> <p>In order to accomplish this, the launch script must be submitted as a batch job, in order to comply with the 30 second timeout imposed by jobs run by the ARP. This necessitates providing an additional wrapper script.</p>"},{"location":"adrs/adr-9/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Log availability from the eLog.</li> <li>All logs available from a single location.</li> </ul>"},{"location":"adrs/adr-9/#considered-options","title":"Considered Options","text":"<ul> <li>All jobs append to the same initial file, by specifying a log file. (<code>--open-mode=append</code> for SLURM)</li> <li>Having a monitoring job provides the opportunity to include additional information.</li> </ul>"},{"location":"adrs/adr-9/#consequences","title":"Consequences","text":"<ul> <li>There needs to be an additional wrapper script: <code>submit_launch_airflow.sh</code> which submits the <code>launch_airflow.py</code> script (run by <code>lute_launcher</code>) as a batch job.</li> <li>Jobs run by the ARP can not be long-lived - there is a 30 second timeout.</li> <li>The ARP was intended to submit batch jobs - it captures the log file from batch jobs, so running the job directly or submitting as a batch job is equivalent in terms of presenting information to the eLog UI.</li> <li>Another core is used to run the job. Overhead is now two cores - 1 for the monitoring job (<code>launch_airflow.py</code>) and 1 for the <code>Executor</code> process. </li> </ul>"},{"location":"adrs/adr-9/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-9/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/madr_template/","title":"Madr template","text":""},{"location":"adrs/madr_template/#title","title":"Title","text":"<p>{ADR #X : Short description/title of feature/decision}</p> <p>Date:</p>"},{"location":"adrs/madr_template/#status","title":"Status","text":"<p>{Accepted | Proposed | Rejected | Deprecated | Superseded} {If this proposal supersedes another, please indicate so, e.g. \"Status: Accepted, supersedes [ADR-3]\"} {Likewise, if this proposal was superseded, e.g. \"Status: Superseded by [ADR-2]\"}</p>"},{"location":"adrs/madr_template/#context-and-problem-statement","title":"Context and Problem Statement","text":"<p>{Describe the problem context and why this decision has been made/feature implemented.}</p>"},{"location":"adrs/madr_template/#decision","title":"Decision","text":"<p>{Describe how the solution was arrived at in the manner it was. You may use the sections below to help.}</p>"},{"location":"adrs/madr_template/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>{driver 1}</li> <li>{driver 2}</li> </ul>"},{"location":"adrs/madr_template/#considered-options","title":"Considered Options","text":"<ul> <li>{option 1}</li> <li>{option 2}</li> </ul>"},{"location":"adrs/madr_template/#consequences","title":"Consequences","text":"<p>{Short description of anticipated consequences} * {Anticipated consequence 1} * {Anticipated consequence 2}</p>"},{"location":"adrs/madr_template/#compliance","title":"Compliance","text":"<p>{How will the decision/implementation be enforced. How will compliance be validated?}</p>"},{"location":"adrs/madr_template/#metadata","title":"Metadata","text":"<p>{Any additional information to include}</p>"},{"location":"design/database/","title":"LUTE Configuration Database Specification","text":"<p>Date: 2024-02-12 VERSION: v0.1</p>"},{"location":"design/database/#basic-outline","title":"Basic Outline","text":"<ul> <li>The backend database will be sqlite, using the standard Python library.</li> <li>A high-level API is provided, so if needed, the backend database can be changed without affecting <code>Executor</code> level code.</li> <li>One LUTE database is created per working directory for this iteration of the database. Note that this database is independent of any database used by a workflow manager (e.g. Airflow) to manage task execution order.</li> <li>Each database has the following tables:</li> <li>1 table for <code>Executor</code> configuration</li> <li>1 table for general task configuration (i.e., <code>lute.io.config.AnalysisHeader</code>)</li> <li>1 table PER <code>Task</code><ul> <li>Executor and general configuration is shared between <code>Task</code> tables by pointing/linking to the entry ids in the above two tables.</li> <li>Multiple experiments can reside in the same table, although in practice this is unlikely to occur in production as the working directory will most likely change between experiments.</li> </ul> </li> </ul>"},{"location":"design/database/#gen_cfg-table","title":"<code>gen_cfg</code> table","text":"<p>The general configuration table contains entries which may be shared between multiple <code>Task</code>s. The format of the table is:</p> id title experiment run date lute_version task_timeout 2 \"My experiment desc\" \"EXPx00000 1 YYYY/MM/DD 0.1 6000 <p>These parameters are extracted from the <code>TaskParameters</code> object. Each of those contains an <code>AnalysisHeader</code> object stored in the <code>lute_config</code> variable. For a given experimental run, this value will be shared across any <code>Task</code>s that are executed.</p>"},{"location":"design/database/#column-descriptions","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>title</code> Arbitrary description/title of the purpose of analysis. E.g. what kind of experiment is being conducted <code>experiment</code> LCLS Experiment. Can be a placeholder if debugging, etc. <code>run</code> LCLS Acquisition run. Can be a placeholder if debugging, testing, etc. <code>date</code> Date the configuration file was first setup. <code>lute_version</code> Version of the codebase being used to execute <code>Task</code>s. <code>task_timeout</code> The maximum amount of time in seconds that a <code>Task</code> can run before being cancelled."},{"location":"design/database/#exec_cfg-table","title":"<code>exec_cfg</code> table","text":"<p>The <code>Executor</code> table contains information on the environment provided to the <code>Executor</code> for <code>Task</code> execution, the polling interval used for IPC between the <code>Task</code> and <code>Executor</code> and information on the communicator protocols used for IPC. This information can be shared between <code>Task</code>s or between experimental runs, but not necessarily every <code>Task</code> of a given run will use exactly the same <code>Executor</code> configuration and environment.</p> id env poll_interval communicator_desc 2 \"VAR1=val1;VAR2=val2\" 0.1 \"PipeCommunicator...;SocketCommunicator...\""},{"location":"design/database/#column-descriptions_1","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>env</code> Execution environment used by the Executor and by proxy any Tasks submitted by an Executor matching this entry. Environment is stored as a string with variables delimited by \";\" <code>poll_interval</code> Polling interval used for Task monitoring. <code>communicator_desc</code> Description of the Communicators used. <p>NOTE: The <code>env</code> column currently only stores variables related to <code>SLURM</code> or <code>LUTE</code> itself.</p>"},{"location":"design/database/#task-tables","title":"<code>Task</code> tables","text":"<p>For every <code>Task</code> a table of the following format will be created. The exact number of columns will depend on the specific <code>Task</code>, as the number of parameters can vary between them, and each parameter gets its own column. Within a table, multiple experiments and runs can coexist. The experiment and run are not recorded directly. Instead, the first two columns point to the id of entries in the general configuration and <code>Executor</code> tables respectively. The general configuration table entry will contain the experiment and run information.</p> id timestamp gen_cfg_id exec_cfg_id P1 P2 ... Pn result.task_status result.summary result.payload result.impl_schemas valid_flag 2 \"YYYY-MM-DD HH:MM:SS\" 1 1 1 2 ... 3 \"COMPLETED\" \"Summary\" \"XYZ\" \"schema1;schema3;\" 1 3 \"YYYY-MM-DD HH:MM:SS\" 1 1 3 1 ... 4 \"FAILED\" \"Summary\" \"XYZ\" \"schema1;schema3;\" 0 <p>Parameter sets which can be described as nested dictionaries are flattened and then delimited with a <code>.</code> to create column names. Parameters which are lists (or Python tuples, etc.) have a column for each entry with names that include an index (counting from 0). E.g. consider the following dictionary of parameters:</p> <pre><code>param_dict: Dict[str, Any] = {\n    \"a\": {               # First parameter a\n        \"b\": (1, 2),\n        \"c\": 1,\n        # ...\n    },\n    \"a2\": 4,             # Second parameter a2\n    # ...\n}\n</code></pre> <p>The dictionary <code>a</code> will produce columns: <code>a.b[0]</code>, <code>a.b[1]</code>, <code>a.c</code>, and so on.</p>"},{"location":"design/database/#column-descriptions_2","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>CURRENT_TIMESTAMP</code> Full timestamp for the entry. <code>gen_cfg_id</code> ID of the entry in the general config table that applies to this <code>Task</code> entry. That table has, e.g., experiment and run number. <code>exec_cfg_id</code> The ID of the entry in the <code>Executor</code> table which applies to this <code>Task</code> entry. <code>P1</code> - <code>Pn</code> The specific parameters of the <code>Task</code>. The <code>P{1..n}</code> are replaced by the actual parameter names. <code>result.task_status</code> Reported exit status of the <code>Task</code>. Note that the output may still be labeled invalid by the <code>valid_flag</code> (see below). <code>result.summary</code> Short text summary of the <code>Task</code> result. This is provided by the <code>Task</code>, or sometimes the <code>Executor</code>. <code>result.payload</code> Full description of result from the <code>Task</code>. If the object is incompatible with the database, will instead be a pointer to where it can be found. <code>result.impl_schemas</code> A string of semi-colon separated schema(s) implemented by the <code>Task</code>. Schemas describe conceptually the type output the <code>Task</code> produces. <code>valid_flag</code> A boolean flag for whether the result is valid. May be <code>0</code> (False) if e.g., data is missing, or corrupt, or reported status is failed. <p>NOTE: The <code>result.payload</code> may be distinct from the output files. Payloads can be specified in terms of output parameters, specific output files, or are an optional summary of the results provided by the <code>Task</code>. E.g. this may include graphical descriptions of results (plots, figures, etc.). In many cases, however, the output files will most likely be pointed to by a parameter in one of the columns <code>P{1...n}</code> - if properly specified in the <code>TaskParameters</code> model the value of this output parameter will be replicated in the <code>result.payload</code> column as well..</p>"},{"location":"design/database/#api","title":"API","text":"<p>This API is intended to be used at the <code>Executor</code> level, with some calls intended to provide default values for Pydantic models. Utilities for reading and inspecting the database outside of normal <code>Task</code> execution are addressed in the following subheader.</p>"},{"location":"design/database/#write","title":"Write","text":"<ul> <li><code>record_analysis_db(cfg: DescribedAnalysis) -&gt; None</code>: Writes the configuration to the backend database.</li> <li>...</li> <li>...</li> </ul>"},{"location":"design/database/#read","title":"Read","text":"<ul> <li><code>read_latest_db_entry(db_dir: str, task_name: str, param: str) -&gt; Any</code>: Retrieve the most recent entry from a database for a specific Task.</li> <li>...</li> <li>...</li> </ul>"},{"location":"design/database/#utilities","title":"Utilities","text":""},{"location":"design/database/#scripts","title":"Scripts","text":"<ul> <li><code>invalidate_entry</code>: Marks a database entry as invalid. Common reason to use this is if data has been deleted, or found to be corrupted.</li> <li>...</li> </ul>"},{"location":"design/database/#tui-and-gui","title":"TUI and GUI","text":"<ul> <li><code>dbview</code>: TUI for database inspection. Read only.</li> <li>...</li> </ul>"},{"location":"source/managed_tasks/","title":"managed_tasks","text":"<p>LUTE Managed Tasks.</p> <p>Executor-managed Tasks with specific environment specifications are defined here.</p>"},{"location":"source/managed_tasks/#managed_tasks.BinaryErrTester","title":"<code>BinaryErrTester = Executor('TestBinaryErr')</code>  <code>module-attribute</code>","text":"<p>Runs a test of a third-party task that fails.</p>"},{"location":"source/managed_tasks/#managed_tasks.BinaryTester","title":"<code>BinaryTester: Executor = Executor('TestBinary')</code>  <code>module-attribute</code>","text":"<p>Runs a basic test of a multi-threaded third-party Task.</p>"},{"location":"source/managed_tasks/#managed_tasks.CrystFELIndexer","title":"<code>CrystFELIndexer: Executor = Executor('IndexCrystFEL')</code>  <code>module-attribute</code>","text":"<p>Runs crystallographic indexing using CrystFEL.</p>"},{"location":"source/managed_tasks/#managed_tasks.DimpleSolver","title":"<code>DimpleSolver: Executor = Executor('DimpleSolve')</code>  <code>module-attribute</code>","text":"<p>Solves a crystallographic structure using molecular replacement.</p>"},{"location":"source/managed_tasks/#managed_tasks.HKLComparer","title":"<code>HKLComparer: Executor = Executor('CompareHKL')</code>  <code>module-attribute</code>","text":"<p>Runs analysis on merge results for statistics/figures of merit..</p>"},{"location":"source/managed_tasks/#managed_tasks.HKLManipulator","title":"<code>HKLManipulator: Executor = Executor('ManipulateHKL')</code>  <code>module-attribute</code>","text":"<p>Performs format conversions (among other things) of merge results.</p>"},{"location":"source/managed_tasks/#managed_tasks.MultiNodeCommunicationTester","title":"<code>MultiNodeCommunicationTester: MPIExecutor = MPIExecutor('TestMultiNodeCommunication')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm communication works between multiple nodes.</p>"},{"location":"source/managed_tasks/#managed_tasks.PartialatorMerger","title":"<code>PartialatorMerger: Executor = Executor('MergePartialator')</code>  <code>module-attribute</code>","text":"<p>Runs crystallographic merging using CrystFEL's partialator.</p>"},{"location":"source/managed_tasks/#managed_tasks.PeakFinderPsocake","title":"<code>PeakFinderPsocake: Executor = Executor('FindPeaksPsocake')</code>  <code>module-attribute</code>","text":"<p>Performs Bragg peak finding using psocake - DEPRECATED.</p>"},{"location":"source/managed_tasks/#managed_tasks.PeakFinderPyAlgos","title":"<code>PeakFinderPyAlgos: MPIExecutor = MPIExecutor('FindPeaksPyAlgos')</code>  <code>module-attribute</code>","text":"<p>Performs Bragg peak finding using the PyAlgos algorithm.</p>"},{"location":"source/managed_tasks/#managed_tasks.ReadTester","title":"<code>ReadTester: Executor = Executor('TestReadOutput')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm database reading.</p>"},{"location":"source/managed_tasks/#managed_tasks.SHELXCRunner","title":"<code>SHELXCRunner: Executor = Executor('RunSHELXC')</code>  <code>module-attribute</code>","text":"<p>Runs CCP4 SHELXC - needed for crystallographic phasing.</p>"},{"location":"source/managed_tasks/#managed_tasks.SmallDataProducer","title":"<code>SmallDataProducer: Executor = Executor('SubmitSMD')</code>  <code>module-attribute</code>","text":"<p>Runs the production of a smalldata HDF5 file.</p>"},{"location":"source/managed_tasks/#managed_tasks.SocketTester","title":"<code>SocketTester: Executor = Executor('TestSocket')</code>  <code>module-attribute</code>","text":"<p>Runs a test of socket-based communication.</p>"},{"location":"source/managed_tasks/#managed_tasks.StreamFileConcatenator","title":"<code>StreamFileConcatenator: Executor = Executor('ConcatenateStreamFiles')</code>  <code>module-attribute</code>","text":"<p>Concatenates results from crystallographic indexing of multiple runs.</p>"},{"location":"source/managed_tasks/#managed_tasks.Tester","title":"<code>Tester: Executor = Executor('Test')</code>  <code>module-attribute</code>","text":"<p>Runs a basic test of a first-party Task.</p>"},{"location":"source/managed_tasks/#managed_tasks.WriteTester","title":"<code>WriteTester: Executor = Executor('TestWriteOutput')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm database writing.</p>"},{"location":"source/execution/debug_utils/","title":"debug_utils","text":"<p>Functions to assist in debugging execution of LUTE.</p> <p>Functions:</p> Name Description <code>LUTE_DEBUG_EXIT</code> <p>str, str_dump: Optional[str]): Exits the program if the provided <code>env_var</code> is set. Optionally, also prints a message if provided.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Error raised by pydantic during data validation. (From Pydantic)</p>"},{"location":"source/execution/executor/","title":"executor","text":"<p>Base classes and functions for handling <code>Task</code> execution.</p> <p>Executors run a <code>Task</code> as a subprocess and handle all communication with other services, e.g., the eLog. They accept specific handlers to override default stream parsing.</p> <p>Event handlers/hooks are implemented as standalone functions which can be added to an Executor.</p> <p>Classes:</p> Name Description <code>AnalysisConfig</code> <p>Data class for holding a managed Task's configuration.</p> <code>BaseExecutor</code> <p>Abstract base class from which all Executors are derived.</p> <code>Executor</code> <p>Default Executor implementing all basic functionality and IPC.</p> <code>BinaryExecutor</code> <p>Can execute any arbitrary binary/command as a managed task within the framework provided by LUTE.</p>"},{"location":"source/execution/executor/#execution.executor--exceptions","title":"Exceptions","text":""},{"location":"source/execution/executor/#execution.executor.BaseExecutor","title":"<code>BaseExecutor</code>","text":"<p>               Bases: <code>ABC</code></p> <p>ABC to manage Task execution and communication with user services.</p> <p>When running in a workflow, \"tasks\" (not the class instances) are submitted as <code>Executors</code>. The Executor manages environment setup, the actual Task submission, and communication regarding Task results and status with third party services like the eLog.</p> <p>Attributes:</p> <p>Methods:</p> Name Description <code>add_hook</code> <p>str, hook: Callable[[None], None]) -&gt; None: Create a new hook to be called each time a specific event occurs.</p> <code>add_default_hooks</code> <p>Populate the event hooks with the default functions.</p> <code>update_environment</code> <p>Dict[str, str], update_path: str): Update the environment that is passed to the Task subprocess.</p> <code>execute_task</code> <p>Run the task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class BaseExecutor(ABC):\n    \"\"\"ABC to manage Task execution and communication with user services.\n\n    When running in a workflow, \"tasks\" (not the class instances) are submitted\n    as `Executors`. The Executor manages environment setup, the actual Task\n    submission, and communication regarding Task results and status with third\n    party services like the eLog.\n\n    Attributes:\n\n    Methods:\n        add_hook(event: str, hook: Callable[[None], None]) -&gt; None: Create a\n            new hook to be called each time a specific event occurs.\n\n        add_default_hooks() -&gt; None: Populate the event hooks with the default\n            functions.\n\n        update_environment(env: Dict[str, str], update_path: str): Update the\n            environment that is passed to the Task subprocess.\n\n        execute_task(): Run the task as a subprocess.\n    \"\"\"\n\n    class Hooks:\n        \"\"\"A container class for the Executor's event hooks.\n\n        There is a corresponding function (hook) for each event/signal. Each\n        function takes two parameters - a reference to the Executor (self) and\n        a reference to the Message (msg) which includes the corresponding\n        signal.\n        \"\"\"\n\n        def no_pickle_mode(self: Self, msg: Message): ...\n\n        def task_started(self: Self, msg: Message): ...\n\n        def task_failed(self: Self, msg: Message): ...\n\n        def task_stopped(self: Self, msg: Message): ...\n\n        def task_done(self: Self, msg: Message): ...\n\n        def task_cancelled(self: Self, msg: Message): ...\n\n        def task_result(self: Self, msg: Message): ...\n\n    def __init__(\n        self,\n        task_name: str,\n        communicators: List[Communicator],\n        poll_interval: float = 0.05,\n    ) -&gt; None:\n        \"\"\"The Executor will manage the subprocess in which `task_name` is run.\n\n        Args:\n            task_name (str): The name of the Task to be submitted. Must match\n                the Task's class name exactly. The parameter specification must\n                also be in a properly named model to be identified.\n\n            communicators (List[Communicator]): A list of one or more\n                communicators which manage information flow to/from the Task.\n                Subclasses may have different defaults, and new functionality\n                can be introduced by composing Executors with communicators.\n\n            poll_interval (float): Time to wait between reading/writing to the\n                managed subprocess. In seconds.\n        \"\"\"\n        result: TaskResult = TaskResult(\n            task_name=task_name, task_status=TaskStatus.PENDING, summary=\"\", payload=\"\"\n        )\n        task_parameters: Optional[TaskParameters] = None\n        task_env: Dict[str, str] = os.environ.copy()\n        self._communicators: List[Communicator] = communicators\n        communicator_desc: List[str] = []\n        for comm in self._communicators:\n            comm.stage_communicator()\n            communicator_desc.append(str(comm))\n\n        self._analysis_desc: DescribedAnalysis = DescribedAnalysis(\n            task_result=result,\n            task_parameters=task_parameters,\n            task_env=task_env,\n            poll_interval=poll_interval,\n            communicator_desc=communicator_desc,\n        )\n\n    def add_hook(self, event: str, hook: Callable[[Self, Message], None]) -&gt; None:\n        \"\"\"Add a new hook.\n\n        Each hook is a function called any time the Executor receives a signal\n        for a particular event, e.g. Task starts, Task ends, etc. Calling this\n        method will remove any hook that currently exists for the event. I.e.\n        only one hook can be called per event at a time. Creating hooks for\n        events which do not exist is not allowed.\n\n        Args:\n            event (str): The event for which the hook will be called.\n\n            hook (Callable[[None], None]) The function to be called during each\n                occurrence of the event.\n        \"\"\"\n        if event.upper() in LUTE_SIGNALS:\n            setattr(self.Hooks, event.lower(), hook)\n\n    @abstractmethod\n    def add_default_hooks(self) -&gt; None:\n        \"\"\"Populate the set of default event hooks.\"\"\"\n\n        ...\n\n    def update_environment(\n        self, env: Dict[str, str], update_path: str = \"prepend\"\n    ) -&gt; None:\n        \"\"\"Update the stored set of environment variables.\n\n        These are passed to the subprocess to setup its environment.\n\n        Args:\n            env (Dict[str, str]): A dictionary of \"VAR\":\"VALUE\" pairs of\n                environment variables to be added to the subprocess environment.\n                If any variables already exist, the new variables will\n                overwrite them (except PATH, see below).\n\n            update_path (str): If PATH is present in the new set of variables,\n                this argument determines how the old PATH is dealt with. There\n                are three options:\n                * \"prepend\" : The new PATH values are prepended to the old ones.\n                * \"append\" : The new PATH values are appended to the old ones.\n                * \"overwrite\" : The old PATH is overwritten by the new one.\n                \"prepend\" is the default option. If PATH is not present in the\n                current environment, the new PATH is used without modification.\n        \"\"\"\n        if \"PATH\" in env:\n            sep: str = os.pathsep\n            if update_path == \"prepend\":\n                env[\"PATH\"] = (\n                    f\"{env['PATH']}{sep}{self._analysis_desc.task_env['PATH']}\"\n                )\n            elif update_path == \"append\":\n                env[\"PATH\"] = (\n                    f\"{self._analysis_desc.task_env['PATH']}{sep}{env['PATH']}\"\n                )\n            elif update_path == \"overwrite\":\n                pass\n            else:\n                raise ValueError(\n                    (\n                        f\"{update_path} is not a valid option for `update_path`!\"\n                        \" Options are: prepend, append, overwrite.\"\n                    )\n                )\n        os.environ.update(env)\n        self._analysis_desc.task_env.update(env)\n\n    def shell_source(self, env: str) -&gt; None:\n        \"\"\"Source a script.\n\n        Unlike `update_environment` this method sources a new file.\n\n        Args:\n            env (str): Path to the script to source.\n        \"\"\"\n        import sys\n\n        if not os.path.exists(env):\n            logger.info(f\"Cannot source environment from {env}!\")\n            return\n\n        script: str = (\n            f\"set -a\\n\"\n            f'source \"{env}\" &gt;/dev/null\\n'\n            f'{sys.executable} -c \"import os; print(dict(os.environ))\"\\n'\n        )\n        logger.info(f\"Sourcing file {env}\")\n        o, e = subprocess.Popen(\n            [\"bash\", \"-c\", script], stdout=subprocess.PIPE\n        ).communicate()\n        new_environment: Dict[str, str] = eval(o)\n        self._analysis_desc.task_env = new_environment\n\n    def _pre_task(self) -&gt; None:\n        \"\"\"Any actions to be performed before task submission.\n\n        This method may or may not be used by subclasses. It may be useful\n        for logging etc.\n        \"\"\"\n        # This prevents the Executors in managed_tasks.py from all acquiring\n        # resources like sockets.\n        for communicator in self._communicators:\n            communicator.delayed_setup()\n            # Not great, but experience shows we need a bit of time to setup\n            # network.\n            time.sleep(0.1)\n        # Propagate any env vars setup by Communicators - only update LUTE_ vars\n        tmp: Dict[str, str] = {key: os.environ[key] for key in os.environ if \"LUTE_\" in key}\n        self._analysis_desc.task_env.update(tmp)\n\n    def _submit_task(self, cmd: str) -&gt; subprocess.Popen:\n        proc: subprocess.Popen = subprocess.Popen(\n            cmd.split(),\n            stdout=subprocess.PIPE,\n            stderr=subprocess.PIPE,\n            env=self._analysis_desc.task_env,\n        )\n        os.set_blocking(proc.stdout.fileno(), False)\n        os.set_blocking(proc.stderr.fileno(), False)\n        return proc\n\n    @abstractmethod\n    def _task_loop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Actions to perform while the Task is running.\n\n        This function is run in the body of a loop until the Task signals\n        that its finished.\n        \"\"\"\n        ...\n\n    @abstractmethod\n    def _finalize_task(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Any actions to be performed after the Task has ended.\n\n        Examples include a final clearing of the pipes, retrieving results,\n        reporting to third party services, etc.\n        \"\"\"\n        ...\n\n    def _submit_cmd(self, executable_path: str, params: str) -&gt; str:\n        \"\"\"Return a formatted command for launching Task subprocess.\n\n        May be overridden by subclasses.\n\n        Args:\n            executable_path (str): Path to the LUTE subprocess script.\n\n            params (str): String of formatted command-line arguments.\n\n        Returns:\n            cmd (str): Appropriately formatted command for this Executor.\n        \"\"\"\n        cmd: str = \"\"\n        if __debug__:\n            cmd = f\"python -B {executable_path} {params}\"\n        else:\n            cmd = f\"python -OB {executable_path} {params}\"\n\n        return cmd\n\n    def execute_task(self) -&gt; None:\n        \"\"\"Run the requested Task as a subprocess.\"\"\"\n        self._pre_task()\n        lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n        if lute_path is None:\n            logger.debug(\"Absolute path to subprocess_task.py not found.\")\n            lute_path = os.path.abspath(f\"{os.path.dirname(__file__)}/../..\")\n            self.update_environment({\"LUTE_PATH\": lute_path})\n        executable_path: str = f\"{lute_path}/subprocess_task.py\"\n        config_path: str = self._analysis_desc.task_env[\"LUTE_CONFIGPATH\"]\n        params: str = f\"-c {config_path} -t {self._analysis_desc.task_result.task_name}\"\n\n        cmd: str = self._submit_cmd(executable_path, params)\n        proc: subprocess.Popen = self._submit_task(cmd)\n\n        while self._task_is_running(proc):\n            self._task_loop(proc)\n            time.sleep(self._analysis_desc.poll_interval)\n\n        os.set_blocking(proc.stdout.fileno(), True)\n        os.set_blocking(proc.stderr.fileno(), True)\n\n        self._finalize_task(proc)\n        proc.stdout.close()\n        proc.stderr.close()\n        proc.wait()\n        if ret := proc.returncode:\n            logger.info(f\"Task failed with return code: {ret}\")\n            self._analysis_desc.task_result.task_status = TaskStatus.FAILED\n            self.Hooks.task_failed(self, msg=Message())\n        elif self._analysis_desc.task_result.task_status == TaskStatus.RUNNING:\n            # Ret code is 0, no exception was thrown, task forgot to set status\n            self._analysis_desc.task_result.task_status = TaskStatus.COMPLETED\n            logger.debug(f\"Task did not change from RUNNING status. Assume COMPLETED.\")\n            self.Hooks.task_done(self, msg=Message())\n        self._store_configuration()\n        for comm in self._communicators:\n            comm.clear_communicator()\n\n        if self._analysis_desc.task_result.task_status == TaskStatus.FAILED:\n            logger.info(\"Exiting after Task failure. Result recorded.\")\n            sys.exit(-1)\n\n        self.process_results()\n\n    def _store_configuration(self) -&gt; None:\n        \"\"\"Store configuration and results in the LUTE database.\"\"\"\n        record_analysis_db(copy.deepcopy(self._analysis_desc))\n\n    def _task_is_running(self, proc: subprocess.Popen) -&gt; bool:\n        \"\"\"Whether a subprocess is running.\n\n        Args:\n            proc (subprocess.Popen): The subprocess to determine the run status\n                of.\n\n        Returns:\n            bool: Is the subprocess task running.\n        \"\"\"\n        # Add additional conditions - don't want to exit main loop\n        # if only stopped\n        task_status: TaskStatus = self._analysis_desc.task_result.task_status\n        is_running: bool = task_status != TaskStatus.COMPLETED\n        is_running &amp;= task_status != TaskStatus.CANCELLED\n        is_running &amp;= task_status != TaskStatus.TIMEDOUT\n        return proc.poll() is None and is_running\n\n    def _stop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Stop the Task subprocess.\"\"\"\n        os.kill(proc.pid, signal.SIGTSTP)\n        self._analysis_desc.task_result.task_status = TaskStatus.STOPPED\n\n    def _continue(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Resume a stopped Task subprocess.\"\"\"\n        os.kill(proc.pid, signal.SIGCONT)\n        self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n\n    def _set_result_from_parameters(self) -&gt; None:\n        \"\"\"Use TaskParameters object to set TaskResult fields.\n\n        A result may be defined in terms of specific parameters. This is most\n        useful for ThirdPartyTasks which would not otherwise have an easy way of\n        reporting what the TaskResult is. There are two options for specifying\n        results from parameters:\n            1. A single parameter (Field) of the model has an attribute\n               `is_result`. This is a bool indicating that this parameter points\n               to a result. E.g. a parameter `output` may set `is_result=True`.\n            2. The `TaskParameters.Config` has a `result_from_params` attribute.\n               This is an appropriate option if the result is determinable for\n               the Task, but it is not easily defined by a single parameter. The\n               TaskParameters.Config.result_from_param can be set by a custom\n               validator, e.g. to combine the values of multiple parameters into\n               a single result. E.g. an `out_dir` and `out_file` parameter used\n               together specify the result. Currently only string specifiers are\n               supported.\n\n        A TaskParameters object specifies that it contains information about the\n        result by setting a single config option:\n                        TaskParameters.Config.set_result=True\n        In general, this method should only be called when the above condition is\n        met, however, there are minimal checks in it as well.\n        \"\"\"\n        # This method shouldn't be called unless appropriate\n        # But we will add extra guards here\n        if self._analysis_desc.task_parameters is None:\n            logger.debug(\n                \"Cannot set result from TaskParameters. TaskParameters is None!\"\n            )\n            return\n        if (\n            not hasattr(self._analysis_desc.task_parameters.Config, \"set_result\")\n            or not self._analysis_desc.task_parameters.Config.set_result\n        ):\n            logger.debug(\n                \"Cannot set result from TaskParameters. `set_result` not specified!\"\n            )\n            return\n\n        # First try to set from result_from_params (faster)\n        if self._analysis_desc.task_parameters.Config.result_from_params is not None:\n            result_from_params: str = (\n                self._analysis_desc.task_parameters.Config.result_from_params\n            )\n            logger.info(f\"TaskResult specified as {result_from_params}.\")\n            self._analysis_desc.task_result.payload = result_from_params\n        else:\n            # Iterate parameters to find the one that is the result\n            schema: Dict[str, Any] = self._analysis_desc.task_parameters.schema()\n            for param, value in self._analysis_desc.task_parameters.dict().items():\n                param_attrs: Dict[str, Any] = schema[\"properties\"][param]\n                if \"is_result\" in param_attrs:\n                    is_result: bool = param_attrs[\"is_result\"]\n                    if isinstance(is_result, bool) and is_result:\n                        logger.info(f\"TaskResult specified as {value}.\")\n                        self._analysis_desc.task_result.payload = value\n                    else:\n                        logger.debug(\n                            (\n                                f\"{param} specified as result! But specifier is of \"\n                                f\"wrong type: {type(is_result)}!\"\n                            )\n                        )\n                    break  # We should only have 1 result-like parameter!\n\n        # If we get this far and haven't changed the payload we should complain\n        if self._analysis_desc.task_result.payload == \"\":\n            task_name: str = self._analysis_desc.task_result.task_name\n            logger.debug(\n                (\n                    f\"{task_name} specified result be set from {task_name}Parameters,\"\n                    \" but no result provided! Check model definition!\"\n                )\n            )\n        # Now check for impl_schemas and pass to result.impl_schemas\n        # Currently unused\n        impl_schemas: Optional[str] = (\n            self._analysis_desc.task_parameters.Config.impl_schemas\n        )\n        self._analysis_desc.task_result.impl_schemas = impl_schemas\n        # If we set_result but didn't get schema information we should complain\n        if self._analysis_desc.task_result.impl_schemas is None:\n            task_name: str = self._analysis_desc.task_result.task_name\n            logger.debug(\n                (\n                    f\"{task_name} specified result be set from {task_name}Parameters,\"\n                    \" but no schema provided! Check model definition!\"\n                )\n            )\n\n    def process_results(self) -&gt; None:\n        \"\"\"Perform any necessary steps to process TaskResults object.\n\n        Processing will depend on subclass. Examples of steps include, moving\n        files, converting file formats, compiling plots/figures into an HTML\n        file, etc.\n        \"\"\"\n        self._process_results()\n\n    @abstractmethod\n    def _process_results(self) -&gt; None: ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.Hooks","title":"<code>Hooks</code>","text":"<p>A container class for the Executor's event hooks.</p> <p>There is a corresponding function (hook) for each event/signal. Each function takes two parameters - a reference to the Executor (self) and a reference to the Message (msg) which includes the corresponding signal.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class Hooks:\n    \"\"\"A container class for the Executor's event hooks.\n\n    There is a corresponding function (hook) for each event/signal. Each\n    function takes two parameters - a reference to the Executor (self) and\n    a reference to the Message (msg) which includes the corresponding\n    signal.\n    \"\"\"\n\n    def no_pickle_mode(self: Self, msg: Message): ...\n\n    def task_started(self: Self, msg: Message): ...\n\n    def task_failed(self: Self, msg: Message): ...\n\n    def task_stopped(self: Self, msg: Message): ...\n\n    def task_done(self: Self, msg: Message): ...\n\n    def task_cancelled(self: Self, msg: Message): ...\n\n    def task_result(self: Self, msg: Message): ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.__init__","title":"<code>__init__(task_name, communicators, poll_interval=0.05)</code>","text":"<p>The Executor will manage the subprocess in which <code>task_name</code> is run.</p> <p>Parameters:</p> Name Type Description Default <code>task_name</code> <code>str</code> <p>The name of the Task to be submitted. Must match the Task's class name exactly. The parameter specification must also be in a properly named model to be identified.</p> required <code>communicators</code> <code>List[Communicator]</code> <p>A list of one or more communicators which manage information flow to/from the Task. Subclasses may have different defaults, and new functionality can be introduced by composing Executors with communicators.</p> required <code>poll_interval</code> <code>float</code> <p>Time to wait between reading/writing to the managed subprocess. In seconds.</p> <code>0.05</code> Source code in <code>lute/execution/executor.py</code> <pre><code>def __init__(\n    self,\n    task_name: str,\n    communicators: List[Communicator],\n    poll_interval: float = 0.05,\n) -&gt; None:\n    \"\"\"The Executor will manage the subprocess in which `task_name` is run.\n\n    Args:\n        task_name (str): The name of the Task to be submitted. Must match\n            the Task's class name exactly. The parameter specification must\n            also be in a properly named model to be identified.\n\n        communicators (List[Communicator]): A list of one or more\n            communicators which manage information flow to/from the Task.\n            Subclasses may have different defaults, and new functionality\n            can be introduced by composing Executors with communicators.\n\n        poll_interval (float): Time to wait between reading/writing to the\n            managed subprocess. In seconds.\n    \"\"\"\n    result: TaskResult = TaskResult(\n        task_name=task_name, task_status=TaskStatus.PENDING, summary=\"\", payload=\"\"\n    )\n    task_parameters: Optional[TaskParameters] = None\n    task_env: Dict[str, str] = os.environ.copy()\n    self._communicators: List[Communicator] = communicators\n    communicator_desc: List[str] = []\n    for comm in self._communicators:\n        comm.stage_communicator()\n        communicator_desc.append(str(comm))\n\n    self._analysis_desc: DescribedAnalysis = DescribedAnalysis(\n        task_result=result,\n        task_parameters=task_parameters,\n        task_env=task_env,\n        poll_interval=poll_interval,\n        communicator_desc=communicator_desc,\n    )\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.add_default_hooks","title":"<code>add_default_hooks()</code>  <code>abstractmethod</code>","text":"<p>Populate the set of default event hooks.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>@abstractmethod\ndef add_default_hooks(self) -&gt; None:\n    \"\"\"Populate the set of default event hooks.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.add_hook","title":"<code>add_hook(event, hook)</code>","text":"<p>Add a new hook.</p> <p>Each hook is a function called any time the Executor receives a signal for a particular event, e.g. Task starts, Task ends, etc. Calling this method will remove any hook that currently exists for the event. I.e. only one hook can be called per event at a time. Creating hooks for events which do not exist is not allowed.</p> <p>Parameters:</p> Name Type Description Default <code>event</code> <code>str</code> <p>The event for which the hook will be called.</p> required Source code in <code>lute/execution/executor.py</code> <pre><code>def add_hook(self, event: str, hook: Callable[[Self, Message], None]) -&gt; None:\n    \"\"\"Add a new hook.\n\n    Each hook is a function called any time the Executor receives a signal\n    for a particular event, e.g. Task starts, Task ends, etc. Calling this\n    method will remove any hook that currently exists for the event. I.e.\n    only one hook can be called per event at a time. Creating hooks for\n    events which do not exist is not allowed.\n\n    Args:\n        event (str): The event for which the hook will be called.\n\n        hook (Callable[[None], None]) The function to be called during each\n            occurrence of the event.\n    \"\"\"\n    if event.upper() in LUTE_SIGNALS:\n        setattr(self.Hooks, event.lower(), hook)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.execute_task","title":"<code>execute_task()</code>","text":"<p>Run the requested Task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def execute_task(self) -&gt; None:\n    \"\"\"Run the requested Task as a subprocess.\"\"\"\n    self._pre_task()\n    lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n    if lute_path is None:\n        logger.debug(\"Absolute path to subprocess_task.py not found.\")\n        lute_path = os.path.abspath(f\"{os.path.dirname(__file__)}/../..\")\n        self.update_environment({\"LUTE_PATH\": lute_path})\n    executable_path: str = f\"{lute_path}/subprocess_task.py\"\n    config_path: str = self._analysis_desc.task_env[\"LUTE_CONFIGPATH\"]\n    params: str = f\"-c {config_path} -t {self._analysis_desc.task_result.task_name}\"\n\n    cmd: str = self._submit_cmd(executable_path, params)\n    proc: subprocess.Popen = self._submit_task(cmd)\n\n    while self._task_is_running(proc):\n        self._task_loop(proc)\n        time.sleep(self._analysis_desc.poll_interval)\n\n    os.set_blocking(proc.stdout.fileno(), True)\n    os.set_blocking(proc.stderr.fileno(), True)\n\n    self._finalize_task(proc)\n    proc.stdout.close()\n    proc.stderr.close()\n    proc.wait()\n    if ret := proc.returncode:\n        logger.info(f\"Task failed with return code: {ret}\")\n        self._analysis_desc.task_result.task_status = TaskStatus.FAILED\n        self.Hooks.task_failed(self, msg=Message())\n    elif self._analysis_desc.task_result.task_status == TaskStatus.RUNNING:\n        # Ret code is 0, no exception was thrown, task forgot to set status\n        self._analysis_desc.task_result.task_status = TaskStatus.COMPLETED\n        logger.debug(f\"Task did not change from RUNNING status. Assume COMPLETED.\")\n        self.Hooks.task_done(self, msg=Message())\n    self._store_configuration()\n    for comm in self._communicators:\n        comm.clear_communicator()\n\n    if self._analysis_desc.task_result.task_status == TaskStatus.FAILED:\n        logger.info(\"Exiting after Task failure. Result recorded.\")\n        sys.exit(-1)\n\n    self.process_results()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.process_results","title":"<code>process_results()</code>","text":"<p>Perform any necessary steps to process TaskResults object.</p> <p>Processing will depend on subclass. Examples of steps include, moving files, converting file formats, compiling plots/figures into an HTML file, etc.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def process_results(self) -&gt; None:\n    \"\"\"Perform any necessary steps to process TaskResults object.\n\n    Processing will depend on subclass. Examples of steps include, moving\n    files, converting file formats, compiling plots/figures into an HTML\n    file, etc.\n    \"\"\"\n    self._process_results()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.shell_source","title":"<code>shell_source(env)</code>","text":"<p>Source a script.</p> <p>Unlike <code>update_environment</code> this method sources a new file.</p> <p>Parameters:</p> Name Type Description Default <code>env</code> <code>str</code> <p>Path to the script to source.</p> required Source code in <code>lute/execution/executor.py</code> <pre><code>def shell_source(self, env: str) -&gt; None:\n    \"\"\"Source a script.\n\n    Unlike `update_environment` this method sources a new file.\n\n    Args:\n        env (str): Path to the script to source.\n    \"\"\"\n    import sys\n\n    if not os.path.exists(env):\n        logger.info(f\"Cannot source environment from {env}!\")\n        return\n\n    script: str = (\n        f\"set -a\\n\"\n        f'source \"{env}\" &gt;/dev/null\\n'\n        f'{sys.executable} -c \"import os; print(dict(os.environ))\"\\n'\n    )\n    logger.info(f\"Sourcing file {env}\")\n    o, e = subprocess.Popen(\n        [\"bash\", \"-c\", script], stdout=subprocess.PIPE\n    ).communicate()\n    new_environment: Dict[str, str] = eval(o)\n    self._analysis_desc.task_env = new_environment\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.update_environment","title":"<code>update_environment(env, update_path='prepend')</code>","text":"<p>Update the stored set of environment variables.</p> <p>These are passed to the subprocess to setup its environment.</p> <p>Parameters:</p> Name Type Description Default <code>env</code> <code>Dict[str, str]</code> <p>A dictionary of \"VAR\":\"VALUE\" pairs of environment variables to be added to the subprocess environment. If any variables already exist, the new variables will overwrite them (except PATH, see below).</p> required <code>update_path</code> <code>str</code> <p>If PATH is present in the new set of variables, this argument determines how the old PATH is dealt with. There are three options: * \"prepend\" : The new PATH values are prepended to the old ones. * \"append\" : The new PATH values are appended to the old ones. * \"overwrite\" : The old PATH is overwritten by the new one. \"prepend\" is the default option. If PATH is not present in the current environment, the new PATH is used without modification.</p> <code>'prepend'</code> Source code in <code>lute/execution/executor.py</code> <pre><code>def update_environment(\n    self, env: Dict[str, str], update_path: str = \"prepend\"\n) -&gt; None:\n    \"\"\"Update the stored set of environment variables.\n\n    These are passed to the subprocess to setup its environment.\n\n    Args:\n        env (Dict[str, str]): A dictionary of \"VAR\":\"VALUE\" pairs of\n            environment variables to be added to the subprocess environment.\n            If any variables already exist, the new variables will\n            overwrite them (except PATH, see below).\n\n        update_path (str): If PATH is present in the new set of variables,\n            this argument determines how the old PATH is dealt with. There\n            are three options:\n            * \"prepend\" : The new PATH values are prepended to the old ones.\n            * \"append\" : The new PATH values are appended to the old ones.\n            * \"overwrite\" : The old PATH is overwritten by the new one.\n            \"prepend\" is the default option. If PATH is not present in the\n            current environment, the new PATH is used without modification.\n    \"\"\"\n    if \"PATH\" in env:\n        sep: str = os.pathsep\n        if update_path == \"prepend\":\n            env[\"PATH\"] = (\n                f\"{env['PATH']}{sep}{self._analysis_desc.task_env['PATH']}\"\n            )\n        elif update_path == \"append\":\n            env[\"PATH\"] = (\n                f\"{self._analysis_desc.task_env['PATH']}{sep}{env['PATH']}\"\n            )\n        elif update_path == \"overwrite\":\n            pass\n        else:\n            raise ValueError(\n                (\n                    f\"{update_path} is not a valid option for `update_path`!\"\n                    \" Options are: prepend, append, overwrite.\"\n                )\n            )\n    os.environ.update(env)\n    self._analysis_desc.task_env.update(env)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator","title":"<code>Communicator</code>","text":"<p>               Bases: <code>ABC</code></p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Communicator(ABC):\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"Abstract Base Class for IPC Communicator objects.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using pickle prior to\n                sending it.\n        \"\"\"\n        self._party = party\n        self._use_pickle = use_pickle\n        self.desc = \"Communicator abstract base class.\"\n\n    @abstractmethod\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Method for reading data through the communication mechanism.\"\"\"\n        ...\n\n    @abstractmethod\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Method for sending data through the communication mechanism.\"\"\"\n        ...\n\n    def __str__(self):\n        name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        return f\"{name}: {self.desc}\"\n\n    def __repr__(self):\n        return self.__str__()\n\n    def __enter__(self) -&gt; Self:\n        return self\n\n    def __exit__(self) -&gt; None: ...\n\n    @property\n    def has_messages(self) -&gt; bool:\n        \"\"\"Whether the Communicator has remaining messages.\n\n        The precise method for determining whether there are remaining messages\n        will depend on the specific Communicator sub-class.\n        \"\"\"\n        return False\n\n    def stage_communicator(self):\n        \"\"\"Alternative method for staging outside of context manager.\"\"\"\n        self.__enter__()\n\n    def clear_communicator(self):\n        \"\"\"Alternative exit method outside of context manager.\"\"\"\n        self.__exit__()\n\n    def delayed_setup(self):\n        \"\"\"Any setup that should be done later than init.\"\"\"\n        ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.has_messages","title":"<code>has_messages: bool</code>  <code>property</code>","text":"<p>Whether the Communicator has remaining messages.</p> <p>The precise method for determining whether there are remaining messages will depend on the specific Communicator sub-class.</p>"},{"location":"source/execution/executor/#execution.executor.Communicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>Abstract Base Class for IPC Communicator objects.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using pickle prior to sending it.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"Abstract Base Class for IPC Communicator objects.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using pickle prior to\n            sending it.\n    \"\"\"\n    self._party = party\n    self._use_pickle = use_pickle\n    self.desc = \"Communicator abstract base class.\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.clear_communicator","title":"<code>clear_communicator()</code>","text":"<p>Alternative exit method outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def clear_communicator(self):\n    \"\"\"Alternative exit method outside of context manager.\"\"\"\n    self.__exit__()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Any setup that should be done later than init.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self):\n    \"\"\"Any setup that should be done later than init.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.read","title":"<code>read(proc)</code>  <code>abstractmethod</code>","text":"<p>Method for reading data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Method for reading data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.stage_communicator","title":"<code>stage_communicator()</code>","text":"<p>Alternative method for staging outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def stage_communicator(self):\n    \"\"\"Alternative method for staging outside of context manager.\"\"\"\n    self.__enter__()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.write","title":"<code>write(msg)</code>  <code>abstractmethod</code>","text":"<p>Method for sending data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef write(self, msg: Message) -&gt; None:\n    \"\"\"Method for sending data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Executor","title":"<code>Executor</code>","text":"<p>               Bases: <code>BaseExecutor</code></p> <p>Basic implementation of an Executor which manages simple IPC with Task.</p> <p>Attributes:</p> <p>Methods:</p> Name Description <code>add_hook</code> <p>str, hook: Callable[[None], None]) -&gt; None: Create a new hook to be called each time a specific event occurs.</p> <code>add_default_hooks</code> <p>Populate the event hooks with the default functions.</p> <code>update_environment</code> <p>Dict[str, str], update_path: str): Update the environment that is passed to the Task subprocess.</p> <code>execute_task</code> <p>Run the task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class Executor(BaseExecutor):\n    \"\"\"Basic implementation of an Executor which manages simple IPC with Task.\n\n    Attributes:\n\n    Methods:\n        add_hook(event: str, hook: Callable[[None], None]) -&gt; None: Create a\n            new hook to be called each time a specific event occurs.\n\n        add_default_hooks() -&gt; None: Populate the event hooks with the default\n            functions.\n\n        update_environment(env: Dict[str, str], update_path: str): Update the\n            environment that is passed to the Task subprocess.\n\n        execute_task(): Run the task as a subprocess.\n    \"\"\"\n\n    def __init__(\n        self,\n        task_name: str,\n        communicators: List[Communicator] = [\n            PipeCommunicator(Party.EXECUTOR),\n            SocketCommunicator(Party.EXECUTOR),\n        ],\n        poll_interval: float = 0.05,\n    ) -&gt; None:\n        super().__init__(\n            task_name=task_name,\n            communicators=communicators,\n            poll_interval=poll_interval,\n        )\n        self.add_default_hooks()\n\n    def add_default_hooks(self) -&gt; None:\n        \"\"\"Populate the set of default event hooks.\"\"\"\n\n        def no_pickle_mode(self: Executor, msg: Message):\n            for idx, communicator in enumerate(self._communicators):\n                if isinstance(communicator, PipeCommunicator):\n                    self._communicators[idx] = PipeCommunicator(\n                        Party.EXECUTOR, use_pickle=False\n                    )\n\n        self.add_hook(\"no_pickle_mode\", no_pickle_mode)\n\n        def task_started(self: Executor, msg: Message):\n            if isinstance(msg.contents, TaskParameters):\n                self._analysis_desc.task_parameters = msg.contents\n                # Maybe just run this no matter what? Rely on the other guards?\n                # Perhaps just check if ThirdPartyParameters?\n                # if isinstance(self._analysis_desc.task_parameters, ThirdPartyParameters):\n                if hasattr(self._analysis_desc.task_parameters.Config, \"set_result\"):\n                    # Third party Tasks may mark a parameter as the result\n                    # If so, setup the result now.\n                    self._set_result_from_parameters()\n            logger.info(\n                f\"Executor: {self._analysis_desc.task_result.task_name} started\"\n            )\n            self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"RUNNING\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_started\", task_started)\n\n        def task_failed(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"FAILED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_failed\", task_failed)\n\n        def task_stopped(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"STOPPED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_stopped\", task_stopped)\n\n        def task_done(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_done\", task_done)\n\n        def task_cancelled(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"CANCELLED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_cancelled\", task_cancelled)\n\n        def task_result(self: Executor, msg: Message):\n            if isinstance(msg.contents, TaskResult):\n                self._analysis_desc.task_result = msg.contents\n                logger.info(self._analysis_desc.task_result.summary)\n                logger.info(self._analysis_desc.task_result.task_status)\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_result\", task_result)\n\n    def _task_loop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Actions to perform while the Task is running.\n\n        This function is run in the body of a loop until the Task signals\n        that its finished.\n        \"\"\"\n        for communicator in self._communicators:\n            while True:\n                msg: Message = communicator.read(proc)\n                if msg.signal is not None and msg.signal.upper() in LUTE_SIGNALS:\n                    hook: Callable[[Executor, Message], None] = getattr(\n                        self.Hooks, msg.signal.lower()\n                    )\n                    hook(self, msg)\n                if msg.contents is not None:\n                    if isinstance(msg.contents, str) and msg.contents != \"\":\n                        logger.info(msg.contents)\n                    elif not isinstance(msg.contents, str):\n                        logger.info(msg.contents)\n                if not communicator.has_messages:\n                    break\n\n    def _finalize_task(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Any actions to be performed after the Task has ended.\n\n        Examples include a final clearing of the pipes, retrieving results,\n        reporting to third party services, etc.\n        \"\"\"\n        self._task_loop(proc)  # Perform a final read.\n\n    def _process_results(self) -&gt; None:\n        \"\"\"Performs result processing.\n\n        Actions include:\n        - For `ElogSummaryPlots`, will save the summary plot to the appropriate\n            directory for display in the eLog.\n        \"\"\"\n        task_result: TaskResult = self._analysis_desc.task_result\n        self._process_result_payload(task_result.payload)\n        self._process_result_summary(task_result.summary)\n\n    def _process_result_payload(self, payload: Any) -&gt; None:\n        if self._analysis_desc.task_parameters is None:\n            logger.debug(\"Please run Task before using this method!\")\n            return\n        if isinstance(payload, ElogSummaryPlots):\n            # ElogSummaryPlots has figures and a display name\n            # display name also serves as a path.\n            expmt: str = self._analysis_desc.task_parameters.lute_config.experiment\n            base_path: str = f\"/sdf/data/lcls/ds/{expmt[:3]}/{expmt}/stats/summary\"\n            full_path: str = f\"{base_path}/{payload.display_name}\"\n            if not os.path.isdir(full_path):\n                os.makedirs(full_path)\n\n            # Preferred plots are pn.Tabs objects which save directly as html\n            # Only supported plot type that has \"save\" method - do not want to\n            # import plot modules here to do type checks.\n            if hasattr(payload.figures, \"save\"):\n                payload.figures.save(f\"{full_path}/report.html\")\n            else:\n                ...\n        elif isinstance(payload, str):\n            # May be a path to a file...\n            schemas: Optional[str] = self._analysis_desc.task_result.impl_schemas\n            # Should also check `impl_schemas` to determine what to do with path\n\n    def _process_result_summary(self, summary: str) -&gt; None: ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Executor.add_default_hooks","title":"<code>add_default_hooks()</code>","text":"<p>Populate the set of default event hooks.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def add_default_hooks(self) -&gt; None:\n    \"\"\"Populate the set of default event hooks.\"\"\"\n\n    def no_pickle_mode(self: Executor, msg: Message):\n        for idx, communicator in enumerate(self._communicators):\n            if isinstance(communicator, PipeCommunicator):\n                self._communicators[idx] = PipeCommunicator(\n                    Party.EXECUTOR, use_pickle=False\n                )\n\n    self.add_hook(\"no_pickle_mode\", no_pickle_mode)\n\n    def task_started(self: Executor, msg: Message):\n        if isinstance(msg.contents, TaskParameters):\n            self._analysis_desc.task_parameters = msg.contents\n            # Maybe just run this no matter what? Rely on the other guards?\n            # Perhaps just check if ThirdPartyParameters?\n            # if isinstance(self._analysis_desc.task_parameters, ThirdPartyParameters):\n            if hasattr(self._analysis_desc.task_parameters.Config, \"set_result\"):\n                # Third party Tasks may mark a parameter as the result\n                # If so, setup the result now.\n                self._set_result_from_parameters()\n        logger.info(\n            f\"Executor: {self._analysis_desc.task_result.task_name} started\"\n        )\n        self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"RUNNING\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_started\", task_started)\n\n    def task_failed(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"FAILED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_failed\", task_failed)\n\n    def task_stopped(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"STOPPED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_stopped\", task_stopped)\n\n    def task_done(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_done\", task_done)\n\n    def task_cancelled(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"CANCELLED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_cancelled\", task_cancelled)\n\n    def task_result(self: Executor, msg: Message):\n        if isinstance(msg.contents, TaskResult):\n            self._analysis_desc.task_result = msg.contents\n            logger.info(self._analysis_desc.task_result.summary)\n            logger.info(self._analysis_desc.task_result.task_status)\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_result\", task_result)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.MPIExecutor","title":"<code>MPIExecutor</code>","text":"<p>               Bases: <code>Executor</code></p> <p>Runs first-party Tasks that require MPI.</p> <p>This Executor is otherwise identical to the standard Executor, except it uses <code>mpirun</code> for <code>Task</code> submission. Currently this Executor assumes a job has been submitted using SLURM as a first step. It will determine the number of MPI ranks based on the resources requested. As a fallback, it will try to determine the number of local cores available for cases where a job has not been submitted via SLURM. On S3DF, the second determination mechanism should accurately match the environment variable provided by SLURM indicating resources allocated.</p> <p>This Executor will submit the Task to run with a number of processes equal to the total number of cores available minus 1. A single core is reserved for the Executor itself. Note that currently this means that you must submit on 3 cores or more, since MPI requires a minimum of 2 ranks, and the number of ranks is determined from the cores dedicated to Task execution.</p> <p>Methods:</p> Name Description <code>_submit_cmd</code> <p>Run the task as a subprocess using <code>mpirun</code>.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class MPIExecutor(Executor):\n    \"\"\"Runs first-party Tasks that require MPI.\n\n    This Executor is otherwise identical to the standard Executor, except it\n    uses `mpirun` for `Task` submission. Currently this Executor assumes a job\n    has been submitted using SLURM as a first step. It will determine the number\n    of MPI ranks based on the resources requested. As a fallback, it will try\n    to determine the number of local cores available for cases where a job has\n    not been submitted via SLURM. On S3DF, the second determination mechanism\n    should accurately match the environment variable provided by SLURM indicating\n    resources allocated.\n\n    This Executor will submit the Task to run with a number of processes equal\n    to the total number of cores available minus 1. A single core is reserved\n    for the Executor itself. Note that currently this means that you must submit\n    on 3 cores or more, since MPI requires a minimum of 2 ranks, and the number\n    of ranks is determined from the cores dedicated to Task execution.\n\n    Methods:\n        _submit_cmd: Run the task as a subprocess using `mpirun`.\n    \"\"\"\n\n    def _submit_cmd(self, executable_path: str, params: str) -&gt; str:\n        \"\"\"Override submission command to use `mpirun`\n\n        Args:\n            executable_path (str): Path to the LUTE subprocess script.\n\n            params (str): String of formatted command-line arguments.\n\n        Returns:\n            cmd (str): Appropriately formatted command for this Executor.\n        \"\"\"\n        py_cmd: str = \"\"\n        nprocs: int = max(\n            int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1\n        )\n        mpi_cmd: str = f\"mpirun -np {nprocs}\"\n        if __debug__:\n            py_cmd = f\"python -B -u -m mpi4py.run {executable_path} {params}\"\n        else:\n            py_cmd = f\"python -OB -u -m mpi4py.run {executable_path} {params}\"\n\n        cmd: str = f\"{mpi_cmd} {py_cmd}\"\n        return cmd\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Party","title":"<code>Party</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Identifier for which party (side/end) is using a communicator.</p> <p>For some types of communication streams there may be different interfaces depending on which side of the communicator you are on. This enum is used by the communicator to determine which interface to use.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Party(Enum):\n    \"\"\"Identifier for which party (side/end) is using a communicator.\n\n    For some types of communication streams there may be different interfaces\n    depending on which side of the communicator you are on. This enum is used\n    by the communicator to determine which interface to use.\n    \"\"\"\n\n    TASK = 0\n    \"\"\"\n    The Task (client) side.\n    \"\"\"\n    EXECUTOR = 1\n    \"\"\"\n    The Executor (server) side.\n    \"\"\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Party.EXECUTOR","title":"<code>EXECUTOR = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Executor (server) side.</p>"},{"location":"source/execution/executor/#execution.executor.Party.TASK","title":"<code>TASK = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Task (client) side.</p>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator","title":"<code>PipeCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication through pipes over stderr/stdout.</p> <p>The implementation of this communicator has reading and writing ocurring on stderr and stdout. In general the <code>Task</code> will be writing while the <code>Executor</code> will be reading. <code>stderr</code> is used for sending signals.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class PipeCommunicator(Communicator):\n    \"\"\"Provides communication through pipes over stderr/stdout.\n\n    The implementation of this communicator has reading and writing ocurring\n    on stderr and stdout. In general the `Task` will be writing while the\n    `Executor` will be reading. `stderr` is used for sending signals.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC through pipes.\n\n        Arbitrary objects may be transmitted using pickle to serialize the data.\n        If pickle is not used\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using Pickle prior to\n                sending it. If False, data is assumed to be text whi\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n        self.desc = \"Communicates through stderr and stdout using pickle.\"\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Read from stdout and stderr.\n\n        Args:\n            proc (subprocess.Popen): The process to read from.\n\n        Returns:\n            msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        signal: Optional[str]\n        contents: Optional[str]\n        raw_signal: bytes = proc.stderr.read()\n        raw_contents: bytes = proc.stdout.read()\n        if raw_signal is not None:\n            signal = raw_signal.decode()\n        else:\n            signal = raw_signal\n        if raw_contents:\n            if self._use_pickle:\n                try:\n                    contents = pickle.loads(raw_contents)\n                except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                    self._use_pickle = False\n                    contents = self._safe_unpickle_decode(raw_contents)\n            else:\n                try:\n                    contents = raw_contents.decode()\n                except UnicodeDecodeError as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                    self._use_pickle = True\n                    contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            contents = None\n\n        if signal and signal not in LUTE_SIGNALS:\n            # Some tasks write on stderr\n            # If the signal channel has \"non-signal\" info, add it to\n            # contents\n            if not contents:\n                contents = f\"({signal})\"\n            else:\n                contents = f\"{contents} ({signal})\"\n            signal = None\n\n        return Message(contents=contents, signal=signal)\n\n    def _safe_unpickle_decode(self, maybe_mixed: bytes) -&gt; Optional[str]:\n        \"\"\"This method is used to unpickle and/or decode a bytes object.\n\n        It attempts to handle cases where contents can be mixed, i.e., part of\n        the message must be decoded and the other part unpickled. It handles\n        only two-way splits. If there are more complex arrangements such as:\n        &lt;pickled&gt;:&lt;unpickled&gt;:&lt;pickled&gt; etc, it will give up.\n\n        The simpler two way splits are unlikely to occur in normal usage. They\n        may arise when debugging if, e.g., `print` statements are mixed with the\n        usage of the `_report_to_executor` method.\n\n        Note that this method works because ONLY text data is assumed to be\n        sent via the pipes. The method needs to be revised to handle non-text\n        data if the `Task` is modified to also send that via PipeCommunicator.\n        The use of pickle is supported to provide for this option if it is\n        necessary. It may be deprecated in the future.\n\n        Be careful when making changes. This method has seemingly redundant\n        checks because unpickling will not throw an error if a full object can\n        be retrieved. That is, the library will ignore extraneous bytes. This\n        method attempts to retrieve that information if the pickled data comes\n        first in the stream.\n\n        Args:\n            maybe_mixed (bytes): A bytes object which could require unpickling,\n                decoding, or both.\n\n        Returns:\n            contents (Optional[str]): The unpickled/decoded contents if possible.\n                Otherwise, None.\n        \"\"\"\n        contents: Optional[str]\n        try:\n            contents = pickle.loads(maybe_mixed)\n            repickled: bytes = pickle.dumps(contents)\n            if len(repickled) &lt; len(maybe_mixed):\n                # Successful unpickling, but pickle stops even if there are more bytes\n                try:\n                    additional_data: str = maybe_mixed[len(repickled) :].decode()\n                    contents = f\"{contents}{additional_data}\"\n                except UnicodeDecodeError:\n                    # Can't decode the bytes left by pickle, so they are lost\n                    missing_bytes: int = len(maybe_mixed) - len(repickled)\n                    logger.debug(\n                        f\"PipeCommunicator has truncated message. Unable to retrieve {missing_bytes} bytes.\"\n                    )\n        except (pickle.UnpicklingError, ValueError, EOFError) as err:\n            # Pickle may also throw a ValueError, e.g. this bytes: b\"Found! \\n\"\n            # Pickle may also throw an EOFError, eg. this bytes: b\"F0\\n\"\n            try:\n                contents = maybe_mixed.decode()\n            except UnicodeDecodeError as err2:\n                try:\n                    contents = maybe_mixed[: err2.start].decode()\n                    contents = f\"{contents}{pickle.loads(maybe_mixed[err2.start:])}\"\n                except Exception as err3:\n                    logger.debug(\n                        f\"PipeCommunicator unable to decode/parse data! {err3}\"\n                    )\n                    contents = None\n        return contents\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Write to stdout and stderr.\n\n         The signal component is sent to `stderr` while the contents of the\n         Message are sent to `stdout`.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        if self._use_pickle:\n            signal: bytes\n            if msg.signal:\n                signal = msg.signal.encode()\n            else:\n                signal = b\"\"\n\n            contents: bytes = pickle.dumps(msg.contents)\n\n            sys.stderr.buffer.write(signal)\n            sys.stdout.buffer.write(contents)\n\n            sys.stderr.buffer.flush()\n            sys.stdout.buffer.flush()\n        else:\n            raw_signal: str\n            if msg.signal:\n                raw_signal = msg.signal\n            else:\n                raw_signal = \"\"\n\n            raw_contents: str\n            if isinstance(msg.contents, str):\n                raw_contents = msg.contents\n            elif msg.contents is None:\n                raw_contents = \"\"\n            else:\n                raise ValueError(\n                    f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n                )\n            sys.stderr.write(raw_signal)\n            sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC through pipes.</p> <p>Arbitrary objects may be transmitted using pickle to serialize the data. If pickle is not used</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using Pickle prior to sending it. If False, data is assumed to be text whi</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC through pipes.\n\n    Arbitrary objects may be transmitted using pickle to serialize the data.\n    If pickle is not used\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using Pickle prior to\n            sending it. If False, data is assumed to be text whi\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n    self.desc = \"Communicates through stderr and stdout using pickle.\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Read from stdout and stderr.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Read from stdout and stderr.\n\n    Args:\n        proc (subprocess.Popen): The process to read from.\n\n    Returns:\n        msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    signal: Optional[str]\n    contents: Optional[str]\n    raw_signal: bytes = proc.stderr.read()\n    raw_contents: bytes = proc.stdout.read()\n    if raw_signal is not None:\n        signal = raw_signal.decode()\n    else:\n        signal = raw_signal\n    if raw_contents:\n        if self._use_pickle:\n            try:\n                contents = pickle.loads(raw_contents)\n            except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                self._use_pickle = False\n                contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            try:\n                contents = raw_contents.decode()\n            except UnicodeDecodeError as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                self._use_pickle = True\n                contents = self._safe_unpickle_decode(raw_contents)\n    else:\n        contents = None\n\n    if signal and signal not in LUTE_SIGNALS:\n        # Some tasks write on stderr\n        # If the signal channel has \"non-signal\" info, add it to\n        # contents\n        if not contents:\n            contents = f\"({signal})\"\n        else:\n            contents = f\"{contents} ({signal})\"\n        signal = None\n\n    return Message(contents=contents, signal=signal)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Write to stdout and stderr.</p> <p>The signal component is sent to <code>stderr</code> while the contents of the  Message are sent to <code>stdout</code>.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Write to stdout and stderr.\n\n     The signal component is sent to `stderr` while the contents of the\n     Message are sent to `stdout`.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    if self._use_pickle:\n        signal: bytes\n        if msg.signal:\n            signal = msg.signal.encode()\n        else:\n            signal = b\"\"\n\n        contents: bytes = pickle.dumps(msg.contents)\n\n        sys.stderr.buffer.write(signal)\n        sys.stdout.buffer.write(contents)\n\n        sys.stderr.buffer.flush()\n        sys.stdout.buffer.flush()\n    else:\n        raw_signal: str\n        if msg.signal:\n            raw_signal = msg.signal\n        else:\n            raw_signal = \"\"\n\n        raw_contents: str\n        if isinstance(msg.contents, str):\n            raw_contents = msg.contents\n        elif msg.contents is None:\n            raw_contents = \"\"\n        else:\n            raise ValueError(\n                f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n            )\n        sys.stderr.write(raw_signal)\n        sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator","title":"<code>SocketCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication over Unix or TCP sockets.</p> <p>Communication is provided either using sockets with the Python socket library or using ZMQ. The choice of implementation is controlled by the global bool <code>USE_ZMQ</code>.</p> Whether to use TCP or Unix sockets is controlled by the environment <p><code>LUTE_USE_TCP=1</code></p> <p>If defined, TCP sockets will be used, otherwise Unix sockets will be used.</p> <p>Regardless of socket type, the environment variable                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> will be defined by the Executor-side Communicator.</p> <p>For TCP sockets: The Executor-side Communicator should be run first and will bind to all interfaces on the port determined by the environment variable:                         <code>LUTE_PORT=###</code> If no port is defined, a port scan will be performed and the Executor-side Communicator will bind the first one available from a random selection. It will then define the environment variable so the Task-side can pick it up.</p> <p>For Unix sockets: The path to the Unix socket is defined by the environment variable:                   <code>LUTE_SOCKET=/path/to/socket</code> This class assumes proper permissions and that this above environment variable has been defined. The <code>Task</code> is configured as what would commonly be referred to as the <code>client</code>, while the <code>Executor</code> is configured as the server.</p> <p>If the Task process is run on a different machine than the Executor, the Task-side Communicator will open a ssh-tunnel to forward traffic from a local Unix socket to the Executor Unix socket. Opening of the tunnel relies on the environment variable:                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> to determine the Executor's host. This variable should be defined by the Executor and passed to the Task process automatically, but it can also be defined manually if launching the Task process separately. The Task will use the local socket <code>&lt;LUTE_SOCKET&gt;.task{##}</code>. Multiple local sockets may be created. Currently, it is assumed that the user is identical on both the Task machine and Executor machine.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class SocketCommunicator(Communicator):\n    \"\"\"Provides communication over Unix or TCP sockets.\n\n    Communication is provided either using sockets with the Python socket library\n    or using ZMQ. The choice of implementation is controlled by the global bool\n    `USE_ZMQ`.\n\n    Whether to use TCP or Unix sockets is controlled by the environment:\n                           `LUTE_USE_TCP=1`\n    If defined, TCP sockets will be used, otherwise Unix sockets will be used.\n\n    Regardless of socket type, the environment variable\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    will be defined by the Executor-side Communicator.\n\n\n    For TCP sockets:\n    The Executor-side Communicator should be run first and will bind to all\n    interfaces on the port determined by the environment variable:\n                            `LUTE_PORT=###`\n    If no port is defined, a port scan will be performed and the Executor-side\n    Communicator will bind the first one available from a random selection. It\n    will then define the environment variable so the Task-side can pick it up.\n\n    For Unix sockets:\n    The path to the Unix socket is defined by the environment variable:\n                      `LUTE_SOCKET=/path/to/socket`\n    This class assumes proper permissions and that this above environment\n    variable has been defined. The `Task` is configured as what would commonly\n    be referred to as the `client`, while the `Executor` is configured as the\n    server.\n\n    If the Task process is run on a different machine than the Executor, the\n    Task-side Communicator will open a ssh-tunnel to forward traffic from a local\n    Unix socket to the Executor Unix socket. Opening of the tunnel relies on the\n    environment variable:\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    to determine the Executor's host. This variable should be defined by the\n    Executor and passed to the Task process automatically, but it can also be\n    defined manually if launching the Task process separately. The Task will use\n    the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n    created. Currently, it is assumed that the user is identical on both the Task\n    machine and Executor machine.\n    \"\"\"\n\n    ACCEPT_TIMEOUT: float = 0.01\n    \"\"\"\n    Maximum time to wait to accept connections. Used by Executor-side.\n    \"\"\"\n    MSG_HEAD: bytes = b\"MSG\"\n    \"\"\"\n    Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].\n    \"\"\"\n    MSG_SEP: bytes = b\";;;\"\n    \"\"\"\n    Separator for parts of a message. Messages have a start, length, message and end.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC over a TCP or Unix socket.\n\n        Unlike with the PipeCommunicator, pickle is always used to send data\n        through the socket.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n            use_pickle (bool): Whether to use pickle. Always True currently,\n                passing False does not change behaviour.\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n\n    def delayed_setup(self) -&gt; None:\n        \"\"\"Delays the creation of socket objects.\n\n        The Executor initializes the Communicator when it is created. Since\n        all Executors are created and available at once we want to delay\n        acquisition of socket resources until a single Executor is ready\n        to use them.\n        \"\"\"\n        self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n        if USE_ZMQ:\n            self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n            self._context: zmq.context.Context = zmq.Context()\n            self._data_socket = self._create_socket_zmq()\n        else:\n            self.desc: str = \"Communicates through a TCP or Unix socket.\"\n            self._data_socket = self._create_socket_raw()\n            self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n        if self._party == Party.EXECUTOR:\n            # Executor created first so we can define the hostname env variable\n            os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n            # Setup reader thread\n            self._reader_thread: threading.Thread = threading.Thread(\n                target=self._read_socket\n            )\n            self._msg_queue: queue.Queue = queue.Queue()\n            self._partial_msg: Optional[bytes] = None\n            self._stop_thread: bool = False\n            self._reader_thread.start()\n        else:\n            # Only used by Party.TASK\n            self._use_ssh_tunnel: bool = False\n            self._ssh_proc: Optional[subprocess.Popen] = None\n            self._local_socket_path: Optional[str] = None\n\n    # Read\n    ############################################################################\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Return a message from the queue if available.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Args:\n            proc (subprocess.Popen): The process to read from. Provided for\n                compatibility with other Communicator subtypes. Is ignored.\n\n        Returns:\n             msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        msg: Message\n        try:\n            msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n        except queue.Empty:\n            msg = Message()\n\n        return msg\n\n    def _read_socket(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Calls an underlying method for either raw sockets or ZMQ.\n        \"\"\"\n\n        while True:\n            if self._stop_thread:\n                logger.debug(\"Stopping socket reader thread.\")\n                break\n            if USE_ZMQ:\n                self._read_socket_zmq()\n            else:\n                self._read_socket_raw()\n\n    def _read_socket_raw(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Raw socket implementation for the reader thread.\n        \"\"\"\n        connection: socket.socket\n        addr: Union[str, Tuple[str, int]]\n        try:\n            connection, addr = self._data_socket.accept()\n            full_data: bytes = b\"\"\n            while True:\n                data: bytes = connection.recv(8192)\n                if data:\n                    full_data += data\n                else:\n                    break\n            connection.close()\n            self._unpack_messages(full_data)\n        except socket.timeout:\n            pass\n\n    def _read_socket_zmq(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        ZMQ implementation for the reader thread.\n        \"\"\"\n        try:\n            full_data: bytes = self._data_socket.recv(0)\n            self._unpack_messages(full_data)\n        except zmq.ZMQError:\n            pass\n\n    def _unpack_messages(self, data: bytes) -&gt; None:\n        \"\"\"Unpacks a byte stream into individual messages.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        Partial messages (a series of bytes which cannot be converted to a full\n        message) are stored for later. An attempt is made to reconstruct the\n        message with the next call to this method.\n\n        Args:\n            data (bytes): A raw byte stream containing anywhere from a partial\n                message to multiple full messages.\n        \"\"\"\n        msg: Message\n        working_data: bytes\n        if self._partial_msg:\n            # Concatenate the previous partial message to the beginning\n            working_data = self._partial_msg + data\n            self._partial_msg = None\n        else:\n            working_data = data\n        while working_data:\n            try:\n                # Message encoding: &lt;HEAD&gt;&lt;SEP&gt;&lt;len&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n                end = working_data.find(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD[::-1]\n                )\n                msg_parts: List[bytes] = working_data[:end].split(\n                    SocketCommunicator.MSG_SEP\n                )\n                if len(msg_parts) != 3:\n                    self._partial_msg = working_data\n                    break\n\n                cmd: bytes\n                nbytes: bytes\n                raw_msg: bytes\n                cmd, nbytes, raw_msg = msg_parts\n                if len(raw_msg) != int(nbytes):\n                    self._partial_msg = working_data\n                    break\n                msg = pickle.loads(raw_msg)\n                self._msg_queue.put(msg)\n            except pickle.UnpicklingError:\n                self._partial_msg = working_data\n                break\n            if end &lt; len(working_data):\n                # Add len(SEP+HEAD) since end marks the start of &lt;SEP&gt;&lt;HEAD[::-1]\n                offset: int = len(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD\n                )\n                working_data = working_data[end + offset :]\n            else:\n                working_data = b\"\"\n\n    # Write\n    ############################################################################\n\n    def _write_socket(self, msg: Message) -&gt; None:\n        \"\"\"Sends data over a socket from the 'client' (Task) side.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        This structure is used for decoding the message on the other end.\n        \"\"\"\n        data: bytes = pickle.dumps(msg)\n        cmd: bytes = SocketCommunicator.MSG_HEAD\n        size: bytes = b\"%d\" % len(data)\n        end: bytes = SocketCommunicator.MSG_HEAD[::-1]\n        sep: bytes = SocketCommunicator.MSG_SEP\n        packed_msg: bytes = cmd + sep + size + sep + data + sep + end\n        if USE_ZMQ:\n            self._data_socket.send(packed_msg)\n        else:\n            self._data_socket.sendall(packed_msg)\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Send a single Message.\n\n        The entire Message (signal and contents) is serialized and sent through\n        a connection over Unix socket.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        self._write_socket(msg)\n\n    # Generic create\n    ############################################################################\n\n    def _create_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): TCP or Unix socket.\n        \"\"\"\n        import struct\n\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        sock: socket.socket\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw TCP sockets.\")\n            sock = self._init_tcp_socket_raw()\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw Unix sockets.\")\n            sock = self._init_unix_socket_raw()\n        sock.setsockopt(\n            socket.SOL_SOCKET, socket.SO_LINGER, struct.pack(\"ii\", 1, 10000)\n        )\n        return sock\n\n    def _create_socket_zmq(self) -&gt; zmq.sugar.socket.Socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_type: Literal[zmq.PULL, zmq.PUSH]\n        if self._party == Party.EXECUTOR:\n            socket_type = zmq.PULL\n        else:\n            socket_type = zmq.PUSH\n\n        data_socket: zmq.sugar.socket.Socket = self._context.socket(socket_type)\n        data_socket.set_hwm(160000)\n        # Need to multiply by 1000 since ZMQ uses ms\n        data_socket.setsockopt(\n            zmq.RCVTIMEO, int(SocketCommunicator.ACCEPT_TIMEOUT * 1000)\n        )\n        # Try TCP first\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use TCP (ZMQ).\")\n            self._init_tcp_socket_zmq(data_socket)\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use Unix sockets (ZMQ).\")\n            self._init_unix_socket_zmq(data_socket)\n\n        return data_socket\n\n    # TCP Init\n    ############################################################################\n\n    def _find_random_port(\n        self, min_port: int = 41923, max_port: int = 64324, max_tries: int = 100\n    ) -&gt; Optional[int]:\n        \"\"\"Find a random open port to bind to if using TCP.\"\"\"\n        from random import choices\n\n        sock: socket.socket\n        ports: List[int] = choices(range(min_port, max_port), k=max_tries)\n        for port in ports:\n            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n            try:\n                sock.bind((\"\", port))\n                sock.close()\n                del sock\n                return port\n            except:\n                continue\n        return None\n\n    def _init_tcp_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Initialize a TCP socket.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_PORT=###`\n        is defined, if so binds it, otherwise find a free port from a selection\n        of random ports. If a port search is performed, the `LUTE_PORT` variable\n        will be defined so it can be picked up by the the Task-side Communicator.\n\n        In the event that no port can be bound on the Executor-side, or the port\n        and hostname information is unavailable to the Task-side, the program\n        will exit.\n\n        Returns:\n            data_socket (socket.socket): TCP socket object.\n        \"\"\"\n        data_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                # If port is None find one\n                # Executor code executes first\n                port = self._find_random_port()\n                if port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                # Provide port env var for Task-side\n                os.environ[\"LUTE_PORT\"] = str(port)\n            data_socket.bind((\"\", int(port)))\n            data_socket.listen()\n        else:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect((\"localhost\", int(port)))\n            else:\n                data_socket.connect((executor_hostname, int(port)))\n        return data_socket\n\n    def _init_tcp_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a TCP socket using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (zmq.socket.Socket): Socket object.\n        \"\"\"\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                new_port: int = data_socket.bind_to_random_port(\"tcp://*\")\n                if new_port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                port = new_port\n                os.environ[\"LUTE_PORT\"] = str(port)\n            else:\n                data_socket.bind(f\"tcp://*:{port}\")\n            logger.debug(f\"Executor bound port {port}\")\n        else:\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            data_socket.connect(f\"tcp://{executor_hostname}:{port}\")\n\n    # Unix Init\n    ############################################################################\n\n    def _get_socket_path(self) -&gt; str:\n        \"\"\"Return the socket path, defining one if it is not available.\n\n        Returns:\n            socket_path (str): Path to the Unix socket.\n        \"\"\"\n        socket_path: str\n        try:\n            socket_path = os.environ[\"LUTE_SOCKET\"]\n        except KeyError as err:\n            import uuid\n            import tempfile\n\n            # Define a path, and add to environment\n            # Executor-side always created first, Task will use the same one\n            socket_path = f\"{tempfile.gettempdir()}/lute_{uuid.uuid4().hex}.sock\"\n            os.environ[\"LUTE_SOCKET\"] = socket_path\n            logger.debug(f\"SocketCommunicator defines socket_path: {socket_path}\")\n        if USE_ZMQ:\n            return f\"ipc://{socket_path}\"\n        else:\n            return socket_path\n\n    def _init_unix_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Returns a Unix socket object.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_SOCKET=XYZ`\n        is defined, if so binds it, otherwise it will create a new path and\n        define the environment variable for the Task-side to find.\n\n        On the Task (client-side), this method will also open a SSH tunnel to\n        forward a local Unix socket to an Executor Unix socket if the Task and\n        Executor processes are on different machines.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_path: str = self._get_socket_path()\n        data_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n            data_socket.listen()\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path, hostname, executor_hostname\n                )\n                while 1:\n                    # Keep trying reconnect until ssh tunnel works.\n                    try:\n                        data_socket.connect(self._local_socket_path)\n                        break\n                    except FileNotFoundError:\n                        continue\n\n        return data_socket\n\n    def _init_unix_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a Unix socket object, using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (socket.socket): ZMQ object.\n        \"\"\"\n        socket_path = self._get_socket_path()\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                self._data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                # Need to remove ipc:// from socket_path for forwarding\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path[6:], hostname, executor_hostname\n                )\n                # Need to add it back\n                path: str = f\"ipc://{self._local_socket_path}\"\n                data_socket.connect(path)\n\n    def _setup_unix_ssh_tunnel(\n        self, socket_path: str, hostname: str, executor_hostname: str\n    ) -&gt; str:\n        \"\"\"Prepares an SSH tunnel for forwarding between Unix sockets on two hosts.\n\n        An SSH tunnel is opened with `ssh -L &lt;local&gt;:&lt;remote&gt; sleep 2`.\n        This method of communication is slightly slower and incurs additional\n        overhead - it should only be used as a backup. If communication across\n        multiple hosts is required consider using TCP.  The Task will use\n        the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n        created. It is assumed that the user is identical on both the\n        Task machine and Executor machine.\n\n        Returns:\n            local_socket_path (str): The local Unix socket to connect to.\n        \"\"\"\n        if \"uuid\" not in globals():\n            import uuid\n        local_socket_path = f\"{socket_path}.task{uuid.uuid4().hex[:4]}\"\n        self._use_ssh_tunnel = True\n        ssh_cmd: List[str] = [\n            \"ssh\",\n            \"-o\",\n            \"LogLevel=quiet\",\n            \"-L\",\n            f\"{local_socket_path}:{socket_path}\",\n            executor_hostname,\n            \"sleep\",\n            \"2\",\n        ]\n        logger.debug(f\"Opening tunnel from {hostname} to {executor_hostname}\")\n        self._ssh_proc = subprocess.Popen(ssh_cmd)\n        time.sleep(0.4)  # Need to wait... -&gt; Use single Task comm at beginning?\n        return local_socket_path\n\n    # Clean up and properties\n    ############################################################################\n\n    def _clean_up(self) -&gt; None:\n        \"\"\"Clean up connections.\"\"\"\n        if self._party == Party.EXECUTOR:\n            self._stop_thread = True\n            self._reader_thread.join()\n            logger.debug(\"Closed reading thread.\")\n\n        self._data_socket.close()\n        if USE_ZMQ:\n            self._context.term()\n        else:\n            ...\n\n        if os.getenv(\"LUTE_USE_TCP\"):\n            return\n        else:\n            if self._party == Party.EXECUTOR:\n                os.unlink(os.getenv(\"LUTE_SOCKET\"))  # Should be defined\n                return\n            elif self._use_ssh_tunnel:\n                if self._ssh_proc is not None:\n                    self._ssh_proc.terminate()\n\n    @property\n    def has_messages(self) -&gt; bool:\n        if self._party == Party.TASK:\n            # Shouldn't be called on Task-side\n            return False\n\n        if self._msg_queue.qsize() &gt; 0:\n            return True\n        return False\n\n    def __exit__(self):\n        self._clean_up()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.ACCEPT_TIMEOUT","title":"<code>ACCEPT_TIMEOUT: float = 0.01</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Maximum time to wait to accept connections. Used by Executor-side.</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.MSG_HEAD","title":"<code>MSG_HEAD: bytes = b'MSG'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.MSG_SEP","title":"<code>MSG_SEP: bytes = b';;;'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Separator for parts of a message. Messages have a start, length, message and end.</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC over a TCP or Unix socket.</p> <p>Unlike with the PipeCommunicator, pickle is always used to send data through the socket.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to use pickle. Always True currently, passing False does not change behaviour.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC over a TCP or Unix socket.\n\n    Unlike with the PipeCommunicator, pickle is always used to send data\n    through the socket.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n        use_pickle (bool): Whether to use pickle. Always True currently,\n            passing False does not change behaviour.\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Delays the creation of socket objects.</p> <p>The Executor initializes the Communicator when it is created. Since all Executors are created and available at once we want to delay acquisition of socket resources until a single Executor is ready to use them.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self) -&gt; None:\n    \"\"\"Delays the creation of socket objects.\n\n    The Executor initializes the Communicator when it is created. Since\n    all Executors are created and available at once we want to delay\n    acquisition of socket resources until a single Executor is ready\n    to use them.\n    \"\"\"\n    self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n    if USE_ZMQ:\n        self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n        self._context: zmq.context.Context = zmq.Context()\n        self._data_socket = self._create_socket_zmq()\n    else:\n        self.desc: str = \"Communicates through a TCP or Unix socket.\"\n        self._data_socket = self._create_socket_raw()\n        self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n    if self._party == Party.EXECUTOR:\n        # Executor created first so we can define the hostname env variable\n        os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n        # Setup reader thread\n        self._reader_thread: threading.Thread = threading.Thread(\n            target=self._read_socket\n        )\n        self._msg_queue: queue.Queue = queue.Queue()\n        self._partial_msg: Optional[bytes] = None\n        self._stop_thread: bool = False\n        self._reader_thread.start()\n    else:\n        # Only used by Party.TASK\n        self._use_ssh_tunnel: bool = False\n        self._ssh_proc: Optional[subprocess.Popen] = None\n        self._local_socket_path: Optional[str] = None\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Return a message from the queue if available.</p> <p>Socket(s) are continuously monitored, and read from when new data is available.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from. Provided for compatibility with other Communicator subtypes. Is ignored.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Return a message from the queue if available.\n\n    Socket(s) are continuously monitored, and read from when new data is\n    available.\n\n    Args:\n        proc (subprocess.Popen): The process to read from. Provided for\n            compatibility with other Communicator subtypes. Is ignored.\n\n    Returns:\n         msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    msg: Message\n    try:\n        msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n    except queue.Empty:\n        msg = Message()\n\n    return msg\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Send a single Message.</p> <p>The entire Message (signal and contents) is serialized and sent through a connection over Unix socket.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Send a single Message.\n\n    The entire Message (signal and contents) is serialized and sent through\n    a connection over Unix socket.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    self._write_socket(msg)\n</code></pre>"},{"location":"source/execution/ipc/","title":"ipc","text":"<p>Classes and utilities for communication between Executors and subprocesses.</p> <p>Communicators manage message passing and parsing between subprocesses. They maintain a limited public interface of \"read\" and \"write\" operations. Behind this interface the methods of communication vary from serialization across pipes to Unix sockets, etc. All communicators pass a single object called a \"Message\" which contains an arbitrary \"contents\" field as well as an optional \"signal\" field.</p> <p>Classes:</p> Name Description <code>Party</code> <p>Enum describing whether Communicator is on Task-side or Executor-side.</p> <code>Message</code> <p>A dataclass used for passing information from Task to Executor.</p> <code>Communicator</code> <p>Abstract base class for Communicator types.</p> <code>PipeCommunicator</code> <p>Manages communication between Task and Executor via pipes (stderr and stdout).</p> <code>SocketCommunicator</code> <p>Manages communication using sockets, either raw or using zmq. Supports both TCP and Unix sockets.</p>"},{"location":"source/execution/ipc/#execution.ipc.Communicator","title":"<code>Communicator</code>","text":"<p>               Bases: <code>ABC</code></p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Communicator(ABC):\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"Abstract Base Class for IPC Communicator objects.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using pickle prior to\n                sending it.\n        \"\"\"\n        self._party = party\n        self._use_pickle = use_pickle\n        self.desc = \"Communicator abstract base class.\"\n\n    @abstractmethod\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Method for reading data through the communication mechanism.\"\"\"\n        ...\n\n    @abstractmethod\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Method for sending data through the communication mechanism.\"\"\"\n        ...\n\n    def __str__(self):\n        name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        return f\"{name}: {self.desc}\"\n\n    def __repr__(self):\n        return self.__str__()\n\n    def __enter__(self) -&gt; Self:\n        return self\n\n    def __exit__(self) -&gt; None: ...\n\n    @property\n    def has_messages(self) -&gt; bool:\n        \"\"\"Whether the Communicator has remaining messages.\n\n        The precise method for determining whether there are remaining messages\n        will depend on the specific Communicator sub-class.\n        \"\"\"\n        return False\n\n    def stage_communicator(self):\n        \"\"\"Alternative method for staging outside of context manager.\"\"\"\n        self.__enter__()\n\n    def clear_communicator(self):\n        \"\"\"Alternative exit method outside of context manager.\"\"\"\n        self.__exit__()\n\n    def delayed_setup(self):\n        \"\"\"Any setup that should be done later than init.\"\"\"\n        ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.has_messages","title":"<code>has_messages: bool</code>  <code>property</code>","text":"<p>Whether the Communicator has remaining messages.</p> <p>The precise method for determining whether there are remaining messages will depend on the specific Communicator sub-class.</p>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>Abstract Base Class for IPC Communicator objects.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using pickle prior to sending it.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"Abstract Base Class for IPC Communicator objects.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using pickle prior to\n            sending it.\n    \"\"\"\n    self._party = party\n    self._use_pickle = use_pickle\n    self.desc = \"Communicator abstract base class.\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.clear_communicator","title":"<code>clear_communicator()</code>","text":"<p>Alternative exit method outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def clear_communicator(self):\n    \"\"\"Alternative exit method outside of context manager.\"\"\"\n    self.__exit__()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Any setup that should be done later than init.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self):\n    \"\"\"Any setup that should be done later than init.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.read","title":"<code>read(proc)</code>  <code>abstractmethod</code>","text":"<p>Method for reading data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Method for reading data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.stage_communicator","title":"<code>stage_communicator()</code>","text":"<p>Alternative method for staging outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def stage_communicator(self):\n    \"\"\"Alternative method for staging outside of context manager.\"\"\"\n    self.__enter__()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.write","title":"<code>write(msg)</code>  <code>abstractmethod</code>","text":"<p>Method for sending data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef write(self, msg: Message) -&gt; None:\n    \"\"\"Method for sending data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Party","title":"<code>Party</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Identifier for which party (side/end) is using a communicator.</p> <p>For some types of communication streams there may be different interfaces depending on which side of the communicator you are on. This enum is used by the communicator to determine which interface to use.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Party(Enum):\n    \"\"\"Identifier for which party (side/end) is using a communicator.\n\n    For some types of communication streams there may be different interfaces\n    depending on which side of the communicator you are on. This enum is used\n    by the communicator to determine which interface to use.\n    \"\"\"\n\n    TASK = 0\n    \"\"\"\n    The Task (client) side.\n    \"\"\"\n    EXECUTOR = 1\n    \"\"\"\n    The Executor (server) side.\n    \"\"\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Party.EXECUTOR","title":"<code>EXECUTOR = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Executor (server) side.</p>"},{"location":"source/execution/ipc/#execution.ipc.Party.TASK","title":"<code>TASK = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Task (client) side.</p>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator","title":"<code>PipeCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication through pipes over stderr/stdout.</p> <p>The implementation of this communicator has reading and writing ocurring on stderr and stdout. In general the <code>Task</code> will be writing while the <code>Executor</code> will be reading. <code>stderr</code> is used for sending signals.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class PipeCommunicator(Communicator):\n    \"\"\"Provides communication through pipes over stderr/stdout.\n\n    The implementation of this communicator has reading and writing ocurring\n    on stderr and stdout. In general the `Task` will be writing while the\n    `Executor` will be reading. `stderr` is used for sending signals.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC through pipes.\n\n        Arbitrary objects may be transmitted using pickle to serialize the data.\n        If pickle is not used\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using Pickle prior to\n                sending it. If False, data is assumed to be text whi\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n        self.desc = \"Communicates through stderr and stdout using pickle.\"\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Read from stdout and stderr.\n\n        Args:\n            proc (subprocess.Popen): The process to read from.\n\n        Returns:\n            msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        signal: Optional[str]\n        contents: Optional[str]\n        raw_signal: bytes = proc.stderr.read()\n        raw_contents: bytes = proc.stdout.read()\n        if raw_signal is not None:\n            signal = raw_signal.decode()\n        else:\n            signal = raw_signal\n        if raw_contents:\n            if self._use_pickle:\n                try:\n                    contents = pickle.loads(raw_contents)\n                except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                    self._use_pickle = False\n                    contents = self._safe_unpickle_decode(raw_contents)\n            else:\n                try:\n                    contents = raw_contents.decode()\n                except UnicodeDecodeError as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                    self._use_pickle = True\n                    contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            contents = None\n\n        if signal and signal not in LUTE_SIGNALS:\n            # Some tasks write on stderr\n            # If the signal channel has \"non-signal\" info, add it to\n            # contents\n            if not contents:\n                contents = f\"({signal})\"\n            else:\n                contents = f\"{contents} ({signal})\"\n            signal = None\n\n        return Message(contents=contents, signal=signal)\n\n    def _safe_unpickle_decode(self, maybe_mixed: bytes) -&gt; Optional[str]:\n        \"\"\"This method is used to unpickle and/or decode a bytes object.\n\n        It attempts to handle cases where contents can be mixed, i.e., part of\n        the message must be decoded and the other part unpickled. It handles\n        only two-way splits. If there are more complex arrangements such as:\n        &lt;pickled&gt;:&lt;unpickled&gt;:&lt;pickled&gt; etc, it will give up.\n\n        The simpler two way splits are unlikely to occur in normal usage. They\n        may arise when debugging if, e.g., `print` statements are mixed with the\n        usage of the `_report_to_executor` method.\n\n        Note that this method works because ONLY text data is assumed to be\n        sent via the pipes. The method needs to be revised to handle non-text\n        data if the `Task` is modified to also send that via PipeCommunicator.\n        The use of pickle is supported to provide for this option if it is\n        necessary. It may be deprecated in the future.\n\n        Be careful when making changes. This method has seemingly redundant\n        checks because unpickling will not throw an error if a full object can\n        be retrieved. That is, the library will ignore extraneous bytes. This\n        method attempts to retrieve that information if the pickled data comes\n        first in the stream.\n\n        Args:\n            maybe_mixed (bytes): A bytes object which could require unpickling,\n                decoding, or both.\n\n        Returns:\n            contents (Optional[str]): The unpickled/decoded contents if possible.\n                Otherwise, None.\n        \"\"\"\n        contents: Optional[str]\n        try:\n            contents = pickle.loads(maybe_mixed)\n            repickled: bytes = pickle.dumps(contents)\n            if len(repickled) &lt; len(maybe_mixed):\n                # Successful unpickling, but pickle stops even if there are more bytes\n                try:\n                    additional_data: str = maybe_mixed[len(repickled) :].decode()\n                    contents = f\"{contents}{additional_data}\"\n                except UnicodeDecodeError:\n                    # Can't decode the bytes left by pickle, so they are lost\n                    missing_bytes: int = len(maybe_mixed) - len(repickled)\n                    logger.debug(\n                        f\"PipeCommunicator has truncated message. Unable to retrieve {missing_bytes} bytes.\"\n                    )\n        except (pickle.UnpicklingError, ValueError, EOFError) as err:\n            # Pickle may also throw a ValueError, e.g. this bytes: b\"Found! \\n\"\n            # Pickle may also throw an EOFError, eg. this bytes: b\"F0\\n\"\n            try:\n                contents = maybe_mixed.decode()\n            except UnicodeDecodeError as err2:\n                try:\n                    contents = maybe_mixed[: err2.start].decode()\n                    contents = f\"{contents}{pickle.loads(maybe_mixed[err2.start:])}\"\n                except Exception as err3:\n                    logger.debug(\n                        f\"PipeCommunicator unable to decode/parse data! {err3}\"\n                    )\n                    contents = None\n        return contents\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Write to stdout and stderr.\n\n         The signal component is sent to `stderr` while the contents of the\n         Message are sent to `stdout`.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        if self._use_pickle:\n            signal: bytes\n            if msg.signal:\n                signal = msg.signal.encode()\n            else:\n                signal = b\"\"\n\n            contents: bytes = pickle.dumps(msg.contents)\n\n            sys.stderr.buffer.write(signal)\n            sys.stdout.buffer.write(contents)\n\n            sys.stderr.buffer.flush()\n            sys.stdout.buffer.flush()\n        else:\n            raw_signal: str\n            if msg.signal:\n                raw_signal = msg.signal\n            else:\n                raw_signal = \"\"\n\n            raw_contents: str\n            if isinstance(msg.contents, str):\n                raw_contents = msg.contents\n            elif msg.contents is None:\n                raw_contents = \"\"\n            else:\n                raise ValueError(\n                    f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n                )\n            sys.stderr.write(raw_signal)\n            sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC through pipes.</p> <p>Arbitrary objects may be transmitted using pickle to serialize the data. If pickle is not used</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using Pickle prior to sending it. If False, data is assumed to be text whi</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC through pipes.\n\n    Arbitrary objects may be transmitted using pickle to serialize the data.\n    If pickle is not used\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using Pickle prior to\n            sending it. If False, data is assumed to be text whi\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n    self.desc = \"Communicates through stderr and stdout using pickle.\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Read from stdout and stderr.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Read from stdout and stderr.\n\n    Args:\n        proc (subprocess.Popen): The process to read from.\n\n    Returns:\n        msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    signal: Optional[str]\n    contents: Optional[str]\n    raw_signal: bytes = proc.stderr.read()\n    raw_contents: bytes = proc.stdout.read()\n    if raw_signal is not None:\n        signal = raw_signal.decode()\n    else:\n        signal = raw_signal\n    if raw_contents:\n        if self._use_pickle:\n            try:\n                contents = pickle.loads(raw_contents)\n            except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                self._use_pickle = False\n                contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            try:\n                contents = raw_contents.decode()\n            except UnicodeDecodeError as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                self._use_pickle = True\n                contents = self._safe_unpickle_decode(raw_contents)\n    else:\n        contents = None\n\n    if signal and signal not in LUTE_SIGNALS:\n        # Some tasks write on stderr\n        # If the signal channel has \"non-signal\" info, add it to\n        # contents\n        if not contents:\n            contents = f\"({signal})\"\n        else:\n            contents = f\"{contents} ({signal})\"\n        signal = None\n\n    return Message(contents=contents, signal=signal)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Write to stdout and stderr.</p> <p>The signal component is sent to <code>stderr</code> while the contents of the  Message are sent to <code>stdout</code>.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Write to stdout and stderr.\n\n     The signal component is sent to `stderr` while the contents of the\n     Message are sent to `stdout`.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    if self._use_pickle:\n        signal: bytes\n        if msg.signal:\n            signal = msg.signal.encode()\n        else:\n            signal = b\"\"\n\n        contents: bytes = pickle.dumps(msg.contents)\n\n        sys.stderr.buffer.write(signal)\n        sys.stdout.buffer.write(contents)\n\n        sys.stderr.buffer.flush()\n        sys.stdout.buffer.flush()\n    else:\n        raw_signal: str\n        if msg.signal:\n            raw_signal = msg.signal\n        else:\n            raw_signal = \"\"\n\n        raw_contents: str\n        if isinstance(msg.contents, str):\n            raw_contents = msg.contents\n        elif msg.contents is None:\n            raw_contents = \"\"\n        else:\n            raise ValueError(\n                f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n            )\n        sys.stderr.write(raw_signal)\n        sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator","title":"<code>SocketCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication over Unix or TCP sockets.</p> <p>Communication is provided either using sockets with the Python socket library or using ZMQ. The choice of implementation is controlled by the global bool <code>USE_ZMQ</code>.</p> Whether to use TCP or Unix sockets is controlled by the environment <p><code>LUTE_USE_TCP=1</code></p> <p>If defined, TCP sockets will be used, otherwise Unix sockets will be used.</p> <p>Regardless of socket type, the environment variable                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> will be defined by the Executor-side Communicator.</p> <p>For TCP sockets: The Executor-side Communicator should be run first and will bind to all interfaces on the port determined by the environment variable:                         <code>LUTE_PORT=###</code> If no port is defined, a port scan will be performed and the Executor-side Communicator will bind the first one available from a random selection. It will then define the environment variable so the Task-side can pick it up.</p> <p>For Unix sockets: The path to the Unix socket is defined by the environment variable:                   <code>LUTE_SOCKET=/path/to/socket</code> This class assumes proper permissions and that this above environment variable has been defined. The <code>Task</code> is configured as what would commonly be referred to as the <code>client</code>, while the <code>Executor</code> is configured as the server.</p> <p>If the Task process is run on a different machine than the Executor, the Task-side Communicator will open a ssh-tunnel to forward traffic from a local Unix socket to the Executor Unix socket. Opening of the tunnel relies on the environment variable:                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> to determine the Executor's host. This variable should be defined by the Executor and passed to the Task process automatically, but it can also be defined manually if launching the Task process separately. The Task will use the local socket <code>&lt;LUTE_SOCKET&gt;.task{##}</code>. Multiple local sockets may be created. Currently, it is assumed that the user is identical on both the Task machine and Executor machine.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class SocketCommunicator(Communicator):\n    \"\"\"Provides communication over Unix or TCP sockets.\n\n    Communication is provided either using sockets with the Python socket library\n    or using ZMQ. The choice of implementation is controlled by the global bool\n    `USE_ZMQ`.\n\n    Whether to use TCP or Unix sockets is controlled by the environment:\n                           `LUTE_USE_TCP=1`\n    If defined, TCP sockets will be used, otherwise Unix sockets will be used.\n\n    Regardless of socket type, the environment variable\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    will be defined by the Executor-side Communicator.\n\n\n    For TCP sockets:\n    The Executor-side Communicator should be run first and will bind to all\n    interfaces on the port determined by the environment variable:\n                            `LUTE_PORT=###`\n    If no port is defined, a port scan will be performed and the Executor-side\n    Communicator will bind the first one available from a random selection. It\n    will then define the environment variable so the Task-side can pick it up.\n\n    For Unix sockets:\n    The path to the Unix socket is defined by the environment variable:\n                      `LUTE_SOCKET=/path/to/socket`\n    This class assumes proper permissions and that this above environment\n    variable has been defined. The `Task` is configured as what would commonly\n    be referred to as the `client`, while the `Executor` is configured as the\n    server.\n\n    If the Task process is run on a different machine than the Executor, the\n    Task-side Communicator will open a ssh-tunnel to forward traffic from a local\n    Unix socket to the Executor Unix socket. Opening of the tunnel relies on the\n    environment variable:\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    to determine the Executor's host. This variable should be defined by the\n    Executor and passed to the Task process automatically, but it can also be\n    defined manually if launching the Task process separately. The Task will use\n    the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n    created. Currently, it is assumed that the user is identical on both the Task\n    machine and Executor machine.\n    \"\"\"\n\n    ACCEPT_TIMEOUT: float = 0.01\n    \"\"\"\n    Maximum time to wait to accept connections. Used by Executor-side.\n    \"\"\"\n    MSG_HEAD: bytes = b\"MSG\"\n    \"\"\"\n    Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].\n    \"\"\"\n    MSG_SEP: bytes = b\";;;\"\n    \"\"\"\n    Separator for parts of a message. Messages have a start, length, message and end.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC over a TCP or Unix socket.\n\n        Unlike with the PipeCommunicator, pickle is always used to send data\n        through the socket.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n            use_pickle (bool): Whether to use pickle. Always True currently,\n                passing False does not change behaviour.\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n\n    def delayed_setup(self) -&gt; None:\n        \"\"\"Delays the creation of socket objects.\n\n        The Executor initializes the Communicator when it is created. Since\n        all Executors are created and available at once we want to delay\n        acquisition of socket resources until a single Executor is ready\n        to use them.\n        \"\"\"\n        self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n        if USE_ZMQ:\n            self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n            self._context: zmq.context.Context = zmq.Context()\n            self._data_socket = self._create_socket_zmq()\n        else:\n            self.desc: str = \"Communicates through a TCP or Unix socket.\"\n            self._data_socket = self._create_socket_raw()\n            self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n        if self._party == Party.EXECUTOR:\n            # Executor created first so we can define the hostname env variable\n            os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n            # Setup reader thread\n            self._reader_thread: threading.Thread = threading.Thread(\n                target=self._read_socket\n            )\n            self._msg_queue: queue.Queue = queue.Queue()\n            self._partial_msg: Optional[bytes] = None\n            self._stop_thread: bool = False\n            self._reader_thread.start()\n        else:\n            # Only used by Party.TASK\n            self._use_ssh_tunnel: bool = False\n            self._ssh_proc: Optional[subprocess.Popen] = None\n            self._local_socket_path: Optional[str] = None\n\n    # Read\n    ############################################################################\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Return a message from the queue if available.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Args:\n            proc (subprocess.Popen): The process to read from. Provided for\n                compatibility with other Communicator subtypes. Is ignored.\n\n        Returns:\n             msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        msg: Message\n        try:\n            msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n        except queue.Empty:\n            msg = Message()\n\n        return msg\n\n    def _read_socket(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Calls an underlying method for either raw sockets or ZMQ.\n        \"\"\"\n\n        while True:\n            if self._stop_thread:\n                logger.debug(\"Stopping socket reader thread.\")\n                break\n            if USE_ZMQ:\n                self._read_socket_zmq()\n            else:\n                self._read_socket_raw()\n\n    def _read_socket_raw(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Raw socket implementation for the reader thread.\n        \"\"\"\n        connection: socket.socket\n        addr: Union[str, Tuple[str, int]]\n        try:\n            connection, addr = self._data_socket.accept()\n            full_data: bytes = b\"\"\n            while True:\n                data: bytes = connection.recv(8192)\n                if data:\n                    full_data += data\n                else:\n                    break\n            connection.close()\n            self._unpack_messages(full_data)\n        except socket.timeout:\n            pass\n\n    def _read_socket_zmq(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        ZMQ implementation for the reader thread.\n        \"\"\"\n        try:\n            full_data: bytes = self._data_socket.recv(0)\n            self._unpack_messages(full_data)\n        except zmq.ZMQError:\n            pass\n\n    def _unpack_messages(self, data: bytes) -&gt; None:\n        \"\"\"Unpacks a byte stream into individual messages.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        Partial messages (a series of bytes which cannot be converted to a full\n        message) are stored for later. An attempt is made to reconstruct the\n        message with the next call to this method.\n\n        Args:\n            data (bytes): A raw byte stream containing anywhere from a partial\n                message to multiple full messages.\n        \"\"\"\n        msg: Message\n        working_data: bytes\n        if self._partial_msg:\n            # Concatenate the previous partial message to the beginning\n            working_data = self._partial_msg + data\n            self._partial_msg = None\n        else:\n            working_data = data\n        while working_data:\n            try:\n                # Message encoding: &lt;HEAD&gt;&lt;SEP&gt;&lt;len&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n                end = working_data.find(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD[::-1]\n                )\n                msg_parts: List[bytes] = working_data[:end].split(\n                    SocketCommunicator.MSG_SEP\n                )\n                if len(msg_parts) != 3:\n                    self._partial_msg = working_data\n                    break\n\n                cmd: bytes\n                nbytes: bytes\n                raw_msg: bytes\n                cmd, nbytes, raw_msg = msg_parts\n                if len(raw_msg) != int(nbytes):\n                    self._partial_msg = working_data\n                    break\n                msg = pickle.loads(raw_msg)\n                self._msg_queue.put(msg)\n            except pickle.UnpicklingError:\n                self._partial_msg = working_data\n                break\n            if end &lt; len(working_data):\n                # Add len(SEP+HEAD) since end marks the start of &lt;SEP&gt;&lt;HEAD[::-1]\n                offset: int = len(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD\n                )\n                working_data = working_data[end + offset :]\n            else:\n                working_data = b\"\"\n\n    # Write\n    ############################################################################\n\n    def _write_socket(self, msg: Message) -&gt; None:\n        \"\"\"Sends data over a socket from the 'client' (Task) side.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        This structure is used for decoding the message on the other end.\n        \"\"\"\n        data: bytes = pickle.dumps(msg)\n        cmd: bytes = SocketCommunicator.MSG_HEAD\n        size: bytes = b\"%d\" % len(data)\n        end: bytes = SocketCommunicator.MSG_HEAD[::-1]\n        sep: bytes = SocketCommunicator.MSG_SEP\n        packed_msg: bytes = cmd + sep + size + sep + data + sep + end\n        if USE_ZMQ:\n            self._data_socket.send(packed_msg)\n        else:\n            self._data_socket.sendall(packed_msg)\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Send a single Message.\n\n        The entire Message (signal and contents) is serialized and sent through\n        a connection over Unix socket.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        self._write_socket(msg)\n\n    # Generic create\n    ############################################################################\n\n    def _create_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): TCP or Unix socket.\n        \"\"\"\n        import struct\n\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        sock: socket.socket\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw TCP sockets.\")\n            sock = self._init_tcp_socket_raw()\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw Unix sockets.\")\n            sock = self._init_unix_socket_raw()\n        sock.setsockopt(\n            socket.SOL_SOCKET, socket.SO_LINGER, struct.pack(\"ii\", 1, 10000)\n        )\n        return sock\n\n    def _create_socket_zmq(self) -&gt; zmq.sugar.socket.Socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_type: Literal[zmq.PULL, zmq.PUSH]\n        if self._party == Party.EXECUTOR:\n            socket_type = zmq.PULL\n        else:\n            socket_type = zmq.PUSH\n\n        data_socket: zmq.sugar.socket.Socket = self._context.socket(socket_type)\n        data_socket.set_hwm(160000)\n        # Need to multiply by 1000 since ZMQ uses ms\n        data_socket.setsockopt(\n            zmq.RCVTIMEO, int(SocketCommunicator.ACCEPT_TIMEOUT * 1000)\n        )\n        # Try TCP first\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use TCP (ZMQ).\")\n            self._init_tcp_socket_zmq(data_socket)\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use Unix sockets (ZMQ).\")\n            self._init_unix_socket_zmq(data_socket)\n\n        return data_socket\n\n    # TCP Init\n    ############################################################################\n\n    def _find_random_port(\n        self, min_port: int = 41923, max_port: int = 64324, max_tries: int = 100\n    ) -&gt; Optional[int]:\n        \"\"\"Find a random open port to bind to if using TCP.\"\"\"\n        from random import choices\n\n        sock: socket.socket\n        ports: List[int] = choices(range(min_port, max_port), k=max_tries)\n        for port in ports:\n            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n            try:\n                sock.bind((\"\", port))\n                sock.close()\n                del sock\n                return port\n            except:\n                continue\n        return None\n\n    def _init_tcp_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Initialize a TCP socket.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_PORT=###`\n        is defined, if so binds it, otherwise find a free port from a selection\n        of random ports. If a port search is performed, the `LUTE_PORT` variable\n        will be defined so it can be picked up by the the Task-side Communicator.\n\n        In the event that no port can be bound on the Executor-side, or the port\n        and hostname information is unavailable to the Task-side, the program\n        will exit.\n\n        Returns:\n            data_socket (socket.socket): TCP socket object.\n        \"\"\"\n        data_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                # If port is None find one\n                # Executor code executes first\n                port = self._find_random_port()\n                if port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                # Provide port env var for Task-side\n                os.environ[\"LUTE_PORT\"] = str(port)\n            data_socket.bind((\"\", int(port)))\n            data_socket.listen()\n        else:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect((\"localhost\", int(port)))\n            else:\n                data_socket.connect((executor_hostname, int(port)))\n        return data_socket\n\n    def _init_tcp_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a TCP socket using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (zmq.socket.Socket): Socket object.\n        \"\"\"\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                new_port: int = data_socket.bind_to_random_port(\"tcp://*\")\n                if new_port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                port = new_port\n                os.environ[\"LUTE_PORT\"] = str(port)\n            else:\n                data_socket.bind(f\"tcp://*:{port}\")\n            logger.debug(f\"Executor bound port {port}\")\n        else:\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            data_socket.connect(f\"tcp://{executor_hostname}:{port}\")\n\n    # Unix Init\n    ############################################################################\n\n    def _get_socket_path(self) -&gt; str:\n        \"\"\"Return the socket path, defining one if it is not available.\n\n        Returns:\n            socket_path (str): Path to the Unix socket.\n        \"\"\"\n        socket_path: str\n        try:\n            socket_path = os.environ[\"LUTE_SOCKET\"]\n        except KeyError as err:\n            import uuid\n            import tempfile\n\n            # Define a path, and add to environment\n            # Executor-side always created first, Task will use the same one\n            socket_path = f\"{tempfile.gettempdir()}/lute_{uuid.uuid4().hex}.sock\"\n            os.environ[\"LUTE_SOCKET\"] = socket_path\n            logger.debug(f\"SocketCommunicator defines socket_path: {socket_path}\")\n        if USE_ZMQ:\n            return f\"ipc://{socket_path}\"\n        else:\n            return socket_path\n\n    def _init_unix_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Returns a Unix socket object.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_SOCKET=XYZ`\n        is defined, if so binds it, otherwise it will create a new path and\n        define the environment variable for the Task-side to find.\n\n        On the Task (client-side), this method will also open a SSH tunnel to\n        forward a local Unix socket to an Executor Unix socket if the Task and\n        Executor processes are on different machines.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_path: str = self._get_socket_path()\n        data_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n            data_socket.listen()\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path, hostname, executor_hostname\n                )\n                while 1:\n                    # Keep trying reconnect until ssh tunnel works.\n                    try:\n                        data_socket.connect(self._local_socket_path)\n                        break\n                    except FileNotFoundError:\n                        continue\n\n        return data_socket\n\n    def _init_unix_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a Unix socket object, using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (socket.socket): ZMQ object.\n        \"\"\"\n        socket_path = self._get_socket_path()\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                self._data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                # Need to remove ipc:// from socket_path for forwarding\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path[6:], hostname, executor_hostname\n                )\n                # Need to add it back\n                path: str = f\"ipc://{self._local_socket_path}\"\n                data_socket.connect(path)\n\n    def _setup_unix_ssh_tunnel(\n        self, socket_path: str, hostname: str, executor_hostname: str\n    ) -&gt; str:\n        \"\"\"Prepares an SSH tunnel for forwarding between Unix sockets on two hosts.\n\n        An SSH tunnel is opened with `ssh -L &lt;local&gt;:&lt;remote&gt; sleep 2`.\n        This method of communication is slightly slower and incurs additional\n        overhead - it should only be used as a backup. If communication across\n        multiple hosts is required consider using TCP.  The Task will use\n        the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n        created. It is assumed that the user is identical on both the\n        Task machine and Executor machine.\n\n        Returns:\n            local_socket_path (str): The local Unix socket to connect to.\n        \"\"\"\n        if \"uuid\" not in globals():\n            import uuid\n        local_socket_path = f\"{socket_path}.task{uuid.uuid4().hex[:4]}\"\n        self._use_ssh_tunnel = True\n        ssh_cmd: List[str] = [\n            \"ssh\",\n            \"-o\",\n            \"LogLevel=quiet\",\n            \"-L\",\n            f\"{local_socket_path}:{socket_path}\",\n            executor_hostname,\n            \"sleep\",\n            \"2\",\n        ]\n        logger.debug(f\"Opening tunnel from {hostname} to {executor_hostname}\")\n        self._ssh_proc = subprocess.Popen(ssh_cmd)\n        time.sleep(0.4)  # Need to wait... -&gt; Use single Task comm at beginning?\n        return local_socket_path\n\n    # Clean up and properties\n    ############################################################################\n\n    def _clean_up(self) -&gt; None:\n        \"\"\"Clean up connections.\"\"\"\n        if self._party == Party.EXECUTOR:\n            self._stop_thread = True\n            self._reader_thread.join()\n            logger.debug(\"Closed reading thread.\")\n\n        self._data_socket.close()\n        if USE_ZMQ:\n            self._context.term()\n        else:\n            ...\n\n        if os.getenv(\"LUTE_USE_TCP\"):\n            return\n        else:\n            if self._party == Party.EXECUTOR:\n                os.unlink(os.getenv(\"LUTE_SOCKET\"))  # Should be defined\n                return\n            elif self._use_ssh_tunnel:\n                if self._ssh_proc is not None:\n                    self._ssh_proc.terminate()\n\n    @property\n    def has_messages(self) -&gt; bool:\n        if self._party == Party.TASK:\n            # Shouldn't be called on Task-side\n            return False\n\n        if self._msg_queue.qsize() &gt; 0:\n            return True\n        return False\n\n    def __exit__(self):\n        self._clean_up()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.ACCEPT_TIMEOUT","title":"<code>ACCEPT_TIMEOUT: float = 0.01</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Maximum time to wait to accept connections. Used by Executor-side.</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.MSG_HEAD","title":"<code>MSG_HEAD: bytes = b'MSG'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.MSG_SEP","title":"<code>MSG_SEP: bytes = b';;;'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Separator for parts of a message. Messages have a start, length, message and end.</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC over a TCP or Unix socket.</p> <p>Unlike with the PipeCommunicator, pickle is always used to send data through the socket.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to use pickle. Always True currently, passing False does not change behaviour.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC over a TCP or Unix socket.\n\n    Unlike with the PipeCommunicator, pickle is always used to send data\n    through the socket.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n        use_pickle (bool): Whether to use pickle. Always True currently,\n            passing False does not change behaviour.\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Delays the creation of socket objects.</p> <p>The Executor initializes the Communicator when it is created. Since all Executors are created and available at once we want to delay acquisition of socket resources until a single Executor is ready to use them.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self) -&gt; None:\n    \"\"\"Delays the creation of socket objects.\n\n    The Executor initializes the Communicator when it is created. Since\n    all Executors are created and available at once we want to delay\n    acquisition of socket resources until a single Executor is ready\n    to use them.\n    \"\"\"\n    self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n    if USE_ZMQ:\n        self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n        self._context: zmq.context.Context = zmq.Context()\n        self._data_socket = self._create_socket_zmq()\n    else:\n        self.desc: str = \"Communicates through a TCP or Unix socket.\"\n        self._data_socket = self._create_socket_raw()\n        self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n    if self._party == Party.EXECUTOR:\n        # Executor created first so we can define the hostname env variable\n        os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n        # Setup reader thread\n        self._reader_thread: threading.Thread = threading.Thread(\n            target=self._read_socket\n        )\n        self._msg_queue: queue.Queue = queue.Queue()\n        self._partial_msg: Optional[bytes] = None\n        self._stop_thread: bool = False\n        self._reader_thread.start()\n    else:\n        # Only used by Party.TASK\n        self._use_ssh_tunnel: bool = False\n        self._ssh_proc: Optional[subprocess.Popen] = None\n        self._local_socket_path: Optional[str] = None\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Return a message from the queue if available.</p> <p>Socket(s) are continuously monitored, and read from when new data is available.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from. Provided for compatibility with other Communicator subtypes. Is ignored.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Return a message from the queue if available.\n\n    Socket(s) are continuously monitored, and read from when new data is\n    available.\n\n    Args:\n        proc (subprocess.Popen): The process to read from. Provided for\n            compatibility with other Communicator subtypes. Is ignored.\n\n    Returns:\n         msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    msg: Message\n    try:\n        msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n    except queue.Empty:\n        msg = Message()\n\n    return msg\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Send a single Message.</p> <p>The entire Message (signal and contents) is serialized and sent through a connection over Unix socket.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Send a single Message.\n\n    The entire Message (signal and contents) is serialized and sent through\n    a connection over Unix socket.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    self._write_socket(msg)\n</code></pre>"},{"location":"source/io/_sqlite/","title":"_sqlite","text":"<p>Backend SQLite database utilites.</p> <p>Functions should be used only by the higher-level database module.</p>"},{"location":"source/io/config/","title":"config","text":"<p>Machinary for the IO of configuration YAML files and their validation.</p> <p>Functions:</p> Name Description <code>parse_config</code> <p>str, config_path: str) -&gt; TaskParameters: Parse a configuration file and return a TaskParameters object of validated parameters for a specific Task. Raises an exception if the provided configuration does not match the expected model.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Error raised by pydantic during data validation. (From Pydantic)</p>"},{"location":"source/io/config/#io.config.AnalysisHeader","title":"<code>AnalysisHeader</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Header information for LUTE analysis runs.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class AnalysisHeader(BaseModel):\n    \"\"\"Header information for LUTE analysis runs.\"\"\"\n\n    title: str = Field(\n        \"LUTE Task Configuration\",\n        description=\"Description of the configuration or experiment.\",\n    )\n    experiment: str = Field(\"\", description=\"Experiment.\")\n    run: Union[str, int] = Field(\"\", description=\"Data acquisition run.\")\n    date: str = Field(\"1970/01/01\", description=\"Start date of analysis.\")\n    lute_version: Union[float, str] = Field(\n        0.1, description=\"Version of LUTE used for analysis.\"\n    )\n    task_timeout: PositiveInt = Field(\n        600,\n        description=(\n            \"Time in seconds until a task times out. Should be slightly shorter\"\n            \" than job timeout if using a job manager (e.g. SLURM).\"\n        ),\n    )\n    work_dir: str = Field(\"\", description=\"Main working directory for LUTE.\")\n\n    @validator(\"work_dir\", always=True)\n    def validate_work_dir(cls, directory: str, values: Dict[str, Any]) -&gt; str:\n        work_dir: str\n        if directory == \"\":\n            std_work_dir = (\n                f\"/sdf/data/lcls/ds/{values['experiment'][:3]}/\"\n                f\"{values['experiment']}/scratch\"\n            )\n            work_dir = std_work_dir\n        else:\n            work_dir = directory\n        # Check existence and permissions\n        if not os.path.exists(work_dir):\n            raise ValueError(f\"Working Directory: {work_dir} does not exist!\")\n        if not os.access(work_dir, os.W_OK):\n            # Need write access for database, files etc.\n            raise ValueError(f\"Not write access for working directory: {work_dir}!\")\n        return work_dir\n\n    @validator(\"run\", always=True)\n    def validate_run(\n        cls, run: Union[str, int], values: Dict[str, Any]\n    ) -&gt; Union[str, int]:\n        if run == \"\":\n            # From Airflow RUN_NUM should have Format \"RUN_DATETIME\" - Num is first part\n            run_time: str = os.environ.get(\"RUN_NUM\", \"\")\n            if run_time != \"\":\n                return int(run_time.split(\"_\")[0])\n        return run\n\n    @validator(\"experiment\", always=True)\n    def validate_experiment(cls, experiment: str, values: Dict[str, Any]) -&gt; str:\n        if experiment == \"\":\n            arp_exp: str = os.environ.get(\"EXPERIMENT\", \"EXPX00000\")\n            return arp_exp\n        return experiment\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters","title":"<code>CompareHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>compare_hkl</code> for calculating figures of merit.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class CompareHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `compare_hkl` for calculating figures of merit.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/compare_hkl\",\n        description=\"CrystFEL's reflection comparison binary.\",\n        flag_type=\"\",\n    )\n    in_files: Optional[str] = Field(\n        \"\",\n        description=\"Path to input HKLs. Space-separated list of 2. Use output of partialator e.g.\",\n        flag_type=\"\",\n    )\n    ## Need mechanism to set is_result=True ...\n    symmetry: str = Field(\"\", description=\"Point group symmetry.\", flag_type=\"--\")\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    fom: str = Field(\n        \"Rsplit\", description=\"Specify figure of merit to calculate.\", flag_type=\"--\"\n    )\n    nshells: int = Field(10, description=\"Use n resolution shells.\", flag_type=\"--\")\n    # NEED A NEW CASE FOR THIS -&gt; Boolean flag, no arg, one hyphen...\n    # fix_unity: bool = Field(\n    #    False,\n    #    description=\"Fix scale factors to unity.\",\n    #    flag_type=\"-\",\n    #    rename_param=\"u\",\n    # )\n    shell_file: str = Field(\n        \"\",\n        description=\"Write the statistics in resolution shells to a file.\",\n        flag_type=\"--\",\n        rename_param=\"shell-file\",\n        is_result=True,\n    )\n    ignore_negs: bool = Field(\n        False,\n        description=\"Ignore reflections with negative reflections.\",\n        flag_type=\"--\",\n        rename_param=\"ignore-negs\",\n    )\n    zero_negs: bool = Field(\n        False,\n        description=\"Set negative intensities to 0.\",\n        flag_type=\"--\",\n        rename_param=\"zero-negs\",\n    )\n    sigma_cutoff: Optional[Union[float, int, str]] = Field(\n        # \"-infinity\",\n        description=\"Discard reflections with I/sigma(I) &lt; n. -infinity means no cutoff.\",\n        flag_type=\"--\",\n        rename_param=\"sigma-cutoff\",\n    )\n    rmin: Optional[float] = Field(\n        description=\"Low resolution cutoff of 1/d (m-1). Use this or --lowres NOT both.\",\n        flag_type=\"--\",\n    )\n    lowres: Optional[float] = Field(\n        descirption=\"Low resolution cutoff in Angstroms. Use this or --rmin NOT both.\",\n        flag_type=\"--\",\n    )\n    rmax: Optional[float] = Field(\n        description=\"High resolution cutoff in 1/d (m-1). Use this or --highres NOT both.\",\n        flag_type=\"--\",\n    )\n    highres: Optional[float] = Field(\n        description=\"High resolution cutoff in Angstroms. Use this or --rmax NOT both.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_files\", always=True)\n    def validate_in_files(cls, in_files: str, values: Dict[str, Any]) -&gt; str:\n        if in_files == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                hkls: str = f\"{partialator_file}1 {partialator_file}2\"\n                return hkls\n        return in_files\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n\n    @validator(\"symmetry\", always=True)\n    def validate_symmetry(cls, symmetry: str, values: Dict[str, Any]) -&gt; str:\n        if symmetry == \"\":\n            partialator_sym: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"symmetry\"\n            )\n            if partialator_sym:\n                return partialator_sym\n        return symmetry\n\n    @validator(\"shell_file\", always=True)\n    def validate_shell_file(cls, shell_file: str, values: Dict[str, Any]) -&gt; str:\n        if shell_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                shells_out: str = partialator_file.split(\".\")[0]\n                shells_out = f\"{shells_out}_{values['fom']}_n{values['nshells']}.dat\"\n                return shells_out\n        return shell_file\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters","title":"<code>ConcatenateStreamFilesParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for stream concatenation.</p> <p>Concatenates the stream file output from CrystFEL indexing for multiple experimental runs.</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class ConcatenateStreamFilesParameters(TaskParameters):\n    \"\"\"Parameters for stream concatenation.\n\n    Concatenates the stream file output from CrystFEL indexing for multiple\n    experimental runs.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    in_file: str = Field(\n        \"\",\n        description=\"Root of directory tree storing stream files to merge.\",\n    )\n\n    tag: Optional[str] = Field(\n        \"\",\n        description=\"Tag identifying the stream files to merge.\",\n    )\n\n    out_file: str = Field(\n        \"\", description=\"Path to merged output stream file.\", is_result=True\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_dir: str = str(Path(stream_file).parent)\n                return stream_dir\n        return in_file\n\n    @validator(\"tag\", always=True)\n    def validate_tag(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_tag: str = Path(stream_file).name.split(\"_\")[0]\n                return stream_tag\n        return tag\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_out_file: str = str(\n                Path(values[\"in_file\"]).parent / f\"{values['tag'].stream}\"\n            )\n            return stream_out_file\n        return tag\n</code></pre>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.DimpleSolveParameters","title":"<code>DimpleSolveParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's dimple program.</p> <p>There are many parameters. For more information on usage, please refer to the CCP4 documentation, here: https://ccp4.github.io/dimple/</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class DimpleSolveParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's dimple program.\n\n    There are many parameters. For more information on\n    usage, please refer to the CCP4 documentation, here:\n    https://ccp4.github.io/dimple/\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/dimple\",\n        description=\"CCP4 Dimple for solving structures with MR.\",\n        flag_type=\"\",\n    )\n    # Positional requirements - all required.\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input mtz.\",\n        flag_type=\"\",\n    )\n    pdb: str = Field(\"\", description=\"Path to a PDB.\", flag_type=\"\")\n    out_dir: str = Field(\"\", description=\"Output DIRECTORY.\", flag_type=\"\")\n    # Most used options\n    mr_thresh: PositiveFloat = Field(\n        0.4,\n        description=\"Threshold for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-when-r\",\n    )\n    slow: Optional[bool] = Field(\n        False, description=\"Perform more refinement.\", flag_type=\"--\"\n    )\n    # Other options (IO)\n    hklout: str = Field(\n        \"final.mtz\", description=\"Output mtz file name.\", flag_type=\"--\"\n    )\n    xyzout: str = Field(\n        \"final.pdb\", description=\"Output PDB file name.\", flag_type=\"--\"\n    )\n    icolumn: Optional[str] = Field(\n        # \"IMEAN\",\n        description=\"Name for the I column.\",\n        flag_type=\"--\",\n    )\n    sigicolumn: Optional[str] = Field(\n        # \"SIG&lt;ICOL&gt;\",\n        description=\"Name for the Sig&lt;I&gt; column.\",\n        flag_type=\"--\",\n    )\n    fcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the F column.\",\n        flag_type=\"--\",\n    )\n    sigfcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the Sig&lt;F&gt; column.\",\n        flag_type=\"--\",\n    )\n    libin: Optional[str] = Field(\n        description=\"Ligand descriptions for refmac (LIBIN).\", flag_type=\"--\"\n    )\n    refmac_key: Optional[str] = Field(\n        description=\"Extra Refmac keywords to use in refinement.\",\n        flag_type=\"--\",\n        rename_param=\"refmac-key\",\n    )\n    free_r_flags: Optional[str] = Field(\n        description=\"Path to a mtz file with freeR flags.\",\n        flag_type=\"--\",\n        rename_param=\"free-r-flags\",\n    )\n    freecolumn: Optional[Union[int, float]] = Field(\n        # 0,\n        description=\"Refree column with an optional value.\",\n        flag_type=\"--\",\n    )\n    img_format: Optional[str] = Field(\n        description=\"Format of generated images. (png, jpeg, none).\",\n        flag_type=\"-\",\n        rename_param=\"f\",\n    )\n    white_bg: bool = Field(\n        False,\n        description=\"Use a white background in Coot and in images.\",\n        flag_type=\"--\",\n        rename_param=\"white-bg\",\n    )\n    no_cleanup: bool = Field(\n        False,\n        description=\"Retain intermediate files.\",\n        flag_type=\"--\",\n        rename_param=\"no-cleanup\",\n    )\n    # Calculations\n    no_blob_search: bool = Field(\n        False,\n        description=\"Do not search for unmodelled blobs.\",\n        flag_type=\"--\",\n        rename_param=\"no-blob-search\",\n    )\n    anode: bool = Field(\n        False, description=\"Use SHELX/AnoDe to find peaks in the anomalous map.\"\n    )\n    # Run customization\n    no_hetatm: bool = Field(\n        False,\n        description=\"Remove heteroatoms from the given model.\",\n        flag_type=\"--\",\n        rename_param=\"no-hetatm\",\n    )\n    rigid_cycles: Optional[PositiveInt] = Field(\n        # 10,\n        description=\"Number of cycles of rigid-body refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"rigid-cycles\",\n    )\n    jelly: Optional[PositiveInt] = Field(\n        # 4,\n        description=\"Number of cycles of jelly-body refinement to perform.\",\n        flag_type=\"--\",\n    )\n    restr_cycles: Optional[PositiveInt] = Field(\n        # 8,\n        description=\"Number of cycles of refmac final refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"restr-cycles\",\n    )\n    lim_resolution: Optional[PositiveFloat] = Field(\n        description=\"Limit the final resolution.\", flag_type=\"--\", rename_param=\"reso\"\n    )\n    weight: Optional[str] = Field(\n        # \"auto-weight\",\n        description=\"The refmac matrix weight.\",\n        flag_type=\"--\",\n    )\n    mr_prog: Optional[str] = Field(\n        # \"phaser\",\n        description=\"Molecular replacement program. phaser or molrep.\",\n        flag_type=\"--\",\n        rename_param=\"mr-prog\",\n    )\n    mr_num: Optional[Union[str, int]] = Field(\n        # \"auto\",\n        description=\"Number of molecules to use for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-num\",\n    )\n    mr_reso: Optional[PositiveFloat] = Field(\n        # 3.25,\n        description=\"High resolution for molecular replacement. If &gt;10 interpreted as eLLG.\",\n        flag_type=\"--\",\n        rename_param=\"mr-reso\",\n    )\n    itof_prog: Optional[str] = Field(\n        description=\"Program to calculate amplitudes. truncate, or ctruncate.\",\n        flag_type=\"--\",\n        rename_param=\"ItoF-prog\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return get_hkl_file\n        return in_file\n\n    @validator(\"out_dir\", always=True)\n    def validate_out_dir(cls, out_dir: str, values: Dict[str, Any]) -&gt; str:\n        if out_dir == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return os.path.dirname(get_hkl_file)\n        return out_dir\n</code></pre>"},{"location":"source/io/config/#io.config.FindOverlapXSSParameters","title":"<code>FindOverlapXSSParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>TaskParameter model for FindOverlapXSS Task.</p> <p>This Task determines spatial or temporal overlap between an optical pulse and the FEL pulse based on difference scattering (XSS) signal. This Task uses SmallData HDF5 files as a source.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class FindOverlapXSSParameters(TaskParameters):\n    \"\"\"TaskParameter model for FindOverlapXSS Task.\n\n    This Task determines spatial or temporal overlap between an optical pulse\n    and the FEL pulse based on difference scattering (XSS) signal. This Task\n    uses SmallData HDF5 files as a source.\n    \"\"\"\n\n    class ExpConfig(BaseModel):\n        det_name: str\n        ipm_var: str\n        scan_var: Union[str, List[str]]\n\n    class Thresholds(BaseModel):\n        min_Iscat: Union[int, float]\n        min_ipm: Union[int, float]\n\n    class AnalysisFlags(BaseModel):\n        use_pyfai: bool = True\n        use_asymls: bool = False\n\n    exp_config: ExpConfig\n    thresholds: Thresholds\n    analysis_flags: AnalysisFlags\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters","title":"<code>FindPeaksPsocakeParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using Psocake.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation. NOTE: This Task is deprecated and provided for compatibility only.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPsocakeParameters(ThirdPartyParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using Psocake.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    NOTE: This Task is deprecated and provided for compatibility only.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    class SZParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description=\"SZ compression algorithm (qoz, sz3)\"\n        )\n        binSize: int = Field(2, description=\"SZ compression's bin size paramater\")\n        roiWindowSize: int = Field(\n            2, description=\"SZ compression's ROI window size paramater\"\n        )\n        absError: float = Field(10, descriptionp=\"Maximum absolute error value\")\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    mca: str = Field(\n        \"btl ^openib\", description=\"Mca option for the MPI executable\", flag_type=\"--\"\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    p_arg2: str = Field(\n        \"findPeaksSZ.py\",\n        description=\"Executable to run with mpi (i.e. python).\",\n        flag_type=\"\",\n    )\n    d: str = Field(description=\"Detector name\", flag_type=\"-\")\n    e: str = Field(\"\", description=\"Experiment name\", flag_type=\"-\")\n    r: int = Field(-1, description=\"Run number\", flag_type=\"-\")\n    outDir: str = Field(\n        description=\"Output directory where .cxi will be saved\", flag_type=\"--\"\n    )\n    algorithm: int = Field(1, description=\"PyAlgos algorithm to use\", flag_type=\"--\")\n    alg_npix_min: float = Field(\n        1.0, description=\"PyAlgos algorithm's npix_min parameter\", flag_type=\"--\"\n    )\n    alg_npix_max: float = Field(\n        45.0, description=\"PyAlgos algorithm's npix_max parameter\", flag_type=\"--\"\n    )\n    alg_amax_thr: float = Field(\n        250.0, description=\"PyAlgos algorithm's amax_thr parameter\", flag_type=\"--\"\n    )\n    alg_atot_thr: float = Field(\n        330.0, description=\"PyAlgos algorithm's atot_thr parameter\", flag_type=\"--\"\n    )\n    alg_son_min: float = Field(\n        10.0, description=\"PyAlgos algorithm's son_min parameter\", flag_type=\"--\"\n    )\n    alg1_thr_low: float = Field(\n        80.0, description=\"PyAlgos algorithm's thr_low parameter\", flag_type=\"--\"\n    )\n    alg1_thr_high: float = Field(\n        270.0, description=\"PyAlgos algorithm's thr_high parameter\", flag_type=\"--\"\n    )\n    alg1_rank: int = Field(\n        3, description=\"PyAlgos algorithm's rank parameter\", flag_type=\"--\"\n    )\n    alg1_radius: int = Field(\n        3, description=\"PyAlgos algorithm's radius parameter\", flag_type=\"--\"\n    )\n    alg1_dr: int = Field(\n        1, description=\"PyAlgos algorithm's dr parameter\", flag_type=\"--\"\n    )\n    psanaMask_on: str = Field(\n        \"True\", description=\"Whether psana's mask should be used\", flag_type=\"--\"\n    )\n    psanaMask_calib: str = Field(\n        \"True\", description=\"Psana mask's calib parameter\", flag_type=\"--\"\n    )\n    psanaMask_status: str = Field(\n        \"True\", description=\"Psana mask's status parameter\", flag_type=\"--\"\n    )\n    psanaMask_edges: str = Field(\n        \"True\", description=\"Psana mask's edges parameter\", flag_type=\"--\"\n    )\n    psanaMask_central: str = Field(\n        \"True\", description=\"Psana mask's central parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbond: str = Field(\n        \"True\", description=\"Psana mask's unbond parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbondnrs: str = Field(\n        \"True\", description=\"Psana mask's unbondnbrs parameter\", flag_type=\"--\"\n    )\n    mask: str = Field(\n        \"\", description=\"Path to an additional mask to apply\", flag_type=\"--\"\n    )\n    clen: str = Field(\n        description=\"Epics variable storing the camera length\", flag_type=\"--\"\n    )\n    coffset: float = Field(0, description=\"Camera offset in m\", flag_type=\"--\")\n    minPeaks: int = Field(\n        15,\n        description=\"Minimum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    maxPeaks: int = Field(\n        15,\n        description=\"Maximum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    minRes: int = Field(\n        0,\n        description=\"Minimum peak resolution to mark frame for indexing \",\n        flag_type=\"--\",\n    )\n    sample: str = Field(\"\", description=\"Sample name\", flag_type=\"--\")\n    instrument: Union[None, str] = Field(\n        None, description=\"Instrument name\", flag_type=\"--\"\n    )\n    pixelSize: float = Field(0.0, description=\"Pixel size\", flag_type=\"--\")\n    auto: str = Field(\n        \"False\",\n        description=(\n            \"Whether to automatically determine peak per event peak \"\n            \"finding parameters\"\n        ),\n        flag_type=\"--\",\n    )\n    detectorDistance: float = Field(\n        0.0, description=\"Detector distance from interaction point in m\", flag_type=\"--\"\n    )\n    access: Literal[\"ana\", \"ffb\"] = Field(\n        \"ana\", description=\"Data node type: {ana,ffb}\", flag_type=\"--\"\n    )\n    szfile: str = Field(\"qoz.json\", description=\"Path to SZ's JSON configuration file\")\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"sz.json\",\n            output_path=\"\",  # Will want to change where this goes...\n        ),\n        description=\"Template information for the sz.json file\",\n    )\n    sz_parameters: SZParameters = Field(\n        description=\"Configuration parameters for SZ Compression\", flag_type=\"\"\n    )\n\n    @validator(\"e\", always=True)\n    def validate_e(cls, e: str, values: Dict[str, Any]) -&gt; str:\n        if e == \"\":\n            return values[\"lute_config\"].experiment\n        return e\n\n    @validator(\"r\", always=True)\n    def validate_r(cls, r: int, values: Dict[str, Any]) -&gt; int:\n        if r == -1:\n            return values[\"lute_config\"].run\n        return r\n\n    @validator(\"lute_template_cfg\", always=True)\n    def set_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"szfile\"]\n        return lute_template_cfg\n\n    @validator(\"sz_parameters\", always=True)\n    def set_sz_compression_parameters(\n        cls, sz_parameters: SZParameters, values: Dict[str, Any]\n    ) -&gt; None:\n        values[\"compressor\"] = sz_parameters.compressor\n        values[\"binSize\"] = sz_parameters.binSize\n        values[\"roiWindowSize\"] = sz_parameters.roiWindowSize\n        if sz_parameters.compressor == \"qoz\":\n            values[\"pressio_opts\"] = {\n                \"pressio:abs\": sz_parameters.absError,\n                \"qoz\": {\"qoz:stride\": 8},\n            }\n        else:\n            values[\"pressio_opts\"] = {\"pressio:abs\": sz_parameters.absError}\n        return None\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        run: int = int(values[\"lute_config\"].run)\n        directory: str = values[\"outDir\"]\n        fname: str = f\"{exp}_{run:04d}.lst\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters","title":"<code>FindPeaksPyAlgosParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using PyAlgos.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgosParameters(TaskParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using PyAlgos.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    class SZCompressorParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description='Compression algorithm (\"qoz\" or \"sz3\")'\n        )\n        abs_error: float = Field(10.0, description=\"Absolute error bound\")\n        bin_size: int = Field(2, description=\"Bin size\")\n        roi_window_size: int = Field(\n            9,\n            description=\"Default window size\",\n        )\n\n    outdir: str = Field(\n        description=\"Output directory for cxi files\",\n    )\n    n_events: int = Field(\n        0,\n        description=\"Number of events to process (0 to process all events)\",\n    )\n    det_name: str = Field(\n        description=\"Psana name of the detector storing the image data\",\n    )\n    event_receiver: Literal[\"evr0\", \"evr1\"] = Field(\n        description=\"Event Receiver to be used: evr0 or evr1\",\n    )\n    tag: str = Field(\n        \"\",\n        description=\"Tag to add to the output file names\",\n    )\n    pv_camera_length: Union[str, float] = Field(\n        \"\",\n        description=\"PV associated with camera length \"\n        \"(if a number, camera length directly)\",\n    )\n    event_logic: bool = Field(\n        False,\n        description=\"True if only events with a specific event code should be \"\n        \"processed. False if the event code should be ignored\",\n    )\n    event_code: int = Field(\n        0,\n        description=\"Required events code for events to be processed if event logic \"\n        \"is True\",\n    )\n    psana_mask: bool = Field(\n        False,\n        description=\"If True, apply mask from psana Detector object\",\n    )\n    mask_file: Union[str, None] = Field(\n        None,\n        description=\"File with a custom mask to apply. If None, no custom mask is \"\n        \"applied\",\n    )\n    min_peaks: int = Field(2, description=\"Minimum number of peaks per image\")\n    max_peaks: int = Field(\n        2048,\n        description=\"Maximum number of peaks per image\",\n    )\n    npix_min: int = Field(\n        2,\n        description=\"Minimum number of pixels per peak\",\n    )\n    npix_max: int = Field(\n        30,\n        description=\"Maximum number of pixels per peak\",\n    )\n    amax_thr: float = Field(\n        80.0,\n        description=\"Minimum intensity threshold for starting a peak\",\n    )\n    atot_thr: float = Field(\n        120.0,\n        description=\"Minimum summed intensity threshold for pixel collection\",\n    )\n    son_min: float = Field(\n        7.0,\n        description=\"Minimum signal-to-noise ratio to be considered a peak\",\n    )\n    peak_rank: int = Field(\n        3,\n        description=\"Radius in which central peak pixel is a local maximum\",\n    )\n    r0: float = Field(\n        3.0,\n        description=\"Radius of ring for background evaluation in pixels\",\n    )\n    dr: float = Field(\n        2.0,\n        description=\"Width of ring for background evaluation in pixels\",\n    )\n    nsigm: float = Field(\n        7.0,\n        description=\"Intensity threshold to include pixel in connected group\",\n    )\n    compression: Optional[SZCompressorParameters] = Field(\n        None,\n        description=\"Options for the SZ Compression Algorithm\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            fname: Path = (\n                Path(values[\"outdir\"])\n                / f\"{values['lute_config'].experiment}_{values['lute_config'].run}_\"\n                f\"{values['tag']}.list\"\n            )\n            return str(fname)\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters","title":"<code>IndexCrystFELParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>indexamajig</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-indexamajig.html</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class IndexCrystFELParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `indexamajig`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/indexamajig\",\n        description=\"CrystFEL's indexing binary.\",\n        flag_type=\"\",\n    )\n    # Basic options\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    geometry: str = Field(\n        \"\", description=\"Path to geometry file.\", flag_type=\"-\", rename_param=\"g\"\n    )\n    zmq_input: Optional[str] = Field(\n        description=\"ZMQ address to receive data over. `input` and `zmq-input` are mutually exclusive\",\n        flag_type=\"--\",\n        rename_param=\"zmq-input\",\n    )\n    zmq_subscribe: Optional[str] = Field(  # Can be used multiple times...\n        description=\"Subscribe to ZMQ message of type `tag`\",\n        flag_type=\"--\",\n        rename_param=\"zmq-subscribe\",\n    )\n    zmq_request: Optional[AnyUrl] = Field(\n        description=\"Request new data over ZMQ by sending this value\",\n        flag_type=\"--\",\n        rename_param=\"zmq-request\",\n    )\n    asapo_endpoint: Optional[str] = Field(\n        description=\"ASAP::O endpoint. zmq-input and this are mutually exclusive.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-endpoint\",\n    )\n    asapo_token: Optional[str] = Field(\n        description=\"ASAP::O authentication token.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-token\",\n    )\n    asapo_beamtime: Optional[str] = Field(\n        description=\"ASAP::O beatime.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-beamtime\",\n    )\n    asapo_source: Optional[str] = Field(\n        description=\"ASAP::O data source.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-source\",\n    )\n    asapo_group: Optional[str] = Field(\n        description=\"ASAP::O consumer group.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-group\",\n    )\n    asapo_stream: Optional[str] = Field(\n        description=\"ASAP::O stream.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    asapo_wait_for_stream: Optional[str] = Field(\n        description=\"If ASAP::O stream does not exist, wait for it to appear.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-wait-for-stream\",\n    )\n    data_format: Optional[str] = Field(\n        description=\"Specify format for ZMQ or ASAP::O. `msgpack`, `hdf5` or `seedee`.\",\n        flag_type=\"--\",\n        rename_param=\"data-format\",\n    )\n    basename: bool = Field(\n        False,\n        description=\"Remove directory parts of filenames. Acts before prefix if prefix also given.\",\n        flag_type=\"--\",\n    )\n    prefix: Optional[str] = Field(\n        description=\"Add a prefix to the filenames from the infile argument.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    nthreads: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of threads to use. See also `max_indexer_threads`.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    no_check_prefix: bool = Field(\n        False,\n        description=\"Don't attempt to correct the prefix if it seems incorrect.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-prefix\",\n    )\n    highres: Optional[float] = Field(\n        description=\"Mark all pixels greater than `x` has bad.\", flag_type=\"--\"\n    )\n    profile: bool = Field(\n        False, description=\"Display timing data to monitor performance.\", flag_type=\"--\"\n    )\n    temp_dir: Optional[str] = Field(\n        description=\"Specify a path for the temp files folder.\",\n        flag_type=\"--\",\n        rename_param=\"temp-dir\",\n    )\n    wait_for_file: conint(gt=-2) = Field(\n        0,\n        description=\"Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\",\n        flag_type=\"--\",\n        rename_param=\"wait-for-file\",\n    )\n    no_image_data: bool = Field(\n        False,\n        description=\"Load only the metadata, no iamges. Can check indexability without high data requirements.\",\n        flag_type=\"--\",\n        rename_param=\"no-image-data\",\n    )\n    # Peak-finding options\n    # ....\n    # Indexing options\n    indexing: Optional[str] = Field(\n        description=\"Comma-separated list of supported indexing algorithms to use. Default is to automatically detect.\",\n        flag_type=\"--\",\n    )\n    cell_file: Optional[str] = Field(\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    tolerance: str = Field(\n        \"5,5,5,1.5\",\n        description=(\n            \"Tolerances (in percent) for unit cell comparison. \"\n            \"Comma-separated list a,b,c,angle. Default=5,5,5,1.5\"\n        ),\n        flag_type=\"--\",\n    )\n    no_check_cell: bool = Field(\n        False,\n        description=\"Do not check cell parameters against unit cell. Replaces '-raw' method.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-cell\",\n    )\n    no_check_peaks: bool = Field(\n        False,\n        description=\"Do not verify peaks are accounted for by solution.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-peaks\",\n    )\n    multi: bool = Field(\n        False, description=\"Enable multi-lattice indexing.\", flag_type=\"--\"\n    )\n    wavelength_estimate: Optional[float] = Field(\n        description=\"Estimate for X-ray wavelength. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"wavelength-estimate\",\n    )\n    camera_length_estimate: Optional[float] = Field(\n        description=\"Estimate for camera distance. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"camera-length-estimate\",\n    )\n    max_indexer_threads: Optional[PositiveInt] = Field(\n        # 1,\n        description=\"Some indexing algos can use multiple threads. In addition to image-based.\",\n        flag_type=\"--\",\n        rename_param=\"max-indexer-threads\",\n    )\n    no_retry: bool = Field(\n        False,\n        description=\"Do not remove weak peaks and try again.\",\n        flag_type=\"--\",\n        rename_param=\"no-retry\",\n    )\n    no_refine: bool = Field(\n        False,\n        description=\"Skip refinement step.\",\n        flag_type=\"--\",\n        rename_param=\"no-refine\",\n    )\n    no_revalidate: bool = Field(\n        False,\n        description=\"Skip revalidation step.\",\n        flag_type=\"--\",\n        rename_param=\"no-revalidate\",\n    )\n    # TakeTwo specific parameters\n    taketwo_member_threshold: Optional[PositiveInt] = Field(\n        # 20,\n        description=\"Minimum number of vectors to consider.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-member-threshold\",\n    )\n    taketwo_len_tolerance: Optional[PositiveFloat] = Field(\n        # 0.001,\n        description=\"TakeTwo length tolerance in Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-len-tolerance\",\n    )\n    taketwo_angle_tolerance: Optional[PositiveFloat] = Field(\n        # 0.6,\n        description=\"TakeTwo angle tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-angle-tolerance\",\n    )\n    taketwo_trace_tolerance: Optional[PositiveFloat] = Field(\n        # 3,\n        description=\"Matrix trace tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-trace-tolerance\",\n    )\n    # Felix-specific parameters\n    # felix_domega\n    # felix-fraction-max-visits\n    # felix-max-internal-angle\n    # felix-max-uniqueness\n    # felix-min-completeness\n    # felix-min-visits\n    # felix-num-voxels\n    # felix-sigma\n    # felix-tthrange-max\n    # felix-tthrange-min\n    # XGANDALF-specific parameters\n    xgandalf_sampling_pitch: Optional[NonNegativeInt] = Field(\n        # 6,\n        description=\"Density of reciprocal space sampling.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-sampling-pitch\",\n    )\n    xgandalf_grad_desc_iterations: Optional[NonNegativeInt] = Field(\n        # 4,\n        description=\"Number of gradient descent iterations.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-grad-desc-iterations\",\n    )\n    xgandalf_tolerance: Optional[PositiveFloat] = Field(\n        # 0.02,\n        description=\"Relative tolerance of lattice vectors\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-tolerance\",\n    )\n    xgandalf_no_deviation_from_provided_cell: Optional[bool] = Field(\n        description=\"Found unit cell must match provided.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-no-deviation-from-provided-cell\",\n    )\n    xgandalf_min_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 30,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-min-lattice-vector-length\",\n    )\n    xgandalf_max_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 250,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-lattice-vector-length\",\n    )\n    xgandalf_max_peaks: Optional[PositiveInt] = Field(\n        # 250,\n        description=\"Maximum number of peaks to use for indexing.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-peaks\",\n    )\n    xgandalf_fast_execution: bool = Field(\n        False,\n        description=\"Shortcut to set sampling-pitch=2, and grad-desc-iterations=3.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-fast-execution\",\n    )\n    # pinkIndexer parameters\n    # ...\n    # asdf_fast: bool = Field(False, description=\"Enable fast mode for asdf. 3x faster for 7% loss in accuracy.\", flag_type=\"--\", rename_param=\"asdf-fast\")\n    # Integration parameters\n    integration: str = Field(\n        \"rings-nocen\", description=\"Method for integrating reflections.\", flag_type=\"--\"\n    )\n    fix_profile_radius: Optional[float] = Field(\n        description=\"Fix the profile radius (m^{-1})\",\n        flag_type=\"--\",\n        rename_param=\"fix-profile-radius\",\n    )\n    fix_divergence: Optional[float] = Field(\n        0,\n        description=\"Fix the divergence (rad, full angle).\",\n        flag_type=\"--\",\n        rename_param=\"fix-divergence\",\n    )\n    int_radius: str = Field(\n        \"4,5,7\",\n        description=\"Inner, middle, and outer radii for 3-ring integration.\",\n        flag_type=\"--\",\n        rename_param=\"int-radius\",\n    )\n    int_diag: str = Field(\n        \"none\",\n        description=\"Show detailed information on integration when condition is met.\",\n        flag_type=\"--\",\n        rename_param=\"int-diag\",\n    )\n    push_res: str = Field(\n        \"infinity\",\n        description=\"Integrate `x` higher than apparent resolution limit (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    overpredict: bool = Field(\n        False,\n        description=\"Over-predict reflections. Maybe useful with post-refinement.\",\n        flag_type=\"--\",\n    )\n    cell_parameters_only: bool = Field(\n        False, description=\"Do not predict refletions at all\", flag_type=\"--\"\n    )\n    # Output parameters\n    no_non_hits_in_stream: bool = Field(\n        False,\n        description=\"Exclude non-hits from the stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-non-hits-in-stream\",\n    )\n    copy_hheader: Optional[str] = Field(\n        description=\"Copy information from header in the image to output stream.\",\n        flag_type=\"--\",\n        rename_param=\"copy-hheader\",\n    )\n    no_peaks_in_stream: bool = Field(\n        False,\n        description=\"Do not record peaks in stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-peaks-in-stream\",\n    )\n    no_refls_in_stream: bool = Field(\n        False,\n        description=\"Do not record reflections in stream.\",\n        flag_type=\"--\",\n        rename_param=\"no-refls-in-stream\",\n    )\n    serial_offset: Optional[PositiveInt] = Field(\n        description=\"Start numbering at `x` instead of 1.\",\n        flag_type=\"--\",\n        rename_param=\"serial-offset\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            filename: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"FindPeaksPyAlgos\", \"out_file\"\n            )\n            if filename is None:\n                exp: str = values[\"lute_config\"].experiment\n                run: int = int(values[\"lute_config\"].run)\n                tag: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"tag\"\n                )\n                out_dir: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"outDir\"\n                )\n                if out_dir is not None:\n                    fname: str = f\"{out_dir}/{exp}_{run:04d}\"\n                    if tag is not None:\n                        fname = f\"{fname}_{tag}\"\n                    return f\"{fname}.lst\"\n            else:\n                return filename\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            expmt: str = values[\"lute_config\"].experiment\n            run: int = int(values[\"lute_config\"].run)\n            work_dir: str = values[\"lute_config\"].work_dir\n            fname: str = f\"{expmt}_r{run:04d}.stream\"\n            return f\"{work_dir}/{fname}\"\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters","title":"<code>ManipulateHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>get_hkl</code> for manipulating lists of reflections.</p> <p>This Task is predominantly used internally to convert <code>hkl</code> to <code>mtz</code> files. Note that performing multiple manipulations is undefined behaviour. Run the Task with multiple configurations in explicit separate steps. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class ManipulateHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `get_hkl` for manipulating lists of reflections.\n\n    This Task is predominantly used internally to convert `hkl` to `mtz` files.\n    Note that performing multiple manipulations is undefined behaviour. Run\n    the Task with multiple configurations in explicit separate steps. For more\n    information on usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/get_hkl\",\n        description=\"CrystFEL's reflection manipulation binary.\",\n        flag_type=\"\",\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input HKL file.\",\n        flag_type=\"-\",\n        rename_param=\"i\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    output_format: str = Field(\n        \"mtz\",\n        description=\"Output format. One of mtz, mtz-bij, or xds. Otherwise CrystFEL format.\",\n        flag_type=\"--\",\n        rename_param=\"output-format\",\n    )\n    expand: Optional[str] = Field(\n        description=\"Reflections will be expanded to fill asymmetric unit of specified point group.\",\n        flag_type=\"--\",\n    )\n    # Reducing reflections to higher symmetry\n    twin: Optional[str] = Field(\n        description=\"Reflections equivalent to specified point group will have intensities summed.\",\n        flag_type=\"--\",\n    )\n    no_need_all_parts: Optional[bool] = Field(\n        description=\"Use with --twin to allow reflections missing a 'twin mate' to be written out.\",\n        flag_type=\"--\",\n        rename_param=\"no-need-all-parts\",\n    )\n    # Noise - Add to data\n    noise: Optional[bool] = Field(\n        description=\"Generate 10% uniform noise.\", flag_type=\"--\"\n    )\n    poisson: Optional[bool] = Field(\n        description=\"Generate Poisson noise. Intensities assumed to be A.U.\",\n        flag_type=\"--\",\n    )\n    adu_per_photon: Optional[int] = Field(\n        description=\"Use with --poisson to convert A.U. to photons.\",\n        flag_type=\"--\",\n        rename_param=\"adu-per-photon\",\n    )\n    # Remove duplicate reflections\n    trim_centrics: Optional[bool] = Field(\n        description=\"Duplicated reflections (according to symmetry) are removed.\",\n        flag_type=\"--\",\n    )\n    # Restrict to template file\n    template: Optional[str] = Field(\n        description=\"Only reflections which also appear in specified file are written out.\",\n        flag_type=\"--\",\n    )\n    # Multiplicity\n    multiplicity: Optional[bool] = Field(\n        description=\"Reflections are multiplied by their symmetric multiplicites.\",\n        flag_type=\"--\",\n    )\n    # Resolution cutoffs\n    cutoff_angstroms: Optional[Union[str, int, float]] = Field(\n        description=\"Either n, or n1,n2,n3. For n, reflections &lt; n are removed. For n1,n2,n3 anisotropic trunction performed at separate resolution limits for a*, b*, c*.\",\n        flag_type=\"--\",\n        rename_param=\"cutoff-angstroms\",\n    )\n    lowres: Optional[float] = Field(\n        description=\"Remove reflections with d &gt; n\", flag_type=\"--\"\n    )\n    highres: Optional[float] = Field(\n        description=\"Synonym for first form of --cutoff-angstroms\"\n    )\n    reindex: Optional[str] = Field(\n        description=\"Reindex according to specified operator. E.g. k,h,-l.\",\n        flag_type=\"--\",\n    )\n    # Override input symmetry\n    symmetry: Optional[str] = Field(\n        description=\"Point group symmetry to use to override. Almost always OMIT this option.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                return partialator_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                mtz_out: str = partialator_file.split(\".\")[0]\n                mtz_out = f\"{mtz_out}.mtz\"\n                return mtz_out\n        return out_file\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n</code></pre>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.MergePartialatorParameters","title":"<code>MergePartialatorParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>partialator</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class MergePartialatorParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `partialator`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/partialator\",\n        description=\"CrystFEL's Partialator binary.\",\n        flag_type=\"\",\n    )\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input stream.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    symmetry: str = Field(description=\"Point group symmetry.\", flag_type=\"--\")\n    niter: Optional[int] = Field(\n        description=\"Number of cycles of scaling and post-refinement.\",\n        flag_type=\"-\",\n        rename_param=\"n\",\n    )\n    no_scale: Optional[bool] = Field(\n        description=\"Disable scaling.\", flag_type=\"--\", rename_param=\"no-scale\"\n    )\n    no_Bscale: Optional[bool] = Field(\n        description=\"Disable Debye-Waller part of scaling.\",\n        flag_type=\"--\",\n        rename_param=\"no-Bscale\",\n    )\n    no_pr: Optional[bool] = Field(\n        description=\"Disable orientation model.\", flag_type=\"--\", rename_param=\"no-pr\"\n    )\n    no_deltacchalf: Optional[bool] = Field(\n        description=\"Disable rejection based on deltaCC1/2.\",\n        flag_type=\"--\",\n        rename_param=\"no-deltacchalf\",\n    )\n    model: str = Field(\n        \"unity\",\n        description=\"Partiality model. Options: xsphere, unity, offset, ggpm.\",\n        flag_type=\"--\",\n    )\n    nthreads: int = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of parallel analyses.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    polarisation: Optional[str] = Field(\n        description=\"Specification of incident polarisation. Refer to CrystFEL docs for more info.\",\n        flag_type=\"--\",\n    )\n    no_polarisation: Optional[bool] = Field(\n        description=\"Synonym for --polarisation=none\",\n        flag_type=\"--\",\n        rename_param=\"no-polarisation\",\n    )\n    max_adu: Optional[float] = Field(\n        description=\"Maximum intensity of reflection to include.\",\n        flag_type=\"--\",\n        rename_param=\"max-adu\",\n    )\n    min_res: Optional[float] = Field(\n        description=\"Only include crystals diffracting to a minimum resolution.\",\n        flag_type=\"--\",\n        rename_param=\"min-res\",\n    )\n    min_measurements: int = Field(\n        2,\n        description=\"Include a reflection only if it appears a minimum number of times.\",\n        flag_type=\"--\",\n        rename_param=\"min-measurements\",\n    )\n    push_res: Optional[float] = Field(\n        description=\"Merge reflections up to higher than the apparent resolution limit.\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    start_after: int = Field(\n        0,\n        description=\"Ignore the first n crystals.\",\n        flag_type=\"--\",\n        rename_param=\"start-after\",\n    )\n    stop_after: int = Field(\n        0,\n        description=\"Stop after processing n crystals. 0 means process all.\",\n        flag_type=\"--\",\n        rename_param=\"stop-after\",\n    )\n    no_free: Optional[bool] = Field(\n        description=\"Disable cross-validation. Testing ONLY.\",\n        flag_type=\"--\",\n        rename_param=\"no-free\",\n    )\n    custom_split: Optional[str] = Field(\n        description=\"Read a set of filenames, event and dataset IDs from a filename.\",\n        flag_type=\"--\",\n        rename_param=\"custom-split\",\n    )\n    max_rel_B: float = Field(\n        100,\n        description=\"Reject crystals if |relB| &gt; n sq Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"max-rel-B\",\n    )\n    output_every_cycle: bool = Field(\n        False,\n        description=\"Write per-crystal params after every refinement cycle.\",\n        flag_type=\"--\",\n        rename_param=\"output-every-cycle\",\n    )\n    no_logs: bool = Field(\n        False,\n        description=\"Do not write logs needed for plots, maps and graphs.\",\n        flag_type=\"--\",\n        rename_param=\"no-logs\",\n    )\n    set_symmetry: Optional[str] = Field(\n        description=\"Set the apparent symmetry of the crystals to a point group.\",\n        flag_type=\"-\",\n        rename_param=\"w\",\n    )\n    operator: Optional[str] = Field(\n        description=\"Specify an ambiguity operator. E.g. k,h,-l.\", flag_type=\"--\"\n    )\n    force_bandwidth: Optional[float] = Field(\n        description=\"Set X-ray bandwidth. As percent, e.g. 0.0013 (0.13%).\",\n        flag_type=\"--\",\n        rename_param=\"force-bandwidth\",\n    )\n    force_radius: Optional[float] = Field(\n        description=\"Set the initial profile radius (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"force-radius\",\n    )\n    force_lambda: Optional[float] = Field(\n        description=\"Set the wavelength. In Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"force-lambda\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"ConcatenateStreamFiles\",\n                \"out_file\",\n            )\n            if stream_file:\n                return stream_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            in_file: str = values[\"in_file\"]\n            if in_file:\n                tag: str = in_file.split(\".\")[0]\n                return f\"{tag}.hkl\"\n            else:\n                return \"partialator.hkl\"\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.RunSHELXCParameters","title":"<code>RunSHELXCParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's SHELXC program.</p> <p>SHELXC prepares files for SHELXD and SHELXE.</p> <p>For more information please refer to the official documentation: https://www.ccp4.ac.uk/html/crank.html</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class RunSHELXCParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's SHELXC program.\n\n    SHELXC prepares files for SHELXD and SHELXE.\n\n    For more information please refer to the official documentation:\n    https://www.ccp4.ac.uk/html/crank.html\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/shelxc\",\n        description=\"CCP4 SHELXC. Generates input files for SHELXD/SHELXE.\",\n        flag_type=\"\",\n    )\n    placeholder: str = Field(\n        \"xx\", description=\"Placeholder filename stem.\", flag_type=\"\"\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Input file for SHELXC with reflections AND proper records.\",\n        flag_type=\"\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            # get_hkl needed to be run to produce an XDS format file...\n            xds_format_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if xds_format_file:\n                in_file = xds_format_file\n        if in_file[0] != \"&lt;\":\n            # Need to add a redirection for this program\n            # Runs like `shelxc xx &lt;input_file.xds`\n            in_file = f\"&lt;{in_file}\"\n        return in_file\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters","title":"<code>SubmitSMDParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for running smalldata to produce reduced HDF5 files.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class SubmitSMDParameters(ThirdPartyParameters):\n    \"\"\"Parameters for running smalldata to produce reduced HDF5 files.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    m: str = Field(\n        \"mpi4py.run\",\n        description=\"Python option to execute a module's contents as __main__ module.\",\n        flag_type=\"-\",\n    )\n    producer: str = Field(\n        \"\", description=\"Path to the SmallData producer Python script.\", flag_type=\"\"\n    )\n    run: str = Field(\n        os.environ.get(\"RUN_NUM\", \"\"), description=\"DAQ Run Number.\", flag_type=\"--\"\n    )\n    experiment: str = Field(\n        os.environ.get(\"EXPERIMENT\", \"\"),\n        description=\"LCLS Experiment Number.\",\n        flag_type=\"--\",\n    )\n    stn: NonNegativeInt = Field(0, description=\"Hutch endstation.\", flag_type=\"--\")\n    nevents: int = Field(\n        int(1e9), description=\"Number of events to process.\", flag_type=\"--\"\n    )\n    directory: Optional[str] = Field(\n        None,\n        description=\"Optional output directory. If None, will be in ${EXP_FOLDER}/hdf5/smalldata.\",\n        flag_type=\"--\",\n    )\n    ## Need mechanism to set result_from_param=True ...\n    gather_interval: PositiveInt = Field(\n        25, description=\"Number of events to collect at a time.\", flag_type=\"--\"\n    )\n    norecorder: bool = Field(\n        False, description=\"Whether to ignore recorder streams.\", flag_type=\"--\"\n    )\n    url: HttpUrl = Field(\n        \"https://pswww.slac.stanford.edu/ws-auth/lgbk\",\n        description=\"Base URL for eLog posting.\",\n        flag_type=\"--\",\n    )\n    epicsAll: bool = Field(\n        False,\n        description=\"Whether to store all EPICS PVs. Use with care.\",\n        flag_type=\"--\",\n    )\n    full: bool = Field(\n        False,\n        description=\"Whether to store all data. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    fullSum: bool = Field(\n        False,\n        description=\"Whether to store sums for all area detector images.\",\n        flag_type=\"--\",\n    )\n    default: bool = Field(\n        False,\n        description=\"Whether to store only the default minimal set of data.\",\n        flag_type=\"--\",\n    )\n    image: bool = Field(\n        False,\n        description=\"Whether to save everything as images. Use with care.\",\n        flag_type=\"--\",\n    )\n    tiff: bool = Field(\n        False,\n        description=\"Whether to save all images as a single TIFF. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    centerpix: bool = Field(\n        False,\n        description=\"Whether to mask center pixels for Epix10k2M detectors.\",\n        flag_type=\"--\",\n    )\n    postRuntable: bool = Field(\n        False,\n        description=\"Whether to post run tables. Also used as a trigger for summary jobs.\",\n        flag_type=\"--\",\n    )\n    wait: bool = Field(\n        False, description=\"Whether to wait for a file to appear.\", flag_type=\"--\"\n    )\n    xtcav: bool = Field(\n        False,\n        description=\"Whether to add XTCAV processing to the HDF5 generation.\",\n        flag_type=\"--\",\n    )\n    noarch: bool = Field(\n        False, description=\"Whether to not use archiver data.\", flag_type=\"--\"\n    )\n\n    lute_template_cfg: TemplateConfig = TemplateConfig(template_name=\"\", output_path=\"\")\n\n    @validator(\"producer\", always=True)\n    def validate_producer_path(cls, producer: str) -&gt; str:\n        return producer\n\n    @validator(\"lute_template_cfg\", always=True)\n    def use_producer(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if not lute_template_cfg.output_path:\n            lute_template_cfg.output_path = values[\"producer\"]\n        return lute_template_cfg\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        hutch: str = exp[:3]\n        run: int = int(values[\"lute_config\"].run)\n        directory: Optional[str] = values[\"directory\"]\n        if directory is None:\n            directory = f\"/sdf/data/lcls/ds/{hutch}/{exp}/hdf5/smalldata\"\n        fname: str = f\"{exp}_Run{run:04d}.h5\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Identical to super-class Config but includes a result.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.TaskParameters","title":"<code>TaskParameters</code>","text":"<p>               Bases: <code>BaseSettings</code></p> <p>Base class for models of task parameters to be validated.</p> <p>Parameters are read from a configuration YAML file and validated against subclasses of this type in order to ensure that both all parameters are present, and that the parameters are of the correct type.</p> Note <p>Pydantic is used for data validation. Pydantic does not perform \"strict\" validation by default. Parameter values may be cast to conform with the model specified by the subclass definition if it is possible to do so. Consider whether this may cause issues (e.g. if a float is cast to an int).</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TaskParameters(BaseSettings):\n    \"\"\"Base class for models of task parameters to be validated.\n\n    Parameters are read from a configuration YAML file and validated against\n    subclasses of this type in order to ensure that both all parameters are\n    present, and that the parameters are of the correct type.\n\n    Note:\n        Pydantic is used for data validation. Pydantic does not perform \"strict\"\n        validation by default. Parameter values may be cast to conform with the\n        model specified by the subclass definition if it is possible to do so.\n        Consider whether this may cause issues (e.g. if a float is cast to an\n        int).\n    \"\"\"\n\n    class Config:\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration. A number of LUTE-specific\n        configuration has also been placed here.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). False. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n                `set_result==True`\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however. Only used if `set_result==True`\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if `set_result==True`.\n        \"\"\"\n\n        env_prefix = \"LUTE_\"\n        underscore_attrs_are_private: bool = True\n        copy_on_model_validation: str = \"deep\"\n        allow_inf_nan: bool = False\n\n        run_directory: Optional[str] = None\n        \"\"\"Set the directory that the Task is run from.\"\"\"\n        set_result: bool = False\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n        result_from_params: Optional[str] = None\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n        result_summary: Optional[str] = None\n        \"\"\"Format a TaskResult.summary from output.\"\"\"\n        impl_schemas: Optional[str] = None\n        \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n\n    lute_config: AnalysisHeader\n</code></pre>"},{"location":"source/io/config/#io.config.TaskParameters.Config","title":"<code>Config</code>","text":"<p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration. A number of LUTE-specific configuration has also been placed here.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc. Only used if <code>set_result==True</code></p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however. Only used if <code>set_result==True</code></p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config:\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration. A number of LUTE-specific\n    configuration has also been placed here.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). False. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n            `set_result==True`\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however. Only used if `set_result==True`\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if `set_result==True`.\n    \"\"\"\n\n    env_prefix = \"LUTE_\"\n    underscore_attrs_are_private: bool = True\n    copy_on_model_validation: str = \"deep\"\n    allow_inf_nan: bool = False\n\n    run_directory: Optional[str] = None\n    \"\"\"Set the directory that the Task is run from.\"\"\"\n    set_result: bool = False\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n    result_from_params: Optional[str] = None\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n    result_summary: Optional[str] = None\n    \"\"\"Format a TaskResult.summary from output.\"\"\"\n    impl_schemas: Optional[str] = None\n    \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.TaskParameters.Config.impl_schemas","title":"<code>impl_schemas: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Schema specification for output result. Will be passed to TaskResult.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.result_from_params","title":"<code>result_from_params: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.result_summary","title":"<code>result_summary: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Format a TaskResult.summary from output.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.run_directory","title":"<code>run_directory: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Set the directory that the Task is run from.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.set_result","title":"<code>set_result: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.TemplateConfig","title":"<code>TemplateConfig</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Parameters used for templating of third party configuration files.</p> <p>Attributes:</p> Name Type Description <code>template_name</code> <code>str</code> <p>The name of the template to use. This template must live in <code>config/templates</code>.</p> <code>output_path</code> <code>str</code> <p>The FULL path, including filename to write the rendered template to.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TemplateConfig(BaseModel):\n    \"\"\"Parameters used for templating of third party configuration files.\n\n    Attributes:\n        template_name (str): The name of the template to use. This template must\n            live in `config/templates`.\n\n        output_path (str): The FULL path, including filename to write the\n            rendered template to.\n    \"\"\"\n\n    template_name: str\n    output_path: str\n</code></pre>"},{"location":"source/io/config/#io.config.TemplateParameters","title":"<code>TemplateParameters</code>","text":"<p>Class for representing parameters for third party configuration files.</p> <p>These parameters can represent arbitrary data types and are used in conjunction with templates for modifying third party configuration files from the single LUTE YAML. Due to the storage of arbitrary data types, and the use of a template file, a single instance of this class can hold from a single template variable to an entire configuration file. The data parsing is done by jinja using the complementary template. All data is stored in the single model variable <code>params.</code></p> <p>The pydantic \"dataclass\" is used over the BaseModel/Settings to allow positional argument instantiation of the <code>params</code> Field.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>@dataclass\nclass TemplateParameters:\n    \"\"\"Class for representing parameters for third party configuration files.\n\n    These parameters can represent arbitrary data types and are used in\n    conjunction with templates for modifying third party configuration files\n    from the single LUTE YAML. Due to the storage of arbitrary data types, and\n    the use of a template file, a single instance of this class can hold from a\n    single template variable to an entire configuration file. The data parsing\n    is done by jinja using the complementary template.\n    All data is stored in the single model variable `params.`\n\n    The pydantic \"dataclass\" is used over the BaseModel/Settings to allow\n    positional argument instantiation of the `params` Field.\n    \"\"\"\n\n    params: Any\n</code></pre>"},{"location":"source/io/config/#io.config.TestBinaryErrParameters","title":"<code>TestBinaryErrParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Same as TestBinary, but exits with non-zero code.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestBinaryErrParameters(ThirdPartyParameters):\n    \"\"\"Same as TestBinary, but exits with non-zero code.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/home/d/dorlhiac/test_tasks/test_threads_err\",\n        description=\"Multi-threaded tes tbinary with non-zero exit code.\",\n    )\n    p_arg1: int = Field(1, description=\"Number of threads.\")\n</code></pre>"},{"location":"source/io/config/#io.config.TestMultiNodeCommunicationParameters","title":"<code>TestMultiNodeCommunicationParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>TestMultiNodeCommunication</code>.</p> <p>Test verifies communication across multiple machines.</p> Source code in <code>lute/io/models/mpi_tests.py</code> <pre><code>class TestMultiNodeCommunicationParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `TestMultiNodeCommunication`.\n\n    Test verifies communication across multiple machines.\n    \"\"\"\n\n    send_obj: Literal[\"plot\", \"array\"] = Field(\n        \"array\", description=\"Object to send to Executor. `plot` or `array`\"\n    )\n    arr_size: Optional[int] = Field(\n        None, description=\"Size of array to send back to Executor.\"\n    )\n</code></pre>"},{"location":"source/io/config/#io.config.TestParameters","title":"<code>TestParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>Test</code>.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `Test`.\"\"\"\n\n    float_var: float = Field(0.01, description=\"A floating point number.\")\n    str_var: str = Field(\"test\", description=\"A string.\")\n\n    class CompoundVar(BaseModel):\n        int_var: int = 1\n        dict_var: Dict[str, str] = {\"a\": \"b\"}\n\n    compound_var: CompoundVar = Field(\n        description=(\n            \"A compound parameter - consists of a `int_var` (int) and `dict_var`\"\n            \" (Dict[str, str]).\"\n        )\n    )\n    throw_error: bool = Field(\n        False, description=\"If `True`, raise an exception to test error handling.\"\n    )\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters","title":"<code>ThirdPartyParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Base class for third party task parameters.</p> <p>Contains special validators for extra arguments and handling of parameters used for filling in third party configuration files.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class ThirdPartyParameters(TaskParameters):\n    \"\"\"Base class for third party task parameters.\n\n    Contains special validators for extra arguments and handling of parameters\n    used for filling in third party configuration files.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration and inherited configuration\n        from the base `TaskParameters.Config` class. A number of values are also\n        overridden, and there are some specific configuration options to\n        ThirdPartyParameters. A full list of options (with TaskParameters options\n        repeated) is described below.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). True. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc.\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however.\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if set_result is True.\n\n            -----------------------\n            ThirdPartyTask-specific:\n\n            extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n                arguments.\n\n            short_flags_use_eq (bool): False. If True, \"short\" command-line args\n                are passed as `-x=arg`. ThirdPartyTask-specific.\n\n            long_flags_use_eq (bool): False. If True, \"long\" command-line args\n                are passed as `--long=arg`. ThirdPartyTask-specific.\n        \"\"\"\n\n        extra: str = \"allow\"\n        short_flags_use_eq: bool = False\n        \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n        long_flags_use_eq: bool = False\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    # lute_template_cfg: TemplateConfig\n\n    @root_validator(pre=False)\n    def extra_fields_to_thirdparty(cls, values: Dict[str, Any]):\n        for key in values:\n            if key not in cls.__fields__:\n                values[key] = TemplateParameters(values[key])\n\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration and inherited configuration from the base <code>TaskParameters.Config</code> class. A number of values are also overridden, and there are some specific configuration options to ThirdPartyParameters. A full list of options (with TaskParameters options repeated) is described below.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc.</p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however.</p> <code>ThirdPartyTask-specific</code> <code>Optional[str]</code> <code>extra</code> <code>str</code> <p>\"allow\". Pydantic configuration. Allow (or ignore) extra arguments.</p> <code>short_flags_use_eq</code> <code>bool</code> <p>False. If True, \"short\" command-line args are passed as <code>-x=arg</code>. ThirdPartyTask-specific.</p> <code>long_flags_use_eq</code> <code>bool</code> <p>False. If True, \"long\" command-line args are passed as <code>--long=arg</code>. ThirdPartyTask-specific.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config(TaskParameters.Config):\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration and inherited configuration\n    from the base `TaskParameters.Config` class. A number of values are also\n    overridden, and there are some specific configuration options to\n    ThirdPartyParameters. A full list of options (with TaskParameters options\n    repeated) is described below.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). True. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc.\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however.\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if set_result is True.\n\n        -----------------------\n        ThirdPartyTask-specific:\n\n        extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n            arguments.\n\n        short_flags_use_eq (bool): False. If True, \"short\" command-line args\n            are passed as `-x=arg`. ThirdPartyTask-specific.\n\n        long_flags_use_eq (bool): False. If True, \"long\" command-line args\n            are passed as `--long=arg`. ThirdPartyTask-specific.\n    \"\"\"\n\n    extra: str = \"allow\"\n    short_flags_use_eq: bool = False\n    \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n    long_flags_use_eq: bool = False\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.short_flags_use_eq","title":"<code>short_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether short command-line arguments are passed like <code>-x=arg</code>.</p>"},{"location":"source/io/config/#io.config.parse_config","title":"<code>parse_config(task_name='test', config_path='')</code>","text":"<p>Parse a configuration file and validate the contents.</p> <p>Parameters:</p> Name Type Description Default <code>task_name</code> <code>str</code> <p>Name of the specific task that will be run.</p> <code>'test'</code> <code>config_path</code> <code>str</code> <p>Path to the configuration file.</p> <code>''</code> <p>Returns:</p> Name Type Description <code>params</code> <code>TaskParameters</code> <p>A TaskParameters object of validated task-specific parameters. Parameters are accessed with \"dot\" notation. E.g. <code>params.param1</code>.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Raised if there are problems with the configuration file. Passed through from Pydantic.</p> Source code in <code>lute/io/config.py</code> <pre><code>def parse_config(task_name: str = \"test\", config_path: str = \"\") -&gt; TaskParameters:\n    \"\"\"Parse a configuration file and validate the contents.\n\n    Args:\n        task_name (str): Name of the specific task that will be run.\n\n        config_path (str): Path to the configuration file.\n\n    Returns:\n        params (TaskParameters): A TaskParameters object of validated\n            task-specific parameters. Parameters are accessed with \"dot\"\n            notation. E.g. `params.param1`.\n\n    Raises:\n        ValidationError: Raised if there are problems with the configuration\n            file. Passed through from Pydantic.\n    \"\"\"\n    task_config_name: str = f\"{task_name}Parameters\"\n\n    with open(config_path, \"r\") as f:\n        docs: Iterator[Dict[str, Any]] = yaml.load_all(stream=f, Loader=yaml.FullLoader)\n        header: Dict[str, Any] = next(docs)\n        config: Dict[str, Any] = next(docs)\n    substitute_variables(header, header)\n    substitute_variables(header, config)\n    LUTE_DEBUG_EXIT(\"LUTE_DEBUG_EXIT_AT_YAML\", pprint.pformat(config))\n    lute_config: Dict[str, AnalysisHeader] = {\"lute_config\": AnalysisHeader(**header)}\n    try:\n        task_config: Dict[str, Any] = dict(config[task_name])\n        lute_config.update(task_config)\n    except KeyError as err:\n        warnings.warn(\n            (\n                f\"{task_name} has no parameter definitions in YAML file.\"\n                \" Attempting default parameter initialization.\"\n            )\n        )\n    parsed_parameters: TaskParameters = globals()[task_config_name](**lute_config)\n    return parsed_parameters\n</code></pre>"},{"location":"source/io/config/#io.config.substitute_variables","title":"<code>substitute_variables(header, config, curr_key=None)</code>","text":"<p>Performs variable substitutions on a dictionary read from config YAML file.</p> <p>Can be used to define input parameters in terms of other input parameters. This is similar to functionality employed by validators for parameters in the specific Task models, but is intended to be more accessible to users. Variable substitutions are defined using a minimal syntax from Jinja:                            {{ experiment }} defines a substitution of the variable <code>experiment</code>. The characters <code>{{ }}</code> can be escaped if the literal symbols are needed in place.</p> <p>For example, a path to a file can be defined in terms of experiment and run values in the config file:     MyTask:       experiment: myexp       run: 2       special_file: /path/to/{{ experiment }}/{{ run }}/file.inp</p> <p>Acceptable variables for substitutions are values defined elsewhere in the YAML file. Environment variables can also be used if prefaced with a <code>$</code> character. E.g. to get the experiment from an environment variable:     MyTask:       run: 2       special_file: /path/to/{{ $EXPERIMENT }}/{{ run }}/file.inp</p> <p>Parameters:</p> Name Type Description Default <code>config</code> <code>Dict[str, Any]</code> <p>A dictionary of parsed configuration.</p> required <code>curr_key</code> <code>Optional[str]</code> <p>Used to keep track of recursion level when scanning through iterable items in the config dictionary.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subbed_config</code> <code>Dict[str, Any]</code> <p>The config dictionary after substitutions have been made. May be identical to the input if no substitutions are needed.</p> Source code in <code>lute/io/config.py</code> <pre><code>def substitute_variables(\n    header: Dict[str, Any], config: Dict[str, Any], curr_key: Optional[str] = None\n) -&gt; None:\n    \"\"\"Performs variable substitutions on a dictionary read from config YAML file.\n\n    Can be used to define input parameters in terms of other input parameters.\n    This is similar to functionality employed by validators for parameters in\n    the specific Task models, but is intended to be more accessible to users.\n    Variable substitutions are defined using a minimal syntax from Jinja:\n                               {{ experiment }}\n    defines a substitution of the variable `experiment`. The characters `{{ }}`\n    can be escaped if the literal symbols are needed in place.\n\n    For example, a path to a file can be defined in terms of experiment and run\n    values in the config file:\n        MyTask:\n          experiment: myexp\n          run: 2\n          special_file: /path/to/{{ experiment }}/{{ run }}/file.inp\n\n    Acceptable variables for substitutions are values defined elsewhere in the\n    YAML file. Environment variables can also be used if prefaced with a `$`\n    character. E.g. to get the experiment from an environment variable:\n        MyTask:\n          run: 2\n          special_file: /path/to/{{ $EXPERIMENT }}/{{ run }}/file.inp\n\n    Args:\n        config (Dict[str, Any]):  A dictionary of parsed configuration.\n\n        curr_key (Optional[str]): Used to keep track of recursion level when scanning\n            through iterable items in the config dictionary.\n\n    Returns:\n        subbed_config (Dict[str, Any]): The config dictionary after substitutions\n            have been made. May be identical to the input if no substitutions are\n            needed.\n    \"\"\"\n    _sub_pattern = r\"\\{\\{[^}{]*\\}\\}\"\n    iterable: Dict[str, Any] = config\n    if curr_key is not None:\n        # Need to handle nested levels by interpreting curr_key\n        keys_by_level: List[str] = curr_key.split(\".\")\n        for key in keys_by_level:\n            iterable = iterable[key]\n    else:\n        ...\n        # iterable = config\n    for param, value in iterable.items():\n        if isinstance(value, dict):\n            new_key: str\n            if curr_key is None:\n                new_key = param\n            else:\n                new_key = f\"{curr_key}.{param}\"\n            substitute_variables(header, config, curr_key=new_key)\n        elif isinstance(value, list):\n            ...\n        # Scalars str - we skip numeric types\n        elif isinstance(value, str):\n            matches: List[str] = re.findall(_sub_pattern, value)\n            for m in matches:\n                key_to_sub_maybe_with_fmt: List[str] = m[2:-2].strip().split(\":\")\n                key_to_sub: str = key_to_sub_maybe_with_fmt[0]\n                fmt: Optional[str] = None\n                if len(key_to_sub_maybe_with_fmt) == 2:\n                    fmt = key_to_sub_maybe_with_fmt[1]\n                sub: Any\n                if key_to_sub[0] == \"$\":\n                    sub = os.getenv(key_to_sub[1:], None)\n                    if sub is None:\n                        print(\n                            f\"Environment variable {key_to_sub[1:]} not found! Cannot substitute in YAML config!\",\n                            flush=True,\n                        )\n                        continue\n                    # substitutions from env vars will be strings, so convert back\n                    # to numeric in order to perform formatting later on (e.g. {var:04d})\n                    sub = _check_str_numeric(sub)\n                else:\n                    try:\n                        sub = config\n                        for key in key_to_sub.split(\".\"):\n                            sub = sub[key]\n                    except KeyError:\n                        sub = header[key_to_sub]\n                pattern: str = (\n                    m.replace(\"{{\", r\"\\{\\{\").replace(\"}}\", r\"\\}\\}\").replace(\"$\", r\"\\$\")\n                )\n                if fmt is not None:\n                    sub = f\"{sub:{fmt}}\"\n                else:\n                    sub = f\"{sub}\"\n                iterable[param] = re.sub(pattern, sub, iterable[param])\n            # Reconvert back to numeric values if needed...\n            iterable[param] = _check_str_numeric(iterable[param])\n</code></pre>"},{"location":"source/io/db/","title":"db","text":"<p>Tools for working with the LUTE parameter and configuration database.</p> <p>The current implementation relies on a sqlite backend database. In the future this may change - therefore relatively few high-level API function calls are intended to be public. These abstract away the details of the database interface and work exclusively on LUTE objects.</p> <p>Functions:</p> Name Description <code>record_analysis_db</code> <p>DescribedAnalysis) -&gt; None: Writes the configuration to the backend database.</p> <code>read_latest_db_entry</code> <p>str, task_name: str, param: str) -&gt; Any: Retrieve the most recent entry from a database for a specific Task.</p> <p>Raises:</p> Type Description <code>DatabaseError</code> <p>Generic exception raised for LUTE database errors.</p>"},{"location":"source/io/db/#io.db.DatabaseError","title":"<code>DatabaseError</code>","text":"<p>               Bases: <code>Exception</code></p> <p>General LUTE database error.</p> Source code in <code>lute/io/db.py</code> <pre><code>class DatabaseError(Exception):\n    \"\"\"General LUTE database error.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/io/db/#io.db.read_latest_db_entry","title":"<code>read_latest_db_entry(db_dir, task_name, param, valid_only=True)</code>","text":"<p>Read most recent value entered into the database for a Task parameter.</p> <p>(Will be updated for schema compliance as well as Task name.)</p> <p>Parameters:</p> Name Type Description Default <code>db_dir</code> <code>str</code> <p>Database location.</p> required <code>task_name</code> <code>str</code> <p>The name of the Task to check the database for.</p> required <code>param</code> <code>str</code> <p>The parameter name for the Task that we want to retrieve.</p> required <code>valid_only</code> <code>bool</code> <p>Whether to consider only valid results or not. E.g. An input file may be useful even if the Task result is invalid (Failed). Default = True.</p> <code>True</code> <p>Returns:</p> Name Type Description <code>val</code> <code>Any</code> <p>The most recently entered value for <code>param</code> of <code>task_name</code> that can be found in the database. Returns None if nothing found.</p> Source code in <code>lute/io/db.py</code> <pre><code>def read_latest_db_entry(\n    db_dir: str, task_name: str, param: str, valid_only: bool = True\n) -&gt; Optional[Any]:\n    \"\"\"Read most recent value entered into the database for a Task parameter.\n\n    (Will be updated for schema compliance as well as Task name.)\n\n    Args:\n        db_dir (str): Database location.\n\n        task_name (str): The name of the Task to check the database for.\n\n        param (str): The parameter name for the Task that we want to retrieve.\n\n        valid_only (bool): Whether to consider only valid results or not. E.g.\n            An input file may be useful even if the Task result is invalid\n            (Failed). Default = True.\n\n    Returns:\n        val (Any): The most recently entered value for `param` of `task_name`\n            that can be found in the database. Returns None if nothing found.\n    \"\"\"\n    import sqlite3\n    from ._sqlite import _select_from_db\n\n    con: sqlite3.Connection = sqlite3.Connection(f\"{db_dir}/lute.db\")\n    with con:\n        try:\n            cond: Dict[str, str] = {}\n            if valid_only:\n                cond = {\"valid_flag\": \"1\"}\n            entry: Any = _select_from_db(con, task_name, param, cond)\n        except sqlite3.OperationalError as err:\n            logger.debug(f\"Cannot retrieve value {param} due to: {err}\")\n            entry = None\n    return entry\n</code></pre>"},{"location":"source/io/db/#io.db.record_analysis_db","title":"<code>record_analysis_db(cfg)</code>","text":"<p>Write an DescribedAnalysis object to the database.</p> <p>The DescribedAnalysis object is maintained by the Executor and contains all information necessary to fully describe a single <code>Task</code> execution. The contained fields are split across multiple tables within the database as some of the information can be shared across multiple Tasks. Refer to <code>docs/design/database.md</code> for more information on the database specification.</p> Source code in <code>lute/io/db.py</code> <pre><code>def record_analysis_db(cfg: DescribedAnalysis) -&gt; None:\n    \"\"\"Write an DescribedAnalysis object to the database.\n\n    The DescribedAnalysis object is maintained by the Executor and contains all\n    information necessary to fully describe a single `Task` execution. The\n    contained fields are split across multiple tables within the database as\n    some of the information can be shared across multiple Tasks. Refer to\n    `docs/design/database.md` for more information on the database specification.\n    \"\"\"\n    import sqlite3\n    from ._sqlite import (\n        _make_shared_table,\n        _make_task_table,\n        _add_row_no_duplicate,\n        _add_task_entry,\n    )\n\n    try:\n        work_dir: str = cfg.task_parameters.lute_config.work_dir\n    except AttributeError:\n        logger.info(\n            (\n                \"Unable to access TaskParameters object. Likely wasn't created. \"\n                \"Cannot store result.\"\n            )\n        )\n        return\n    del cfg.task_parameters.lute_config.work_dir\n\n    exec_entry, exec_columns = _cfg_to_exec_entry_cols(cfg)\n    task_name: str = cfg.task_result.task_name\n    # All `Task`s have an AnalysisHeader, but this info can be shared so is\n    # split into a different table\n    (\n        task_entry,  # Dict[str, Any]\n        task_columns,  # Dict[str, str]\n        gen_entry,  # Dict[str, Any]\n        gen_columns,  # Dict[str, str]\n    ) = _params_to_entry_cols(cfg.task_parameters)\n    x, y = _result_to_entry_cols(cfg.task_result)\n    task_entry.update(x)\n    task_columns.update(y)\n\n    con: sqlite3.Connection = sqlite3.Connection(f\"{work_dir}/lute.db\")\n    with con:\n        # --- Table Creation ---#\n        if not _make_shared_table(con, \"gen_cfg\", gen_columns):\n            raise DatabaseError(\"Could not make general configuration table!\")\n        if not _make_shared_table(con, \"exec_cfg\", exec_columns):\n            raise DatabaseError(\"Could not make Executor configuration table!\")\n        if not _make_task_table(con, task_name, task_columns):\n            raise DatabaseError(f\"Could not make Task table for: {task_name}!\")\n\n        # --- Row Addition ---#\n        gen_id: int = _add_row_no_duplicate(con, \"gen_cfg\", gen_entry)\n        exec_id: int = _add_row_no_duplicate(con, \"exec_cfg\", exec_entry)\n\n        full_task_entry: Dict[str, Any] = {\n            \"gen_cfg_id\": gen_id,\n            \"exec_cfg_id\": exec_id,\n        }\n        full_task_entry.update(task_entry)\n        # Prepare flag to indicate whether the task entry is valid or not\n        # By default we say it is assuming proper completion\n        valid_flag: int = (\n            1 if cfg.task_result.task_status == TaskStatus.COMPLETED else 0\n        )\n        full_task_entry.update({\"valid_flag\": valid_flag})\n\n        _add_task_entry(con, task_name, full_task_entry)\n</code></pre>"},{"location":"source/io/elog/","title":"elog","text":"<p>Provides utilities for communicating with the LCLS eLog.</p> <p>Make use of various eLog API endpoint to retrieve information or post results.</p> <p>Functions:</p> Name Description <code>get_elog_opr_auth</code> <p>str): Return an authorization object to interact with eLog API as an opr account for the hutch where <code>exp</code> was conducted.</p> <code>get_elog_kerberos_auth</code> <p>Return the authorization headers for the user account submitting the job.</p> <code>elog_http_request</code> <p>str, request_type: str, **params): Make an HTTP request to the API endpoint at <code>url</code>.</p> <code>format_file_for_post</code> <p>Union[str, tuple, list]): Prepare files according to the specification needed to add them as attachments to eLog posts.</p> <code>post_elog_message</code> <p>str, msg: str, tag: Optional[str],               title: Optional[str],               in_files: List[Union[str, tuple, list]],               auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Post a message to the eLog.</p> <code>post_elog_run_status</code> <p>Dict[str, Union[str, int, float]],                  update_url: Optional[str] = None) Post a run status to the summary section on the Workflows&gt;Control tab.</p> <code>post_elog_run_table</code> <p>str, run: int, data: Dict[str, Any],                auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Update run table in the eLog.</p> <code>get_elog_runs_by_tag</code> <p>str, tag: str,                  auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Return a list of runs with a specific tag.</p> <code>get_elog_params_by_run</code> <p>str, params: List[str], runs: Optional[List[int]]) Retrieve the requested parameters by run. If no run is provided, retrieve the requested parameters for all runs.</p>"},{"location":"source/io/elog/#io.elog.elog_http_request","title":"<code>elog_http_request(exp, endpoint, request_type, **params)</code>","text":"<p>Make an HTTP request to the eLog.</p> <p>This method will determine the proper authorization method and update the passed parameters appropriately. Functions implementing specific endpoint functionality and calling this function should only pass the necessary endpoint-specific parameters and not include the authorization objects.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment.</p> required <code>endpoint</code> <code>str</code> <p>eLog API endpoint.</p> required <code>request_type</code> <code>str</code> <p>Type of request to make. Recognized options: POST or GET.</p> required <code>**params</code> <code>Dict</code> <p>Endpoint parameters to pass with the HTTP request! Differs depending on the API endpoint. Do not include auth objects.</p> <code>{}</code> <p>Returns:</p> Name Type Description <code>status_code</code> <code>int</code> <p>Response status code. Can be checked for errors.</p> <code>msg</code> <code>str</code> <p>An error message, or a message saying SUCCESS.</p> <code>value</code> <code>Optional[Any]</code> <p>For GET requests ONLY, return the requested information.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def elog_http_request(\n    exp: str, endpoint: str, request_type: str, **params\n) -&gt; Tuple[int, str, Optional[Any]]:\n    \"\"\"Make an HTTP request to the eLog.\n\n    This method will determine the proper authorization method and update the\n    passed parameters appropriately. Functions implementing specific endpoint\n    functionality and calling this function should only pass the necessary\n    endpoint-specific parameters and not include the authorization objects.\n\n    Args:\n        exp (str): Experiment.\n\n        endpoint (str): eLog API endpoint.\n\n        request_type (str): Type of request to make. Recognized options: POST or\n            GET.\n\n        **params (Dict): Endpoint parameters to pass with the HTTP request!\n            Differs depending on the API endpoint. Do not include auth objects.\n\n    Returns:\n        status_code (int): Response status code. Can be checked for errors.\n\n        msg (str): An error message, or a message saying SUCCESS.\n\n        value (Optional[Any]): For GET requests ONLY, return the requested\n            information.\n    \"\"\"\n    auth: Union[HTTPBasicAuth, Dict[str, str]] = get_elog_auth(exp)\n    base_url: str\n    if isinstance(auth, HTTPBasicAuth):\n        params.update({\"auth\": auth})\n        base_url = \"https://pswww.slac.stanford.edu/ws-auth/lgbk/lgbk\"\n    elif isinstance(auth, dict):\n        params.update({\"headers\": auth})\n        base_url = \"https://pswww.slac.stanford.edu/ws-kerb/lgbk/lgbk\"\n\n    url: str = f\"{base_url}/{endpoint}\"\n\n    resp: requests.models.Response\n    if request_type.upper() == \"POST\":\n        resp = requests.post(url, **params)\n    elif request_type.upper() == \"GET\":\n        resp = requests.get(url, **params)\n    else:\n        return (-1, \"Invalid request type!\", None)\n\n    status_code: int = resp.status_code\n    msg: str = \"SUCCESS\"\n\n    if resp.json()[\"success\"] and request_type.upper() == \"GET\":\n        return (status_code, msg, resp.json()[\"value\"])\n\n    if status_code &gt;= 300:\n        msg = f\"Error when posting to eLog: Response {status_code}\"\n\n    if not resp.json()[\"success\"]:\n        err_msg = resp.json()[\"error_msg\"]\n        msg += f\"\\nInclude message: {err_msg}\"\n    return (resp.status_code, msg, None)\n</code></pre>"},{"location":"source/io/elog/#io.elog.format_file_for_post","title":"<code>format_file_for_post(in_file)</code>","text":"<p>Format a file for attachment to an eLog post.</p> <p>The eLog API expects a specifically formatted tuple when adding file attachments. This function prepares the tuple to specification given a number of different input types.</p> <p>Parameters:</p> Name Type Description Default <code>in_file</code> <code>str | tuple | list</code> <p>File to include as an attachment in an eLog post.</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def format_file_for_post(\n    in_file: Union[str, tuple, list]\n) -&gt; Tuple[str, Tuple[str, BufferedReader], Any]:\n    \"\"\"Format a file for attachment to an eLog post.\n\n    The eLog API expects a specifically formatted tuple when adding file\n    attachments. This function prepares the tuple to specification given a\n    number of different input types.\n\n    Args:\n        in_file (str | tuple | list): File to include as an attachment in an\n            eLog post.\n    \"\"\"\n    description: str\n    fptr: BufferedReader\n    ftype: Optional[str]\n    if isinstance(in_file, str):\n        description = os.path.basename(in_file)\n        fptr = open(in_file, \"rb\")\n        ftype = mimetypes.guess_type(in_file)[0]\n    elif isinstance(in_file, tuple) or isinstance(in_file, list):\n        description = in_file[1]\n        fptr = open(in_file[0], \"rb\")\n        ftype = mimetypes.guess_type(in_file[0])[0]\n    else:\n        raise ElogFileFormatError(f\"Unrecognized format: {in_file}\")\n\n    out_file: Tuple[str, Tuple[str, BufferedReader], Any] = (\n        \"files\",\n        (description, fptr),\n        ftype,\n    )\n    return out_file\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_active_expmt","title":"<code>get_elog_active_expmt(hutch, *, endstation=0)</code>","text":"<p>Get the current active experiment for a hutch.</p> <p>This function is one of two functions to manage the HTTP request independently. This is because it does not require an authorization object, and its result is needed for the generic function <code>elog_http_request</code> to work properly.</p> <p>Parameters:</p> Name Type Description Default <code>hutch</code> <code>str</code> <p>The hutch to get the active experiment for.</p> required <code>endstation</code> <code>int</code> <p>The hutch endstation to get the experiment for. This should generally be 0.</p> <code>0</code> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_active_expmt(hutch: str, *, endstation: int = 0) -&gt; str:\n    \"\"\"Get the current active experiment for a hutch.\n\n    This function is one of two functions to manage the HTTP request independently.\n    This is because it does not require an authorization object, and its result\n    is needed for the generic function `elog_http_request` to work properly.\n\n    Args:\n        hutch (str): The hutch to get the active experiment for.\n\n        endstation (int): The hutch endstation to get the experiment for. This\n            should generally be 0.\n    \"\"\"\n\n    base_url: str = \"https://pswww.slac.stanford.edu/ws/lgbk/lgbk\"\n    endpoint: str = \"ws/activeexperiment_for_instrument_station\"\n    url: str = f\"{base_url}/{endpoint}\"\n    params: Dict[str, str] = {\"instrument_name\": hutch, \"station\": f\"{endstation}\"}\n    resp: requests.models.Response = requests.get(url, params)\n    if resp.status_code &gt; 300:\n        raise RuntimeError(\n            f\"Error getting current experiment!\\n\\t\\tIncorrect hutch: '{hutch}'?\"\n        )\n    if resp.json()[\"success\"]:\n        return resp.json()[\"value\"][\"name\"]\n    else:\n        msg: str = resp.json()[\"error_msg\"]\n        raise RuntimeError(f\"Error getting current experiment! Err: {msg}\")\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_auth","title":"<code>get_elog_auth(exp)</code>","text":"<p>Determine the appropriate auth method depending on experiment state.</p> <p>Returns:</p> Name Type Description <code>auth</code> <code>HTTPBasicAuth | Dict[str, str]</code> <p>Depending on whether an experiment is active/live, returns authorization for the hutch operator account or the current user submitting a job.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_auth(exp: str) -&gt; Union[HTTPBasicAuth, Dict[str, str]]:\n    \"\"\"Determine the appropriate auth method depending on experiment state.\n\n    Returns:\n        auth (HTTPBasicAuth | Dict[str, str]): Depending on whether an experiment\n            is active/live, returns authorization for the hutch operator account\n            or the current user submitting a job.\n    \"\"\"\n    hutch: str = exp[:3]\n    if exp.lower() == get_elog_active_expmt(hutch=hutch).lower():\n        return get_elog_opr_auth(exp)\n    else:\n        return get_elog_kerberos_auth()\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_kerberos_auth","title":"<code>get_elog_kerberos_auth()</code>","text":"<p>Returns Kerberos authorization key.</p> <p>This functions returns authorization for the USER account submitting jobs. It assumes that <code>kinit</code> has been run.</p> <p>Returns:</p> Name Type Description <code>auth</code> <code>Dict[str, str]</code> <p>Dictionary containing Kerberos authorization key.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_kerberos_auth() -&gt; Dict[str, str]:\n    \"\"\"Returns Kerberos authorization key.\n\n    This functions returns authorization for the USER account submitting jobs.\n    It assumes that `kinit` has been run.\n\n    Returns:\n        auth (Dict[str, str]): Dictionary containing Kerberos authorization key.\n    \"\"\"\n    from krtc import KerberosTicket\n\n    return KerberosTicket(\"HTTP@pswww.slac.stanford.edu\").getAuthHeaders()\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_opr_auth","title":"<code>get_elog_opr_auth(exp)</code>","text":"<p>Produce authentication for the \"opr\" user associated to an experiment.</p> <p>This method uses basic authentication using username and password.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Name of the experiment to produce authentication for.</p> required <p>Returns:</p> Name Type Description <code>auth</code> <code>HTTPBasicAuth</code> <p>HTTPBasicAuth for an active experiment based on username and password for the associated operator account.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_opr_auth(exp: str) -&gt; HTTPBasicAuth:\n    \"\"\"Produce authentication for the \"opr\" user associated to an experiment.\n\n    This method uses basic authentication using username and password.\n\n    Args:\n        exp (str): Name of the experiment to produce authentication for.\n\n    Returns:\n        auth (HTTPBasicAuth): HTTPBasicAuth for an active experiment based on\n            username and password for the associated operator account.\n    \"\"\"\n    opr: str = f\"{exp[:3]}opr\"\n    with open(\"/sdf/group/lcls/ds/tools/forElogPost.txt\", \"r\") as f:\n        pw: str = f.readline()[:-1]\n    return HTTPBasicAuth(opr, pw)\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_params_by_run","title":"<code>get_elog_params_by_run(exp, params, runs=None)</code>","text":"<p>Retrieve requested parameters by run or for all runs.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment to retrieve parameters for.</p> required <code>params</code> <code>List[str]</code> <p>A list of parameters to retrieve. These can be any parameter recorded in the eLog (PVs, parameters posted by other Tasks, etc.)</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_params_by_run(\n    exp: str, params: List[str], runs: Optional[List[int]] = None\n) -&gt; Dict[str, str]:\n    \"\"\"Retrieve requested parameters by run or for all runs.\n\n    Args:\n        exp (str): Experiment to retrieve parameters for.\n\n        params (List[str]): A list of parameters to retrieve. These can be any\n            parameter recorded in the eLog (PVs, parameters posted by other\n            Tasks, etc.)\n    \"\"\"\n    ...\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_runs_by_tag","title":"<code>get_elog_runs_by_tag(exp, tag, auth=None)</code>","text":"<p>Retrieve run numbers with a specified tag.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>tag</code> <code>str</code> <p>The tag to retrieve runs for.</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_runs_by_tag(\n    exp: str, tag: str, auth: Optional[Union[HTTPBasicAuth, Dict]] = None\n) -&gt; List[int]:\n    \"\"\"Retrieve run numbers with a specified tag.\n\n    Args:\n        exp (str): Experiment name.\n\n        tag (str): The tag to retrieve runs for.\n    \"\"\"\n    endpoint: str = f\"{exp}/ws/get_runs_with_tag?tag={tag}\"\n    params: Dict[str, Any] = {}\n\n    status_code, resp_msg, tagged_runs = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"GET\", **params\n    )\n\n    if not tagged_runs:\n        tagged_runs = []\n\n    return tagged_runs\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_workflows","title":"<code>get_elog_workflows(exp)</code>","text":"<p>Get the current workflow definitions for an experiment.</p> <p>Returns:</p> Name Type Description <code>defns</code> <code>Dict[str, str]</code> <p>A dictionary of workflow definitions.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_workflows(exp: str) -&gt; Dict[str, str]:\n    \"\"\"Get the current workflow definitions for an experiment.\n\n    Returns:\n        defns (Dict[str, str]): A dictionary of workflow definitions.\n    \"\"\"\n    raise NotImplementedError\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_message","title":"<code>post_elog_message(exp, msg, *, tag, title, in_files=[])</code>","text":"<p>Post a new message to the eLog. Inspired by the <code>elog</code> package.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>msg</code> <code>str</code> <p>BODY of the eLog post.</p> required <code>tag</code> <code>str | None</code> <p>Optional \"tag\" to associate with the eLog post.</p> required <code>title</code> <code>str | None</code> <p>Optional title to include in the eLog post.</p> required <code>in_files</code> <code>List[str | tuple | list]</code> <p>Files to include as attachments in the eLog post.</p> <code>[]</code> <p>Returns:</p> Name Type Description <code>err_msg</code> <code>str | None</code> <p>If successful, nothing is returned, otherwise, return an error message.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_message(\n    exp: str,\n    msg: str,\n    *,\n    tag: Optional[str],\n    title: Optional[str],\n    in_files: List[Union[str, tuple, list]] = [],\n) -&gt; Optional[str]:\n    \"\"\"Post a new message to the eLog. Inspired by the `elog` package.\n\n    Args:\n        exp (str): Experiment name.\n\n        msg (str): BODY of the eLog post.\n\n        tag (str | None): Optional \"tag\" to associate with the eLog post.\n\n        title (str | None): Optional title to include in the eLog post.\n\n        in_files (List[str | tuple | list]): Files to include as attachments in\n            the eLog post.\n\n    Returns:\n        err_msg (str | None): If successful, nothing is returned, otherwise,\n            return an error message.\n    \"\"\"\n    # MOSTLY CORRECT\n    out_files: list = []\n    for f in in_files:\n        try:\n            out_files.append(format_file_for_post(in_file=f))\n        except ElogFileFormatError as err:\n            logger.debug(f\"ElogFileFormatError: {err}\")\n    post: Dict[str, str] = {}\n    post[\"log_text\"] = msg\n    if tag:\n        post[\"log_tags\"] = tag\n    if title:\n        post[\"log_title\"] = title\n\n    endpoint: str = f\"{exp}/ws/new_elog_entry\"\n\n    params: Dict[str, Any] = {\"data\": post}\n\n    if out_files:\n        params.update({\"files\": out_files})\n\n    status_code, resp_msg, _ = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"POST\", **params\n    )\n\n    if resp_msg != \"SUCCESS\":\n        return resp_msg\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_run_status","title":"<code>post_elog_run_status(data, update_url=None)</code>","text":"<p>Post a summary to the status/report section of a specific run.</p> <p>In contrast to most eLog update/post mechanisms, this function searches for a specific environment variable which contains a specific URL for posting. This is updated every job/run as jobs are submitted by the JID. The URL can optionally be passed to this function if it is known.</p> <p>Parameters:</p> Name Type Description Default <code>data</code> <code>Dict[str, Union[str, int, float]]</code> <p>The data to post to the eLog report section. Formatted in key:value pairs.</p> required <code>update_url</code> <code>Optional[str]</code> <p>Optional update URL. If not provided, the function searches for the corresponding environment variable. If neither is found, the function aborts</p> <code>None</code> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_run_status(\n    data: Dict[str, Union[str, int, float]], update_url: Optional[str] = None\n) -&gt; None:\n    \"\"\"Post a summary to the status/report section of a specific run.\n\n    In contrast to most eLog update/post mechanisms, this function searches\n    for a specific environment variable which contains a specific URL for\n    posting. This is updated every job/run as jobs are submitted by the JID.\n    The URL can optionally be passed to this function if it is known.\n\n    Args:\n        data (Dict[str, Union[str, int, float]]): The data to post to the eLog\n            report section. Formatted in key:value pairs.\n\n        update_url (Optional[str]): Optional update URL. If not provided, the\n            function searches for the corresponding environment variable. If\n            neither is found, the function aborts\n    \"\"\"\n    if update_url is None:\n        update_url = os.environ.get(\"JID_UPDATE_COUNTERS\")\n        if update_url is None:\n            logger.info(\"eLog Update Failed! JID_UPDATE_COUNTERS is not defined!\")\n            return\n    current_status: Dict[str, Union[str, int, float]] = _get_current_run_status(\n        update_url\n    )\n    current_status.update(data)\n    post_list: List[Dict[str, str]] = [\n        {\"key\": f\"{key}\", \"value\": f\"{value}\"} for key, value in current_status.items()\n    ]\n    params: Dict[str, List[Dict[str, str]]] = {\"json\": post_list}\n    resp: requests.models.Response = requests.post(update_url, **params)\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_run_table","title":"<code>post_elog_run_table(exp, run, data)</code>","text":"<p>Post data for eLog run tables.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>run</code> <code>int</code> <p>Run number corresponding to the data being posted.</p> required <code>data</code> <code>Dict[str, Any]</code> <p>Data to be posted in format data[\"column_header\"] = value.</p> required <p>Returns:</p> Name Type Description <code>err_msg</code> <code>None | str</code> <p>If successful, nothing is returned, otherwise, return an error message.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_run_table(\n    exp: str,\n    run: int,\n    data: Dict[str, Any],\n) -&gt; Optional[str]:\n    \"\"\"Post data for eLog run tables.\n\n    Args:\n        exp (str): Experiment name.\n\n        run (int): Run number corresponding to the data being posted.\n\n        data (Dict[str, Any]): Data to be posted in format\n            data[\"column_header\"] = value.\n\n    Returns:\n        err_msg (None | str): If successful, nothing is returned, otherwise,\n            return an error message.\n    \"\"\"\n    endpoint: str = f\"run_control/{exp}/ws/add_run_params\"\n\n    params: Dict[str, Any] = {\"params\": {\"run_num\": run}, \"json\": data}\n\n    status_code, resp_msg, _ = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"POST\", **params\n    )\n\n    if resp_msg != \"SUCCESS\":\n        return resp_msg\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_workflow","title":"<code>post_elog_workflow(exp, name, executable, wf_params, *, trigger='run_end', location='S3DF', **trig_args)</code>","text":"<p>Create a new eLog workflow, or update an existing one.</p> <p>The workflow will run a specific executable as a batch job when the specified trigger occurs. The precise arguments may vary depending on the selected trigger type.</p> <p>Parameters:</p> Name Type Description Default <code>name</code> <code>str</code> <p>An identifying name for the workflow. E.g. \"process data\"</p> required <code>executable</code> <code>str</code> <p>Full path to the executable to be run.</p> required <code>wf_params</code> <code>str</code> <p>All command-line parameters for the executable as a string.</p> required <code>trigger</code> <code>str</code> <p>When to trigger execution of the specified executable. One of:     - 'manual': Must be manually triggered. No automatic processing.     - 'run_start': Execute immediately if a new run begins.     - 'run_end': As soon as a run ends.     - 'param_is': As soon as a parameter has a specific value for a run.</p> <code>'run_end'</code> <code>location</code> <code>str</code> <p>Where to submit the job. S3DF or NERSC.</p> <code>'S3DF'</code> <code>**trig_args</code> <code>str</code> <p>Arguments required for a specific trigger type. trigger='param_is' - 2 Arguments     trig_param (str): Name of the parameter to watch for.     trig_param_val (str): Value the parameter should have to trigger.</p> <code>{}</code> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_workflow(\n    exp: str,\n    name: str,\n    executable: str,\n    wf_params: str,\n    *,\n    trigger: str = \"run_end\",\n    location: str = \"S3DF\",\n    **trig_args: str,\n) -&gt; None:\n    \"\"\"Create a new eLog workflow, or update an existing one.\n\n    The workflow will run a specific executable as a batch job when the\n    specified trigger occurs. The precise arguments may vary depending on the\n    selected trigger type.\n\n    Args:\n        name (str): An identifying name for the workflow. E.g. \"process data\"\n\n        executable (str): Full path to the executable to be run.\n\n        wf_params (str): All command-line parameters for the executable as a string.\n\n        trigger (str): When to trigger execution of the specified executable.\n            One of:\n                - 'manual': Must be manually triggered. No automatic processing.\n                - 'run_start': Execute immediately if a new run begins.\n                - 'run_end': As soon as a run ends.\n                - 'param_is': As soon as a parameter has a specific value for a run.\n\n        location (str): Where to submit the job. S3DF or NERSC.\n\n        **trig_args (str): Arguments required for a specific trigger type.\n            trigger='param_is' - 2 Arguments\n                trig_param (str): Name of the parameter to watch for.\n                trig_param_val (str): Value the parameter should have to trigger.\n    \"\"\"\n    endpoint: str = f\"{exp}/ws/create_update_workflow_def\"\n    trig_map: Dict[str, str] = {\n        \"manual\": \"MANUAL\",\n        \"run_start\": \"START_OF_RUN\",\n        \"run_end\": \"END_OF_RUN\",\n        \"param_is\": \"RUN_PARAM_IS_VALUE\",\n    }\n    if trigger not in trig_map.keys():\n        raise NotImplementedError(\n            f\"Cannot create workflow with trigger type: {trigger}\"\n        )\n    wf_defn: Dict[str, str] = {\n        \"name\": name,\n        \"executable\": executable,\n        \"parameters\": wf_params,\n        \"trigger\": trig_map[trigger],\n        \"location\": location,\n    }\n    if trigger == \"param_is\":\n        if \"trig_param\" not in trig_args or \"trig_param_val\" not in trig_args:\n            raise RuntimeError(\n                \"Trigger type 'param_is' requires: 'trig_param' and 'trig_param_val' arguments\"\n            )\n        wf_defn.update(\n            {\n                \"run_param_name\": trig_args[\"trig_param\"],\n                \"run_param_val\": trig_args[\"trig_param_val\"],\n            }\n        )\n    post_params: Dict[str, Dict[str, str]] = {\"json\": wf_defn}\n    status_code, resp_msg, _ = elog_http_request(\n        exp, endpoint=endpoint, request_type=\"POST\", **post_params\n    )\n</code></pre>"},{"location":"source/io/exceptions/","title":"exceptions","text":"<p>Specifies custom exceptions defined for IO problems.</p> <p>Raises:</p> Type Description <code>ElogFileFormatError</code> <p>Raised if an attachment is specified in an incorrect format.</p>"},{"location":"source/io/exceptions/#io.exceptions.ElogFileFormatError","title":"<code>ElogFileFormatError</code>","text":"<p>               Bases: <code>Exception</code></p> <p>Raised when an eLog attachment is specified in an invalid format.</p> Source code in <code>lute/io/exceptions.py</code> <pre><code>class ElogFileFormatError(Exception):\n    \"\"\"Raised when an eLog attachment is specified in an invalid format.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/io/models/base/","title":"base","text":"<p>Base classes for describing Task parameters.</p> <p>Classes:</p> Name Description <code>AnalysisHeader</code> <p>Model holding shared configuration across Tasks. E.g. experiment name, run number and working directory.</p> <code>TaskParameters</code> <p>Base class for Task parameters. Subclasses specify a model of parameters and their types for validation.</p> <code>ThirdPartyParameters</code> <p>Base class for Third-party, binary executable Tasks.</p> <code>TemplateParameters</code> <p>Dataclass to represent parameters of binary (third-party) Tasks which are used for additional config files.</p> <code>TemplateConfig</code> <p>Class for holding information on where templates are stored in order to properly handle ThirdPartyParameter objects.</p>"},{"location":"source/io/models/base/#io.models.base.AnalysisHeader","title":"<code>AnalysisHeader</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Header information for LUTE analysis runs.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class AnalysisHeader(BaseModel):\n    \"\"\"Header information for LUTE analysis runs.\"\"\"\n\n    title: str = Field(\n        \"LUTE Task Configuration\",\n        description=\"Description of the configuration or experiment.\",\n    )\n    experiment: str = Field(\"\", description=\"Experiment.\")\n    run: Union[str, int] = Field(\"\", description=\"Data acquisition run.\")\n    date: str = Field(\"1970/01/01\", description=\"Start date of analysis.\")\n    lute_version: Union[float, str] = Field(\n        0.1, description=\"Version of LUTE used for analysis.\"\n    )\n    task_timeout: PositiveInt = Field(\n        600,\n        description=(\n            \"Time in seconds until a task times out. Should be slightly shorter\"\n            \" than job timeout if using a job manager (e.g. SLURM).\"\n        ),\n    )\n    work_dir: str = Field(\"\", description=\"Main working directory for LUTE.\")\n\n    @validator(\"work_dir\", always=True)\n    def validate_work_dir(cls, directory: str, values: Dict[str, Any]) -&gt; str:\n        work_dir: str\n        if directory == \"\":\n            std_work_dir = (\n                f\"/sdf/data/lcls/ds/{values['experiment'][:3]}/\"\n                f\"{values['experiment']}/scratch\"\n            )\n            work_dir = std_work_dir\n        else:\n            work_dir = directory\n        # Check existence and permissions\n        if not os.path.exists(work_dir):\n            raise ValueError(f\"Working Directory: {work_dir} does not exist!\")\n        if not os.access(work_dir, os.W_OK):\n            # Need write access for database, files etc.\n            raise ValueError(f\"Not write access for working directory: {work_dir}!\")\n        return work_dir\n\n    @validator(\"run\", always=True)\n    def validate_run(\n        cls, run: Union[str, int], values: Dict[str, Any]\n    ) -&gt; Union[str, int]:\n        if run == \"\":\n            # From Airflow RUN_NUM should have Format \"RUN_DATETIME\" - Num is first part\n            run_time: str = os.environ.get(\"RUN_NUM\", \"\")\n            if run_time != \"\":\n                return int(run_time.split(\"_\")[0])\n        return run\n\n    @validator(\"experiment\", always=True)\n    def validate_experiment(cls, experiment: str, values: Dict[str, Any]) -&gt; str:\n        if experiment == \"\":\n            arp_exp: str = os.environ.get(\"EXPERIMENT\", \"EXPX00000\")\n            return arp_exp\n        return experiment\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters","title":"<code>TaskParameters</code>","text":"<p>               Bases: <code>BaseSettings</code></p> <p>Base class for models of task parameters to be validated.</p> <p>Parameters are read from a configuration YAML file and validated against subclasses of this type in order to ensure that both all parameters are present, and that the parameters are of the correct type.</p> Note <p>Pydantic is used for data validation. Pydantic does not perform \"strict\" validation by default. Parameter values may be cast to conform with the model specified by the subclass definition if it is possible to do so. Consider whether this may cause issues (e.g. if a float is cast to an int).</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TaskParameters(BaseSettings):\n    \"\"\"Base class for models of task parameters to be validated.\n\n    Parameters are read from a configuration YAML file and validated against\n    subclasses of this type in order to ensure that both all parameters are\n    present, and that the parameters are of the correct type.\n\n    Note:\n        Pydantic is used for data validation. Pydantic does not perform \"strict\"\n        validation by default. Parameter values may be cast to conform with the\n        model specified by the subclass definition if it is possible to do so.\n        Consider whether this may cause issues (e.g. if a float is cast to an\n        int).\n    \"\"\"\n\n    class Config:\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration. A number of LUTE-specific\n        configuration has also been placed here.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). False. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n                `set_result==True`\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however. Only used if `set_result==True`\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if `set_result==True`.\n        \"\"\"\n\n        env_prefix = \"LUTE_\"\n        underscore_attrs_are_private: bool = True\n        copy_on_model_validation: str = \"deep\"\n        allow_inf_nan: bool = False\n\n        run_directory: Optional[str] = None\n        \"\"\"Set the directory that the Task is run from.\"\"\"\n        set_result: bool = False\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n        result_from_params: Optional[str] = None\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n        result_summary: Optional[str] = None\n        \"\"\"Format a TaskResult.summary from output.\"\"\"\n        impl_schemas: Optional[str] = None\n        \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n\n    lute_config: AnalysisHeader\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config","title":"<code>Config</code>","text":"<p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration. A number of LUTE-specific configuration has also been placed here.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc. Only used if <code>set_result==True</code></p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however. Only used if <code>set_result==True</code></p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config:\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration. A number of LUTE-specific\n    configuration has also been placed here.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). False. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n            `set_result==True`\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however. Only used if `set_result==True`\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if `set_result==True`.\n    \"\"\"\n\n    env_prefix = \"LUTE_\"\n    underscore_attrs_are_private: bool = True\n    copy_on_model_validation: str = \"deep\"\n    allow_inf_nan: bool = False\n\n    run_directory: Optional[str] = None\n    \"\"\"Set the directory that the Task is run from.\"\"\"\n    set_result: bool = False\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n    result_from_params: Optional[str] = None\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n    result_summary: Optional[str] = None\n    \"\"\"Format a TaskResult.summary from output.\"\"\"\n    impl_schemas: Optional[str] = None\n    \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.impl_schemas","title":"<code>impl_schemas: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Schema specification for output result. Will be passed to TaskResult.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.result_from_params","title":"<code>result_from_params: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.result_summary","title":"<code>result_summary: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Format a TaskResult.summary from output.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.run_directory","title":"<code>run_directory: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Set the directory that the Task is run from.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.set_result","title":"<code>set_result: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/base/#io.models.base.TemplateConfig","title":"<code>TemplateConfig</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Parameters used for templating of third party configuration files.</p> <p>Attributes:</p> Name Type Description <code>template_name</code> <code>str</code> <p>The name of the template to use. This template must live in <code>config/templates</code>.</p> <code>output_path</code> <code>str</code> <p>The FULL path, including filename to write the rendered template to.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TemplateConfig(BaseModel):\n    \"\"\"Parameters used for templating of third party configuration files.\n\n    Attributes:\n        template_name (str): The name of the template to use. This template must\n            live in `config/templates`.\n\n        output_path (str): The FULL path, including filename to write the\n            rendered template to.\n    \"\"\"\n\n    template_name: str\n    output_path: str\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TemplateParameters","title":"<code>TemplateParameters</code>","text":"<p>Class for representing parameters for third party configuration files.</p> <p>These parameters can represent arbitrary data types and are used in conjunction with templates for modifying third party configuration files from the single LUTE YAML. Due to the storage of arbitrary data types, and the use of a template file, a single instance of this class can hold from a single template variable to an entire configuration file. The data parsing is done by jinja using the complementary template. All data is stored in the single model variable <code>params.</code></p> <p>The pydantic \"dataclass\" is used over the BaseModel/Settings to allow positional argument instantiation of the <code>params</code> Field.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>@dataclass\nclass TemplateParameters:\n    \"\"\"Class for representing parameters for third party configuration files.\n\n    These parameters can represent arbitrary data types and are used in\n    conjunction with templates for modifying third party configuration files\n    from the single LUTE YAML. Due to the storage of arbitrary data types, and\n    the use of a template file, a single instance of this class can hold from a\n    single template variable to an entire configuration file. The data parsing\n    is done by jinja using the complementary template.\n    All data is stored in the single model variable `params.`\n\n    The pydantic \"dataclass\" is used over the BaseModel/Settings to allow\n    positional argument instantiation of the `params` Field.\n    \"\"\"\n\n    params: Any\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters","title":"<code>ThirdPartyParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Base class for third party task parameters.</p> <p>Contains special validators for extra arguments and handling of parameters used for filling in third party configuration files.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class ThirdPartyParameters(TaskParameters):\n    \"\"\"Base class for third party task parameters.\n\n    Contains special validators for extra arguments and handling of parameters\n    used for filling in third party configuration files.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration and inherited configuration\n        from the base `TaskParameters.Config` class. A number of values are also\n        overridden, and there are some specific configuration options to\n        ThirdPartyParameters. A full list of options (with TaskParameters options\n        repeated) is described below.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). True. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc.\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however.\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if set_result is True.\n\n            -----------------------\n            ThirdPartyTask-specific:\n\n            extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n                arguments.\n\n            short_flags_use_eq (bool): False. If True, \"short\" command-line args\n                are passed as `-x=arg`. ThirdPartyTask-specific.\n\n            long_flags_use_eq (bool): False. If True, \"long\" command-line args\n                are passed as `--long=arg`. ThirdPartyTask-specific.\n        \"\"\"\n\n        extra: str = \"allow\"\n        short_flags_use_eq: bool = False\n        \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n        long_flags_use_eq: bool = False\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    # lute_template_cfg: TemplateConfig\n\n    @root_validator(pre=False)\n    def extra_fields_to_thirdparty(cls, values: Dict[str, Any]):\n        for key in values:\n            if key not in cls.__fields__:\n                values[key] = TemplateParameters(values[key])\n\n        return values\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration and inherited configuration from the base <code>TaskParameters.Config</code> class. A number of values are also overridden, and there are some specific configuration options to ThirdPartyParameters. A full list of options (with TaskParameters options repeated) is described below.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc.</p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however.</p> <code>ThirdPartyTask-specific</code> <code>Optional[str]</code> <code>extra</code> <code>str</code> <p>\"allow\". Pydantic configuration. Allow (or ignore) extra arguments.</p> <code>short_flags_use_eq</code> <code>bool</code> <p>False. If True, \"short\" command-line args are passed as <code>-x=arg</code>. ThirdPartyTask-specific.</p> <code>long_flags_use_eq</code> <code>bool</code> <p>False. If True, \"long\" command-line args are passed as <code>--long=arg</code>. ThirdPartyTask-specific.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config(TaskParameters.Config):\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration and inherited configuration\n    from the base `TaskParameters.Config` class. A number of values are also\n    overridden, and there are some specific configuration options to\n    ThirdPartyParameters. A full list of options (with TaskParameters options\n    repeated) is described below.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). True. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc.\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however.\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if set_result is True.\n\n        -----------------------\n        ThirdPartyTask-specific:\n\n        extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n            arguments.\n\n        short_flags_use_eq (bool): False. If True, \"short\" command-line args\n            are passed as `-x=arg`. ThirdPartyTask-specific.\n\n        long_flags_use_eq (bool): False. If True, \"long\" command-line args\n            are passed as `--long=arg`. ThirdPartyTask-specific.\n    \"\"\"\n\n    extra: str = \"allow\"\n    short_flags_use_eq: bool = False\n    \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n    long_flags_use_eq: bool = False\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.short_flags_use_eq","title":"<code>short_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether short command-line arguments are passed like <code>-x=arg</code>.</p>"},{"location":"source/io/models/sfx_find_peaks/","title":"sfx_find_peaks","text":""},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters","title":"<code>FindPeaksPsocakeParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using Psocake.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation. NOTE: This Task is deprecated and provided for compatibility only.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPsocakeParameters(ThirdPartyParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using Psocake.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    NOTE: This Task is deprecated and provided for compatibility only.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    class SZParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description=\"SZ compression algorithm (qoz, sz3)\"\n        )\n        binSize: int = Field(2, description=\"SZ compression's bin size paramater\")\n        roiWindowSize: int = Field(\n            2, description=\"SZ compression's ROI window size paramater\"\n        )\n        absError: float = Field(10, descriptionp=\"Maximum absolute error value\")\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    mca: str = Field(\n        \"btl ^openib\", description=\"Mca option for the MPI executable\", flag_type=\"--\"\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    p_arg2: str = Field(\n        \"findPeaksSZ.py\",\n        description=\"Executable to run with mpi (i.e. python).\",\n        flag_type=\"\",\n    )\n    d: str = Field(description=\"Detector name\", flag_type=\"-\")\n    e: str = Field(\"\", description=\"Experiment name\", flag_type=\"-\")\n    r: int = Field(-1, description=\"Run number\", flag_type=\"-\")\n    outDir: str = Field(\n        description=\"Output directory where .cxi will be saved\", flag_type=\"--\"\n    )\n    algorithm: int = Field(1, description=\"PyAlgos algorithm to use\", flag_type=\"--\")\n    alg_npix_min: float = Field(\n        1.0, description=\"PyAlgos algorithm's npix_min parameter\", flag_type=\"--\"\n    )\n    alg_npix_max: float = Field(\n        45.0, description=\"PyAlgos algorithm's npix_max parameter\", flag_type=\"--\"\n    )\n    alg_amax_thr: float = Field(\n        250.0, description=\"PyAlgos algorithm's amax_thr parameter\", flag_type=\"--\"\n    )\n    alg_atot_thr: float = Field(\n        330.0, description=\"PyAlgos algorithm's atot_thr parameter\", flag_type=\"--\"\n    )\n    alg_son_min: float = Field(\n        10.0, description=\"PyAlgos algorithm's son_min parameter\", flag_type=\"--\"\n    )\n    alg1_thr_low: float = Field(\n        80.0, description=\"PyAlgos algorithm's thr_low parameter\", flag_type=\"--\"\n    )\n    alg1_thr_high: float = Field(\n        270.0, description=\"PyAlgos algorithm's thr_high parameter\", flag_type=\"--\"\n    )\n    alg1_rank: int = Field(\n        3, description=\"PyAlgos algorithm's rank parameter\", flag_type=\"--\"\n    )\n    alg1_radius: int = Field(\n        3, description=\"PyAlgos algorithm's radius parameter\", flag_type=\"--\"\n    )\n    alg1_dr: int = Field(\n        1, description=\"PyAlgos algorithm's dr parameter\", flag_type=\"--\"\n    )\n    psanaMask_on: str = Field(\n        \"True\", description=\"Whether psana's mask should be used\", flag_type=\"--\"\n    )\n    psanaMask_calib: str = Field(\n        \"True\", description=\"Psana mask's calib parameter\", flag_type=\"--\"\n    )\n    psanaMask_status: str = Field(\n        \"True\", description=\"Psana mask's status parameter\", flag_type=\"--\"\n    )\n    psanaMask_edges: str = Field(\n        \"True\", description=\"Psana mask's edges parameter\", flag_type=\"--\"\n    )\n    psanaMask_central: str = Field(\n        \"True\", description=\"Psana mask's central parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbond: str = Field(\n        \"True\", description=\"Psana mask's unbond parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbondnrs: str = Field(\n        \"True\", description=\"Psana mask's unbondnbrs parameter\", flag_type=\"--\"\n    )\n    mask: str = Field(\n        \"\", description=\"Path to an additional mask to apply\", flag_type=\"--\"\n    )\n    clen: str = Field(\n        description=\"Epics variable storing the camera length\", flag_type=\"--\"\n    )\n    coffset: float = Field(0, description=\"Camera offset in m\", flag_type=\"--\")\n    minPeaks: int = Field(\n        15,\n        description=\"Minimum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    maxPeaks: int = Field(\n        15,\n        description=\"Maximum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    minRes: int = Field(\n        0,\n        description=\"Minimum peak resolution to mark frame for indexing \",\n        flag_type=\"--\",\n    )\n    sample: str = Field(\"\", description=\"Sample name\", flag_type=\"--\")\n    instrument: Union[None, str] = Field(\n        None, description=\"Instrument name\", flag_type=\"--\"\n    )\n    pixelSize: float = Field(0.0, description=\"Pixel size\", flag_type=\"--\")\n    auto: str = Field(\n        \"False\",\n        description=(\n            \"Whether to automatically determine peak per event peak \"\n            \"finding parameters\"\n        ),\n        flag_type=\"--\",\n    )\n    detectorDistance: float = Field(\n        0.0, description=\"Detector distance from interaction point in m\", flag_type=\"--\"\n    )\n    access: Literal[\"ana\", \"ffb\"] = Field(\n        \"ana\", description=\"Data node type: {ana,ffb}\", flag_type=\"--\"\n    )\n    szfile: str = Field(\"qoz.json\", description=\"Path to SZ's JSON configuration file\")\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"sz.json\",\n            output_path=\"\",  # Will want to change where this goes...\n        ),\n        description=\"Template information for the sz.json file\",\n    )\n    sz_parameters: SZParameters = Field(\n        description=\"Configuration parameters for SZ Compression\", flag_type=\"\"\n    )\n\n    @validator(\"e\", always=True)\n    def validate_e(cls, e: str, values: Dict[str, Any]) -&gt; str:\n        if e == \"\":\n            return values[\"lute_config\"].experiment\n        return e\n\n    @validator(\"r\", always=True)\n    def validate_r(cls, r: int, values: Dict[str, Any]) -&gt; int:\n        if r == -1:\n            return values[\"lute_config\"].run\n        return r\n\n    @validator(\"lute_template_cfg\", always=True)\n    def set_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"szfile\"]\n        return lute_template_cfg\n\n    @validator(\"sz_parameters\", always=True)\n    def set_sz_compression_parameters(\n        cls, sz_parameters: SZParameters, values: Dict[str, Any]\n    ) -&gt; None:\n        values[\"compressor\"] = sz_parameters.compressor\n        values[\"binSize\"] = sz_parameters.binSize\n        values[\"roiWindowSize\"] = sz_parameters.roiWindowSize\n        if sz_parameters.compressor == \"qoz\":\n            values[\"pressio_opts\"] = {\n                \"pressio:abs\": sz_parameters.absError,\n                \"qoz\": {\"qoz:stride\": 8},\n            }\n        else:\n            values[\"pressio_opts\"] = {\"pressio:abs\": sz_parameters.absError}\n        return None\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        run: int = int(values[\"lute_config\"].run)\n        directory: str = values[\"outDir\"]\n        fname: str = f\"{exp}_{run:04d}.lst\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters","title":"<code>FindPeaksPyAlgosParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using PyAlgos.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgosParameters(TaskParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using PyAlgos.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    class SZCompressorParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description='Compression algorithm (\"qoz\" or \"sz3\")'\n        )\n        abs_error: float = Field(10.0, description=\"Absolute error bound\")\n        bin_size: int = Field(2, description=\"Bin size\")\n        roi_window_size: int = Field(\n            9,\n            description=\"Default window size\",\n        )\n\n    outdir: str = Field(\n        description=\"Output directory for cxi files\",\n    )\n    n_events: int = Field(\n        0,\n        description=\"Number of events to process (0 to process all events)\",\n    )\n    det_name: str = Field(\n        description=\"Psana name of the detector storing the image data\",\n    )\n    event_receiver: Literal[\"evr0\", \"evr1\"] = Field(\n        description=\"Event Receiver to be used: evr0 or evr1\",\n    )\n    tag: str = Field(\n        \"\",\n        description=\"Tag to add to the output file names\",\n    )\n    pv_camera_length: Union[str, float] = Field(\n        \"\",\n        description=\"PV associated with camera length \"\n        \"(if a number, camera length directly)\",\n    )\n    event_logic: bool = Field(\n        False,\n        description=\"True if only events with a specific event code should be \"\n        \"processed. False if the event code should be ignored\",\n    )\n    event_code: int = Field(\n        0,\n        description=\"Required events code for events to be processed if event logic \"\n        \"is True\",\n    )\n    psana_mask: bool = Field(\n        False,\n        description=\"If True, apply mask from psana Detector object\",\n    )\n    mask_file: Union[str, None] = Field(\n        None,\n        description=\"File with a custom mask to apply. If None, no custom mask is \"\n        \"applied\",\n    )\n    min_peaks: int = Field(2, description=\"Minimum number of peaks per image\")\n    max_peaks: int = Field(\n        2048,\n        description=\"Maximum number of peaks per image\",\n    )\n    npix_min: int = Field(\n        2,\n        description=\"Minimum number of pixels per peak\",\n    )\n    npix_max: int = Field(\n        30,\n        description=\"Maximum number of pixels per peak\",\n    )\n    amax_thr: float = Field(\n        80.0,\n        description=\"Minimum intensity threshold for starting a peak\",\n    )\n    atot_thr: float = Field(\n        120.0,\n        description=\"Minimum summed intensity threshold for pixel collection\",\n    )\n    son_min: float = Field(\n        7.0,\n        description=\"Minimum signal-to-noise ratio to be considered a peak\",\n    )\n    peak_rank: int = Field(\n        3,\n        description=\"Radius in which central peak pixel is a local maximum\",\n    )\n    r0: float = Field(\n        3.0,\n        description=\"Radius of ring for background evaluation in pixels\",\n    )\n    dr: float = Field(\n        2.0,\n        description=\"Width of ring for background evaluation in pixels\",\n    )\n    nsigm: float = Field(\n        7.0,\n        description=\"Intensity threshold to include pixel in connected group\",\n    )\n    compression: Optional[SZCompressorParameters] = Field(\n        None,\n        description=\"Options for the SZ Compression Algorithm\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            fname: Path = (\n                Path(values[\"outdir\"])\n                / f\"{values['lute_config'].experiment}_{values['lute_config'].run}_\"\n                f\"{values['tag']}.list\"\n            )\n            return str(fname)\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_index/","title":"sfx_index","text":"<p>Models for serial femtosecond crystallography indexing.</p> <p>Classes:</p> Name Description <code>IndexCrystFELParameters</code> <p>Perform indexing of hits/peaks using CrystFEL's <code>indexamajig</code>.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters","title":"<code>ConcatenateStreamFilesParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for stream concatenation.</p> <p>Concatenates the stream file output from CrystFEL indexing for multiple experimental runs.</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class ConcatenateStreamFilesParameters(TaskParameters):\n    \"\"\"Parameters for stream concatenation.\n\n    Concatenates the stream file output from CrystFEL indexing for multiple\n    experimental runs.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    in_file: str = Field(\n        \"\",\n        description=\"Root of directory tree storing stream files to merge.\",\n    )\n\n    tag: Optional[str] = Field(\n        \"\",\n        description=\"Tag identifying the stream files to merge.\",\n    )\n\n    out_file: str = Field(\n        \"\", description=\"Path to merged output stream file.\", is_result=True\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_dir: str = str(Path(stream_file).parent)\n                return stream_dir\n        return in_file\n\n    @validator(\"tag\", always=True)\n    def validate_tag(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_tag: str = Path(stream_file).name.split(\"_\")[0]\n                return stream_tag\n        return tag\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_out_file: str = str(\n                Path(values[\"in_file\"]).parent / f\"{values['tag'].stream}\"\n            )\n            return stream_out_file\n        return tag\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters","title":"<code>IndexCrystFELParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>indexamajig</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-indexamajig.html</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class IndexCrystFELParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `indexamajig`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/indexamajig\",\n        description=\"CrystFEL's indexing binary.\",\n        flag_type=\"\",\n    )\n    # Basic options\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    geometry: str = Field(\n        \"\", description=\"Path to geometry file.\", flag_type=\"-\", rename_param=\"g\"\n    )\n    zmq_input: Optional[str] = Field(\n        description=\"ZMQ address to receive data over. `input` and `zmq-input` are mutually exclusive\",\n        flag_type=\"--\",\n        rename_param=\"zmq-input\",\n    )\n    zmq_subscribe: Optional[str] = Field(  # Can be used multiple times...\n        description=\"Subscribe to ZMQ message of type `tag`\",\n        flag_type=\"--\",\n        rename_param=\"zmq-subscribe\",\n    )\n    zmq_request: Optional[AnyUrl] = Field(\n        description=\"Request new data over ZMQ by sending this value\",\n        flag_type=\"--\",\n        rename_param=\"zmq-request\",\n    )\n    asapo_endpoint: Optional[str] = Field(\n        description=\"ASAP::O endpoint. zmq-input and this are mutually exclusive.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-endpoint\",\n    )\n    asapo_token: Optional[str] = Field(\n        description=\"ASAP::O authentication token.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-token\",\n    )\n    asapo_beamtime: Optional[str] = Field(\n        description=\"ASAP::O beatime.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-beamtime\",\n    )\n    asapo_source: Optional[str] = Field(\n        description=\"ASAP::O data source.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-source\",\n    )\n    asapo_group: Optional[str] = Field(\n        description=\"ASAP::O consumer group.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-group\",\n    )\n    asapo_stream: Optional[str] = Field(\n        description=\"ASAP::O stream.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    asapo_wait_for_stream: Optional[str] = Field(\n        description=\"If ASAP::O stream does not exist, wait for it to appear.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-wait-for-stream\",\n    )\n    data_format: Optional[str] = Field(\n        description=\"Specify format for ZMQ or ASAP::O. `msgpack`, `hdf5` or `seedee`.\",\n        flag_type=\"--\",\n        rename_param=\"data-format\",\n    )\n    basename: bool = Field(\n        False,\n        description=\"Remove directory parts of filenames. Acts before prefix if prefix also given.\",\n        flag_type=\"--\",\n    )\n    prefix: Optional[str] = Field(\n        description=\"Add a prefix to the filenames from the infile argument.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    nthreads: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of threads to use. See also `max_indexer_threads`.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    no_check_prefix: bool = Field(\n        False,\n        description=\"Don't attempt to correct the prefix if it seems incorrect.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-prefix\",\n    )\n    highres: Optional[float] = Field(\n        description=\"Mark all pixels greater than `x` has bad.\", flag_type=\"--\"\n    )\n    profile: bool = Field(\n        False, description=\"Display timing data to monitor performance.\", flag_type=\"--\"\n    )\n    temp_dir: Optional[str] = Field(\n        description=\"Specify a path for the temp files folder.\",\n        flag_type=\"--\",\n        rename_param=\"temp-dir\",\n    )\n    wait_for_file: conint(gt=-2) = Field(\n        0,\n        description=\"Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\",\n        flag_type=\"--\",\n        rename_param=\"wait-for-file\",\n    )\n    no_image_data: bool = Field(\n        False,\n        description=\"Load only the metadata, no iamges. Can check indexability without high data requirements.\",\n        flag_type=\"--\",\n        rename_param=\"no-image-data\",\n    )\n    # Peak-finding options\n    # ....\n    # Indexing options\n    indexing: Optional[str] = Field(\n        description=\"Comma-separated list of supported indexing algorithms to use. Default is to automatically detect.\",\n        flag_type=\"--\",\n    )\n    cell_file: Optional[str] = Field(\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    tolerance: str = Field(\n        \"5,5,5,1.5\",\n        description=(\n            \"Tolerances (in percent) for unit cell comparison. \"\n            \"Comma-separated list a,b,c,angle. Default=5,5,5,1.5\"\n        ),\n        flag_type=\"--\",\n    )\n    no_check_cell: bool = Field(\n        False,\n        description=\"Do not check cell parameters against unit cell. Replaces '-raw' method.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-cell\",\n    )\n    no_check_peaks: bool = Field(\n        False,\n        description=\"Do not verify peaks are accounted for by solution.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-peaks\",\n    )\n    multi: bool = Field(\n        False, description=\"Enable multi-lattice indexing.\", flag_type=\"--\"\n    )\n    wavelength_estimate: Optional[float] = Field(\n        description=\"Estimate for X-ray wavelength. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"wavelength-estimate\",\n    )\n    camera_length_estimate: Optional[float] = Field(\n        description=\"Estimate for camera distance. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"camera-length-estimate\",\n    )\n    max_indexer_threads: Optional[PositiveInt] = Field(\n        # 1,\n        description=\"Some indexing algos can use multiple threads. In addition to image-based.\",\n        flag_type=\"--\",\n        rename_param=\"max-indexer-threads\",\n    )\n    no_retry: bool = Field(\n        False,\n        description=\"Do not remove weak peaks and try again.\",\n        flag_type=\"--\",\n        rename_param=\"no-retry\",\n    )\n    no_refine: bool = Field(\n        False,\n        description=\"Skip refinement step.\",\n        flag_type=\"--\",\n        rename_param=\"no-refine\",\n    )\n    no_revalidate: bool = Field(\n        False,\n        description=\"Skip revalidation step.\",\n        flag_type=\"--\",\n        rename_param=\"no-revalidate\",\n    )\n    # TakeTwo specific parameters\n    taketwo_member_threshold: Optional[PositiveInt] = Field(\n        # 20,\n        description=\"Minimum number of vectors to consider.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-member-threshold\",\n    )\n    taketwo_len_tolerance: Optional[PositiveFloat] = Field(\n        # 0.001,\n        description=\"TakeTwo length tolerance in Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-len-tolerance\",\n    )\n    taketwo_angle_tolerance: Optional[PositiveFloat] = Field(\n        # 0.6,\n        description=\"TakeTwo angle tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-angle-tolerance\",\n    )\n    taketwo_trace_tolerance: Optional[PositiveFloat] = Field(\n        # 3,\n        description=\"Matrix trace tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-trace-tolerance\",\n    )\n    # Felix-specific parameters\n    # felix_domega\n    # felix-fraction-max-visits\n    # felix-max-internal-angle\n    # felix-max-uniqueness\n    # felix-min-completeness\n    # felix-min-visits\n    # felix-num-voxels\n    # felix-sigma\n    # felix-tthrange-max\n    # felix-tthrange-min\n    # XGANDALF-specific parameters\n    xgandalf_sampling_pitch: Optional[NonNegativeInt] = Field(\n        # 6,\n        description=\"Density of reciprocal space sampling.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-sampling-pitch\",\n    )\n    xgandalf_grad_desc_iterations: Optional[NonNegativeInt] = Field(\n        # 4,\n        description=\"Number of gradient descent iterations.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-grad-desc-iterations\",\n    )\n    xgandalf_tolerance: Optional[PositiveFloat] = Field(\n        # 0.02,\n        description=\"Relative tolerance of lattice vectors\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-tolerance\",\n    )\n    xgandalf_no_deviation_from_provided_cell: Optional[bool] = Field(\n        description=\"Found unit cell must match provided.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-no-deviation-from-provided-cell\",\n    )\n    xgandalf_min_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 30,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-min-lattice-vector-length\",\n    )\n    xgandalf_max_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 250,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-lattice-vector-length\",\n    )\n    xgandalf_max_peaks: Optional[PositiveInt] = Field(\n        # 250,\n        description=\"Maximum number of peaks to use for indexing.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-peaks\",\n    )\n    xgandalf_fast_execution: bool = Field(\n        False,\n        description=\"Shortcut to set sampling-pitch=2, and grad-desc-iterations=3.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-fast-execution\",\n    )\n    # pinkIndexer parameters\n    # ...\n    # asdf_fast: bool = Field(False, description=\"Enable fast mode for asdf. 3x faster for 7% loss in accuracy.\", flag_type=\"--\", rename_param=\"asdf-fast\")\n    # Integration parameters\n    integration: str = Field(\n        \"rings-nocen\", description=\"Method for integrating reflections.\", flag_type=\"--\"\n    )\n    fix_profile_radius: Optional[float] = Field(\n        description=\"Fix the profile radius (m^{-1})\",\n        flag_type=\"--\",\n        rename_param=\"fix-profile-radius\",\n    )\n    fix_divergence: Optional[float] = Field(\n        0,\n        description=\"Fix the divergence (rad, full angle).\",\n        flag_type=\"--\",\n        rename_param=\"fix-divergence\",\n    )\n    int_radius: str = Field(\n        \"4,5,7\",\n        description=\"Inner, middle, and outer radii for 3-ring integration.\",\n        flag_type=\"--\",\n        rename_param=\"int-radius\",\n    )\n    int_diag: str = Field(\n        \"none\",\n        description=\"Show detailed information on integration when condition is met.\",\n        flag_type=\"--\",\n        rename_param=\"int-diag\",\n    )\n    push_res: str = Field(\n        \"infinity\",\n        description=\"Integrate `x` higher than apparent resolution limit (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    overpredict: bool = Field(\n        False,\n        description=\"Over-predict reflections. Maybe useful with post-refinement.\",\n        flag_type=\"--\",\n    )\n    cell_parameters_only: bool = Field(\n        False, description=\"Do not predict refletions at all\", flag_type=\"--\"\n    )\n    # Output parameters\n    no_non_hits_in_stream: bool = Field(\n        False,\n        description=\"Exclude non-hits from the stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-non-hits-in-stream\",\n    )\n    copy_hheader: Optional[str] = Field(\n        description=\"Copy information from header in the image to output stream.\",\n        flag_type=\"--\",\n        rename_param=\"copy-hheader\",\n    )\n    no_peaks_in_stream: bool = Field(\n        False,\n        description=\"Do not record peaks in stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-peaks-in-stream\",\n    )\n    no_refls_in_stream: bool = Field(\n        False,\n        description=\"Do not record reflections in stream.\",\n        flag_type=\"--\",\n        rename_param=\"no-refls-in-stream\",\n    )\n    serial_offset: Optional[PositiveInt] = Field(\n        description=\"Start numbering at `x` instead of 1.\",\n        flag_type=\"--\",\n        rename_param=\"serial-offset\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            filename: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"FindPeaksPyAlgos\", \"out_file\"\n            )\n            if filename is None:\n                exp: str = values[\"lute_config\"].experiment\n                run: int = int(values[\"lute_config\"].run)\n                tag: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"tag\"\n                )\n                out_dir: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"outDir\"\n                )\n                if out_dir is not None:\n                    fname: str = f\"{out_dir}/{exp}_{run:04d}\"\n                    if tag is not None:\n                        fname = f\"{fname}_{tag}\"\n                    return f\"{fname}.lst\"\n            else:\n                return filename\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            expmt: str = values[\"lute_config\"].experiment\n            run: int = int(values[\"lute_config\"].run)\n            work_dir: str = values[\"lute_config\"].work_dir\n            fname: str = f\"{expmt}_r{run:04d}.stream\"\n            return f\"{work_dir}/{fname}\"\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/","title":"sfx_merge","text":"<p>Models for merging reflections in serial femtosecond crystallography.</p> <p>Classes:</p> Name Description <code>MergePartialatorParameters</code> <p>Perform merging using CrystFEL's <code>partialator</code>.</p> <code>CompareHKLParameters</code> <p>Calculate figures of merit using CrystFEL's <code>compare_hkl</code>.</p> <code>ManipulateHKLParameters</code> <p>Perform transformations on lists of reflections using CrystFEL's <code>get_hkl</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters","title":"<code>CompareHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>compare_hkl</code> for calculating figures of merit.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class CompareHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `compare_hkl` for calculating figures of merit.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/compare_hkl\",\n        description=\"CrystFEL's reflection comparison binary.\",\n        flag_type=\"\",\n    )\n    in_files: Optional[str] = Field(\n        \"\",\n        description=\"Path to input HKLs. Space-separated list of 2. Use output of partialator e.g.\",\n        flag_type=\"\",\n    )\n    ## Need mechanism to set is_result=True ...\n    symmetry: str = Field(\"\", description=\"Point group symmetry.\", flag_type=\"--\")\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    fom: str = Field(\n        \"Rsplit\", description=\"Specify figure of merit to calculate.\", flag_type=\"--\"\n    )\n    nshells: int = Field(10, description=\"Use n resolution shells.\", flag_type=\"--\")\n    # NEED A NEW CASE FOR THIS -&gt; Boolean flag, no arg, one hyphen...\n    # fix_unity: bool = Field(\n    #    False,\n    #    description=\"Fix scale factors to unity.\",\n    #    flag_type=\"-\",\n    #    rename_param=\"u\",\n    # )\n    shell_file: str = Field(\n        \"\",\n        description=\"Write the statistics in resolution shells to a file.\",\n        flag_type=\"--\",\n        rename_param=\"shell-file\",\n        is_result=True,\n    )\n    ignore_negs: bool = Field(\n        False,\n        description=\"Ignore reflections with negative reflections.\",\n        flag_type=\"--\",\n        rename_param=\"ignore-negs\",\n    )\n    zero_negs: bool = Field(\n        False,\n        description=\"Set negative intensities to 0.\",\n        flag_type=\"--\",\n        rename_param=\"zero-negs\",\n    )\n    sigma_cutoff: Optional[Union[float, int, str]] = Field(\n        # \"-infinity\",\n        description=\"Discard reflections with I/sigma(I) &lt; n. -infinity means no cutoff.\",\n        flag_type=\"--\",\n        rename_param=\"sigma-cutoff\",\n    )\n    rmin: Optional[float] = Field(\n        description=\"Low resolution cutoff of 1/d (m-1). Use this or --lowres NOT both.\",\n        flag_type=\"--\",\n    )\n    lowres: Optional[float] = Field(\n        descirption=\"Low resolution cutoff in Angstroms. Use this or --rmin NOT both.\",\n        flag_type=\"--\",\n    )\n    rmax: Optional[float] = Field(\n        description=\"High resolution cutoff in 1/d (m-1). Use this or --highres NOT both.\",\n        flag_type=\"--\",\n    )\n    highres: Optional[float] = Field(\n        description=\"High resolution cutoff in Angstroms. Use this or --rmax NOT both.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_files\", always=True)\n    def validate_in_files(cls, in_files: str, values: Dict[str, Any]) -&gt; str:\n        if in_files == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                hkls: str = f\"{partialator_file}1 {partialator_file}2\"\n                return hkls\n        return in_files\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n\n    @validator(\"symmetry\", always=True)\n    def validate_symmetry(cls, symmetry: str, values: Dict[str, Any]) -&gt; str:\n        if symmetry == \"\":\n            partialator_sym: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"symmetry\"\n            )\n            if partialator_sym:\n                return partialator_sym\n        return symmetry\n\n    @validator(\"shell_file\", always=True)\n    def validate_shell_file(cls, shell_file: str, values: Dict[str, Any]) -&gt; str:\n        if shell_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                shells_out: str = partialator_file.split(\".\")[0]\n                shells_out = f\"{shells_out}_{values['fom']}_n{values['nshells']}.dat\"\n                return shells_out\n        return shell_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters","title":"<code>ManipulateHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>get_hkl</code> for manipulating lists of reflections.</p> <p>This Task is predominantly used internally to convert <code>hkl</code> to <code>mtz</code> files. Note that performing multiple manipulations is undefined behaviour. Run the Task with multiple configurations in explicit separate steps. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class ManipulateHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `get_hkl` for manipulating lists of reflections.\n\n    This Task is predominantly used internally to convert `hkl` to `mtz` files.\n    Note that performing multiple manipulations is undefined behaviour. Run\n    the Task with multiple configurations in explicit separate steps. For more\n    information on usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/get_hkl\",\n        description=\"CrystFEL's reflection manipulation binary.\",\n        flag_type=\"\",\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input HKL file.\",\n        flag_type=\"-\",\n        rename_param=\"i\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    output_format: str = Field(\n        \"mtz\",\n        description=\"Output format. One of mtz, mtz-bij, or xds. Otherwise CrystFEL format.\",\n        flag_type=\"--\",\n        rename_param=\"output-format\",\n    )\n    expand: Optional[str] = Field(\n        description=\"Reflections will be expanded to fill asymmetric unit of specified point group.\",\n        flag_type=\"--\",\n    )\n    # Reducing reflections to higher symmetry\n    twin: Optional[str] = Field(\n        description=\"Reflections equivalent to specified point group will have intensities summed.\",\n        flag_type=\"--\",\n    )\n    no_need_all_parts: Optional[bool] = Field(\n        description=\"Use with --twin to allow reflections missing a 'twin mate' to be written out.\",\n        flag_type=\"--\",\n        rename_param=\"no-need-all-parts\",\n    )\n    # Noise - Add to data\n    noise: Optional[bool] = Field(\n        description=\"Generate 10% uniform noise.\", flag_type=\"--\"\n    )\n    poisson: Optional[bool] = Field(\n        description=\"Generate Poisson noise. Intensities assumed to be A.U.\",\n        flag_type=\"--\",\n    )\n    adu_per_photon: Optional[int] = Field(\n        description=\"Use with --poisson to convert A.U. to photons.\",\n        flag_type=\"--\",\n        rename_param=\"adu-per-photon\",\n    )\n    # Remove duplicate reflections\n    trim_centrics: Optional[bool] = Field(\n        description=\"Duplicated reflections (according to symmetry) are removed.\",\n        flag_type=\"--\",\n    )\n    # Restrict to template file\n    template: Optional[str] = Field(\n        description=\"Only reflections which also appear in specified file are written out.\",\n        flag_type=\"--\",\n    )\n    # Multiplicity\n    multiplicity: Optional[bool] = Field(\n        description=\"Reflections are multiplied by their symmetric multiplicites.\",\n        flag_type=\"--\",\n    )\n    # Resolution cutoffs\n    cutoff_angstroms: Optional[Union[str, int, float]] = Field(\n        description=\"Either n, or n1,n2,n3. For n, reflections &lt; n are removed. For n1,n2,n3 anisotropic trunction performed at separate resolution limits for a*, b*, c*.\",\n        flag_type=\"--\",\n        rename_param=\"cutoff-angstroms\",\n    )\n    lowres: Optional[float] = Field(\n        description=\"Remove reflections with d &gt; n\", flag_type=\"--\"\n    )\n    highres: Optional[float] = Field(\n        description=\"Synonym for first form of --cutoff-angstroms\"\n    )\n    reindex: Optional[str] = Field(\n        description=\"Reindex according to specified operator. E.g. k,h,-l.\",\n        flag_type=\"--\",\n    )\n    # Override input symmetry\n    symmetry: Optional[str] = Field(\n        description=\"Point group symmetry to use to override. Almost always OMIT this option.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                return partialator_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                mtz_out: str = partialator_file.split(\".\")[0]\n                mtz_out = f\"{mtz_out}.mtz\"\n                return mtz_out\n        return out_file\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters","title":"<code>MergePartialatorParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>partialator</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class MergePartialatorParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `partialator`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/partialator\",\n        description=\"CrystFEL's Partialator binary.\",\n        flag_type=\"\",\n    )\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input stream.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    symmetry: str = Field(description=\"Point group symmetry.\", flag_type=\"--\")\n    niter: Optional[int] = Field(\n        description=\"Number of cycles of scaling and post-refinement.\",\n        flag_type=\"-\",\n        rename_param=\"n\",\n    )\n    no_scale: Optional[bool] = Field(\n        description=\"Disable scaling.\", flag_type=\"--\", rename_param=\"no-scale\"\n    )\n    no_Bscale: Optional[bool] = Field(\n        description=\"Disable Debye-Waller part of scaling.\",\n        flag_type=\"--\",\n        rename_param=\"no-Bscale\",\n    )\n    no_pr: Optional[bool] = Field(\n        description=\"Disable orientation model.\", flag_type=\"--\", rename_param=\"no-pr\"\n    )\n    no_deltacchalf: Optional[bool] = Field(\n        description=\"Disable rejection based on deltaCC1/2.\",\n        flag_type=\"--\",\n        rename_param=\"no-deltacchalf\",\n    )\n    model: str = Field(\n        \"unity\",\n        description=\"Partiality model. Options: xsphere, unity, offset, ggpm.\",\n        flag_type=\"--\",\n    )\n    nthreads: int = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of parallel analyses.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    polarisation: Optional[str] = Field(\n        description=\"Specification of incident polarisation. Refer to CrystFEL docs for more info.\",\n        flag_type=\"--\",\n    )\n    no_polarisation: Optional[bool] = Field(\n        description=\"Synonym for --polarisation=none\",\n        flag_type=\"--\",\n        rename_param=\"no-polarisation\",\n    )\n    max_adu: Optional[float] = Field(\n        description=\"Maximum intensity of reflection to include.\",\n        flag_type=\"--\",\n        rename_param=\"max-adu\",\n    )\n    min_res: Optional[float] = Field(\n        description=\"Only include crystals diffracting to a minimum resolution.\",\n        flag_type=\"--\",\n        rename_param=\"min-res\",\n    )\n    min_measurements: int = Field(\n        2,\n        description=\"Include a reflection only if it appears a minimum number of times.\",\n        flag_type=\"--\",\n        rename_param=\"min-measurements\",\n    )\n    push_res: Optional[float] = Field(\n        description=\"Merge reflections up to higher than the apparent resolution limit.\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    start_after: int = Field(\n        0,\n        description=\"Ignore the first n crystals.\",\n        flag_type=\"--\",\n        rename_param=\"start-after\",\n    )\n    stop_after: int = Field(\n        0,\n        description=\"Stop after processing n crystals. 0 means process all.\",\n        flag_type=\"--\",\n        rename_param=\"stop-after\",\n    )\n    no_free: Optional[bool] = Field(\n        description=\"Disable cross-validation. Testing ONLY.\",\n        flag_type=\"--\",\n        rename_param=\"no-free\",\n    )\n    custom_split: Optional[str] = Field(\n        description=\"Read a set of filenames, event and dataset IDs from a filename.\",\n        flag_type=\"--\",\n        rename_param=\"custom-split\",\n    )\n    max_rel_B: float = Field(\n        100,\n        description=\"Reject crystals if |relB| &gt; n sq Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"max-rel-B\",\n    )\n    output_every_cycle: bool = Field(\n        False,\n        description=\"Write per-crystal params after every refinement cycle.\",\n        flag_type=\"--\",\n        rename_param=\"output-every-cycle\",\n    )\n    no_logs: bool = Field(\n        False,\n        description=\"Do not write logs needed for plots, maps and graphs.\",\n        flag_type=\"--\",\n        rename_param=\"no-logs\",\n    )\n    set_symmetry: Optional[str] = Field(\n        description=\"Set the apparent symmetry of the crystals to a point group.\",\n        flag_type=\"-\",\n        rename_param=\"w\",\n    )\n    operator: Optional[str] = Field(\n        description=\"Specify an ambiguity operator. E.g. k,h,-l.\", flag_type=\"--\"\n    )\n    force_bandwidth: Optional[float] = Field(\n        description=\"Set X-ray bandwidth. As percent, e.g. 0.0013 (0.13%).\",\n        flag_type=\"--\",\n        rename_param=\"force-bandwidth\",\n    )\n    force_radius: Optional[float] = Field(\n        description=\"Set the initial profile radius (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"force-radius\",\n    )\n    force_lambda: Optional[float] = Field(\n        description=\"Set the wavelength. In Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"force-lambda\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"ConcatenateStreamFiles\",\n                \"out_file\",\n            )\n            if stream_file:\n                return stream_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            in_file: str = values[\"in_file\"]\n            if in_file:\n                tag: str = in_file.split(\".\")[0]\n                return f\"{tag}.hkl\"\n            else:\n                return \"partialator.hkl\"\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_solve/","title":"sfx_solve","text":"<p>Models for structure solution in serial femtosecond crystallography.</p> <p>Classes:</p> Name Description <code>DimpleSolveParameters</code> <p>Perform structure solution using CCP4's dimple (molecular replacement).</p>"},{"location":"source/io/models/sfx_solve/#io.models.sfx_solve.DimpleSolveParameters","title":"<code>DimpleSolveParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's dimple program.</p> <p>There are many parameters. For more information on usage, please refer to the CCP4 documentation, here: https://ccp4.github.io/dimple/</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class DimpleSolveParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's dimple program.\n\n    There are many parameters. For more information on\n    usage, please refer to the CCP4 documentation, here:\n    https://ccp4.github.io/dimple/\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/dimple\",\n        description=\"CCP4 Dimple for solving structures with MR.\",\n        flag_type=\"\",\n    )\n    # Positional requirements - all required.\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input mtz.\",\n        flag_type=\"\",\n    )\n    pdb: str = Field(\"\", description=\"Path to a PDB.\", flag_type=\"\")\n    out_dir: str = Field(\"\", description=\"Output DIRECTORY.\", flag_type=\"\")\n    # Most used options\n    mr_thresh: PositiveFloat = Field(\n        0.4,\n        description=\"Threshold for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-when-r\",\n    )\n    slow: Optional[bool] = Field(\n        False, description=\"Perform more refinement.\", flag_type=\"--\"\n    )\n    # Other options (IO)\n    hklout: str = Field(\n        \"final.mtz\", description=\"Output mtz file name.\", flag_type=\"--\"\n    )\n    xyzout: str = Field(\n        \"final.pdb\", description=\"Output PDB file name.\", flag_type=\"--\"\n    )\n    icolumn: Optional[str] = Field(\n        # \"IMEAN\",\n        description=\"Name for the I column.\",\n        flag_type=\"--\",\n    )\n    sigicolumn: Optional[str] = Field(\n        # \"SIG&lt;ICOL&gt;\",\n        description=\"Name for the Sig&lt;I&gt; column.\",\n        flag_type=\"--\",\n    )\n    fcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the F column.\",\n        flag_type=\"--\",\n    )\n    sigfcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the Sig&lt;F&gt; column.\",\n        flag_type=\"--\",\n    )\n    libin: Optional[str] = Field(\n        description=\"Ligand descriptions for refmac (LIBIN).\", flag_type=\"--\"\n    )\n    refmac_key: Optional[str] = Field(\n        description=\"Extra Refmac keywords to use in refinement.\",\n        flag_type=\"--\",\n        rename_param=\"refmac-key\",\n    )\n    free_r_flags: Optional[str] = Field(\n        description=\"Path to a mtz file with freeR flags.\",\n        flag_type=\"--\",\n        rename_param=\"free-r-flags\",\n    )\n    freecolumn: Optional[Union[int, float]] = Field(\n        # 0,\n        description=\"Refree column with an optional value.\",\n        flag_type=\"--\",\n    )\n    img_format: Optional[str] = Field(\n        description=\"Format of generated images. (png, jpeg, none).\",\n        flag_type=\"-\",\n        rename_param=\"f\",\n    )\n    white_bg: bool = Field(\n        False,\n        description=\"Use a white background in Coot and in images.\",\n        flag_type=\"--\",\n        rename_param=\"white-bg\",\n    )\n    no_cleanup: bool = Field(\n        False,\n        description=\"Retain intermediate files.\",\n        flag_type=\"--\",\n        rename_param=\"no-cleanup\",\n    )\n    # Calculations\n    no_blob_search: bool = Field(\n        False,\n        description=\"Do not search for unmodelled blobs.\",\n        flag_type=\"--\",\n        rename_param=\"no-blob-search\",\n    )\n    anode: bool = Field(\n        False, description=\"Use SHELX/AnoDe to find peaks in the anomalous map.\"\n    )\n    # Run customization\n    no_hetatm: bool = Field(\n        False,\n        description=\"Remove heteroatoms from the given model.\",\n        flag_type=\"--\",\n        rename_param=\"no-hetatm\",\n    )\n    rigid_cycles: Optional[PositiveInt] = Field(\n        # 10,\n        description=\"Number of cycles of rigid-body refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"rigid-cycles\",\n    )\n    jelly: Optional[PositiveInt] = Field(\n        # 4,\n        description=\"Number of cycles of jelly-body refinement to perform.\",\n        flag_type=\"--\",\n    )\n    restr_cycles: Optional[PositiveInt] = Field(\n        # 8,\n        description=\"Number of cycles of refmac final refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"restr-cycles\",\n    )\n    lim_resolution: Optional[PositiveFloat] = Field(\n        description=\"Limit the final resolution.\", flag_type=\"--\", rename_param=\"reso\"\n    )\n    weight: Optional[str] = Field(\n        # \"auto-weight\",\n        description=\"The refmac matrix weight.\",\n        flag_type=\"--\",\n    )\n    mr_prog: Optional[str] = Field(\n        # \"phaser\",\n        description=\"Molecular replacement program. phaser or molrep.\",\n        flag_type=\"--\",\n        rename_param=\"mr-prog\",\n    )\n    mr_num: Optional[Union[str, int]] = Field(\n        # \"auto\",\n        description=\"Number of molecules to use for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-num\",\n    )\n    mr_reso: Optional[PositiveFloat] = Field(\n        # 3.25,\n        description=\"High resolution for molecular replacement. If &gt;10 interpreted as eLLG.\",\n        flag_type=\"--\",\n        rename_param=\"mr-reso\",\n    )\n    itof_prog: Optional[str] = Field(\n        description=\"Program to calculate amplitudes. truncate, or ctruncate.\",\n        flag_type=\"--\",\n        rename_param=\"ItoF-prog\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return get_hkl_file\n        return in_file\n\n    @validator(\"out_dir\", always=True)\n    def validate_out_dir(cls, out_dir: str, values: Dict[str, Any]) -&gt; str:\n        if out_dir == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return os.path.dirname(get_hkl_file)\n        return out_dir\n</code></pre>"},{"location":"source/io/models/sfx_solve/#io.models.sfx_solve.RunSHELXCParameters","title":"<code>RunSHELXCParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's SHELXC program.</p> <p>SHELXC prepares files for SHELXD and SHELXE.</p> <p>For more information please refer to the official documentation: https://www.ccp4.ac.uk/html/crank.html</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class RunSHELXCParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's SHELXC program.\n\n    SHELXC prepares files for SHELXD and SHELXE.\n\n    For more information please refer to the official documentation:\n    https://www.ccp4.ac.uk/html/crank.html\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/shelxc\",\n        description=\"CCP4 SHELXC. Generates input files for SHELXD/SHELXE.\",\n        flag_type=\"\",\n    )\n    placeholder: str = Field(\n        \"xx\", description=\"Placeholder filename stem.\", flag_type=\"\"\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Input file for SHELXC with reflections AND proper records.\",\n        flag_type=\"\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            # get_hkl needed to be run to produce an XDS format file...\n            xds_format_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if xds_format_file:\n                in_file = xds_format_file\n        if in_file[0] != \"&lt;\":\n            # Need to add a redirection for this program\n            # Runs like `shelxc xx &lt;input_file.xds`\n            in_file = f\"&lt;{in_file}\"\n        return in_file\n</code></pre>"},{"location":"source/io/models/smd/","title":"smd","text":"<p>Models for smalldata_tools Tasks.</p> <p>Classes:</p> Name Description <code>SubmitSMDParameters</code> <p>Parameters to run smalldata_tools to produce a smalldata HDF5 file.</p> <code>FindOverlapXSSParameters</code> <p>Parameter model for the FindOverlapXSS Task. Used to determine spatial/temporal overlap based on XSS difference signal.</p>"},{"location":"source/io/models/smd/#io.models.smd.FindOverlapXSSParameters","title":"<code>FindOverlapXSSParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>TaskParameter model for FindOverlapXSS Task.</p> <p>This Task determines spatial or temporal overlap between an optical pulse and the FEL pulse based on difference scattering (XSS) signal. This Task uses SmallData HDF5 files as a source.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class FindOverlapXSSParameters(TaskParameters):\n    \"\"\"TaskParameter model for FindOverlapXSS Task.\n\n    This Task determines spatial or temporal overlap between an optical pulse\n    and the FEL pulse based on difference scattering (XSS) signal. This Task\n    uses SmallData HDF5 files as a source.\n    \"\"\"\n\n    class ExpConfig(BaseModel):\n        det_name: str\n        ipm_var: str\n        scan_var: Union[str, List[str]]\n\n    class Thresholds(BaseModel):\n        min_Iscat: Union[int, float]\n        min_ipm: Union[int, float]\n\n    class AnalysisFlags(BaseModel):\n        use_pyfai: bool = True\n        use_asymls: bool = False\n\n    exp_config: ExpConfig\n    thresholds: Thresholds\n    analysis_flags: AnalysisFlags\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters","title":"<code>SubmitSMDParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for running smalldata to produce reduced HDF5 files.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class SubmitSMDParameters(ThirdPartyParameters):\n    \"\"\"Parameters for running smalldata to produce reduced HDF5 files.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    m: str = Field(\n        \"mpi4py.run\",\n        description=\"Python option to execute a module's contents as __main__ module.\",\n        flag_type=\"-\",\n    )\n    producer: str = Field(\n        \"\", description=\"Path to the SmallData producer Python script.\", flag_type=\"\"\n    )\n    run: str = Field(\n        os.environ.get(\"RUN_NUM\", \"\"), description=\"DAQ Run Number.\", flag_type=\"--\"\n    )\n    experiment: str = Field(\n        os.environ.get(\"EXPERIMENT\", \"\"),\n        description=\"LCLS Experiment Number.\",\n        flag_type=\"--\",\n    )\n    stn: NonNegativeInt = Field(0, description=\"Hutch endstation.\", flag_type=\"--\")\n    nevents: int = Field(\n        int(1e9), description=\"Number of events to process.\", flag_type=\"--\"\n    )\n    directory: Optional[str] = Field(\n        None,\n        description=\"Optional output directory. If None, will be in ${EXP_FOLDER}/hdf5/smalldata.\",\n        flag_type=\"--\",\n    )\n    ## Need mechanism to set result_from_param=True ...\n    gather_interval: PositiveInt = Field(\n        25, description=\"Number of events to collect at a time.\", flag_type=\"--\"\n    )\n    norecorder: bool = Field(\n        False, description=\"Whether to ignore recorder streams.\", flag_type=\"--\"\n    )\n    url: HttpUrl = Field(\n        \"https://pswww.slac.stanford.edu/ws-auth/lgbk\",\n        description=\"Base URL for eLog posting.\",\n        flag_type=\"--\",\n    )\n    epicsAll: bool = Field(\n        False,\n        description=\"Whether to store all EPICS PVs. Use with care.\",\n        flag_type=\"--\",\n    )\n    full: bool = Field(\n        False,\n        description=\"Whether to store all data. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    fullSum: bool = Field(\n        False,\n        description=\"Whether to store sums for all area detector images.\",\n        flag_type=\"--\",\n    )\n    default: bool = Field(\n        False,\n        description=\"Whether to store only the default minimal set of data.\",\n        flag_type=\"--\",\n    )\n    image: bool = Field(\n        False,\n        description=\"Whether to save everything as images. Use with care.\",\n        flag_type=\"--\",\n    )\n    tiff: bool = Field(\n        False,\n        description=\"Whether to save all images as a single TIFF. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    centerpix: bool = Field(\n        False,\n        description=\"Whether to mask center pixels for Epix10k2M detectors.\",\n        flag_type=\"--\",\n    )\n    postRuntable: bool = Field(\n        False,\n        description=\"Whether to post run tables. Also used as a trigger for summary jobs.\",\n        flag_type=\"--\",\n    )\n    wait: bool = Field(\n        False, description=\"Whether to wait for a file to appear.\", flag_type=\"--\"\n    )\n    xtcav: bool = Field(\n        False,\n        description=\"Whether to add XTCAV processing to the HDF5 generation.\",\n        flag_type=\"--\",\n    )\n    noarch: bool = Field(\n        False, description=\"Whether to not use archiver data.\", flag_type=\"--\"\n    )\n\n    lute_template_cfg: TemplateConfig = TemplateConfig(template_name=\"\", output_path=\"\")\n\n    @validator(\"producer\", always=True)\n    def validate_producer_path(cls, producer: str) -&gt; str:\n        return producer\n\n    @validator(\"lute_template_cfg\", always=True)\n    def use_producer(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if not lute_template_cfg.output_path:\n            lute_template_cfg.output_path = values[\"producer\"]\n        return lute_template_cfg\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        hutch: str = exp[:3]\n        run: int = int(values[\"lute_config\"].run)\n        directory: Optional[str] = values[\"directory\"]\n        if directory is None:\n            directory = f\"/sdf/data/lcls/ds/{hutch}/{exp}/hdf5/smalldata\"\n        fname: str = f\"{exp}_Run{run:04d}.h5\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Identical to super-class Config but includes a result.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/tests/","title":"tests","text":"<p>Models for all test Tasks.</p> <p>Classes:</p> Name Description <code>TestParameters</code> <p>Model for most basic test case. Single core first-party Task. Uses only communication via pipes.</p> <code>TestBinaryParameters</code> <p>Parameters for a simple multi- threaded binary executable.</p> <code>TestSocketParameters</code> <p>Model for first-party test requiring communication via socket.</p> <code>TestWriteOutputParameters</code> <p>Model for test Task which writes an output file. Location of file is recorded in database.</p> <code>TestReadOutputParameters</code> <p>Model for test Task which locates an output file based on an entry in the database, if no path is provided.</p>"},{"location":"source/io/models/tests/#io.models.tests.TestBinaryErrParameters","title":"<code>TestBinaryErrParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Same as TestBinary, but exits with non-zero code.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestBinaryErrParameters(ThirdPartyParameters):\n    \"\"\"Same as TestBinary, but exits with non-zero code.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/home/d/dorlhiac/test_tasks/test_threads_err\",\n        description=\"Multi-threaded tes tbinary with non-zero exit code.\",\n    )\n    p_arg1: int = Field(1, description=\"Number of threads.\")\n</code></pre>"},{"location":"source/io/models/tests/#io.models.tests.TestParameters","title":"<code>TestParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>Test</code>.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `Test`.\"\"\"\n\n    float_var: float = Field(0.01, description=\"A floating point number.\")\n    str_var: str = Field(\"test\", description=\"A string.\")\n\n    class CompoundVar(BaseModel):\n        int_var: int = 1\n        dict_var: Dict[str, str] = {\"a\": \"b\"}\n\n    compound_var: CompoundVar = Field(\n        description=(\n            \"A compound parameter - consists of a `int_var` (int) and `dict_var`\"\n            \" (Dict[str, str]).\"\n        )\n    )\n    throw_error: bool = Field(\n        False, description=\"If `True`, raise an exception to test error handling.\"\n    )\n</code></pre>"},{"location":"source/tasks/dataclasses/","title":"dataclasses","text":"<p>Classes for describing Task state and results.</p> <p>Classes:</p> Name Description <code>TaskResult</code> <p>Output of a specific analysis task.</p> <code>TaskStatus</code> <p>Enumeration of possible Task statuses (running, pending, failed, etc.).</p> <code>DescribedAnalysis</code> <p>Executor's description of a <code>Task</code> run (results, parameters, env).</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.DescribedAnalysis","title":"<code>DescribedAnalysis</code>  <code>dataclass</code>","text":"<p>Complete analysis description. Held by an Executor.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass DescribedAnalysis:\n    \"\"\"Complete analysis description. Held by an Executor.\"\"\"\n\n    task_result: TaskResult\n    task_parameters: Optional[TaskParameters]\n    task_env: Dict[str, str]\n    poll_interval: float\n    communicator_desc: List[str]\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.ElogSummaryPlots","title":"<code>ElogSummaryPlots</code>  <code>dataclass</code>","text":"<p>Holds a graphical summary intended for display in the eLog.</p> <p>Attributes:</p> Name Type Description <code>display_name</code> <code>str</code> <p>This represents both a path and how the result will be displayed in the eLog. Can include \"/\" characters. E.g. <code>display_name = \"scans/my_motor_scan\"</code> will have plots shown on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors how the file is stored on disk as well.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass ElogSummaryPlots:\n    \"\"\"Holds a graphical summary intended for display in the eLog.\n\n    Attributes:\n        display_name (str): This represents both a path and how the result will be\n            displayed in the eLog. Can include \"/\" characters. E.g.\n            `display_name = \"scans/my_motor_scan\"` will have plots shown\n            on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors\n            how the file is stored on disk as well.\n    \"\"\"\n\n    display_name: str\n    figures: Union[pn.Tabs, hv.Image, plt.Figure]\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskResult","title":"<code>TaskResult</code>  <code>dataclass</code>","text":"<p>Class for storing the result of a Task's execution with metadata.</p> <p>Attributes:</p> Name Type Description <code>task_name</code> <code>str</code> <p>Name of the associated task which produced it.</p> <code>task_status</code> <code>TaskStatus</code> <p>Status of associated task.</p> <code>summary</code> <code>str</code> <p>Short message/summary associated with the result.</p> <code>payload</code> <code>Any</code> <p>Actual result. May be data in any format.</p> <code>impl_schemas</code> <code>Optional[str]</code> <p>A string listing <code>Task</code> schemas implemented by the associated <code>Task</code>. Schemas define the category and expected output of the <code>Task</code>. An individual task may implement/conform to multiple schemas. Multiple schemas are separated by ';', e.g.     * impl_schemas = \"schema1;schema2\"</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass TaskResult:\n    \"\"\"Class for storing the result of a Task's execution with metadata.\n\n    Attributes:\n        task_name (str): Name of the associated task which produced it.\n\n        task_status (TaskStatus): Status of associated task.\n\n        summary (str): Short message/summary associated with the result.\n\n        payload (Any): Actual result. May be data in any format.\n\n        impl_schemas (Optional[str]): A string listing `Task` schemas implemented\n            by the associated `Task`. Schemas define the category and expected\n            output of the `Task`. An individual task may implement/conform to\n            multiple schemas. Multiple schemas are separated by ';', e.g.\n                * impl_schemas = \"schema1;schema2\"\n    \"\"\"\n\n    task_name: str\n    task_status: TaskStatus\n    summary: str\n    payload: Any\n    impl_schemas: Optional[str] = None\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus","title":"<code>TaskStatus</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Possible Task statuses.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>class TaskStatus(Enum):\n    \"\"\"Possible Task statuses.\"\"\"\n\n    PENDING = 0\n    \"\"\"\n    Task has yet to run. Is Queued, or waiting for prior tasks.\n    \"\"\"\n    RUNNING = 1\n    \"\"\"\n    Task is in the process of execution.\n    \"\"\"\n    COMPLETED = 2\n    \"\"\"\n    Task has completed without fatal errors.\n    \"\"\"\n    FAILED = 3\n    \"\"\"\n    Task encountered a fatal error.\n    \"\"\"\n    STOPPED = 4\n    \"\"\"\n    Task was, potentially temporarily, stopped/suspended.\n    \"\"\"\n    CANCELLED = 5\n    \"\"\"\n    Task was cancelled prior to completion or failure.\n    \"\"\"\n    TIMEDOUT = 6\n    \"\"\"\n    Task did not reach completion due to timeout.\n    \"\"\"\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.CANCELLED","title":"<code>CANCELLED = 5</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was cancelled prior to completion or failure.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.COMPLETED","title":"<code>COMPLETED = 2</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has completed without fatal errors.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.FAILED","title":"<code>FAILED = 3</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task encountered a fatal error.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.PENDING","title":"<code>PENDING = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has yet to run. Is Queued, or waiting for prior tasks.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.RUNNING","title":"<code>RUNNING = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task is in the process of execution.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.STOPPED","title":"<code>STOPPED = 4</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was, potentially temporarily, stopped/suspended.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.TIMEDOUT","title":"<code>TIMEDOUT = 6</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task did not reach completion due to timeout.</p>"},{"location":"source/tasks/sfx_find_peaks/","title":"sfx_find_peaks","text":"<p>Classes for peak finding tasks in SFX.</p> <p>Classes:</p> Name Description <code>CxiWriter</code> <p>utility class for writing peak finding results to CXI files.</p> <code>FindPeaksPyAlgos</code> <p>peak finding using psana's PyAlgos algorithm. Optional data compression and decompression with libpressio for data reduction tests.</p>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter","title":"<code>CxiWriter</code>","text":"Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>class CxiWriter:\n\n    def __init__(\n        self,\n        outdir: str,\n        rank: int,\n        exp: str,\n        run: int,\n        n_events: int,\n        det_shape: Tuple[int, ...],\n        min_peaks: int,\n        max_peaks: int,\n        i_x: Any,  # Not typed becomes it comes from psana\n        i_y: Any,  # Not typed becomes it comes from psana\n        ipx: Any,  # Not typed becomes it comes from psana\n        ipy: Any,  # Not typed becomes it comes from psana\n        tag: str,\n    ):\n        \"\"\"\n        Set up the CXI files to which peak finding results will be saved.\n\n        Parameters:\n\n            outdir (str): Output directory for cxi file.\n\n            rank (int): MPI rank of the caller.\n\n            exp (str): Experiment string.\n\n            run (int): Experimental run.\n\n            n_events (int): Number of events to process.\n\n            det_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n                data. This must be aCheetah-stile 2D array.\n\n            min_peaks (int): Minimum number of peaks per image.\n\n            max_peaks (int): Maximum number of peaks per image.\n\n            i_x (Any): Array of pixel indexes along x\n\n            i_y (Any): Array of pixel indexes along y\n\n            ipx (Any): Pixel indexes with respect to detector origin (x component)\n\n            ipy (Any): Pixel indexes with respect to detector origin (y component)\n\n            tag (str): Tag to append to cxi file names.\n        \"\"\"\n        self._det_shape: Tuple[int, ...] = det_shape\n        self._i_x: Any = i_x\n        self._i_y: Any = i_y\n        self._ipx: Any = ipx\n        self._ipy: Any = ipy\n        self._index: int = 0\n\n        # Create and open the HDF5 file\n        fname: str = f\"{exp}_r{run:0&gt;4}_{rank}{tag}.cxi\"\n        Path(outdir).mkdir(exist_ok=True)\n        self._outh5: Any = h5py.File(Path(outdir) / fname, \"w\")\n\n        # Entry_1 entry for processing with CrystFEL\n        entry_1: Any = self._outh5.create_group(\"entry_1\")\n        keys: List[str] = [\n            \"nPeaks\",\n            \"peakXPosRaw\",\n            \"peakYPosRaw\",\n            \"rcent\",\n            \"ccent\",\n            \"rmin\",\n            \"rmax\",\n            \"cmin\",\n            \"cmax\",\n            \"peakTotalIntensity\",\n            \"peakMaxIntensity\",\n            \"peakRadius\",\n        ]\n        ds_expId: Any = entry_1.create_dataset(\n            \"experimental_identifier\", (n_events,), maxshape=(None,), dtype=int\n        )\n        ds_expId.attrs[\"axes\"] = \"experiment_identifier\"\n        data_1: Any = entry_1.create_dataset(\n            \"/entry_1/data_1/data\",\n            (n_events, det_shape[0], det_shape[1]),\n            chunks=(1, det_shape[0], det_shape[1]),\n            maxshape=(None, det_shape[0], det_shape[1]),\n            dtype=numpy.float32,\n        )\n        data_1.attrs[\"axes\"] = \"experiment_identifier\"\n        key: str\n        for key in [\"powderHits\", \"powderMisses\", \"mask\"]:\n            entry_1.create_dataset(\n                f\"/entry_1/data_1/{key}\",\n                (det_shape[0], det_shape[1]),\n                chunks=(det_shape[0], det_shape[1]),\n                maxshape=(det_shape[0], det_shape[1]),\n                dtype=float,\n            )\n\n        # Peak-related entries\n        for key in keys:\n            if key == \"nPeaks\":\n                ds_x: Any = self._outh5.create_dataset(\n                    f\"/entry_1/result_1/{key}\",\n                    (n_events,),\n                    maxshape=(None,),\n                    dtype=int,\n                )\n                ds_x.attrs[\"minPeaks\"] = min_peaks\n                ds_x.attrs[\"maxPeaks\"] = max_peaks\n            else:\n                ds_x: Any = self._outh5.create_dataset(\n                    f\"/entry_1/result_1/{key}\",\n                    (n_events, max_peaks),\n                    maxshape=(None, max_peaks),\n                    chunks=(1, max_peaks),\n                    dtype=float,\n                )\n            ds_x.attrs[\"axes\"] = \"experiment_identifier:peaks\"\n\n        # Timestamp entries\n        lcls_1: Any = self._outh5.create_group(\"LCLS\")\n        keys: List[str] = [\n            \"eventNumber\",\n            \"machineTime\",\n            \"machineTimeNanoSeconds\",\n            \"fiducial\",\n            \"photon_energy_eV\",\n        ]\n        key: str\n        for key in keys:\n            if key == \"photon_energy_eV\":\n                ds_x: Any = lcls_1.create_dataset(\n                    f\"{key}\", (n_events,), maxshape=(None,), dtype=float\n                )\n            else:\n                ds_x = lcls_1.create_dataset(\n                    f\"{key}\", (n_events,), maxshape=(None,), dtype=int\n                )\n            ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n        ds_x = self._outh5.create_dataset(\n            \"/LCLS/detector_1/EncoderValue\", (n_events,), maxshape=(None,), dtype=float\n        )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n    def write_event(\n        self,\n        img: NDArray[numpy.float_],\n        peaks: Any,  # Not typed becomes it comes from psana\n        timestamp_seconds: int,\n        timestamp_nanoseconds: int,\n        timestamp_fiducials: int,\n        photon_energy: float,\n    ):\n        \"\"\"\n        Write peak finding results for an event into the HDF5 file.\n\n        Parameters:\n\n            img (NDArray[numpy.float_]): Detector data for the event\n\n            peaks: (Any): Peak information for the event, as recovered from the PyAlgos\n                algorithm\n\n            timestamp_seconds (int): Second part of the event's timestamp information\n\n            timestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n                information\n\n            timestamp_fiducials (int): Fiducials part of the event's timestamp\n                information\n\n            photon_energy (float): Photon energy for the event\n        \"\"\"\n        ch_rows: NDArray[numpy.float_] = peaks[:, 0] * self._det_shape[1] + peaks[:, 1]\n        ch_cols: NDArray[numpy.float_] = peaks[:, 2]\n\n        # Entry_1 entry for processing with CrystFEL\n        self._outh5[\"/entry_1/data_1/data\"][self._index, :, :] = img.reshape(\n            -1, img.shape[-1]\n        )\n        self._outh5[\"/entry_1/result_1/nPeaks\"][self._index] = peaks.shape[0]\n        self._outh5[\"/entry_1/result_1/peakXPosRaw\"][self._index, : peaks.shape[0]] = (\n            ch_cols.astype(\"int\")\n        )\n        self._outh5[\"/entry_1/result_1/peakYPosRaw\"][self._index, : peaks.shape[0]] = (\n            ch_rows.astype(\"int\")\n        )\n        self._outh5[\"/entry_1/result_1/rcent\"][self._index, : peaks.shape[0]] = peaks[\n            :, 6\n        ]\n        self._outh5[\"/entry_1/result_1/ccent\"][self._index, : peaks.shape[0]] = peaks[\n            :, 7\n        ]\n        self._outh5[\"/entry_1/result_1/rmin\"][self._index, : peaks.shape[0]] = peaks[\n            :, 10\n        ]\n        self._outh5[\"/entry_1/result_1/rmax\"][self._index, : peaks.shape[0]] = peaks[\n            :, 11\n        ]\n        self._outh5[\"/entry_1/result_1/cmin\"][self._index, : peaks.shape[0]] = peaks[\n            :, 12\n        ]\n        self._outh5[\"/entry_1/result_1/cmax\"][self._index, : peaks.shape[0]] = peaks[\n            :, 13\n        ]\n        self._outh5[\"/entry_1/result_1/peakTotalIntensity\"][\n            self._index, : peaks.shape[0]\n        ] = peaks[:, 5]\n        self._outh5[\"/entry_1/result_1/peakMaxIntensity\"][\n            self._index, : peaks.shape[0]\n        ] = peaks[:, 4]\n\n        # Calculate and write pixel radius\n        peaks_cenx: NDArray[numpy.float_] = (\n            self._i_x[\n                numpy.array(peaks[:, 0], dtype=numpy.int64),\n                numpy.array(peaks[:, 1], dtype=numpy.int64),\n                numpy.array(peaks[:, 2], dtype=numpy.int64),\n            ]\n            + 0.5\n            - self._ipx\n        )\n        peaks_ceny: NDArray[numpy.float_] = (\n            self._i_y[\n                numpy.array(peaks[:, 0], dtype=numpy.int64),\n                numpy.array(peaks[:, 1], dtype=numpy.int64),\n                numpy.array(peaks[:, 2], dtype=numpy.int64),\n            ]\n            + 0.5\n            - self._ipy\n        )\n        peak_radius: NDArray[numpy.float_] = numpy.sqrt(\n            (peaks_cenx**2) + (peaks_ceny**2)\n        )\n        self._outh5[\"/entry_1/result_1/peakRadius\"][\n            self._index, : peaks.shape[0]\n        ] = peak_radius\n\n        # LCLS entry dataset\n        self._outh5[\"/LCLS/machineTime\"][self._index] = timestamp_seconds\n        self._outh5[\"/LCLS/machineTimeNanoSeconds\"][self._index] = timestamp_nanoseconds\n        self._outh5[\"/LCLS/fiducial\"][self._index] = timestamp_fiducials\n        self._outh5[\"/LCLS/photon_energy_eV\"][self._index] = photon_energy\n\n        self._index += 1\n\n    def write_non_event_data(\n        self,\n        powder_hits: NDArray[numpy.float_],\n        powder_misses: NDArray[numpy.float_],\n        mask: NDArray[numpy.uint16],\n        clen: float,\n    ):\n        \"\"\"\n        Write to the file data that is not related to a specific event (masks, powders)\n\n        Parameters:\n\n            powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n            powder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n            mask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n\n        \"\"\"\n        # Add powders and mask to files, reshaping them to match the crystfel\n        # convention\n        self._outh5[\"/entry_1/data_1/powderHits\"][:] = powder_hits.reshape(\n            -1, powder_hits.shape[-1]\n        )\n        self._outh5[\"/entry_1/data_1/powderMisses\"][:] = powder_misses.reshape(\n            -1, powder_misses.shape[-1]\n        )\n        self._outh5[\"/entry_1/data_1/mask\"][:] = (1 - mask).reshape(\n            -1, mask.shape[-1]\n        )  # Crystfel expects inverted values\n\n        # Add clen distance\n        self._outh5[\"/LCLS/detector_1/EncoderValue\"][:] = clen\n\n    def optimize_and_close_file(\n        self,\n        num_hits: int,\n        max_peaks: int,\n    ):\n        \"\"\"\n        Resize data blocks and write additional information to the file\n\n        Parameters:\n\n            num_hits (int): Number of hits for which information has been saved to the\n                file\n\n            max_peaks (int): Maximum number of peaks (per event) for which information\n                can be written into the file\n        \"\"\"\n\n        # Resize the entry_1 entry\n        data_shape: Tuple[int, ...] = self._outh5[\"/entry_1/data_1/data\"].shape\n        self._outh5[\"/entry_1/data_1/data\"].resize(\n            (num_hits, data_shape[1], data_shape[2])\n        )\n        self._outh5[f\"/entry_1/result_1/nPeaks\"].resize((num_hits,))\n        key: str\n        for key in [\n            \"peakXPosRaw\",\n            \"peakYPosRaw\",\n            \"rcent\",\n            \"ccent\",\n            \"rmin\",\n            \"rmax\",\n            \"cmin\",\n            \"cmax\",\n            \"peakTotalIntensity\",\n            \"peakMaxIntensity\",\n            \"peakRadius\",\n        ]:\n            self._outh5[f\"/entry_1/result_1/{key}\"].resize((num_hits, max_peaks))\n\n        # Resize LCLS entry\n        for key in [\n            \"eventNumber\",\n            \"machineTime\",\n            \"machineTimeNanoSeconds\",\n            \"fiducial\",\n            \"detector_1/EncoderValue\",\n            \"photon_energy_eV\",\n        ]:\n            self._outh5[f\"/LCLS/{key}\"].resize((num_hits,))\n        self._outh5.close()\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.__init__","title":"<code>__init__(outdir, rank, exp, run, n_events, det_shape, min_peaks, max_peaks, i_x, i_y, ipx, ipy, tag)</code>","text":"<p>Set up the CXI files to which peak finding results will be saved.</p> <p>Parameters:</p> <pre><code>outdir (str): Output directory for cxi file.\n\nrank (int): MPI rank of the caller.\n\nexp (str): Experiment string.\n\nrun (int): Experimental run.\n\nn_events (int): Number of events to process.\n\ndet_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n    data. This must be aCheetah-stile 2D array.\n\nmin_peaks (int): Minimum number of peaks per image.\n\nmax_peaks (int): Maximum number of peaks per image.\n\ni_x (Any): Array of pixel indexes along x\n\ni_y (Any): Array of pixel indexes along y\n\nipx (Any): Pixel indexes with respect to detector origin (x component)\n\nipy (Any): Pixel indexes with respect to detector origin (y component)\n\ntag (str): Tag to append to cxi file names.\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def __init__(\n    self,\n    outdir: str,\n    rank: int,\n    exp: str,\n    run: int,\n    n_events: int,\n    det_shape: Tuple[int, ...],\n    min_peaks: int,\n    max_peaks: int,\n    i_x: Any,  # Not typed becomes it comes from psana\n    i_y: Any,  # Not typed becomes it comes from psana\n    ipx: Any,  # Not typed becomes it comes from psana\n    ipy: Any,  # Not typed becomes it comes from psana\n    tag: str,\n):\n    \"\"\"\n    Set up the CXI files to which peak finding results will be saved.\n\n    Parameters:\n\n        outdir (str): Output directory for cxi file.\n\n        rank (int): MPI rank of the caller.\n\n        exp (str): Experiment string.\n\n        run (int): Experimental run.\n\n        n_events (int): Number of events to process.\n\n        det_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n            data. This must be aCheetah-stile 2D array.\n\n        min_peaks (int): Minimum number of peaks per image.\n\n        max_peaks (int): Maximum number of peaks per image.\n\n        i_x (Any): Array of pixel indexes along x\n\n        i_y (Any): Array of pixel indexes along y\n\n        ipx (Any): Pixel indexes with respect to detector origin (x component)\n\n        ipy (Any): Pixel indexes with respect to detector origin (y component)\n\n        tag (str): Tag to append to cxi file names.\n    \"\"\"\n    self._det_shape: Tuple[int, ...] = det_shape\n    self._i_x: Any = i_x\n    self._i_y: Any = i_y\n    self._ipx: Any = ipx\n    self._ipy: Any = ipy\n    self._index: int = 0\n\n    # Create and open the HDF5 file\n    fname: str = f\"{exp}_r{run:0&gt;4}_{rank}{tag}.cxi\"\n    Path(outdir).mkdir(exist_ok=True)\n    self._outh5: Any = h5py.File(Path(outdir) / fname, \"w\")\n\n    # Entry_1 entry for processing with CrystFEL\n    entry_1: Any = self._outh5.create_group(\"entry_1\")\n    keys: List[str] = [\n        \"nPeaks\",\n        \"peakXPosRaw\",\n        \"peakYPosRaw\",\n        \"rcent\",\n        \"ccent\",\n        \"rmin\",\n        \"rmax\",\n        \"cmin\",\n        \"cmax\",\n        \"peakTotalIntensity\",\n        \"peakMaxIntensity\",\n        \"peakRadius\",\n    ]\n    ds_expId: Any = entry_1.create_dataset(\n        \"experimental_identifier\", (n_events,), maxshape=(None,), dtype=int\n    )\n    ds_expId.attrs[\"axes\"] = \"experiment_identifier\"\n    data_1: Any = entry_1.create_dataset(\n        \"/entry_1/data_1/data\",\n        (n_events, det_shape[0], det_shape[1]),\n        chunks=(1, det_shape[0], det_shape[1]),\n        maxshape=(None, det_shape[0], det_shape[1]),\n        dtype=numpy.float32,\n    )\n    data_1.attrs[\"axes\"] = \"experiment_identifier\"\n    key: str\n    for key in [\"powderHits\", \"powderMisses\", \"mask\"]:\n        entry_1.create_dataset(\n            f\"/entry_1/data_1/{key}\",\n            (det_shape[0], det_shape[1]),\n            chunks=(det_shape[0], det_shape[1]),\n            maxshape=(det_shape[0], det_shape[1]),\n            dtype=float,\n        )\n\n    # Peak-related entries\n    for key in keys:\n        if key == \"nPeaks\":\n            ds_x: Any = self._outh5.create_dataset(\n                f\"/entry_1/result_1/{key}\",\n                (n_events,),\n                maxshape=(None,),\n                dtype=int,\n            )\n            ds_x.attrs[\"minPeaks\"] = min_peaks\n            ds_x.attrs[\"maxPeaks\"] = max_peaks\n        else:\n            ds_x: Any = self._outh5.create_dataset(\n                f\"/entry_1/result_1/{key}\",\n                (n_events, max_peaks),\n                maxshape=(None, max_peaks),\n                chunks=(1, max_peaks),\n                dtype=float,\n            )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier:peaks\"\n\n    # Timestamp entries\n    lcls_1: Any = self._outh5.create_group(\"LCLS\")\n    keys: List[str] = [\n        \"eventNumber\",\n        \"machineTime\",\n        \"machineTimeNanoSeconds\",\n        \"fiducial\",\n        \"photon_energy_eV\",\n    ]\n    key: str\n    for key in keys:\n        if key == \"photon_energy_eV\":\n            ds_x: Any = lcls_1.create_dataset(\n                f\"{key}\", (n_events,), maxshape=(None,), dtype=float\n            )\n        else:\n            ds_x = lcls_1.create_dataset(\n                f\"{key}\", (n_events,), maxshape=(None,), dtype=int\n            )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n    ds_x = self._outh5.create_dataset(\n        \"/LCLS/detector_1/EncoderValue\", (n_events,), maxshape=(None,), dtype=float\n    )\n    ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.optimize_and_close_file","title":"<code>optimize_and_close_file(num_hits, max_peaks)</code>","text":"<p>Resize data blocks and write additional information to the file</p> <p>Parameters:</p> <pre><code>num_hits (int): Number of hits for which information has been saved to the\n    file\n\nmax_peaks (int): Maximum number of peaks (per event) for which information\n    can be written into the file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def optimize_and_close_file(\n    self,\n    num_hits: int,\n    max_peaks: int,\n):\n    \"\"\"\n    Resize data blocks and write additional information to the file\n\n    Parameters:\n\n        num_hits (int): Number of hits for which information has been saved to the\n            file\n\n        max_peaks (int): Maximum number of peaks (per event) for which information\n            can be written into the file\n    \"\"\"\n\n    # Resize the entry_1 entry\n    data_shape: Tuple[int, ...] = self._outh5[\"/entry_1/data_1/data\"].shape\n    self._outh5[\"/entry_1/data_1/data\"].resize(\n        (num_hits, data_shape[1], data_shape[2])\n    )\n    self._outh5[f\"/entry_1/result_1/nPeaks\"].resize((num_hits,))\n    key: str\n    for key in [\n        \"peakXPosRaw\",\n        \"peakYPosRaw\",\n        \"rcent\",\n        \"ccent\",\n        \"rmin\",\n        \"rmax\",\n        \"cmin\",\n        \"cmax\",\n        \"peakTotalIntensity\",\n        \"peakMaxIntensity\",\n        \"peakRadius\",\n    ]:\n        self._outh5[f\"/entry_1/result_1/{key}\"].resize((num_hits, max_peaks))\n\n    # Resize LCLS entry\n    for key in [\n        \"eventNumber\",\n        \"machineTime\",\n        \"machineTimeNanoSeconds\",\n        \"fiducial\",\n        \"detector_1/EncoderValue\",\n        \"photon_energy_eV\",\n    ]:\n        self._outh5[f\"/LCLS/{key}\"].resize((num_hits,))\n    self._outh5.close()\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.write_event","title":"<code>write_event(img, peaks, timestamp_seconds, timestamp_nanoseconds, timestamp_fiducials, photon_energy)</code>","text":"<p>Write peak finding results for an event into the HDF5 file.</p> <p>Parameters:</p> <pre><code>img (NDArray[numpy.float_]): Detector data for the event\n\npeaks: (Any): Peak information for the event, as recovered from the PyAlgos\n    algorithm\n\ntimestamp_seconds (int): Second part of the event's timestamp information\n\ntimestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n    information\n\ntimestamp_fiducials (int): Fiducials part of the event's timestamp\n    information\n\nphoton_energy (float): Photon energy for the event\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_event(\n    self,\n    img: NDArray[numpy.float_],\n    peaks: Any,  # Not typed becomes it comes from psana\n    timestamp_seconds: int,\n    timestamp_nanoseconds: int,\n    timestamp_fiducials: int,\n    photon_energy: float,\n):\n    \"\"\"\n    Write peak finding results for an event into the HDF5 file.\n\n    Parameters:\n\n        img (NDArray[numpy.float_]): Detector data for the event\n\n        peaks: (Any): Peak information for the event, as recovered from the PyAlgos\n            algorithm\n\n        timestamp_seconds (int): Second part of the event's timestamp information\n\n        timestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n            information\n\n        timestamp_fiducials (int): Fiducials part of the event's timestamp\n            information\n\n        photon_energy (float): Photon energy for the event\n    \"\"\"\n    ch_rows: NDArray[numpy.float_] = peaks[:, 0] * self._det_shape[1] + peaks[:, 1]\n    ch_cols: NDArray[numpy.float_] = peaks[:, 2]\n\n    # Entry_1 entry for processing with CrystFEL\n    self._outh5[\"/entry_1/data_1/data\"][self._index, :, :] = img.reshape(\n        -1, img.shape[-1]\n    )\n    self._outh5[\"/entry_1/result_1/nPeaks\"][self._index] = peaks.shape[0]\n    self._outh5[\"/entry_1/result_1/peakXPosRaw\"][self._index, : peaks.shape[0]] = (\n        ch_cols.astype(\"int\")\n    )\n    self._outh5[\"/entry_1/result_1/peakYPosRaw\"][self._index, : peaks.shape[0]] = (\n        ch_rows.astype(\"int\")\n    )\n    self._outh5[\"/entry_1/result_1/rcent\"][self._index, : peaks.shape[0]] = peaks[\n        :, 6\n    ]\n    self._outh5[\"/entry_1/result_1/ccent\"][self._index, : peaks.shape[0]] = peaks[\n        :, 7\n    ]\n    self._outh5[\"/entry_1/result_1/rmin\"][self._index, : peaks.shape[0]] = peaks[\n        :, 10\n    ]\n    self._outh5[\"/entry_1/result_1/rmax\"][self._index, : peaks.shape[0]] = peaks[\n        :, 11\n    ]\n    self._outh5[\"/entry_1/result_1/cmin\"][self._index, : peaks.shape[0]] = peaks[\n        :, 12\n    ]\n    self._outh5[\"/entry_1/result_1/cmax\"][self._index, : peaks.shape[0]] = peaks[\n        :, 13\n    ]\n    self._outh5[\"/entry_1/result_1/peakTotalIntensity\"][\n        self._index, : peaks.shape[0]\n    ] = peaks[:, 5]\n    self._outh5[\"/entry_1/result_1/peakMaxIntensity\"][\n        self._index, : peaks.shape[0]\n    ] = peaks[:, 4]\n\n    # Calculate and write pixel radius\n    peaks_cenx: NDArray[numpy.float_] = (\n        self._i_x[\n            numpy.array(peaks[:, 0], dtype=numpy.int64),\n            numpy.array(peaks[:, 1], dtype=numpy.int64),\n            numpy.array(peaks[:, 2], dtype=numpy.int64),\n        ]\n        + 0.5\n        - self._ipx\n    )\n    peaks_ceny: NDArray[numpy.float_] = (\n        self._i_y[\n            numpy.array(peaks[:, 0], dtype=numpy.int64),\n            numpy.array(peaks[:, 1], dtype=numpy.int64),\n            numpy.array(peaks[:, 2], dtype=numpy.int64),\n        ]\n        + 0.5\n        - self._ipy\n    )\n    peak_radius: NDArray[numpy.float_] = numpy.sqrt(\n        (peaks_cenx**2) + (peaks_ceny**2)\n    )\n    self._outh5[\"/entry_1/result_1/peakRadius\"][\n        self._index, : peaks.shape[0]\n    ] = peak_radius\n\n    # LCLS entry dataset\n    self._outh5[\"/LCLS/machineTime\"][self._index] = timestamp_seconds\n    self._outh5[\"/LCLS/machineTimeNanoSeconds\"][self._index] = timestamp_nanoseconds\n    self._outh5[\"/LCLS/fiducial\"][self._index] = timestamp_fiducials\n    self._outh5[\"/LCLS/photon_energy_eV\"][self._index] = photon_energy\n\n    self._index += 1\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.write_non_event_data","title":"<code>write_non_event_data(powder_hits, powder_misses, mask, clen)</code>","text":"<p>Write to the file data that is not related to a specific event (masks, powders)</p> <p>Parameters:</p> <pre><code>powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\npowder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\nmask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_non_event_data(\n    self,\n    powder_hits: NDArray[numpy.float_],\n    powder_misses: NDArray[numpy.float_],\n    mask: NDArray[numpy.uint16],\n    clen: float,\n):\n    \"\"\"\n    Write to the file data that is not related to a specific event (masks, powders)\n\n    Parameters:\n\n        powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n        powder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n        mask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n\n    \"\"\"\n    # Add powders and mask to files, reshaping them to match the crystfel\n    # convention\n    self._outh5[\"/entry_1/data_1/powderHits\"][:] = powder_hits.reshape(\n        -1, powder_hits.shape[-1]\n    )\n    self._outh5[\"/entry_1/data_1/powderMisses\"][:] = powder_misses.reshape(\n        -1, powder_misses.shape[-1]\n    )\n    self._outh5[\"/entry_1/data_1/mask\"][:] = (1 - mask).reshape(\n        -1, mask.shape[-1]\n    )  # Crystfel expects inverted values\n\n    # Add clen distance\n    self._outh5[\"/LCLS/detector_1/EncoderValue\"][:] = clen\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.FindPeaksPyAlgos","title":"<code>FindPeaksPyAlgos</code>","text":"<p>               Bases: <code>Task</code></p> <p>Task that performs peak finding using the PyAlgos peak finding algorithms and writes the peak information to CXI files.</p> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgos(Task):\n    \"\"\"\n    Task that performs peak finding using the PyAlgos peak finding algorithms and\n    writes the peak information to CXI files.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        ds: Any = MPIDataSource(\n            f\"exp={self._task_parameters.lute_config.experiment}:\"\n            f\"run={self._task_parameters.lute_config.run}:smd\"\n        )\n        if self._task_parameters.n_events != 0:\n            ds.break_after(self._task_parameters.n_events)\n\n        det: Any = Detector(self._task_parameters.det_name)\n        det.do_reshape_2d_to_3d(flag=True)\n\n        evr: Any = Detector(self._task_parameters.event_receiver)\n\n        i_x: Any = det.indexes_x(self._task_parameters.lute_config.run).astype(\n            numpy.int64\n        )\n        i_y: Any = det.indexes_y(self._task_parameters.lute_config.run).astype(\n            numpy.int64\n        )\n        ipx: Any\n        ipy: Any\n        ipx, ipy = det.point_indexes(\n            self._task_parameters.lute_config.run, pxy_um=(0, 0)\n        )\n\n        alg: Any = None\n        num_hits: int = 0\n        num_events: int = 0\n        num_empty_images: int = 0\n        tag: str = self._task_parameters.tag\n        if (tag != \"\") and (tag[0] != \"_\"):\n            tag = \"_\" + tag\n\n        evt: Any\n        for evt in ds.events():\n\n            evt_id: Any = evt.get(EventId)\n            timestamp_seconds: int = evt_id.time()[0]\n            timestamp_nanoseconds: int = evt_id.time()[1]\n            timestamp_fiducials: int = evt_id.fiducials()\n            event_codes: Any = evr.eventCodes(evt)\n\n            if isinstance(self._task_parameters.pv_camera_length, float):\n                clen: float = self._task_parameters.pv_camera_length\n            else:\n                clen = (\n                    ds.env().epicsStore().value(self._task_parameters.pv_camera_length)\n                )\n\n            if self._task_parameters.event_logic:\n                if not self._task_parameters.event_code in event_codes:\n                    continue\n\n            img: Any = det.calib(evt)\n\n            if img is None:\n                num_empty_images += 1\n                continue\n\n            if alg is None:\n                det_shape: Tuple[int, ...] = img.shape\n                if len(det_shape) == 3:\n                    det_shape = (det_shape[0] * det_shape[1], det_shape[2])\n                else:\n                    det_shape = img.shape\n\n                mask: NDArray[numpy.uint16] = numpy.ones(det_shape).astype(numpy.uint16)\n\n                if self._task_parameters.psana_mask:\n                    mask = det.mask(\n                        self.task_parameters.run,\n                        calib=False,\n                        status=True,\n                        edges=False,\n                        centra=False,\n                        unbond=False,\n                        unbondnbrs=False,\n                    ).astype(numpy.uint16)\n\n                hdffh: Any\n                if self._task_parameters.mask_file is not None:\n                    with h5py.File(self._task_parameters.mask_file, \"r\") as hdffh:\n                        loaded_mask: NDArray[numpy.int] = hdffh[\"entry_1/data_1/mask\"][\n                            :\n                        ]\n                        mask *= loaded_mask.astype(numpy.uint16)\n\n                file_writer: CxiWriter = CxiWriter(\n                    outdir=self._task_parameters.outdir,\n                    rank=ds.rank,\n                    exp=self._task_parameters.lute_config.experiment,\n                    run=self._task_parameters.lute_config.run,\n                    n_events=self._task_parameters.n_events,\n                    det_shape=det_shape,\n                    i_x=i_x,\n                    i_y=i_y,\n                    ipx=ipx,\n                    ipy=ipy,\n                    min_peaks=self._task_parameters.min_peaks,\n                    max_peaks=self._task_parameters.max_peaks,\n                    tag=tag,\n                )\n                alg: Any = PyAlgos(mask=mask, pbits=0)  # pbits controls verbosity\n                alg.set_peak_selection_pars(\n                    npix_min=self._task_parameters.npix_min,\n                    npix_max=self._task_parameters.npix_max,\n                    amax_thr=self._task_parameters.amax_thr,\n                    atot_thr=self._task_parameters.atot_thr,\n                    son_min=self._task_parameters.son_min,\n                )\n\n                if self._task_parameters.compression is not None:\n\n                    libpressio_config = generate_libpressio_configuration(\n                        compressor=self._task_parameters.compression.compressor,\n                        roi_window_size=self._task_parameters.compression.roi_window_size,\n                        bin_size=self._task_parameters.compression.bin_size,\n                        abs_error=self._task_parameters.compression.abs_error,\n                        libpressio_mask=mask,\n                    )\n\n                powder_hits: NDArray[numpy.float_] = numpy.zeros(det_shape)\n                powder_misses: NDArray[numpy.float_] = numpy.zeros(det_shape)\n\n            peaks: Any = alg.peak_finder_v3r3(\n                img,\n                rank=self._task_parameters.peak_rank,\n                r0=self._task_parameters.r0,\n                dr=self._task_parameters.dr,\n                #      nsigm=self._task_parameters.nsigm,\n            )\n\n            num_events += 1\n\n            if (peaks.shape[0] &gt;= self._task_parameters.min_peaks) and (\n                peaks.shape[0] &lt;= self._task_parameters.max_peaks\n            ):\n\n                if self._task_parameters.compression is not None:\n\n                    libpressio_config_with_peaks = (\n                        add_peaks_to_libpressio_configuration(libpressio_config, peaks)\n                    )\n                    compressor = PressioCompressor.from_config(\n                        libpressio_config_with_peaks\n                    )\n                    compressed_img = compressor.encode(img)\n                    decompressed_img = numpy.zeros_like(img)\n                    decompressed = compressor.decode(compressed_img, decompressed_img)\n                    img = decompressed_img\n\n                try:\n                    photon_energy: float = (\n                        Detector(\"EBeam\").get(evt).ebeamPhotonEnergy()\n                    )\n                except AttributeError:\n                    photon_energy = (\n                        1.23984197386209e-06\n                        / ds.env().epicsStore().value(\"SIOC:SYS0:ML00:AO192\")\n                        / 1.0e9\n                    )\n\n                file_writer.write_event(\n                    img=img,\n                    peaks=peaks,\n                    timestamp_seconds=timestamp_seconds,\n                    timestamp_nanoseconds=timestamp_nanoseconds,\n                    timestamp_fiducials=timestamp_fiducials,\n                    photon_energy=photon_energy,\n                )\n                num_hits += 1\n\n            # TODO: Fix bug here\n            # generate / update powders\n            if peaks.shape[0] &gt;= self._task_parameters.min_peaks:\n                powder_hits = numpy.maximum(powder_hits, img)\n            else:\n                powder_misses = numpy.maximum(powder_misses, img)\n\n        if num_empty_images != 0:\n            msg: Message = Message(\n                contents=f\"Rank {ds.rank} encountered {num_empty_images} empty images.\"\n            )\n            self._report_to_executor(msg)\n\n        file_writer.write_non_event_data(\n            powder_hits=powder_hits,\n            powder_misses=powder_misses,\n            mask=mask,\n            clen=clen,\n        )\n\n        file_writer.optimize_and_close_file(\n            num_hits=num_hits, max_peaks=self._task_parameters.max_peaks\n        )\n\n        COMM_WORLD.Barrier()\n\n        num_hits_per_rank: List[int] = COMM_WORLD.gather(num_hits, root=0)\n        num_hits_total: int = COMM_WORLD.reduce(num_hits, SUM)\n        num_events_per_rank: List[int] = COMM_WORLD.gather(num_events, root=0)\n\n        if ds.rank == 0:\n            master_fname: Path = write_master_file(\n                mpi_size=ds.size,\n                outdir=self._task_parameters.outdir,\n                exp=self._task_parameters.lute_config.experiment,\n                run=self._task_parameters.lute_config.run,\n                tag=tag,\n                n_hits_per_rank=num_hits_per_rank,\n                n_hits_total=num_hits_total,\n            )\n\n            # Write final summary file\n            f: TextIO\n            with open(\n                Path(self._task_parameters.outdir) / f\"peakfinding{tag}.summary\", \"w\"\n            ) as f:\n                print(f\"Number of events processed: {num_events_per_rank[-1]}\", file=f)\n                print(f\"Number of hits found: {num_hits_total}\", file=f)\n                print(\n                    \"Fractional hit rate: \"\n                    f\"{(num_hits_total/num_events_per_rank[-1]):.2f}\",\n                    file=f,\n                )\n                print(f\"No. hits per rank: {num_hits_per_rank}\", file=f)\n\n            with open(Path(self._task_parameters.out_file), \"w\") as f:\n                print(f\"{master_fname}\", file=f)\n\n            # Write out_file\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.add_peaks_to_libpressio_configuration","title":"<code>add_peaks_to_libpressio_configuration(lp_json, peaks)</code>","text":"<p>Add peak infromation to libpressio configuration</p> <p>Parameters:</p> <pre><code>lp_json: Dictionary storing the configuration JSON structure for the libpressio\n    library.\n\npeaks (Any): Peak information as returned by psana.\n</code></pre> <p>Returns:</p> <pre><code>lp_json: Updated configuration JSON structure for the libpressio library.\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def add_peaks_to_libpressio_configuration(lp_json, peaks) -&gt; Dict[str, Any]:\n    \"\"\"\n    Add peak infromation to libpressio configuration\n\n    Parameters:\n\n        lp_json: Dictionary storing the configuration JSON structure for the libpressio\n            library.\n\n        peaks (Any): Peak information as returned by psana.\n\n    Returns:\n\n        lp_json: Updated configuration JSON structure for the libpressio library.\n    \"\"\"\n    lp_json[\"compressor_config\"][\"pressio\"][\"roibin\"][\"roibin:centers\"] = (\n        numpy.ascontiguousarray(numpy.uint64(peaks[:, [2, 1, 0]]))\n    )\n    return lp_json\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.generate_libpressio_configuration","title":"<code>generate_libpressio_configuration(compressor, roi_window_size, bin_size, abs_error, libpressio_mask)</code>","text":"<p>Create the configuration JSON for the libpressio library</p> <p>Parameters:</p> <pre><code>compressor (Literal[\"sz3\", \"qoz\"]): Compression algorithm to use\n    (\"qoz\" or \"sz3\").\n\nabs_error (float): Bound value for the absolute error.\n\nbin_size (int): Bining Size.\n\nroi_window_size (int): Default size of the ROI window.\n\nlibpressio_mask (NDArray): mask to be applied to the data.\n</code></pre> <p>Returns:</p> <pre><code>lp_json (Dict[str, Any]): Dictionary storing the JSON configuration structure\nfor the libpressio library\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def generate_libpressio_configuration(\n    compressor: Literal[\"sz3\", \"qoz\"],\n    roi_window_size: int,\n    bin_size: int,\n    abs_error: float,\n    libpressio_mask,\n) -&gt; Dict[str, Any]:\n    \"\"\"\n    Create the configuration JSON for the libpressio library\n\n    Parameters:\n\n        compressor (Literal[\"sz3\", \"qoz\"]): Compression algorithm to use\n            (\"qoz\" or \"sz3\").\n\n        abs_error (float): Bound value for the absolute error.\n\n        bin_size (int): Bining Size.\n\n        roi_window_size (int): Default size of the ROI window.\n\n        libpressio_mask (NDArray): mask to be applied to the data.\n\n    Returns:\n\n        lp_json (Dict[str, Any]): Dictionary storing the JSON configuration structure\n        for the libpressio library\n    \"\"\"\n\n    if compressor == \"qoz\":\n        pressio_opts: Dict[str, Any] = {\n            \"pressio:abs\": abs_error,\n            \"qoz\": {\"qoz:stride\": 8},\n        }\n    elif compressor == \"sz3\":\n        pressio_opts = {\"pressio:abs\": abs_error}\n\n    lp_json = {\n        \"compressor_id\": \"pressio\",\n        \"early_config\": {\n            \"pressio\": {\n                \"pressio:compressor\": \"roibin\",\n                \"roibin\": {\n                    \"roibin:metric\": \"composite\",\n                    \"roibin:background\": \"mask_binning\",\n                    \"roibin:roi\": \"fpzip\",\n                    \"background\": {\n                        \"binning:compressor\": \"pressio\",\n                        \"mask_binning:compressor\": \"pressio\",\n                        \"pressio\": {\"pressio:compressor\": compressor},\n                    },\n                    \"composite\": {\n                        \"composite:plugins\": [\n                            \"size\",\n                            \"time\",\n                            \"input_stats\",\n                            \"error_stat\",\n                        ]\n                    },\n                },\n            }\n        },\n        \"compressor_config\": {\n            \"pressio\": {\n                \"roibin\": {\n                    \"roibin:roi_size\": [roi_window_size, roi_window_size, 0],\n                    \"roibin:centers\": None,  # \"roibin:roi_strategy\": \"coordinates\",\n                    \"roibin:nthreads\": 4,\n                    \"roi\": {\"fpzip:prec\": 0},\n                    \"background\": {\n                        \"mask_binning:mask\": None,\n                        \"mask_binning:shape\": [bin_size, bin_size, 1],\n                        \"mask_binning:nthreads\": 4,\n                        \"pressio\": pressio_opts,\n                    },\n                }\n            }\n        },\n        \"name\": \"pressio\",\n    }\n\n    lp_json[\"compressor_config\"][\"pressio\"][\"roibin\"][\"background\"][\n        \"mask_binning:mask\"\n    ] = (1 - libpressio_mask)\n\n    return lp_json\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.write_master_file","title":"<code>write_master_file(mpi_size, outdir, exp, run, tag, n_hits_per_rank, n_hits_total)</code>","text":"<p>Generate a virtual dataset to map all individual files for this run.</p> <p>Parameters:</p> <pre><code>mpi_size (int): Number of ranks in the MPI pool.\n\noutdir (str): Output directory for cxi file.\n\nexp (str): Experiment string.\n\nrun (int): Experimental run.\n\ntag (str): Tag to append to cxi file names.\n\nn_hits_per_rank (List[int]): Array containing the number of hits found on each\n    node processing data.\n\nn_hits_total (int): Total number of hits found across all nodes.\n</code></pre> <p>Returns:</p> <pre><code>The path to the the written master file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_master_file(\n    mpi_size: int,\n    outdir: str,\n    exp: str,\n    run: int,\n    tag: str,\n    n_hits_per_rank: List[int],\n    n_hits_total: int,\n) -&gt; Path:\n    \"\"\"\n    Generate a virtual dataset to map all individual files for this run.\n\n    Parameters:\n\n        mpi_size (int): Number of ranks in the MPI pool.\n\n        outdir (str): Output directory for cxi file.\n\n        exp (str): Experiment string.\n\n        run (int): Experimental run.\n\n        tag (str): Tag to append to cxi file names.\n\n        n_hits_per_rank (List[int]): Array containing the number of hits found on each\n            node processing data.\n\n        n_hits_total (int): Total number of hits found across all nodes.\n\n    Returns:\n\n        The path to the the written master file\n    \"\"\"\n    # Retrieve paths to the files containing data\n    fnames: List[Path] = []\n    fi: int\n    for fi in range(mpi_size):\n        if n_hits_per_rank[fi] &gt; 0:\n            fnames.append(Path(outdir) / f\"{exp}_r{run:0&gt;4}_{fi}{tag}.cxi\")\n    if len(fnames) == 0:\n        sys.exit(\"No hits found\")\n\n    # Retrieve list of entries to populate in the virtual hdf5 file\n    dname_list, key_list, shape_list, dtype_list = [], [], [], []\n    datasets = [\"/entry_1/result_1\", \"/LCLS/detector_1\", \"/LCLS\", \"/entry_1/data_1\"]\n    f = h5py.File(fnames[0], \"r\")\n    for dname in datasets:\n        dset = f[dname]\n        for key in dset.keys():\n            if f\"{dname}/{key}\" not in datasets:\n                dname_list.append(dname)\n                key_list.append(key)\n                shape_list.append(dset[key].shape)\n                dtype_list.append(dset[key].dtype)\n    f.close()\n\n    # Compute cumulative powder hits and misses for all files\n    powder_hits, powder_misses = None, None\n    for fn in fnames:\n        f = h5py.File(fn, \"r\")\n        if powder_hits is None:\n            powder_hits = f[\"entry_1/data_1/powderHits\"][:].copy()\n            powder_misses = f[\"entry_1/data_1/powderMisses\"][:].copy()\n        else:\n            powder_hits = numpy.maximum(\n                powder_hits, f[\"entry_1/data_1/powderHits\"][:].copy()\n            )\n            powder_misses = numpy.maximum(\n                powder_misses, f[\"entry_1/data_1/powderMisses\"][:].copy()\n            )\n        f.close()\n\n    vfname: Path = Path(outdir) / f\"{exp}_r{run:0&gt;4}{tag}.cxi\"\n    with h5py.File(vfname, \"w\") as vdf:\n\n        # Write the virtual hdf5 file\n        for dnum in range(len(dname_list)):\n            dname = f\"{dname_list[dnum]}/{key_list[dnum]}\"\n            if key_list[dnum] not in [\"mask\", \"powderHits\", \"powderMisses\"]:\n                layout = h5py.VirtualLayout(\n                    shape=(n_hits_total,) + shape_list[dnum][1:], dtype=dtype_list[dnum]\n                )\n                cursor = 0\n                for i, fn in enumerate(fnames):\n                    vsrc = h5py.VirtualSource(\n                        fn, dname, shape=(n_hits_per_rank[i],) + shape_list[dnum][1:]\n                    )\n                    if len(shape_list[dnum]) == 1:\n                        layout[cursor : cursor + n_hits_per_rank[i]] = vsrc\n                    else:\n                        layout[cursor : cursor + n_hits_per_rank[i], :] = vsrc\n                    cursor += n_hits_per_rank[i]\n                vdf.create_virtual_dataset(dname, layout, fillvalue=-1)\n\n        vdf[\"entry_1/data_1/powderHits\"] = powder_hits\n        vdf[\"entry_1/data_1/powderMisses\"] = powder_misses\n\n    return vfname\n</code></pre>"},{"location":"source/tasks/sfx_index/","title":"sfx_index","text":"<p>Classes for indexing tasks in SFX.</p> <p>Classes:</p> Name Description <code>ConcatenateStreamFIles</code> <p>task that merges multiple stream files into a single file.</p>"},{"location":"source/tasks/sfx_index/#tasks.sfx_index.ConcatenateStreamFiles","title":"<code>ConcatenateStreamFiles</code>","text":"<p>               Bases: <code>Task</code></p> <p>Task that merges stream files located within a directory tree.</p> Source code in <code>lute/tasks/sfx_index.py</code> <pre><code>class ConcatenateStreamFiles(Task):\n    \"\"\"\n    Task that merges stream files located within a directory tree.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n\n        stream_file_path: Path = Path(self._task_parameters.in_file)\n        stream_file_list: List[Path] = list(\n            stream_file_path.rglob(f\"{self._task_parameters.tag}_*.stream\")\n        )\n\n        processed_file_list = [str(stream_file) for stream_file in stream_file_list]\n\n        msg: Message = Message(\n            contents=f\"Merging following stream files: {processed_file_list} into \"\n            f\"{self._task_parameters.out_file}\",\n        )\n        self._report_to_executor(msg)\n\n        wfd: BinaryIO\n        with open(self._task_parameters.out_file, \"wb\") as wfd:\n            infile: Path\n            for infile in stream_file_list:\n                fd: BinaryIO\n                with open(infile, \"rb\") as fd:\n                    shutil.copyfileobj(fd, wfd)\n</code></pre>"},{"location":"source/tasks/task/","title":"task","text":"<p>Base classes for implementing analysis tasks.</p> <p>Classes:</p> Name Description <code>Task</code> <p>Abstract base class from which all analysis tasks are derived.</p> <code>ThirdPartyTask</code> <p>Class to run a third-party executable binary as a <code>Task</code>.</p>"},{"location":"source/tasks/task/#tasks.task.DescribedAnalysis","title":"<code>DescribedAnalysis</code>  <code>dataclass</code>","text":"<p>Complete analysis description. Held by an Executor.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass DescribedAnalysis:\n    \"\"\"Complete analysis description. Held by an Executor.\"\"\"\n\n    task_result: TaskResult\n    task_parameters: Optional[TaskParameters]\n    task_env: Dict[str, str]\n    poll_interval: float\n    communicator_desc: List[str]\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.ElogSummaryPlots","title":"<code>ElogSummaryPlots</code>  <code>dataclass</code>","text":"<p>Holds a graphical summary intended for display in the eLog.</p> <p>Attributes:</p> Name Type Description <code>display_name</code> <code>str</code> <p>This represents both a path and how the result will be displayed in the eLog. Can include \"/\" characters. E.g. <code>display_name = \"scans/my_motor_scan\"</code> will have plots shown on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors how the file is stored on disk as well.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass ElogSummaryPlots:\n    \"\"\"Holds a graphical summary intended for display in the eLog.\n\n    Attributes:\n        display_name (str): This represents both a path and how the result will be\n            displayed in the eLog. Can include \"/\" characters. E.g.\n            `display_name = \"scans/my_motor_scan\"` will have plots shown\n            on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors\n            how the file is stored on disk as well.\n    \"\"\"\n\n    display_name: str\n    figures: Union[pn.Tabs, hv.Image, plt.Figure]\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task","title":"<code>Task</code>","text":"<p>               Bases: <code>ABC</code></p> <p>Abstract base class for analysis tasks.</p> <p>Attributes:</p> Name Type Description <code>name</code> <code>str</code> <p>The name of the Task.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>class Task(ABC):\n    \"\"\"Abstract base class for analysis tasks.\n\n    Attributes:\n        name (str): The name of the Task.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters, use_mpi: bool = False) -&gt; None:\n        \"\"\"Initialize a Task.\n\n        Args:\n            params (TaskParameters): Parameters needed to properly configure\n                the analysis task. These are NOT related to execution parameters\n                (number of cores, etc), except, potentially, in case of binary\n                executable sub-classes.\n\n            use_mpi (bool): Whether this Task requires the use of MPI.\n                This determines the behaviour and timing of certain signals\n                and ensures appropriate barriers are placed to not end\n                processing until all ranks have finished.\n        \"\"\"\n        self.name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        self._result: TaskResult = TaskResult(\n            task_name=self.name,\n            task_status=TaskStatus.PENDING,\n            summary=\"PENDING\",\n            payload=\"\",\n        )\n        self._task_parameters: TaskParameters = params\n        timeout: int = self._task_parameters.lute_config.task_timeout\n        signal.setitimer(signal.ITIMER_REAL, timeout)\n\n        run_directory: Optional[str] = self._task_parameters.Config.run_directory\n        if run_directory is not None:\n            try:\n                os.chdir(run_directory)\n            except FileNotFoundError:\n                warnings.warn(\n                    (\n                        f\"Attempt to change to {run_directory}, but it is not found!\\n\"\n                        f\"Will attempt to run from {os.getcwd()}. It may fail!\"\n                    ),\n                    category=UserWarning,\n                )\n        self._use_mpi: bool = use_mpi\n\n    def run(self) -&gt; None:\n        \"\"\"Calls the analysis routines and any pre/post task functions.\n\n        This method is part of the public API and should not need to be modified\n        in any subclasses.\n        \"\"\"\n        self._signal_start()\n        self._pre_run()\n        self._run()\n        self._post_run()\n        self._signal_result()\n\n    @abstractmethod\n    def _run(self) -&gt; None:\n        \"\"\"Actual analysis to run. Overridden by subclasses.\n\n        Separating the calling API from the implementation allows `run` to\n        have pre and post task functionality embedded easily into a single\n        function call.\n        \"\"\"\n        ...\n\n    def _pre_run(self) -&gt; None:\n        \"\"\"Code to run BEFORE the main analysis takes place.\n\n        This function may, or may not, be employed by subclasses.\n        \"\"\"\n        ...\n\n    def _post_run(self) -&gt; None:\n        \"\"\"Code to run AFTER the main analysis takes place.\n\n        This function may, or may not, be employed by subclasses.\n        \"\"\"\n        ...\n\n    @property\n    def result(self) -&gt; TaskResult:\n        \"\"\"TaskResult: Read-only Task Result information.\"\"\"\n        return self._result\n\n    def __call__(self) -&gt; None:\n        self.run()\n\n    def _signal_start(self) -&gt; None:\n        \"\"\"Send the signal that the Task will begin shortly.\"\"\"\n        start_msg: Message = Message(\n            contents=self._task_parameters, signal=\"TASK_STARTED\"\n        )\n        self._result.task_status = TaskStatus.RUNNING\n        if self._use_mpi:\n            from mpi4py import MPI\n\n            comm: MPI.Intracomm = MPI.COMM_WORLD\n            rank: int = comm.Get_rank()\n            comm.Barrier()\n            if rank == 0:\n                self._report_to_executor(start_msg)\n        else:\n            self._report_to_executor(start_msg)\n\n    def _signal_result(self) -&gt; None:\n        \"\"\"Send the signal that results are ready along with the results.\"\"\"\n        signal: str = \"TASK_RESULT\"\n        results_msg: Message = Message(contents=self.result, signal=signal)\n        if self._use_mpi:\n            from mpi4py import MPI\n\n            comm: MPI.Intracomm = MPI.COMM_WORLD\n            rank: int = comm.Get_rank()\n            comm.Barrier()\n            if rank == 0:\n                self._report_to_executor(results_msg)\n        else:\n            self._report_to_executor(results_msg)\n        time.sleep(0.1)\n\n    def _report_to_executor(self, msg: Message) -&gt; None:\n        \"\"\"Send a message to the Executor.\n\n        Details of `Communicator` choice are hidden from the caller. This\n        method may be overriden by subclasses with specialized functionality.\n\n        Args:\n            msg (Message): The message object to send.\n        \"\"\"\n        communicator: Communicator\n        if isinstance(msg.contents, str) or msg.contents is None:\n            communicator = PipeCommunicator()\n        else:\n            communicator = SocketCommunicator()\n\n        communicator.delayed_setup()\n        communicator.write(msg)\n        communicator.clear_communicator()\n\n    def clean_up_timeout(self) -&gt; None:\n        \"\"\"Perform any necessary cleanup actions before exit if timing out.\"\"\"\n        ...\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.result","title":"<code>result: TaskResult</code>  <code>property</code>","text":"<p>TaskResult: Read-only Task Result information.</p>"},{"location":"source/tasks/task/#tasks.task.Task.__init__","title":"<code>__init__(*, params, use_mpi=False)</code>","text":"<p>Initialize a Task.</p> <p>Parameters:</p> Name Type Description Default <code>params</code> <code>TaskParameters</code> <p>Parameters needed to properly configure the analysis task. These are NOT related to execution parameters (number of cores, etc), except, potentially, in case of binary executable sub-classes.</p> required <code>use_mpi</code> <code>bool</code> <p>Whether this Task requires the use of MPI. This determines the behaviour and timing of certain signals and ensures appropriate barriers are placed to not end processing until all ranks have finished.</p> <code>False</code> Source code in <code>lute/tasks/task.py</code> <pre><code>def __init__(self, *, params: TaskParameters, use_mpi: bool = False) -&gt; None:\n    \"\"\"Initialize a Task.\n\n    Args:\n        params (TaskParameters): Parameters needed to properly configure\n            the analysis task. These are NOT related to execution parameters\n            (number of cores, etc), except, potentially, in case of binary\n            executable sub-classes.\n\n        use_mpi (bool): Whether this Task requires the use of MPI.\n            This determines the behaviour and timing of certain signals\n            and ensures appropriate barriers are placed to not end\n            processing until all ranks have finished.\n    \"\"\"\n    self.name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n    self._result: TaskResult = TaskResult(\n        task_name=self.name,\n        task_status=TaskStatus.PENDING,\n        summary=\"PENDING\",\n        payload=\"\",\n    )\n    self._task_parameters: TaskParameters = params\n    timeout: int = self._task_parameters.lute_config.task_timeout\n    signal.setitimer(signal.ITIMER_REAL, timeout)\n\n    run_directory: Optional[str] = self._task_parameters.Config.run_directory\n    if run_directory is not None:\n        try:\n            os.chdir(run_directory)\n        except FileNotFoundError:\n            warnings.warn(\n                (\n                    f\"Attempt to change to {run_directory}, but it is not found!\\n\"\n                    f\"Will attempt to run from {os.getcwd()}. It may fail!\"\n                ),\n                category=UserWarning,\n            )\n    self._use_mpi: bool = use_mpi\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.clean_up_timeout","title":"<code>clean_up_timeout()</code>","text":"<p>Perform any necessary cleanup actions before exit if timing out.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>def clean_up_timeout(self) -&gt; None:\n    \"\"\"Perform any necessary cleanup actions before exit if timing out.\"\"\"\n    ...\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.run","title":"<code>run()</code>","text":"<p>Calls the analysis routines and any pre/post task functions.</p> <p>This method is part of the public API and should not need to be modified in any subclasses.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>def run(self) -&gt; None:\n    \"\"\"Calls the analysis routines and any pre/post task functions.\n\n    This method is part of the public API and should not need to be modified\n    in any subclasses.\n    \"\"\"\n    self._signal_start()\n    self._pre_run()\n    self._run()\n    self._post_run()\n    self._signal_result()\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskResult","title":"<code>TaskResult</code>  <code>dataclass</code>","text":"<p>Class for storing the result of a Task's execution with metadata.</p> <p>Attributes:</p> Name Type Description <code>task_name</code> <code>str</code> <p>Name of the associated task which produced it.</p> <code>task_status</code> <code>TaskStatus</code> <p>Status of associated task.</p> <code>summary</code> <code>str</code> <p>Short message/summary associated with the result.</p> <code>payload</code> <code>Any</code> <p>Actual result. May be data in any format.</p> <code>impl_schemas</code> <code>Optional[str]</code> <p>A string listing <code>Task</code> schemas implemented by the associated <code>Task</code>. Schemas define the category and expected output of the <code>Task</code>. An individual task may implement/conform to multiple schemas. Multiple schemas are separated by ';', e.g.     * impl_schemas = \"schema1;schema2\"</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass TaskResult:\n    \"\"\"Class for storing the result of a Task's execution with metadata.\n\n    Attributes:\n        task_name (str): Name of the associated task which produced it.\n\n        task_status (TaskStatus): Status of associated task.\n\n        summary (str): Short message/summary associated with the result.\n\n        payload (Any): Actual result. May be data in any format.\n\n        impl_schemas (Optional[str]): A string listing `Task` schemas implemented\n            by the associated `Task`. Schemas define the category and expected\n            output of the `Task`. An individual task may implement/conform to\n            multiple schemas. Multiple schemas are separated by ';', e.g.\n                * impl_schemas = \"schema1;schema2\"\n    \"\"\"\n\n    task_name: str\n    task_status: TaskStatus\n    summary: str\n    payload: Any\n    impl_schemas: Optional[str] = None\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskStatus","title":"<code>TaskStatus</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Possible Task statuses.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>class TaskStatus(Enum):\n    \"\"\"Possible Task statuses.\"\"\"\n\n    PENDING = 0\n    \"\"\"\n    Task has yet to run. Is Queued, or waiting for prior tasks.\n    \"\"\"\n    RUNNING = 1\n    \"\"\"\n    Task is in the process of execution.\n    \"\"\"\n    COMPLETED = 2\n    \"\"\"\n    Task has completed without fatal errors.\n    \"\"\"\n    FAILED = 3\n    \"\"\"\n    Task encountered a fatal error.\n    \"\"\"\n    STOPPED = 4\n    \"\"\"\n    Task was, potentially temporarily, stopped/suspended.\n    \"\"\"\n    CANCELLED = 5\n    \"\"\"\n    Task was cancelled prior to completion or failure.\n    \"\"\"\n    TIMEDOUT = 6\n    \"\"\"\n    Task did not reach completion due to timeout.\n    \"\"\"\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.CANCELLED","title":"<code>CANCELLED = 5</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was cancelled prior to completion or failure.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.COMPLETED","title":"<code>COMPLETED = 2</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has completed without fatal errors.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.FAILED","title":"<code>FAILED = 3</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task encountered a fatal error.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.PENDING","title":"<code>PENDING = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has yet to run. Is Queued, or waiting for prior tasks.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.RUNNING","title":"<code>RUNNING = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task is in the process of execution.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.STOPPED","title":"<code>STOPPED = 4</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was, potentially temporarily, stopped/suspended.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.TIMEDOUT","title":"<code>TIMEDOUT = 6</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task did not reach completion due to timeout.</p>"},{"location":"source/tasks/task/#tasks.task.ThirdPartyTask","title":"<code>ThirdPartyTask</code>","text":"<p>               Bases: <code>Task</code></p> <p>A <code>Task</code> interface to analysis with binary executables.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>class ThirdPartyTask(Task):\n    \"\"\"A `Task` interface to analysis with binary executables.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        \"\"\"Initialize a Task.\n\n        Args:\n            params (TaskParameters): Parameters needed to properly configure\n                the analysis task. `Task`s of this type MUST include the name\n                of a binary to run and any arguments which should be passed to\n                it (as would be done via command line). The binary is included\n                with the parameter `executable`. All other parameter names are\n                assumed to be the long/extended names of the flag passed on the\n                command line by default:\n                    * `arg_name = 3` is converted to `--arg_name 3`\n                Positional arguments can be included with `p_argN` where `N` is\n                any integer:\n                    * `p_arg1 = 3` is converted to `3`\n\n                Note that it is NOT recommended to rely on this default behaviour\n                as command-line arguments can be passed in many ways. Refer to\n                the dcoumentation at\n                https://slac-lcls.github.io/lute/tutorial/new_task/\n                under \"Speciyfing a TaskParameters Model for your Task\" for more\n                information on how to control parameter parsing from within your\n                TaskParameters model definition.\n        \"\"\"\n        super().__init__(params=params)\n        self._cmd = self._task_parameters.executable\n        self._args_list: List[str] = [self._cmd]\n        self._template_context: Dict[str, Any] = {}\n\n    def _add_to_jinja_context(self, param_name: str, value: Any) -&gt; None:\n        \"\"\"Store a parameter as a Jinja template variable.\n\n        Variables are stored in a dictionary which is used to fill in a\n        premade Jinja template for a third party configuration file.\n\n        Args:\n            param_name (str): Name to store the variable as. This should be\n                the name defined in the corresponding pydantic model. This name\n                MUST match the name used in the Jinja Template!\n            value (Any): The value to store. If possible, large chunks of the\n                template should be represented as a single dictionary for\n                simplicity; however, any type can be stored as needed.\n        \"\"\"\n        context_update: Dict[str, Any] = {param_name: value}\n        if __debug__:\n            msg: Message = Message(contents=f\"TemplateParameters: {context_update}\")\n            self._report_to_executor(msg)\n        self._template_context.update(context_update)\n\n    def _template_to_config_file(self) -&gt; None:\n        \"\"\"Convert a template file into a valid configuration file.\n\n        Uses Jinja to fill in a provided template file with variables supplied\n        through the LUTE config file. This facilitates parameter modification\n        for third party tasks which use a separate configuration, in addition\n        to, or instead of, command-line arguments.\n        \"\"\"\n        from jinja2 import Environment, FileSystemLoader, Template\n\n        out_file: str = self._task_parameters.lute_template_cfg.output_path\n        template_name: str = self._task_parameters.lute_template_cfg.template_name\n\n        lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n        template_dir: str\n        if lute_path is None:\n            warnings.warn(\n                \"LUTE_PATH is None in Task process! Using relative path for templates!\",\n                category=UserWarning,\n            )\n            template_dir: str = \"../../config/templates\"\n        else:\n            template_dir = f\"{lute_path}/config/templates\"\n        environment: Environment = Environment(loader=FileSystemLoader(template_dir))\n        template: Template = environment.get_template(template_name)\n\n        with open(out_file, \"w\", encoding=\"utf-8\") as cfg_out:\n            cfg_out.write(template.render(self._template_context))\n\n    def _pre_run(self) -&gt; None:\n        \"\"\"Parse the parameters into an appropriate argument list.\n\n        Arguments are identified by a `flag_type` attribute, defined in the\n        pydantic model, which indicates how to pass the parameter and its\n        argument on the command-line. This method parses flag:value pairs\n        into an appropriate list to be used to call the executable.\n\n        Note:\n        ThirdPartyParameter objects are returned by custom model validators.\n        Objects of this type are assumed to be used for a templated config\n        file used by the third party executable for configuration. The parsing\n        of these parameters is performed separately by a template file used as\n        an input to Jinja. This method solely identifies the necessary objects\n        and passes them all along. Refer to the template files and pydantic\n        models for more information on how these parameters are defined and\n        identified.\n        \"\"\"\n        super()._pre_run()\n        full_schema: Dict[str, Union[str, Dict[str, Any]]] = (\n            self._task_parameters.schema()\n        )\n        short_flags_use_eq: bool\n        long_flags_use_eq: bool\n        if hasattr(self._task_parameters.Config, \"short_flags_use_eq\"):\n            short_flags_use_eq: bool = self._task_parameters.Config.short_flags_use_eq\n            long_flags_use_eq: bool = self._task_parameters.Config.long_flags_use_eq\n        else:\n            short_flags_use_eq = False\n            long_flags_use_eq = False\n        for param, value in self._task_parameters.dict().items():\n            # Clunky test with __dict__[param] because compound model-types are\n            # converted to `dict`. E.g. type(value) = dict not AnalysisHeader\n            if (\n                param == \"executable\"\n                or value is None  # Cannot have empty values in argument list for execvp\n                or value == \"\"  # But do want to include, e.g. 0\n                or isinstance(self._task_parameters.__dict__[param], TemplateConfig)\n                or isinstance(self._task_parameters.__dict__[param], AnalysisHeader)\n            ):\n                continue\n            if isinstance(self._task_parameters.__dict__[param], TemplateParameters):\n                # TemplateParameters objects have a single parameter `params`\n                self._add_to_jinja_context(param_name=param, value=value.params)\n                continue\n\n            param_attributes: Dict[str, Any] = full_schema[\"properties\"][param]\n            # Some model params do not match the commnad-line parameter names\n            param_repr: str\n            if \"rename_param\" in param_attributes:\n                param_repr = param_attributes[\"rename_param\"]\n            else:\n                param_repr = param\n            if \"flag_type\" in param_attributes:\n                flag: str = param_attributes[\"flag_type\"]\n                if flag:\n                    # \"-\" or \"--\" flags\n                    if flag == \"--\" and isinstance(value, bool) and not value:\n                        continue\n                    constructed_flag: str = f\"{flag}{param_repr}\"\n                    if flag == \"--\" and isinstance(value, bool) and value:\n                        # On/off flag, e.g. something like --verbose: No Arg\n                        self._args_list.append(f\"{constructed_flag}\")\n                        continue\n                    if (flag == \"-\" and short_flags_use_eq) or (\n                        flag == \"--\" and long_flags_use_eq\n                    ):  # Must come after above check! Otherwise you get --param=True\n                        # Flags following --param=value or -param=value\n                        constructed_flag = f\"{constructed_flag}={value}\"\n                        self._args_list.append(f\"{constructed_flag}\")\n                        continue\n                    self._args_list.append(f\"{constructed_flag}\")\n            else:\n                warnings.warn(\n                    (\n                        f\"Model parameters should be defined using Field(...,flag_type='')\"\n                        f\" in the future.  Parameter: {param}\"\n                    ),\n                    category=PendingDeprecationWarning,\n                )\n                if len(param) == 1:  # Single-dash flags\n                    if short_flags_use_eq:\n                        self._args_list.append(f\"-{param_repr}={value}\")\n                        continue\n                    self._args_list.append(f\"-{param_repr}\")\n                elif \"p_arg\" in param:  # Positional arguments\n                    pass\n                else:  # Double-dash flags\n                    if isinstance(value, bool) and not value:\n                        continue\n                    if long_flags_use_eq:\n                        self._args_list.append(f\"--{param_repr}={value}\")\n                        continue\n                    self._args_list.append(f\"--{param_repr}\")\n                    if isinstance(value, bool) and value:\n                        continue\n            if isinstance(value, str) and \" \" in value:\n                for val in value.split():\n                    self._args_list.append(f\"{val}\")\n            else:\n                self._args_list.append(f\"{value}\")\n        if (\n            hasattr(self._task_parameters, \"lute_template_cfg\")\n            and self._template_context\n        ):\n            self._template_to_config_file()\n\n    def _run(self) -&gt; None:\n        \"\"\"Execute the new program by replacing the current process.\"\"\"\n        if __debug__:\n            time.sleep(0.1)\n            msg: Message = Message(contents=self._formatted_command())\n            self._report_to_executor(msg)\n        LUTE_DEBUG_EXIT(\"LUTE_DEBUG_BEFORE_TPP_EXEC\")\n        os.execvp(file=self._cmd, args=self._args_list)\n\n    def _formatted_command(self) -&gt; str:\n        \"\"\"Returns the command as it would passed on the command-line.\"\"\"\n        formatted_cmd: str = \"\".join(f\"{arg} \" for arg in self._args_list)\n        return formatted_cmd\n\n    def _signal_start(self) -&gt; None:\n        \"\"\"Override start signal method to switch communication methods.\"\"\"\n        super()._signal_start()\n        time.sleep(0.05)\n        signal: str = \"NO_PICKLE_MODE\"\n        msg: Message = Message(signal=signal)\n        self._report_to_executor(msg)\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.ThirdPartyTask.__init__","title":"<code>__init__(*, params)</code>","text":"<p>Initialize a Task.</p> <p>Parameters:</p> Name Type Description Default <code>params</code> <code>TaskParameters</code> <p>Parameters needed to properly configure the analysis task. <code>Task</code>s of this type MUST include the name of a binary to run and any arguments which should be passed to it (as would be done via command line). The binary is included with the parameter <code>executable</code>. All other parameter names are assumed to be the long/extended names of the flag passed on the command line by default:     * <code>arg_name = 3</code> is converted to <code>--arg_name 3</code> Positional arguments can be included with <code>p_argN</code> where <code>N</code> is any integer:     * <code>p_arg1 = 3</code> is converted to <code>3</code></p> <p>Note that it is NOT recommended to rely on this default behaviour as command-line arguments can be passed in many ways. Refer to the dcoumentation at https://slac-lcls.github.io/lute/tutorial/new_task/ under \"Speciyfing a TaskParameters Model for your Task\" for more information on how to control parameter parsing from within your TaskParameters model definition.</p> required Source code in <code>lute/tasks/task.py</code> <pre><code>def __init__(self, *, params: TaskParameters) -&gt; None:\n    \"\"\"Initialize a Task.\n\n    Args:\n        params (TaskParameters): Parameters needed to properly configure\n            the analysis task. `Task`s of this type MUST include the name\n            of a binary to run and any arguments which should be passed to\n            it (as would be done via command line). The binary is included\n            with the parameter `executable`. All other parameter names are\n            assumed to be the long/extended names of the flag passed on the\n            command line by default:\n                * `arg_name = 3` is converted to `--arg_name 3`\n            Positional arguments can be included with `p_argN` where `N` is\n            any integer:\n                * `p_arg1 = 3` is converted to `3`\n\n            Note that it is NOT recommended to rely on this default behaviour\n            as command-line arguments can be passed in many ways. Refer to\n            the dcoumentation at\n            https://slac-lcls.github.io/lute/tutorial/new_task/\n            under \"Speciyfing a TaskParameters Model for your Task\" for more\n            information on how to control parameter parsing from within your\n            TaskParameters model definition.\n    \"\"\"\n    super().__init__(params=params)\n    self._cmd = self._task_parameters.executable\n    self._args_list: List[str] = [self._cmd]\n    self._template_context: Dict[str, Any] = {}\n</code></pre>"},{"location":"source/tasks/test/","title":"test","text":"<p>Basic test Tasks for testing functionality.</p> <p>Classes:</p> Name Description <code>Test</code> <p>Simplest test Task - runs a 10 iteration loop and returns a result.</p> <code>TestSocket</code> <p>Test Task which sends larger data to test socket IPC.</p> <code>TestWriteOutput</code> <p>Test Task which writes an output file.</p> <code>TestReadOutput</code> <p>Test Task which reads in a file. Can be used to test database access.</p>"},{"location":"source/tasks/test/#tasks.test.Test","title":"<code>Test</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to ensure subprocess and pipe-based IPC work.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class Test(Task):\n    \"\"\"Simple test Task to ensure subprocess and pipe-based IPC work.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(10):\n            time.sleep(1)\n            msg: Message = Message(contents=f\"Test message {i}\")\n            self._report_to_executor(msg)\n        if self._task_parameters.throw_error:\n            raise RuntimeError(\"Testing Error!\")\n\n    def _post_run(self) -&gt; None:\n        self._result.summary = \"Test Finished.\"\n        self._result.task_status = TaskStatus.COMPLETED\n        time.sleep(0.1)\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestReadOutput","title":"<code>TestReadOutput</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to read in output from the test Task above.</p> <p>Its pydantic model relies on a database access to retrieve the output file.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestReadOutput(Task):\n    \"\"\"Simple test Task to read in output from the test Task above.\n\n    Its pydantic model relies on a database access to retrieve the output file.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        array: np.ndarray = np.loadtxt(self._task_parameters.in_file, delimiter=\",\")\n        self._report_to_executor(msg=Message(contents=\"Successfully loaded data!\"))\n        for i in range(5):\n            time.sleep(1)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.summary = \"Was able to load data.\"\n        self._result.payload = \"This Task produces no output.\"\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestSocket","title":"<code>TestSocket</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to ensure basic IPC over Unix sockets works.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestSocket(Task):\n    \"\"\"Simple test Task to ensure basic IPC over Unix sockets works.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(self._task_parameters.num_arrays):\n            msg: Message = Message(contents=f\"Sending array {i}\")\n            self._report_to_executor(msg)\n            time.sleep(0.05)\n            msg: Message = Message(\n                contents=np.random.rand(self._task_parameters.array_size)\n            )\n            self._report_to_executor(msg)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.summary = f\"Sent {self._task_parameters.num_arrays} arrays\"\n        self._result.payload = np.random.rand(self._task_parameters.array_size)\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestWriteOutput","title":"<code>TestWriteOutput</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to write output other Tasks depend on.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestWriteOutput(Task):\n    \"\"\"Simple test Task to write output other Tasks depend on.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(self._task_parameters.num_vals):\n            # Doing some calculations...\n            time.sleep(0.05)\n            if i % 10 == 0:\n                msg: Message = Message(contents=f\"Processed {i+1} values!\")\n                self._report_to_executor(msg)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        work_dir: str = self._task_parameters.lute_config.work_dir\n        out_file: str = f\"{work_dir}/{self._task_parameters.outfile_name}\"\n        array: np.ndarray = np.random.rand(self._task_parameters.num_vals)\n        np.savetxt(out_file, array, delimiter=\",\")\n        self._result.summary = \"Completed task successfully.\"\n        self._result.payload = out_file\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"tutorial/creating_workflows/","title":"Workflows with Airflow","text":"<p>Note: Airflow uses the term DAG, or directed acyclic graph, to describe workflows of tasks with defined (and acyclic) connectivities. This page will use the terms workflow and DAG interchangeably.</p>"},{"location":"tutorial/creating_workflows/#relevant-components","title":"Relevant Components","text":"<p>In addition to the core LUTE package, a number of components are generally involved to run a workflow. The current set of scripts and objects are used to interface with Airflow, and the SLURM job scheduler. The core LUTE library can also be used to run workflows using different backends, and in the future these may be supported.</p> <p>For building and running workflows using SLURM and Airflow, the following components are necessary, and will be described in more detail below: - Airflow launch script: <code>launch_airflow.py</code>   - This has a wrapper batch submission script: <code>submit_launch_airflow.sh</code> . When running using the ARP (from the eLog), you MUST use this wrapper script instead of the Python script directly. - SLURM submission script: <code>submit_slurm.sh</code> - Airflow operators:   - <code>JIDSlurmOperator</code></p>"},{"location":"tutorial/creating_workflows/#launchsubmission-scripts","title":"Launch/Submission Scripts","text":""},{"location":"tutorial/creating_workflows/#launch_airflowpy","title":"<code>launch_airflow.py</code>","text":"<p>Sends a request to an Airflow instance to submit a specific DAG (workflow). This script prepares an HTTP request with the appropriate parameters in a specific format.</p> <p>A request involves the following information, most of which is retrieved automatically:</p> <pre><code>dag_run_data: Dict[str, Union[str, Dict[str, Union[str, int, List[str]]]]] = {\n    \"dag_run_id\": str(uuid.uuid4()),\n    \"conf\": {\n        \"experiment\": os.environ.get(\"EXPERIMENT\"),\n        \"run_id\": f\"{os.environ.get('RUN_NUM')}{datetime.datetime.utcnow().isoformat()}\",\n        \"JID_UPDATE_COUNTERS\": os.environ.get(\"JID_UPDATE_COUNTERS\"),\n        \"ARP_ROOT_JOB_ID\": os.environ.get(\"ARP_JOB_ID\"),\n        \"ARP_LOCATION\": os.environ.get(\"ARP_LOCATION\", \"S3DF\"),\n        \"Authorization\": os.environ.get(\"Authorization\"),\n        \"user\": getpass.getuser(),\n        \"lute_params\": params,\n        \"slurm_params\": extra_args,\n        \"workflow\": wf_defn,  # Used only for custom DAGs. See below under advanced usage.\n    },\n}\n</code></pre> <p>Note that the environment variables are used to fill in the appropriate information because this script is intended to be launched primarily from the ARP (which passes these variables). The ARP allows for the launch job to be defined in the experiment eLog and submitted automatically for each new DAQ run. The environment variables <code>EXPERIMENT</code> and <code>RUN</code> can alternatively be defined prior to submitting the script on the command-line.</p> <p>The script takes a number of parameters:</p> <pre><code>launch_airflow.py -c &lt;path_to_config_yaml&gt; -w &lt;workflow_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] [SLURM_ARGS]\n</code></pre> <ul> <li><code>-c</code> refers to the path of the configuration YAML that contains the parameters for each managed <code>Task</code> in the requested workflow.</li> <li><code>-w</code> is the name of the DAG (workflow) to run. By convention each DAG is named by the Python file it is defined in. (See below).</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> is an optional flag to run all steps of the workflow in debug mode for verbose logging and output.</li> <li><code>--test</code> is an optional flag which will use the test Airflow instance. By default the script will make requests of the standard production Airflow instance.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>SLURM_ARGS</code> are SLURM arguments to be passed to the <code>submit_slurm.sh</code> script which are used for each individual managed <code>Task</code>. These arguments to do NOT affect the submission parameters for the job running <code>launch_airflow.py</code> (if using <code>submit_launch_airflow.sh</code> below).</li> </ul> <p>Lifetime This script will run for the entire duration of the workflow (DAG). After making the initial request of Airflow to launch the DAG, it will enter a status update loop which will keep track of each individual job (each job runs one managed <code>Task</code>)  submitted by Airflow. At the end of each job it will collect the log file, in addition to providing a few other status updates/debugging messages, and append it to its own log. This allows all logging for the entire workflow (DAG) to be inspected from an individual file. This is particularly useful when running via the eLog, because only a single log file is displayed.</p>"},{"location":"tutorial/creating_workflows/#submit_launch_airflowsh","title":"<code>submit_launch_airflow.sh</code>","text":"<p>This script is only necessary when running from the eLog using the ARP. The initial job submitted by the ARP can not have a duration of longer than 30 seconds, as it will then time out. As the <code>launch_airflow.py</code> job will live for the entire duration of the workflow, which is often much longer than 30 seconds, the solution was to have a wrapper which submits the <code>launch_airflow.py</code> script to run on the S3DF batch nodes. Usage of this script is mostly identical to <code>launch_airflow.py</code>. All the arguments are passed transparently to the underlying Python script with the exception of the first argument which must be the location of the underlying <code>launch_airflow.py</code> script. The wrapper will simply launch a batch job using minimal resources (1 core). While the primary purpose of the script is to allow running from the eLog, it is also an useful wrapper generally, to be able to submit the previous script as a SLURM job.</p> <p>Usage:</p> <pre><code>submit_launch_airflow.sh /path/to/launch_airflow.py -c &lt;path_to_config_yaml&gt; -w &lt;workflow_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] [SLURM_ARGS]\n</code></pre>"},{"location":"tutorial/creating_workflows/#submit_slurmsh","title":"<code>submit_slurm.sh</code>","text":"<p>Launches a job on the S3DF batch nodes using the SLURM job scheduler. This script launches a single managed <code>Task</code> at a time. The usage is as follows:</p> <pre><code>submit_slurm.sh -c &lt;path_to_config_yaml&gt; -t &lt;MANAGED_task_name&gt; [--debug] [SLURM_ARGS ...]\n</code></pre> <p>As a reminder the managed <code>Task</code> refers to the <code>Executor</code>-<code>Task</code> combination. The script does not parse any SLURM specific parameters, and instead passes them transparently to SLURM. At least the following two SLURM arguments must be provided:</p> <pre><code>--partition=&lt;...&gt; # Usually partition=milano\n--account=&lt;...&gt; # Usually account=lcls:$EXPERIMENT\n</code></pre> <p>Generally, resource requests will also be included, such as the number of cores to use. A complete call may look like the following:</p> <pre><code>submit_slurm.sh -c /sdf/data/lcls/ds/hutch/experiment/scratch/config.yaml -t Tester --partition=milano --account=lcls:experiment --ntasks=100 [...]\n</code></pre> <p>When running a workflow using the <code>launch_airflow.py</code> script, each step of the workflow will be submitted using this script.</p>"},{"location":"tutorial/creating_workflows/#operators","title":"Operators","text":"<p><code>Operator</code>s are the objects submitted as individual steps of a DAG by Airflow. They are conceptually linked to the idea of a task in that each task of a workflow is generally an operator. Care should be taken, not to confuse them with LUTE <code>Task</code>s or managed <code>Task</code>s though. There is, however, usually a one-to-one correspondance between a <code>Task</code> and an <code>Operator</code>.</p> <p>Airflow runs on a K8S cluster which has no access to the experiment data. When we ask Airflow to run a DAG, it will launch an <code>Operator</code> for each step of the DAG. However, the <code>Operator</code> itself cannot perform productive analysis without access to the data. The solution employed by <code>LUTE</code> is to have a limited set of <code>Operator</code>s which do not perform analysis, but instead request that a <code>LUTE</code> managed <code>Task</code>s be submitted on the batch nodes where it can access the data. There may be small differences between how the various provided <code>Operator</code>s do this, but in general they will all make a request to the job interface daemon (JID) that a new SLURM job be scheduled using the <code>submit_slurm.sh</code> script described above.</p> <p>Therefore, running a typical Airflow DAG involves the following steps:</p> <ol> <li><code>launch_airflow.py</code> script is submitted, usually from a definition in the eLog.</li> <li>The <code>launch_airflow</code> script requests that Airflow run a specific DAG.</li> <li>The Airflow instance begins submitting the <code>Operator</code>s that makeup the DAG definition.</li> <li>Each <code>Operator</code> sends a request to the <code>JID</code> to submit a job.</li> <li>The <code>JID</code> submits the <code>elog_submit.sh</code> script with the appropriate managed <code>Task</code>.</li> <li>The managed <code>Task</code> runs on the batch nodes, while the <code>Operator</code>, requesting updates from the JID on job status, waits for it to complete.</li> <li>Once a managed <code>Task</code> completes, the <code>Operator</code> will receieve this information and tell the Airflow server whether the job completed successfully or resulted in failure.</li> <li>The Airflow server will then launch the next step of the DAG, and so on, until every step has been executed.</li> </ol> <p>Currently, the following <code>Operator</code>s are maintained: - <code>JIDSlurmOperator</code>: The standard <code>Operator</code>. Each instance has a one-to-one correspondance with a LUTE managed <code>Task</code>.</p>"},{"location":"tutorial/creating_workflows/#jidslurmoperator-arguments","title":"<code>JIDSlurmOperator</code> arguments","text":"<ul> <li><code>task_id</code>: This is nominally the name of the task on the Airflow side. However, for simplicity this is used 1-1 to match the name of a managed Task defined in LUTE's <code>managed_tasks.py</code> module. I.e., it should the name of an <code>Executor(\"Task\")</code> object which will run the specific Task of interest. This must match the name of a defined managed Task.</li> <li><code>max_cores</code>: Used to cap the maximum number of cores which should be requested of SLURM. By default all jobs will run with the same number of cores, which should be specified when running the <code>launch_airflow.py</code> script (either from the ARP, or by hand). This behaviour was chosen because in general we want to increase or decrease the core-count for all <code>Task</code>s uniformly, and we don't want to have to specify core number arguments for each job individually. Nonetheless, on occassion it may be necessary to cap the number of cores a specific job will use. E.g. if the default value specified when launching the Airflow DAG is multiple cores, and one job is single threaded, the core count can be capped for that single job to 1, while the rest run with multiple cores.</li> <li><code>max_nodes</code>: Similar to the above. This will make sure the <code>Task</code> is distributed across no more than a maximum number of nodes. This feature is useful for, e.g., multi-threaded software which does not make use of tools like <code>MPI</code>. So, the <code>Task</code> can run on multiple cores, but only within a single node.</li> <li><code>require_partition</code>: This option is a string that forces the use of a specific S3DF partition for the managed <code>Task</code> submitted by the Operator. E.g. typically a LCLS user will use <code>--partition=milano</code> for CPU-based workflows; however, if a specific <code>Task</code> requires a GPU you may use <code>JIDSlurmOperator(\"MyTaskRunner\", require_partition=\"ampere\")</code> to override the partition for that single <code>Task</code>.</li> <li><code>custom_slurm_params</code>: You can provide a string of parameters which will be used in its entirety to replace any and all default arguments passed by the launch script. This method is not recommended for general use and is mostly used for dynamic DAGs described at the end of the document.</li> </ul>"},{"location":"tutorial/creating_workflows/#creating-a-new-workflow","title":"Creating a new workflow","text":"<p>Defining a new workflow involves creating a new module (Python file) in the directory <code>workflows/airflow</code>, creating a number of <code>Operator</code> instances within the module, and then drawing the connectivity between them. At the top of the file an Airflow DAG is created and given a name. By convention all <code>LUTE</code> workflows use the name of the file as the name of the DAG. The following code can be copied exactly into the file:</p> <pre><code>from datetime import datetime\nimport os\nfrom airflow import DAG\nfrom lute.operators.jidoperators import JIDSlurmOperator # Import other operators if needed\n\ndag_id: str = f\"lute_{os.path.splitext(os.path.basename(__file__))[0]}\"\ndescription: str = (\n    \"Run SFX processing using PyAlgos peak finding and experimental phasing\"\n)\n\ndag: DAG = DAG(\n    dag_id=dag_id,\n    start_date=datetime(2024, 3, 18),\n    schedule_interval=None,\n    description=description,\n)\n</code></pre> <p>Once the DAG has been created, a number of <code>Operator</code>s must be created to run the various LUTE analysis operations. As an example consider a partial SFX processing workflow which includes steps for peak finding, indexing, merging, and calculating figures of merit. Each of the 4 steps will have an <code>Operator</code> instance which will launch a corresponding <code>LUTE</code> managed <code>Task</code>, for example:</p> <pre><code># Using only the JIDSlurmOperator\n# syntax: JIDSlurmOperator(task_id=\"LuteManagedTaskName\", dag=dag) # optionally, max_cores=123)\npeak_finder: JIDSlurmOperator = JIDSlurmOperator(task_id=\"PeakFinderPyAlgos\", dag=dag)\n\n# We specify a maximum number of cores for the rest of the jobs.\nindexer: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=120, task_id=\"CrystFELIndexer\", dag=dag\n)\n# We can alternatively specify this task be only ever run with the following args.\n# indexer: JIDSlurmOperator = JIDSlurmOperator(\n#     custom_slurm_params=\"--partition=milano --ntasks=120 --account=lcls:myaccount\",\n#     task_id=\"CrystFELIndexer\",\n#     dag=dag,\n# )\n\n# Merge\nmerger: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=120, task_id=\"PartialatorMerger\", dag=dag\n)\n\n# Figures of merit\nhkl_comparer: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=8, task_id=\"HKLComparer\", dag=dag\n)\n</code></pre> <p>Finally, the dependencies between the <code>Operator</code>s are \"drawn\", defining the execution order of the various steps. The <code>&gt;&gt;</code> operator has been overloaded for the <code>Operator</code> class, allowing it to be used to specify the next step in the DAG. In this case, a completely linear DAG is drawn as:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>Parallel execution can be added by using the <code>&gt;&gt;</code> operator multiple times. Consider a <code>task1</code> which upon successful completion starts a <code>task2</code> and <code>task3</code> in parallel. This dependency can be added to the DAG using:</p> <pre><code>#task1: JIDSlurmOperator = JIDSlurmOperator(...)\n#task2 ...\n\ntask1 &gt;&gt; task2\ntask1 &gt;&gt; task3\n</code></pre> <p>As each DAG is defined in pure Python, standard control structures (loops, if statements, etc.) can be used to create more complex workflow arrangements.</p> <p>Note: Your DAG will not be available to Airflow until your PR including the file you have defined is merged! Once merged the file will be synced with the Airflow instance and can be run using the scripts described earlier in this document. For testing it is generally preferred that you run each step of your DAG individually using the <code>submit_slurm.sh</code> script and the independent managed <code>Task</code> names. If, however, you want to test the behaviour of Airflow itself (in a modified form) you can use the advanced run-time DAGs defined below as well.</p>"},{"location":"tutorial/creating_workflows/#advanced-usage","title":"Advanced Usage","text":""},{"location":"tutorial/creating_workflows/#run-time-dag-creation","title":"Run-time DAG creation","text":"<p>In most cases, standard DAGs should be defined as described above and called by name. However, Airflow also supports the creation of DAGs dynamically, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Considering the first example DAG defined above (for serial femtosecond crystallography), the standard DAG looked like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next: []\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>. </p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code> as before, in the same way that would be passed to the <code>JIDSlurmOperator</code>.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the list are run in parallel. </li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined this way we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG. The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"tutorial/new_task/","title":"Integrating a New <code>Task</code>","text":"<p><code>Task</code>s can be broadly categorized into two types: - \"First-party\" - where the analysis or executed code is maintained within this library. - \"Third-party\" - where the analysis, code, or program is maintained elsewhere and is simply called by a wrapping <code>Task</code>.</p> <p>Creating a new <code>Task</code> of either type generally involves the same steps, although for first-party <code>Task</code>s, the analysis code must of course also be written. Due to this difference, as well as additional considerations for parameter handling when dealing with \"third-party\" <code>Task</code>s, the \"first-party\" and \"third-party\" <code>Task</code> integration cases will be considered separately.</p>"},{"location":"tutorial/new_task/#creating-a-third-party-task","title":"Creating a \"Third-party\" <code>Task</code>","text":"<p>There are two required steps for third-party <code>Task</code> integration, and one additional step which is optional, and may not be applicable to all possible third-party <code>Task</code>s. Generally, <code>Task</code> integration requires: 1. Defining a <code>TaskParameters</code> (pydantic) model which fully parameterizes the <code>Task</code>. This involves specifying a path to a binary, and all the required command-line arguments to run the binary. 2. Creating a managed <code>Task</code> by specifying an <code>Executor</code> for the new third-party <code>Task</code>. At this stage, any additional environment variables can be added which are required for the execution environment. 3. (Optional/Maybe applicable) Create a template for a third-party configuration file. If the new <code>Task</code> has its own configuration file, specifying a template will allow that file to be parameterized from the singular LUTE yaml configuration file. A couple of minor additions to the <code>pydantic</code> model specified in 1. are required to support template usage.</p> <p>Each of these stages will be discussed in detail below. The vast majority of the work is completed in step 1.</p>"},{"location":"tutorial/new_task/#specifying-a-taskparameters-model-for-your-task","title":"Specifying a <code>TaskParameters</code> Model for your <code>Task</code>","text":"<p>A brief overview of parameters objects will be provided below. The following information goes into detail only about specifics related to LUTE configuration. An in depth description of pydantic is beyond the scope of this tutorial; please refer to the official documentation for more information. Please note that due to environment constraints pydantic is currently pinned to version 1.10! Make sure to read the appropriate documentation for this version as many things are different compared to the newer releases. At the end this document there will be an example highlighting some supported behaviour as well as a FAQ to address some common integration considerations.</p> <p><code>Task</code>s and <code>TaskParameter</code>s</p> <p>All <code>Task</code>s have a corresponding <code>TaskParameters</code> object. These objects are linked exclusively by a named relationship. For a <code>Task</code> named <code>MyThirdPartyTask</code>, the parameters object must be named <code>MyThirdPartyTaskParameters</code>. For third-party <code>Task</code>s there are a number of additional requirements: - The model must inherit from a base class called <code>ThirdPartyParameters</code>. - The model must have one field specified called <code>executable</code>. The presence of this field indicates that the <code>Task</code> is a third-party <code>Task</code> and the specified executable must be called. This allows all third-party <code>Task</code>s to be defined exclusively by their parameters model. A single <code>ThirdPartyTask</code> class handles execution of all third-party <code>Task</code>s.</p> <p>All models are stored in <code>lute/io/models</code>. For any given <code>Task</code>, a new model can be added to an existing module contained in this directory or to a new module. If creating a new module, make sure to add an import statement to <code>lute.io.models.__init__</code>.</p> <p>Defining <code>TaskParameter</code>s</p> <p>When specifying parameters the default behaviour is to provide a one-to-one correspondance between the Python attribute specified in the parameter model, and the parameter specified on the command-line. Single-letter attributes are assumed to be passed using <code>-</code>, e.g. <code>n</code> will be passed as <code>-n</code> when the executable is launched. Longer attributes are passed using <code>--</code>, e.g. by default a model attribute named <code>my_arg</code> will be passed on the command-line as <code>--my_arg</code>. Positional arguments are specified using <code>p_argX</code> where <code>X</code> is a number. All parameters are passed in the order that they are specified in the model.</p> <p>However, because the number of possible command-line combinations is large, relying on the default behaviour above is NOT recommended. It is provided solely as a fallback. Instead, there are a number of configuration knobs which can be tuned to achieve the desired behaviour. The two main mechanisms for controlling behaviour are specification of model-wide configuration under the <code>Config</code> class within the model's definition, and parameter-by-parameter configuration using field attributes. For the latter, we define all parameters as <code>Field</code> objects. This allows parameters to have their own attributes, which are parsed by LUTE's task-layer. Given this, the preferred starting template for a <code>TaskParameters</code> model is the following - we assume we are integrating a new <code>Task</code> called <code>RunTask</code>:</p> <pre><code>\nfrom pydantic import Field, validator\n# Also include any pydantic type specifications - Pydantic has many custom\n# validation types already, e.g. types for constrained numberic values, URL handling, etc.\n\nfrom .base import ThirdPartyParameters\n\n# Change class name as necessary\nclass RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for RunTask...\"\"\"\n\n    class Config(ThirdPartyParameters.Config): # MUST be exactly as written here.\n        ...\n        # Model-wide configuration will go here\n\n    executable: str = Field(\"/path/to/executable\", description=\"...\")\n    ...\n    # Additional params.\n    # param1: param1Type = Field(\"default\", description=\"\", ...)\n</code></pre> <p>Config settings and options Under the class definition for <code>Config</code> in the model, we can modify global options for all the parameters. In addition, there are a number of configuration options related to specifying what the outputs/results from the associated <code>Task</code> are, and a number of options to modify runtime behaviour. Currently, the available configuration options are:</p> Config Parameter Meaning Default Value ThirdPartyTask-specific? <code>run_directory</code> If provided, can be used to specify the directory from which a <code>Task</code> is run. <code>None</code> (not provided) NO <code>set_result</code> <code>bool</code>. If <code>True</code> search the model definition for a parameter that indicates what the result is. <code>False</code> NO <code>result_from_params</code> If <code>set_result</code> is <code>True</code> can define a result using this option and a validator. See also <code>is_result</code> below. <code>None</code> (not provided) NO <code>short_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <code>long_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <p>These configuration options modify how the parameter models are parsed and passed along on the command-line, as well as what we consider results and where a <code>Task</code> can run. The default behaviour is that parameters are assumed to be passed as <code>-p arg</code> and <code>--param arg</code>, the <code>Task</code> will be run in the current working directory (or scratch if submitted with the ARP), and we have no information about <code>Task</code> results . Setting the above options can modify this behaviour.</p> <ul> <li>By setting <code>short_flags_use_eq</code> and/or <code>long_flags_use_eq</code> to <code>True</code> parameters are instead passed as <code>-p=arg</code> and <code>--param=arg</code>.</li> <li>By setting <code>run_directory</code> to a valid path, we can force a <code>Task</code> to be run in a specific directory. By default the <code>Task</code> will be run from the directory you submit the job in, or from your scratch folder (<code>/sdf/scratch/...</code>) if you submit from the eLog. Some <code>ThirdPartyTask</code>s rely on searching the correct working directory in order run properly.</li> <li>By setting <code>set_result</code> to <code>True</code> we indicate that the <code>TaskParameters</code> model will provide information on what the <code>TaskResult</code> is. This setting must be used with one of two options, either the <code>result_from_params</code> <code>Config</code> option, described below, or the Field attribute <code>is_result</code> described in the next sub-section (Field Attributes).</li> <li><code>result_from_params</code> is a Config option that can be used when <code>set_result==True</code>. In conjunction with a validator (described a sections down) we can use this option to specify a result from all the information contained in the model. E.g. if you have a <code>Task</code> that has parameters for an <code>output_directory</code> and a <code>output_filename</code>, you can set <code>result_from_params==f\"{output_directory}/{output_filename}\"</code>.</li> </ul> <p>Field attributes In addition to the global configuration options there are a couple of ways to specify individual parameters. The following <code>Field</code> attributes are used when parsing the model:</p> Field Attribute Meaning Default Value Example <code>flag_type</code> Specify the type of flag for passing this argument. One of <code>\"-\"</code>, <code>\"--\"</code>, or <code>\"\"</code> N/A <code>p_arg1 = Field(..., flag_type=\"\")</code> <code>rename_param</code> Change the name of the parameter as passed on the command-line. N/A <code>my_arg = Field(..., rename_param=\"my-arg\")</code> <code>description</code> Documentation of the parameter's usage or purpose. N/A <code>arg = Field(..., description=\"Argument for...\")</code> <code>is_result</code> <code>bool</code>. If the <code>set_result</code> <code>Config</code> option is <code>True</code>, we can set this to <code>True</code> to indicate a result. N/A <code>output_result = Field(..., is_result=true)</code> <p>The <code>flag_type</code> attribute allows us to specify whether the parameter corresponds to a positional (<code>\"\"</code>) command line argument, requires a single hyphen (<code>\"-\"</code>), or a double hyphen (<code>\"--\"</code>). By default, the parameter name is passed as-is on the command-line. However, command-line arguments can have characters which would not be valid in Python variable names. In particular, hyphens are frequently used. To handle this case, the <code>rename_param</code> attribute can be used to specify an alternative spelling of the parameter when it is passed on the command-line. This also allows for using more descriptive variable names internally than those used on the command-line. A <code>description</code> can also be provided for each Field to document the usage and purpose of that particular parameter.</p> <p>As an example, we can again consider defining a model for a <code>RunTask</code> <code>Task</code>. Consider an executable which would normally be called from the command-line as follows:</p> <pre><code>/sdf/group/lcls/ds/tools/runtask -n &lt;nthreads&gt; --method=&lt;algorithm&gt; -p &lt;algo_param&gt; [--debug]\n</code></pre> <p>A model specification for this <code>Task</code> may look like:</p> <pre><code>class RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True  # For the --method parameter\n\n    # Prefer using full/absolute paths where possible.\n    # No flag_type needed for this field\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/runtask\", description=\"Runtask Binary v1.0\"\n    )\n\n    # We can provide a more descriptive name for -n\n    # Let's assume it's a number of threads, or processes, etc.\n    num_threads: int = Field(\n        1, description=\"Number of concurrent threads.\", flag_type=\"-\", rename_param=\"n\"\n    )\n\n    # In this case we will use the Python variable name directly when passing\n    # the parameter on the command-line\n    method: str = Field(\"algo1\", description=\"Algorithm to use.\", flag_type=\"--\")\n\n    # For an actual parameter we would probably have a better name. Lets assume\n    # This parameter (-p) modifies the behaviour of the method above.\n    method_param1: int = Field(\n        3, description=\"Modify method performance.\", flag_type=\"-\", rename_param=\"p\"\n    )\n\n    # Boolean flags are only passed when True! `--debug` is an optional parameter\n    # which is not followed by any arguments.\n    debug: bool = Field(\n        False, description=\"Whether to run in debug mode.\", flag_type=\"--\"\n    )\n</code></pre> <p>The <code>is_result</code> attribute allows us to specify whether the corresponding Field points to the output/result of the associated <code>Task</code>. Consider a <code>Task</code>, <code>RunTask2</code> which writes its output to a single file which is passed as a parameter.</p> <pre><code>class RunTask2Parameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask2 binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True                     # This must be set here!\n        # result_from_params: Optional[str] = None  # We can use this for more complex result setups (see below). Ignore for now.\n\n    # Prefer using full/absolute paths where possible.\n    # No flag_type needed for this field\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/runtask2\", description=\"Runtask Binary v2.0\"\n    )\n\n    # Lets assume we take one input and write one output file\n    # We will not provide a default value, so this parameter MUST be provided\n    input: str = Field(\n        description=\"Path to input file.\", flag_type=\"--\"\n    )\n\n    # We will also not provide a default for the output\n    # BUT, we will specify that whatever is provided is the result\n    output: str = Field(\n        description=\"Path to write output to.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,   # This means this parameter points to the result!\n    )\n</code></pre> <p>Additional Comments 1. Model parameters of type <code>bool</code> are not passed with an argument and are only passed when <code>True</code>. This is a common use-case for boolean flags which enable things like test or debug modes, verbosity or reporting features. E.g. <code>--debug</code>, <code>--test</code>, <code>--verbose</code>, etc.   - If you need to pass the literal words <code>\"True\"</code> or <code>\"False\"</code>, use a parameter of type <code>str</code>. 2. You can use <code>pydantic</code> types to constrain parameters beyond the basic Python types. E.g. <code>conint</code> can be used to define lower and upper bounds for an integer. There are also types for common categories, positive/negative numbers, paths, URLs, IP addresses, etc.   - Even more custom behaviour can be achieved with <code>validator</code>s (see below). 3. All <code>TaskParameters</code> objects and its subclasses have access to a <code>lute_config</code> parameter, which is of type <code>lute.io.models.base.AnalysisHeader</code>. This special parameter is ignored when constructing the call for a binary task, but it provides access to shared/common parameters between tasks. For example, the following parameters are available through the <code>lute_config</code> object, and may be of use when constructing validators. All fields can be accessed with <code>.</code> notation. E.g. <code>lute_config.experiment</code>.   - <code>title</code>: A user provided title/description of the analysis.   - <code>experiment</code>: The current experiment name   - <code>run</code>: The current acquisition run number   - <code>date</code>: The date of the experiment or the analysis.   - <code>lute_version</code>: The version of the software you are running.   - <code>task_timeout</code>: How long a <code>Task</code> can run before it is killed.   - <code>work_dir</code>: The main working directory for LUTE. Files and the database are created relative to this directory. This is separate from the <code>run_directory</code> config option. LUTE will write files to the work directory by default; however, the <code>Task</code> itself is run from <code>run_directory</code> if it is specified.</p> <p>Validators Pydantic uses <code>validators</code> to determine whether a value for a specific field is appropriate. There are default validators for all the standard library types and the types specified within the pydantic package; however, it is straightforward to define custom ones as well. In the template code-snippet above we imported the <code>validator</code> decorator. To create our own validator we define a method (with any name) with the following prototype, and decorate it with the <code>validator</code> decorator:</p> <pre><code>@validator(\"name_of_field_to_decorate\")\ndef my_custom_validator(cls, field: Any, values: Dict[str, Any]) -&gt; Any: ...\n</code></pre> <p>In this snippet, the <code>field</code> variable corresponds to the value for the specific field we want to validate. <code>values</code> is a dictionary of fields and their values which have been parsed prior to the current field. This means you can validate the value of a parameter based on the values provided for other parameters. Since pydantic always validates the fields in the order they are defined in the model, fields dependent on other fields should come later in the definition.</p> <p>For example, consider the <code>method_param1</code> field defined above for <code>RunTask</code>. We can provide a custom validator which changes the default value for this field depending on what type of algorithm is specified for the <code>--method</code> option. We will also constrain the options for <code>method</code> to two specific strings.</p> <pre><code>from pydantic import Field, validator, ValidationError, root_validator\nclass RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask binary.\"\"\"\n\n    # [...]\n\n    # In this case we will use the Python variable name directly when passing\n    # the parameter on the command-line\n    method: str = Field(\"algo1\", description=\"Algorithm to use.\", flag_type=\"--\")\n\n    # For an actual parameter we would probably have a better name. Lets assume\n    # This parameter (-p) modifies the behaviour of the method above.\n    method_param1: Optional[int] = Field(\n        description=\"Modify method performance.\", flag_type=\"-\", rename_param=\"p\"\n    )\n\n    # We will only allow method to take on one of two values\n    @validator(\"method\")\n    def validate_method(cls, method: str, values: Dict[str, Any]) -&gt; str:\n        \"\"\"Method validator: --method can be algo1 or algo2.\"\"\"\n\n        valid_methods: List[str] = [\"algo1\", \"algo2\"]\n        if method not in valid_methods:\n            raise ValueError(\"method must be algo1 or algo2\")\n        return method\n\n    # Lets change the default value of `method_param1` depending on `method`\n    # NOTE: We didn't provide a default value to the Field above and made it\n    # optional. We can use this to test whether someone is purposefully\n    # overriding the value of it, and if not, set the default ourselves.\n    # We set `always=True` since pydantic will normally not use the validator\n    # if the default is not changed\n    @validator(\"method_param1\", always=True)\n    def validate_method_param1(cls, param1: Optional[int], values: Dict[str, Any]) -&gt; int:\n        \"\"\"method param1 validator\"\"\"\n\n        # If someone actively defined it, lets just return that value\n        # We could instead do some additional validation to make sure that the\n        # value they provided is valid...\n        if param1 is not None:\n            return param1\n\n        # method_param1 comes after method, so this will be defined, or an error\n        # would have been raised.\n        method: str = values['method']\n        if method == \"algo1\":\n            return 3\n        elif method == \"algo2\":\n            return 5\n</code></pre> <p>The special <code>root_validator(pre=False)</code> can also be used to provide validation of the model as a whole. This is also the recommended method for specifying a result (using <code>result_from_params</code>) which has a complex dependence on the parameters of the model. This latter use-case is described in FAQ 2 below.</p>"},{"location":"tutorial/new_task/#faq","title":"FAQ","text":"<ol> <li>How can I specify a default value which depends on another parameter?</li> </ol> <p>Use a custom validator. The example above shows how to do this. The parameter that depends on another parameter must come LATER in the model defintion than the independent parameter.</p> <ol> <li>My <code>TaskResult</code> is determinable from the parameters model, but it isn't easily specified by one parameter. How can I use <code>result_from_params</code> to indicate the result?</li> </ol> <p>When a result can be identified from the set of parameters defined in a <code>TaskParameters</code> model, but is not as straightforward as saying it is equivalent to one of the parameters alone, we can set <code>result_from_params</code> using a custom validator. In the example below, we have two parameters which together determine what the result is, <code>output_dir</code> and <code>out_name</code>. Using a validator we will define a result from these two values.</p> <pre><code>from pydantic import Field, root_validator\n\nclass RunTask3Parameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask3 binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True       # This must be set here!\n        result_from_params: str = \"\"  # We will set this momentarily\n\n    # [...] executable, other params, etc.\n\n    output_dir: str = Field(\n        description=\"Directory to write output to.\",\n        flag_type=\"--\",\n        rename_param=\"dir\",\n    )\n\n    out_name: str = Field(\n        description=\"The name of the final output file.\",\n        flag_type=\"--\",\n        rename_param=\"oname\",\n    )\n\n    # We can still provide other validators as needed\n    # But for now, we just set result_from_params\n    # Validator name can be anything, we set pre=False so this runs at the end\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        # Extract the values of output_dir and out_name\n        output_dir: str = values[\"output_dir\"]\n        out_name: str = values[\"out_name\"]\n\n        result: str = f\"{output_dir}/{out_name}\"\n        # Now we set result_from_params\n        cls.Config.result_from_params = result\n\n        # We haven't modified any other values, but we MUST return this!\n        return values\n</code></pre> <ol> <li>My new <code>Task</code> depends on the output of a previous <code>Task</code>, how can I specify this dependency? Parameters used to run a <code>Task</code> are recorded in a database for every <code>Task</code>. It is also recorded whether or not the execution of that specific parameter set was successful. A utility function is provided to access the most recent values from the database for a specific parameter of a specific <code>Task</code>. It can also be used to specify whether unsuccessful <code>Task</code>s should be included in the query. This utility can be used within a validator to specify dependencies. For example, suppose the input of <code>RunTask2</code> (parameter <code>input</code>) depends on the output location of <code>RunTask1</code> (parameter <code>outfile</code>). A validator of the following type can be used to retrieve the output file and make it the default value of the input parameter.</li> </ol> <pre><code>from pydantic import Field, validator\n\nfrom .base import ThirdPartyParameters\nfrom ..db import read_latest_db_entry\n\nclass RunTask2Parameters(ThirdPartyParameters):\n    input: str = Field(\"\", description=\"Input file.\", flag_type=\"--\")\n\n    @validator(\"input\")\n    def validate_input(cls, input: str, values: Dict[str, Any]) -&gt; str:\n        if input == \"\":\n            task1_out: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",  # Working directory. We search for the database here.\n                \"RunTask1\",                           # Name of Task we want to look up\n                \"outfile\",                            # Name of parameter of the Task\n                valid_only=True,                      # We only want valid output files.\n            )\n            # read_latest_db_entry returns None if nothing is found\n            if task1_out is not None:\n                return task1_out\n        return input\n</code></pre> <p>There are more examples of this pattern spread throughout the various <code>Task</code> models.</p>"},{"location":"tutorial/new_task/#specifying-an-executor-creating-a-runnable-managed-task","title":"Specifying an <code>Executor</code>: Creating a runnable, \"managed <code>Task</code>\"","text":"<p>Overview</p> <p>After a pydantic model has been created, the next required step is to define a managed <code>Task</code>. In the context of this library, a managed <code>Task</code> refers to the combination of an <code>Executor</code> and a <code>Task</code> to run. The <code>Executor</code> manages the process of <code>Task</code> submission and the execution environment, as well as performing any logging, eLog communication, etc. There are currently two types of <code>Executor</code> to choose from, but only one is applicable to third-party code. The second <code>Executor</code> is listed below for completeness only. If you need MPI see the note below.</p> <ol> <li><code>Executor</code>: This is the standard <code>Executor</code>. It should be used for third-party uses cases.</li> <li><code>MPIExecutor</code>: This performs all the same types of operations as the option above; however, it will submit your <code>Task</code> using MPI.</li> <li>The <code>MPIExecutor</code> will submit the <code>Task</code> using the number of available cores - 1. The number of cores is determined from the physical core/thread count on your local machine, or the number of cores allocated by SLURM when submitting on the batch nodes.</li> </ol> <p>Using MPI with third-party <code>Task</code>s</p> <p>As mentioned, you should setup a third-party <code>Task</code> to use the first type of <code>Executor</code>. If, however, your third-party <code>Task</code> uses MPI this may seem non-intuitive. When using the <code>MPIExecutor</code> LUTE code is submitted with MPI. This includes the code that performs signalling to the <code>Executor</code> and <code>exec</code>s the third-party code you are interested in running. While it is possible to set this code up to run with MPI, it is more challenging in the case of third-party <code>Task</code>s because there is no <code>Task</code> code to modify directly! The <code>MPIExecutor</code> is provided mostly for first-party code. This is not an issue, however, since the standard <code>Executor</code> is easily configured to run with MPI in the case of third-party code.</p> <p>When using the standard <code>Executor</code> for a <code>Task</code> requiring MPI, the <code>executable</code> in the pydantic model must be set to <code>mpirun</code>. For example, a third-party <code>Task</code> model, that uses MPI but is intended to be run with the <code>Executor</code> may look like the following. We assume this <code>Task</code> runs a Python script using MPI.</p> <pre><code>class RunMPITaskParameters(ThirdPartyParameters):\n    class Config(ThirdPartyParameters.Config):\n        ...\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    pos_arg: str = Field(\"python\", description=\"Python...\", flag_type=\"\")\n    script: str = Field(\"\", description=\"Python script to run with MPI\", flag_type=\"\")\n</code></pre> <p>Selecting the <code>Executor</code></p> <p>After deciding on which <code>Executor</code> to use, a single line must be added to the <code>lute/managed_tasks.py</code> module:</p> <pre><code># Initialization: Executor(\"TaskName\")\nTaskRunner: Executor = Executor(\"SubmitTask\")\n# TaskRunner: MPIExecutor = MPIExecutor(\"SubmitTask\") ## If using the MPIExecutor\n</code></pre> <p>In an attempt to make it easier to discern whether discussing a <code>Task</code> or managed <code>Task</code>, the standard naming convention is that the <code>Task</code> (class name) will have a verb in the name, e.g. <code>RunTask</code>, <code>SubmitTask</code>. The corresponding managed <code>Task</code> will use a related noun, e.g. <code>TaskRunner</code>, <code>TaskSubmitter</code>, etc.</p> <p>As a reminder, the <code>Task</code> name is the first part of the class name of the pydantic model, without the <code>Parameters</code> suffix. This name must match. E.g. if your pydantic model's class name is <code>RunTaskParameters</code>, the <code>Task</code> name is <code>RunTask</code>, and this is the string passed to the <code>Executor</code> initializer.</p> <p>Modifying the environment</p> <p>If your third-party <code>Task</code> can run in the standard <code>psana</code> environment with no further configuration files, the setup process is now complete and your <code>Task</code> can be run within the LUTE framework. If on the other hand your <code>Task</code> requires some changes to the environment, this is managed through the <code>Executor</code>. There are a couple principle methods that the <code>Executor</code> has to change the environment.</p> <ol> <li><code>Executor.update_environment</code>: if you only need to add a few environment variables, or update the <code>PATH</code> this is the method to use. The method takes a <code>Dict[str, str]</code> as input. Any variables can be passed/defined using this method. By default, any variables in the dictionary will overwrite those variable definitions in the current environment if they are already present, except for the variable <code>PATH</code>. By default <code>PATH</code> entries in the dictionary are prepended to the current <code>PATH</code> available in the environment the <code>Executor</code> runs in (the standard <code>psana</code> environment). This behaviour can be changed to either append, or overwrite the <code>PATH</code> entirely by an optional second argument to the method.</li> <li><code>Executor.shell_source</code>: This method will source a shell script which can perform numerous modifications of the environment (PATH changes, new environment variables, conda environments, etc.). The method takes a <code>str</code> which is the path to a shell script to source.</li> </ol> <p>As an example, we will update the <code>PATH</code> of one <code>Task</code> and source a script for a second.</p> <pre><code>TaskRunner: Executor = Executor(\"RunTask\")\n# update_environment(env: Dict[str,str], update_path: str = \"prepend\") # \"append\" or \"overwrite\"\nTaskRunner.update_environment(\n    { \"PATH\": \"/sdf/group/lcls/ds/tools\" }  # This entry will be prepended to the PATH available after sourcing `psconda.sh`\n)\n\nTask2Runner: Executor = Executor(\"RunTask2\")\nTask2Runner.shell_source(\"/sdf/group/lcls/ds/tools/new_task_setup.sh\") # Will source new_task_setup.sh script\n</code></pre>"},{"location":"tutorial/new_task/#using-templates-managing-third-party-configuration-files","title":"Using templates: managing third-party configuration files","text":"<p>Some third-party executables will require their own configuration files. These are often separate JSON or YAML files, although they can also be bash or Python scripts which are intended to be edited. Since LUTE requires its own configuration YAML file, it attempts to handle these cases by using Jinja templates. When wrapping a third-party task a template can also be provided - with small modifications to the <code>Task</code>'s pydantic model, LUTE can process special types of parameters to render them in the template. LUTE offloads all the template rendering to Jinja, making the required additions to the pydantic model small. On the other hand, it does require understanding the Jinja syntax, and the provision of a well-formatted template, to properly parse parameters. Some basic examples of this syntax will be shown below; however, it is recommended that the <code>Task</code> implementer refer to the official Jinja documentation for more information.</p> <p>LUTE provides two additional base models which are used for template parsing in conjunction with the primary <code>Task</code> model. These are: - <code>TemplateParameters</code> objects which hold parameters which will be used to render a portion of a template. - <code>TemplateConfig</code> objects which hold two strings: the name of the template file to use and the full path (including filename) of where to output the rendered result.</p> <p><code>Task</code> models which inherit from the <code>ThirdPartyParameters</code> model, as all third-party <code>Task</code>s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as <code>TemplateParameters</code> objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the only requirement on the Python-side when adding template rendering functionality to the <code>Task</code> is the addition of one parameter - an instance of <code>TemplateConfig</code>. The instance MUST be called <code>lute_template_cfg</code>.</p> <pre><code>from pydantic import Field, validator\n\nfrom .base import TemplateConfig\n\nclass RunTaskParamaters(ThirdPartyParameters):\n    ...\n    # This parameter MUST be called lute_template_cfg!\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"name_of_template.json\",\n            output_path=\"/path/to/write/rendered_output_to.json\",\n        ),\n        description=\"Template rendering configuration\",\n    )\n</code></pre> <p>LUTE looks for the template in <code>config/templates</code>, so only the name of the template file to use within that directory is required for the <code>template_name</code> attribute of <code>lute_template_cfg</code>. LUTE can write the output anywhere (the user has permissions), and with any name, so the full absolute path including filename should be used for the <code>output_path</code> of <code>lute_template_cfg</code>.</p> <p>The rest of the work is done by the combination of Jinja, LUTE's configuration YAML file, and the template itself. Understanding the interplay between these components is perhaps best illustrated by an example. As such, let us consider a simple third-party <code>Task</code> whose only input parameter (on the command-line) is the location of a configuration JSON file. We'll call the third-party executable <code>jsonuser</code> and our <code>Task</code> model, the <code>RunJsonUserParameters</code>. We assume the program is run like:</p> <pre><code>jsonuser -i &lt;input_file.json&gt;\n</code></pre> <p>The first step is to setup the pydantic model as before.</p> <pre><code>from pydantic import Field, validator\n\nfrom .base import TemplateConfig\n\nclass RunJsonUserParameters:\n    executable: str = Field(\n        \"/path/to/jsonuser\", description=\"Executable which requires a JSON configuration file.\"\n    )\n    # Lets assume the JSON file is passed as \"-i &lt;path_to_json&gt;\"\n    input_json: str = Field(\n        \"\", description=\"Path to the input JSON file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n</code></pre> <p>The next step is to create a template for the JSON file. Let's assume the JSON file looks like:</p> <pre><code>{\n    \"param1\": \"arg1\",\n    \"param2\": 4,\n    \"param3\": {\n        \"a\": 1,\n        \"b\": 2\n    },\n    \"param4\": [\n        1,\n        2,\n        3\n    ]\n}\n</code></pre> <p>Any, or all of these values can be substituted for, and we can determine the way in which we will provide them. I.e. a substitution can be provided for each variable individually, or, for example for a nested hierarchy, a dictionary can be provided which will substitute all the items at once. For this simple case, let's provide variables for <code>param1</code>, <code>param2</code>, <code>param3.b</code> and assume that we want the first and second entries for <code>param4</code> to be identical for our use case (i.e., we can use one variable for them both. In total, this means we will perform 5 substitutions using 4 variables. Jinja will substitute a variable anywhere it sees the following syntax, <code>{{ variable_name }}</code>. As such a valid template for our use-case may look like:</p> <pre><code>{\n    \"param1\": {{ str_var }},\n    \"param2\": {{ int_var }},\n    \"param3\": {\n        \"a\": 1,\n        \"b\": {{ p3_b }}\n    },\n    \"param4\": [\n        {{ val }},\n        {{ val }},\n        3\n    ]\n}\n</code></pre> <p>We save this file as <code>jsonuser.json</code> in <code>config/templates</code>. Next, we will update the original pydantic model to include our template configuration. We still have an issue, however, in that we need to decide where to write the output of the template to. In this case, we can use the <code>input_json</code> parameter. We will assume that the user will provide this, although a default value can also be used. A custom validator will be added so that we can take the <code>input_json</code> value and update the value of <code>lute_template_cfg.output_path</code> with it.</p> <pre><code># from typing import Optional\n\nfrom pydantic import Field, validator\n\nfrom .base import TemplateConfig #, TemplateParameters\n\nclass RunJsonUserParameters:\n    executable: str = Field(\n        \"jsonuser\", description=\"Executable which requires a JSON configuration file.\"\n    )\n    # Lets assume the JSON file is passed as \"-i &lt;path_to_json&gt;\"\n    input_json: str = Field(\n        \"\", description=\"Path to the input JSON file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    # Add template configuration! *MUST* be called `lute_template_cfg`\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"jsonuser.json\", # Only the name of the file here.\n            output_path=\"\",\n        ),\n        description=\"Template rendering configuration\",\n    )\n    # We do not need to include these TemplateParameters, they will be added\n    # automatically if provided in the YAML\n    #str_var: Optional[TemplateParameters]\n    #int_var: Optional[TemplateParameters]\n    #p3_b: Optional[TemplateParameters]\n    #val: Optional[TemplateParameters]\n\n\n    # Tell LUTE to write the rendered template to the location provided with\n    # `input_json`. I.e. update `lute_template_cfg.output_path`\n    @validator(\"lute_template_cfg\", always=True)\n    def update_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"input_json\"]\n        return lute_template_cfg\n</code></pre> <p>All that is left to render the template, is to provide the variables we want to substitute in the LUTE configuration YAML. In our case we must provide the 4 variable names we included within the substitution syntax (<code>{{ var_name }}</code>). The names in the YAML must match those in the template.</p> <pre><code>RunJsonUser:\n    input_json: \"/my/chosen/path.json\" # We'll come back to this...\n    str_var: \"arg1\" # Will substitute for \"param1\": \"arg1\"\n    int_var: 4 # Will substitute for \"param2\": 4\n    p3_b: 2  # Will substitute for \"param3: { \"b\": 2 }\n    val: 2 # Will substitute for \"param4\": [2, 2, 3] in the JSON\n</code></pre> <p>If on the other hand, a user were to have an already valid JSON file, it is possible to turn off the template rendering. (ALL) Template variables (<code>TemplateParameters</code>) are simply excluded from the configuration YAML.</p> <pre><code>RunJsonUser:\n    input_json: \"/path/to/existing.json\"\n    #str_var: ...\n    #...\n</code></pre>"},{"location":"tutorial/new_task/#additional-jinja-syntax","title":"Additional Jinja Syntax","text":"<p>There are many other syntactical constructions we can use with Jinja. Some of the useful ones are:</p> <p>If Statements - E.g. only include portions of the template if a value is defined.</p> <pre><code>{% if VARNAME is defined %}\n// Stuff to include\n{% endif %}\n</code></pre> <p>Loops - E.g. Unpacking multiple elements from a dictionary.</p> <pre><code>{% for name, value in VARNAME.items() %}\n// Do stuff with name and value\n{% endfor %}\n</code></pre>"},{"location":"tutorial/new_task/#creating-a-first-party-task","title":"Creating a \"First-Party\" <code>Task</code>","text":"<p>The process for creating a \"First-Party\" <code>Task</code> is very similar to that for a \"Third-Party\" <code>Task</code>, with the difference being that you must also write the analysis code. The steps for integration are: 1. Write the <code>TaskParameters</code> model. 2. Write the <code>Task</code> class. There are a few rules that need to be adhered to. 3. Make your <code>Task</code> available by modifying the import function. 4. Specify an <code>Executor</code></p>"},{"location":"tutorial/new_task/#specifying-a-taskparameters-model-for-your-task_1","title":"Specifying a <code>TaskParameters</code> Model for your <code>Task</code>","text":"<p>Parameter models have a format that must be followed for \"Third-Party\" <code>Task</code>s, but \"First-Party\" <code>Task</code>s have a little more liberty in how parameters are dealt with, since the <code>Task</code> will do all the parsing itself.</p> <p>To create a model, the basic steps are: 1. If necessary, create a new module (e.g. <code>new_task_category.py</code>) under <code>lute.io.models</code>, or find an appropriate pre-existing module in that directory.   - An <code>import</code> statement must be added to <code>lute.io.models._init_</code> if a new module is created, so it can be found.   - If defining the model in a pre-existing module, make sure to modify the <code>__all__</code> statement to include it. 2. Create a new model that inherits from <code>TaskParameters</code>. You can look at <code>lute.models.io.tests.TestReadOutputParameters</code> for an example. The model must be named <code>&lt;YourTaskName&gt;Parameters</code>   - You should include all relevant parameters here, including input file, output file, and any potentially adjustable parameters. These parameters must be included even if there are some implicit dependencies between <code>Task</code>s and it would make sense for the parameter to be auto-populated based on some other output. Creating this dependency is done with validators (see step 3.). All parameters should be overridable, and all <code>Task</code>s should be fully-independently configurable, based solely on their model and the configuration YAML.   - To follow the preferred format, parameters should be defined as: <code>param_name: type = Field([default value], description=\"This parameter does X.\")</code> 3. Use validators to do more complex things for your parameters, including populating default values dynamically:   - E.g. create default values that depend on other parameters in the model - see for example: SubmitSMDParameters.   - E.g. create default values that depend on other <code>Task</code>s by reading from the database - see for example: TestReadOutputParameters. 4. The model will have access to some general configuration values by inheriting from <code>TaskParameters</code>. These parameters are all stored in <code>lute_config</code> which is an instance of <code>AnalysisHeader</code> (defined here).   - For example, the experiment and run number can be obtained from this object and a validator could use these values to define the default input file for the <code>Task</code>.</p> <p>A number of configuration options and Field attributes are also available for \"First-Party\" <code>Task</code> models. These are identical to those used for the <code>ThirdPartyTask</code>s, although there is a smaller selection. These options are reproduced below for convenience.</p> <p>Config settings and options Under the class definition for <code>Config</code> in the model, we can modify global options for all the parameters. In addition, there are a number of configuration options related to specifying what the outputs/results from the associated <code>Task</code> are, and a number of options to modify runtime behaviour. Currently, the available configuration options are:</p> Config Parameter Meaning Default Value ThirdPartyTask-specific? <code>run_directory</code> If provided, can be used to specify the directory from which a <code>Task</code> is run. <code>None</code> (not provided) NO <code>set_result</code> <code>bool</code>. If <code>True</code> search the model definition for a parameter that indicates what the result is. <code>False</code> NO <code>result_from_params</code> If <code>set_result</code> is <code>True</code> can define a result using this option and a validator. See also <code>is_result</code> below. <code>None</code> (not provided) NO <code>short_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <code>long_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <p>These configuration options modify how the parameter models are parsed and passed along on the command-line, as well as what we consider results and where a <code>Task</code> can run. The default behaviour is that parameters are assumed to be passed as <code>-p arg</code> and <code>--param arg</code>, the <code>Task</code> will be run in the current working directory (or scratch if submitted with the ARP), and we have no information about <code>Task</code> results . Setting the above options can modify this behaviour.</p> <ul> <li>By setting <code>short_flags_use_eq</code> and/or <code>long_flags_use_eq</code> to <code>True</code> parameters are instead passed as <code>-p=arg</code> and <code>--param=arg</code>.</li> <li>By setting <code>run_directory</code> to a valid path, we can force a <code>Task</code> to be run in a specific directory. By default the <code>Task</code> will be run from the directory you submit the job in, or from your scratch folder (<code>/sdf/scratch/...</code>) if you submit from the eLog. Some <code>ThirdPartyTask</code>s rely on searching the correct working directory in order run properly.</li> <li>By setting <code>set_result</code> to <code>True</code> we indicate that the <code>TaskParameters</code> model will provide information on what the <code>TaskResult</code> is. This setting must be used with one of two options, either the <code>result_from_params</code> <code>Config</code> option, described below, or the Field attribute <code>is_result</code> described in the next sub-section (Field Attributes).</li> <li><code>result_from_params</code> is a Config option that can be used when <code>set_result==True</code>. In conjunction with a validator (described a sections down) we can use this option to specify a result from all the information contained in the model. E.g. if you have a <code>Task</code> that has parameters for an <code>output_directory</code> and a <code>output_filename</code>, you can set <code>result_from_params==f\"{output_directory}/{output_filename}\"</code>.</li> </ul> <p>Field attributes In addition to the global configuration options there are a couple of ways to specify individual parameters. The following <code>Field</code> attributes are used when parsing the model:</p> Field Attribute Meaning Default Value Example <code>description</code> Documentation of the parameter's usage or purpose. N/A <code>arg = Field(..., description=\"Argument for...\")</code> <code>is_result</code> <code>bool</code>. If the <code>set_result</code> <code>Config</code> option is <code>True</code>, we can set this to <code>True</code> to indicate a result. N/A <code>output_result = Field(..., is_result=true)</code>"},{"location":"tutorial/new_task/#writing-the-task","title":"Writing the <code>Task</code>","text":"<p>You can write your analysis code (or whatever code to be executed) as long as it adheres to the limited rules below. You can create a new module for your <code>Task</code> in <code>lute.tasks</code> or add it to any existing module, if it makes sense for it to belong there. The <code>Task</code> itself is a single class constructed as:</p> <ol> <li>Your analysis <code>Task</code> is a class named in a way that matches its Pydantic model. E.g. <code>RunTask</code> is the <code>Task</code>, and <code>RunTaskParameters</code> is the Pydantic model.</li> <li>The class must inherit from the <code>Task</code> class (see template below). If you intend to use MPI see the following section.</li> <li>You must provide an implementation of a <code>_run</code> method. This is the method that will be executed when the <code>Task</code> is run. You can in addition write as many methods as you need. For fine-grained execution control you can also provide <code>_pre_run()</code> and <code>_post_run()</code> methods, but this is optional.</li> <li>For all communication (including print statements) you should use the <code>_report_to_executor(msg: Message)</code> method. Since the <code>Task</code> is run as a subprocess this method will pass information to the controlling <code>Executor</code>. You can pass any type of object using this method, strings, plots, arrays, etc.</li> <li>If you did not use the <code>set_result</code> configuration option in your parameters model, make sure to provide a result when finished. This is done by setting <code>self._result.payload = ...</code>. You can set the result to be any object. If you have written the result to a file, for example, please provide a path.</li> </ol> <p>A minimal template is provided below.</p> <pre><code>\"\"\"Standard docstring...\"\"\"\n\n__all__ = [\"RunTask\"]\n__author__ = \"\" # Please include so we know who the SME is\n\n# Include any imports you need here\n\nfrom lute.execution.ipc import Message # Message for communication\nfrom lute.io.models.base import *      # For TaskParameters\nfrom lute.tasks.task import *          # For Task\n\nclass RunTask(Task): # Inherit from Task\n    \"\"\"Task description goes here, or in __init__\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params) # Sets up Task, parameters, etc.\n        # Parameters will be available through:\n          # self._task_parameters\n          # You access with . operator: self._task_parameters.param1, etc.\n        # Your result object is availble through:\n          # self._result\n            # self._result.payload &lt;- Main result\n            # self._result.summary &lt;- Short summary\n            # self._result.task_status &lt;- Semi-automatic, but can be set manually\n\n    def _run(self) -&gt; None:\n        # THIS METHOD MUST BE PROVIDED\n        self.do_my_analysis()\n\n    def do_my_analysis(self) -&gt; None:\n        # Send a message, proper way to print:\n        msg: Message(contents=\"My message contents\", signal=\"\")\n        self._report_to_executor(msg)\n\n        # When done, set result - assume we wrote a file, e.g.\n        self._result.payload = \"/path/to/output_file.h5\"\n        # Optionally also set status - good practice but not obligatory\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"tutorial/new_task/#using-mpi-for-your-task","title":"Using MPI for your <code>Task</code>","text":"<p>In the case your <code>Task</code> is written to use <code>MPI</code> a slight modification to the template above is needed. Specifically, an additional keyword argument should be passed to the base class initializer: <code>use_mpi=True</code>. This tells the base class to adjust signalling/communication behaviour appropriately for a multi-rank MPI program. Doing this prevents tricky-to-track-down problems due to ranks starting, completing and sending messages at different times. The rest of your code can, as before, be written as you see fit. The use of this keyword argument will also synchronize the start of all ranks and wait until all ranks have finished to exit.</p> <pre><code>\"\"\"Task which needs to run with MPI\"\"\"\n\n__all__ = [\"RunTask\"]\n__author__ = \"\" # Please include so we know who the SME is\n\n# Include any imports you need here\n\nfrom lute.execution.ipc import Message # Message for communication\nfrom lute.io.models.base import *      # For TaskParameters\nfrom lute.tasks.task import *          # For Task\n\n# Only the init is shown\nclass RunMPITask(Task): # Inherit from Task\n    \"\"\"Task description goes here, or in __init__\"\"\"\n\n    # Signal the use of MPI!\n    def __init__(self, *, params: TaskParameters, use_mpi: bool = True) -&gt; None:\n        super().__init__(params=params, use_mpi=use_mpi) # Sets up Task, parameters, etc.\n        # That's it.\n</code></pre>"},{"location":"tutorial/new_task/#message-signals","title":"Message signals","text":"<p>Signals in <code>Message</code> objects are strings and can be one of the following:</p> <pre><code>LUTE_SIGNALS: Set[str] = {\n    \"NO_PICKLE_MODE\",\n    \"TASK_STARTED\",\n    \"TASK_FAILED\",\n    \"TASK_STOPPED\",\n    \"TASK_DONE\",\n    \"TASK_CANCELLED\",\n    \"TASK_RESULT\",\n}\n</code></pre> <p>Each of these signals is associated with a hook on the <code>Executor</code>-side. They are for the most part used by base classes; however, you can choose to make use of them manually as well.</p>"},{"location":"tutorial/new_task/#making-your-task-available","title":"Making your <code>Task</code> available","text":"<p>Once the <code>Task</code> has been written, it needs to be made available for import. Since different <code>Task</code>s can have conflicting dependencies and environments, this is managed through an import function. When the <code>Task</code> is done, or ready for testing, a condition is added to <code>lute.tasks.__init__.import_task</code>. For example, assume the <code>Task</code> is called <code>RunXASAnalysis</code> and it's defined in a module called <code>xas.py</code>, we would add the following lines to the <code>import_task</code> function:</p> <pre><code># in lute.tasks.__init__\n\n# ...\n\ndef import_task(task_name: str) -&gt; Type[Task]:\n    # ...\n    if task_name == \"RunXASAnalysis\":\n        from .xas import RunXASAnalysis\n\n        return RunXASAnalysis\n</code></pre>"},{"location":"tutorial/new_task/#defining-an-executor","title":"Defining an <code>Executor</code>","text":"<p>The process of <code>Executor</code> definition is identical to the process as described for <code>ThirdPartyTask</code>s above. The one exception is if you defined the <code>Task</code> to use MPI as described in the section above (Using MPI for your <code>Task</code>), you will likely consider using the <code>MPIExecutor</code>.</p>"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Setup","text":"<p>LUTE is publically available on GitHub. In order to run it, the first step is to clone the repository:</p> <pre><code># Navigate to the directory of your choice.\ngit clone@github.com:slac-lcls/lute\n</code></pre> <p>The repository directory structure is as follows:</p> <pre><code>lute\n  |--- config             # Configuration YAML files (see below) and templates for third party config\n  |--- docs               # Documentation (including this page)\n  |--- launch_scripts     # Entry points for using SLURM and communicating with Airflow\n  |--- lute               # Code\n        |--- run_task.py  # Script to run an individual managed Task\n        |--- ...\n  |--- utilities          # Help utility programs\n  |--- workflows          # This directory contains workflow definitions. It is synced elsewhere and not used directly.\n\n</code></pre> <p>In general, most interactions with the software will be through scripts located in the <code>launch_scripts</code> directory. Some users (for certain use-cases) may also choose to run the <code>run_task.py</code> script directly - it's location has been highlighted within hierarchy. To begin with you will need a YAML file, templates for which are available in the <code>config</code> directory. The structure of the YAML file and how to use the various launch scripts are described in more detail below.</p>"},{"location":"#a-note-on-utilties","title":"A note on utilties","text":"<p>In the <code>utilities</code> directory there are two useful programs to provide assistance with using the software:</p> <ul> <li><code>utilities/dbview</code>: LUTE stores all parameters for every analysis routine it runs (as well as results) in a database. This database is stored in the <code>work_dir</code> defined in the YAML file (see below). The <code>dbview</code> utility is a TUI application (Text-based user interface) which runs in the terminal. It allows you to navigate a LUTE database using the arrow keys, etc. Usage is: <code>utilities/dbview -p &lt;path/to/lute.db&gt;</code>.</li> <li><code>utilities/lute_help</code>: This utility provides help and usage information for running LUTE software. E.g., it provides access to parameter descriptions to assist in properly filling out a configuration YAML. It's usage is described in slightly more detail below.</li> </ul>"},{"location":"#basic-usage","title":"Basic Usage","text":""},{"location":"#overview","title":"Overview","text":"<p>LUTE runs code as <code>Task</code>s that are managed by an <code>Executor</code>. The <code>Executor</code> provides modifications to the environment the <code>Task</code> runs in, as well as controls details of inter-process communication, reporting results to the eLog, etc. Combinations of specific <code>Executor</code>s and <code>Task</code>s are already provided, and are referred to as managed <code>Task</code>s. Managed <code>Task</code>s are submitted as a single unit. They can be run individually, or a series of independent steps can be submitted all at once in the form of a workflow, or directed acyclic graph (DAG). This latter option makes use of Airflow to manage the individual execution steps.</p> <p>Running analysis with LUTE is the process of submitting one or more managed <code>Task</code>s. This is generally a two step process.</p> <ol> <li>First, a configuration YAML file is prepared. This contains the parameterizations of all the <code>Task</code>s which you may run.</li> <li>Individual managed <code>Task</code> submission, or workflow (DAG) submission.</li> </ol> <p>These two steps are described below.</p>"},{"location":"#preparing-a-configuration-yaml","title":"Preparing a Configuration YAML","text":"<p>All <code>Task</code>s are parameterized through a single configuration YAML file - even third party code which requires its own configuration files is managed through this YAML file. The basic structure is split into two documents, a brief header section which contains information that is applicable across all <code>Task</code>s, such as the experiment name, run numbers and the working directory, followed by per <code>Task</code> parameters:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Some title.\"\nexperiment: \"MYEXP123\"\n# run: 12 # Does not need to be provided\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nTaskOne:\n  param_a: 123\n  param_b: 456\n  param_c:\n    sub_var: 3\n    sub_var2: 4\n\nTaskTwo:\n  new_param1: 3\n  new_param2: 4\n\n# ...\n...\n</code></pre> <p>In the first document, the header, it is important that the <code>work_dir</code> is properly specified. This is the root directory from which <code>Task</code> outputs will be written, and the LUTE database will be stored. It may also be desirable to modify the <code>task_timeout</code> parameter which defines the time limit for individual <code>Task</code> jobs. By default it is set to 10 minutes, although this may not be sufficient for long running jobs. This value will be applied to all <code>Task</code>s so should account for the longest running job you expect.</p> <p>The actual analysis parameters are defined in the second document. As these vary from <code>Task</code> to <code>Task</code>, a full description will not be provided here. An actual template with real <code>Task</code> parameters is available in <code>config/test.yaml</code>. Your analysis POC can also help you set up and choose the correct <code>Task</code>s to include as a starting point. The template YAML file has further descriptions of what each parameter does and how to fill it out. You can also refer to the <code>lute_help</code> program described under the following sub-heading.</p> <p>Some things to consider and possible points of confusion:</p> <ul> <li>While we will be submitting managed <code>Task</code>s, the parameters are defined at the <code>Task</code> level. I.e. the managed <code>Task</code> and <code>Task</code> itself have different names, and the names in the YAML refer to the latter. This is because a single <code>Task</code> can be run using different <code>Executor</code> configurations, but using the same parameters. The list of managed <code>Task</code>s is in <code>lute/managed_tasks.py</code>. A table is also provided below for some routines of interest..</li> </ul> Managed <code>Task</code> The <code>Task</code> it Runs <code>Task</code> Description <code>SmallDataProducer</code> <code>SubmitSMD</code> Smalldata production <code>CrystFELIndexer</code> <code>IndexCrystFEL</code> Crystallographic indexing <code>PartialatorMerger</code> <code>MergePartialator</code> Crystallographic merging <code>HKLComparer</code> <code>CompareHKL</code> Crystallographic figures of merit <code>HKLManipulator</code> <code>ManipulateHKL</code> Crystallographic format conversions <code>DimpleSolver</code> <code>DimpleSolve</code> Crystallographic structure solution with molecular replacement <code>PeakFinderPyAlgos</code> <code>FindPeaksPyAlgos</code> Peak finding with PyAlgos algorithm. <code>PeakFinderPsocake</code> <code>FindPeaksPsocake</code> Peak finding with psocake algorithm. <code>StreamFileConcatenator</code> <code>ConcatenateStreamFiles</code> Stream file concatenation."},{"location":"#how-do-i-know-what-parameters-are-available-and-what-they-do","title":"How do I know what parameters are available, and what they do?","text":"<p>A summary of <code>Task</code> parameters is available through the <code>lute_help</code> program.</p> <pre><code>&gt; utilities/lute_help -t [TaskName]\n</code></pre> <p>Note, some parameters may say \"Unknown description\" - this either means they are using an old-style defintion that does not include parameter help, or they may have some internal use. In particular you will see this for <code>lute_config</code> on every <code>Task</code>, this parameter is filled in automatically and should be ignored. E.g. as an example:</p> <pre><code>&gt; utilities/lute_help -t IndexCrystFEL\nINFO:__main__:Fetching parameter information for IndexCrystFEL.\nIndexCrystFEL\n-------------\nParameters for CrystFEL's `indexamajig`.\n\nThere are many parameters, and many combinations. For more information on\nusage, please refer to the CrystFEL documentation, here:\nhttps://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n\n\nRequired Parameters:\n--------------------\n[...]\n\nAll Parameters:\n-------------\n[...]\n\nhighres (number)\n    Mark all pixels greater than `x` has bad.\n\nprofile (boolean) - Default: False\n    Display timing data to monitor performance.\n\ntemp_dir (string)\n    Specify a path for the temp files folder.\n\nwait_for_file (integer) - Default: 0\n    Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\n\nno_image_data (boolean) - Default: False\n    Load only the metadata, no iamges. Can check indexability without high data requirements.\n\n[...]\n</code></pre>"},{"location":"#running-managed-tasks-and-workflows-dags","title":"Running Managed <code>Task</code>s and Workflows (DAGs)","text":"<p>After a YAML file has been filled in you can run a <code>Task</code>. There are multiple ways to submit a <code>Task</code>, but there are 3 that are most likely:</p> <ol> <li>Run a single managed <code>Task</code> interactively by running <code>python ...</code></li> <li>Run a single managed <code>Task</code> as a batch job (e.g. on S3DF) via a SLURM submission <code>submit_slurm.sh ...</code></li> <li>Run a DAG (workflow with multiple managed <code>Task</code>s).</li> </ol> <p>These will be covered in turn below; however, in general all methods will require two parameters: the path to a configuration YAML file, and the name of the managed <code>Task</code> or workflow you want to run. When submitting via SLURM or submitting an entire workflow there are additional parameters to control these processes.</p>"},{"location":"#running-single-managed-tasks-interactively","title":"Running single managed <code>Task</code>s interactively","text":"<p>The simplest submission method is just to run Python interactively. In most cases this is not practical for long-running analysis, but may be of use for short <code>Task</code>s or when debugging. From the root directory of the LUTE repository (or after installation) you can use the <code>run_task.py</code> script:</p> <pre><code>&gt; python -B [-O] run_task.py -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt;\n</code></pre> <p>The command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li><code>-O</code> is the flag controlling whether you run in debug or non-debug mode. By default, i.e. if you do NOT provide this flag you will run in debug mode which enables verbose printing. Passing <code>-O</code> will turn off debug to minimize output.</li> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> </ul>"},{"location":"#submitting-a-single-managed-task-as-a-batch-job","title":"Submitting a single managed <code>Task</code> as a batch job","text":"<p>On S3DF you can also submit individual managed <code>Task</code>s to run as batch jobs. To do so use <code>launch_scripts/submit_slurm.sh</code></p> <pre><code>&gt; launch_scripts/submit_slurm.sh -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt; [--debug] $SLURM_ARGS\n</code></pre> <p>As before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided</p> <ul> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> <li><code>--debug</code> is the flag to control whether or not to run in debug mode.</li> </ul> <p>In addition to the LUTE-specific arguments, SLURM arguments must also be provided (<code>$SLURM_ARGS</code> above). You can provide as many as you want; however you will need to at least provide:</p> <ul> <li><code>--partition=&lt;partition/queue&gt;</code> - The queue to run on, in general for LCLS this is <code>milano</code></li> <li><code>--account=lcls:&lt;experiment&gt;</code> - The account to use for batch job accounting.</li> </ul> <p>You will likely also want to provide at a minimum:</p> <ul> <li><code>--ntasks=&lt;...&gt;</code> to control the number of cores in allocated.</li> </ul> <p>In general, it is best to prefer the long-form of the SLURM-argument (<code>--arg=&lt;...&gt;</code>) in order to avoid potential clashes with present or future LUTE arguments.</p>"},{"location":"#workflow-dag-submission","title":"Workflow (DAG) submission","text":"<p>Finally, you can submit a full workflow (e.g. SFX analysis, smalldata production and summary results, geometry optimization...). This can be done using a single script, <code>submit_launch_airflow.sh</code>, similarly to the SLURM submission above:</p> <pre><code>&gt; launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -c &lt;/path/to/yaml.yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] $SLURM_ARGS\n</code></pre> <p>The submission process is slightly more complicated in this case. A more in-depth explanation is provided under \"Airflow Launch Steps\", in the advanced usage section below if interested. The parameters are as follows - as before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li>The first argument (must be first) is the full path to the <code>launch_scripts/launch_airflow.py</code> script located in whatever LUTE installation you are running. All other arguments can come afterwards in any order.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML to use.</li> <li><code>-w &lt;dag_name&gt;</code> is the name of the DAG (workflow) to run. This replaces the task name provided when using the other two methods above. A DAG list is provided below.</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> controls whether to use debug mode (verbose printing)</li> <li><code>--test</code> controls whether to use the test or production instance of Airflow to manage the DAG. The instances are running identical versions of Airflow, but the <code>test</code> instance may have \"test\" or more bleeding edge development DAGs.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> </ul> <p>The <code>$SLURM_ARGS</code> must be provided in the same manner as when submitting an individual managed <code>Task</code> by hand to be run as batch job with the script above. Note that these parameters will be used as the starting point for the SLURM arguments of every managed <code>Task</code> in the DAG; however, individual steps in the DAG may have overrides built-in where appropriate to make sure that step is not submitted with potentially incompatible arguments. For example, a single threaded analysis <code>Task</code> may be capped to running on one core, even if in general everything should be running on 100 cores, per the SLURM argument provided. These caps are added during development and cannot be disabled through configuration changes in the YAML.</p> <p>DAG List</p> <ul> <li><code>find_peaks_index</code></li> <li><code>psocake_sfx_phasing</code></li> <li><code>pyalgos_sfx</code></li> </ul>"},{"location":"#dag-submission-from-the-elog","title":"DAG Submission from the <code>eLog</code>","text":"<p>You can use the script in the previous section to submit jobs through the eLog. To do so navigate to the <code>Workflow &gt; Definitions</code> tab using the blue navigation bar at the top of the eLog. On this tab, in the top-right corner (underneath the help and zoom icons) you can click the <code>+</code> sign to add a new workflow. This will bring up a \"Workflow definition\" UI window. When filling out the eLog workflow definition the following fields are needed (all of them):</p> <ul> <li><code>Name</code>: You can name the workflow anything you like. It should probably be something descriptive, e.g. if you are using LUTE to run smalldata_tools, you may call the workflow <code>lute_smd</code>.</li> <li><code>Executable</code>: In this field you will put the full path to the <code>submit_launch_airflow.sh</code> script:  <code>/path/to/lute/launch_scripts/submit_launch_airflow.sh</code>.</li> <li><code>Parameters</code>: You will use the parameters as described above. Remember the first argument will be the full path to the <code>launch_airflow.py</code> script (this is NOT the same as the bash script used in the executable!): <code>/full/path/to/lute/launch_scripts/launch_airflow.py -c &lt;path/to/yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] $SLURM_ARGS</code></li> <li><code>Location</code>: Be sure to set to <code>S3DF</code>.</li> <li><code>Trigger</code>: You can have the workflow trigger automatically or manually. Which option to choose will depend on the type of workflow you are running. In general the options <code>Manually triggered</code> (which displays as <code>MANUAL</code> on the definitions page) and <code>End of a run</code> (which displays as <code>END_OF_RUN</code> on the definitions page) are safe options for ALL workflows. The latter will be automatically submitted for you when data acquisition has finished. If you are running a workflow with managed <code>Task</code>s that work as data is being acquired (e.g. <code>SmallDataProducer</code>), you may also select <code>Start of a run</code> (which displays as <code>START_OF_RUN</code> on the definitions page).</li> </ul> <p>Upon clicking create you will see a new entry in the table on the definitions page. In order to run <code>MANUAL</code> workflows, or re-run automatic workflows, you must navigate to the <code>Workflows &gt; Control</code> tab. For each acquisition run you will find a drop down menu under the <code>Job</code> column. To submit a workflow you select it from this drop down menu by the <code>Name</code> you provided when creating its definition.</p>"},{"location":"#advanced-usage","title":"Advanced Usage","text":""},{"location":"#variable-substitution-in-yaml-files","title":"Variable Substitution in YAML Files","text":"<p>Using <code>validator</code>s, it is possible to define (generally, default) model parameters for a <code>Task</code> in terms of other parameters. It is also possible to use validated Pydantic model parameters to substitute values into a configuration file required to run a third party <code>Task</code> (e.g. some <code>Task</code>s may require their own JSON, TOML files, etc. to run properly). For more information on these types of substitutions, refer to the <code>new_task.md</code> documentation on <code>Task</code> creation.</p> <p>These types of substitutions, however, have a limitation in that they are not easily adapted at run time. They therefore address only a small number of the possible combinations in the dependencies between different input parameters. In order to support more complex relationships between parameters, variable substitutions can also be used in the configuration YAML itself. Using a syntax similar to <code>Jinja</code> templates, you can define values for YAML parameters in terms of other parameters or environment variables. The values are substituted before Pydantic attempts to validate the configuration.</p> <p>It is perhaps easiest to illustrate with an example. A test case is provided in <code>config/test_var_subs.yaml</code> and is reproduced here:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nOtherTask:\n  useful_other_var: \"USE ME!\"\n\nNonExistentTask:\n  test_sub: \"/path/to/{{ experiment }}/file_r{{ run:04d }}.input\"         # Substitute `experiment` and `run` from header above\n  test_env_sub: \"/path/to/{{ $EXPERIMENT }}/file.input\"                   # Substitute from the environment variable $EXPERIMENT\n  test_nested:\n    a: \"outfile_{{ run }}_one.out\"                                        # Substitute `run` from header above\n    b:\n      c: \"outfile_{{ run }}_two.out\"                                      # Also substitute `run` from header above\n      d: \"{{ OtherTask.useful_other_var }}\"                               # Substitute `useful_other_var` from `OtherTask`\n  test_fmt: \"{{ run:04d }}\"                                               # Subsitute `run` and format as 0012\n  test_env_fmt: \"{{ $RUN:04d }}\"                                          # Substitute environment variable $RUN and pad to 4 w/ zeros\n...\n</code></pre> <p>Input parameters in the config YAML can be substituted with either other input parameters or environment variables, with or without limited string formatting. All substitutions occur between double curly brackets: <code>{{ VARIABLE_TO_SUBSTITUTE }}</code>. Environment variables are indicated by <code>$</code> in front of the variable name. Parameters from the header, i.e. the first YAML document (top section) containing the <code>run</code>, <code>experiment</code>, version fields, etc. can be substituted without any qualification. If you want to use the <code>run</code> parameter, you can substitute it using <code>{{ run }}</code>. All other parameters, i.e. from other <code>Task</code>s or within <code>Task</code>s, must use a qualified name. Nested levels are delimited using a <code>.</code>. E.g. consider a structure like:</p> <pre><code>Task:\n  param_set:\n    a: 1\n    b: 2\n    c: 3\n</code></pre> <p>In order to use parameter <code>c</code>, you would use <code>{{ Task.param_set.c }}</code> as the substitution.</p> <p>Take care when using substitutions! This process will not try to guess for you. When a substitution is not available, e.g. due to misspelling, one of two things will happen:</p> <ul> <li>If it was an environment variable that does not exist, no substitution will be performed, although a message will be printed. I.e. you will be left with <code>param: /my/failed/{{ $SUBSTITUTION }}</code> as your parameter. This may or may not fail the model validation step, but is likely not what you intended.</li> <li>If it was an attempt at substituting another YAML parameter which does not exist, an exception will be thrown and the program will exit.</li> </ul> <p>Defining your own parameters</p> <p>The configuration file is not validated in its totality, only on a <code>Task</code>-by-<code>Task</code> basis, but it is read in its totality. E.g. when running <code>MyTask</code> only that portion of the configuration is validated even though the entire file has been read, and is available for substitutions. As a result, it is safe to introduce extra entries into the YAML file, as long as they are not entered under a specific <code>Task</code>'s configuration. This may be useful to create your own global substitutions, for example if there is a key variable that may be used across different <code>Task</code>s. E.g. Consider a case where you want to create a more generic configuration file where a single variable is used by multiple <code>Task</code>s. This single variable may be changed between experiments, for instance, but is likely static for the duration of a single set of analyses. In order to avoid a mistake when changing the configuration between experiments you can define this special variable (or variables) as a separate entry in the YAML, and make use of substitutions in each <code>Task</code>'s configuration. This way the variable only needs to be changed in one place.</p> <pre><code># Define our substitution. This is only for substitutiosns!\nMY_SPECIAL_SUB: \"EXPMT_DEPENDENT_VALUE\"  # Can change here once per experiment!\n\nRunTask1:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_1: 1\n  var_2: \"a\"\n  # ...\n\nRunTask2:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_3: \"abcd\"\n  var_4: 123\n  # ...\n\nRunTask3:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  #...\n\n# ... and so on\n</code></pre>"},{"location":"#gotchas","title":"Gotchas!","text":"<p>Order matters</p> <p>While in general you can use parameters that appear later in a YAML document to substitute for values of parameters that appear earlier, the substitutions themselves will be performed in order of appearance. It is therefore NOT possible to correctly use a later parameter as a substitution for an earlier one, if the later one itself depends on a substitution. The YAML document, however, can be rearranged without error. The order in the YAML document has no effect on execution order which is determined purely by the workflow definition. As mentioned above, the document is not validated in its entirety so rearrangements are allowed. For example consider the following situation which produces an incorrect substitution:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will incorrectly be \"{{ work_dir }}/additional_path/{{ $RUN }}\"\n  # ...\n\nRunTaskTwo:\n  # Remember `work_dir` and `run` come from the header document and don't need to\n  # be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n...\n</code></pre> <p>This configuration can be rearranged to achieve the desired result:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskTwo:\n  # Remember `work_dir` comes from the header document and doesn't need to be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will now be /sdf/data/lcls/ds/exp/experiment/scratch/additional_path/12\n  # ...\n...\n</code></pre> <p>On the otherhand, relationships such as these may point to inconsistencies in the dependencies between <code>Task</code>s which may warrant a refactor.</p> <p>Found unhashable key</p> <p>To avoid YAML parsing issues when using the substitution syntax, be sure to quote your substitutions. Before substitution is performed, a dictionary is first constructed by the <code>pyyaml</code> package which parses the document - it may fail to parse the document and raise an exception if the substitutions are not quoted. E.g.</p> <pre><code># USE THIS\nMyTask:\n  var_sub: \"{{ other_var:04d }}\"\n\n# **DO NOT** USE THIS\nMyTask:\n  var_sub: {{ other_var:04d }}\n</code></pre> <p>During validation, Pydantic will by default cast variables if possible, because of this it is generally safe to use strings for substitutions. E.g. if your parameter is expecting an integer, and after substitution you pass <code>\"2\"</code>, Pydantic will cast this to the <code>int</code> <code>2</code>, and validation will succeed. As part of the substitution process limited type casting will also be handled if it is necessary for any formatting strings provided. E.g. <code>\"{{ run:04d }}\"</code> requires that run be an integer, so it will be treated as such in order to apply the formatting.</p>"},{"location":"#custom-run-time-dags","title":"Custom Run-Time DAGs","text":"<p>In most cases, standard DAGs should be called as described above. However, Airflow also supports the dynamic creation of DAGs, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Consider a simplified serial femtosecond crystallography DAG which runs peak finding through merging and then calculates some statistics. I.e. we want an execution order that looks like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next:\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>.</p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code>. This name must be identical to a managed <code>Task</code> defined in the LUTE installation you are using.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the <code>next</code> list are run in parallel.</li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined in this way, we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG.  The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"#debug-environment-variables","title":"Debug Environment Variables","text":"<p>Special markers have been inserted at certain points in the execution flow for LUTE. These can be enabled by setting the environment variables detailed below. These are intended to allow developers to exit the program at certain points to investigate behaviour or a bug. For instance, when working on configuration parsing, an environment variable can be set which exits the program after passing this step. This allows you to run LUTE otherwise as normal (described above), without having to modify any additional code or insert your own early exits.</p> <p>Types of debug markers:</p> <ul> <li><code>LUTE_DEBUG_EXIT</code>: Will exit the program at this point if the corresponding environment variable has been set.</li> </ul> <p>Developers can insert these markers as needed into their code to add new exit points, although as a rule of thumb they should be used sparingly, and generally only after major steps in the execution flow (e.g. after parsing, after beginning a task, after returning a result, etc.).</p> <p>In order to include a new marker in your code:</p> <pre><code>from lute.execution.debug_utils import LUTE_DEBUG_EXIT\n\ndef my_code() -&gt; None:\n    # ...\n    LUTE_DEBUG_EXIT(\"MYENVVAR\", \"Additional message to print\")\n    # If MYENVVAR is not set, the above function does nothing\n</code></pre> <p>You can enable a marker by setting to 1, e.g. to enable the example marker above while running <code>Tester</code>:</p> <pre><code>MYENVVAR=1 python -B run_task.py -t Tester -c config/test.yaml\n</code></pre>"},{"location":"#currently-used-environment-variables","title":"Currently used environment variables","text":"<ul> <li><code>LUTE_DEBUG_EXIT_AT_YAML</code>: Exits the program after reading in a YAML configuration file and performing variable substitutions, but BEFORE Pydantic validation.</li> <li><code>LUTE_DEBUG_BEFORE_TPP_EXEC</code>: Exits the program after a ThirdPartyTask has prepared its submission command, but before <code>exec</code> is used to run it.</li> </ul>"},{"location":"#airflow-launch-and-dag-execution-steps","title":"Airflow Launch and DAG Execution Steps","text":"<p>The Airflow launch process actually involves a number of steps, and is rather complicated. There are two wrapper steps prior to getting to the actual Airflow API communication.</p> <ol> <li><code>launch_scripts/submit_launch_airflow.sh</code> is run.</li> <li>This script calls <code>/sdf/group/lcls/ds/tools/lute_launcher</code> with all the same parameters that it was called with.</li> <li><code>lute_launcher</code> runs the <code>launch_scripts/launch_airflow.py</code> script which was provided as the first argument. This is the true launch script</li> <li><code>launch_airflow.py</code> communicates with the Airflow API, requesting that a specific DAG be launched. It then continues to run, and gathers the individual logs and the exit status of each step of the DAG.</li> <li>Airflow will then enter a loop of communication where it asks the JID to submit each step of the requested DAG as batch job using <code>launch_scripts/submit_slurm.sh</code>.</li> </ol> <p>There are some specific reasons for this complexity:</p> <ul> <li>The use of <code>submit_launch_airflow.sh</code> as a thin-wrapper around <code>lute_launcher</code> is to allow the true Airflow launch script to be a long-lived job. This is for compatibility with the eLog and the ARP. When run from the eLog as a workflow, the job submission process must occur within 30 seconds due to a timeout built-in to the system. This is fine when submitting jobs to run on the batch-nodes, as the submission to the queue takes very little time. So here, <code>submit_launch_airflow.sh</code> serves as a thin script to have <code>lute_launcher</code> run as a batch job. It can then run as a long-lived job (for the duration of the entire DAG) collecting log files all in one place. This allows the log for each stage of the Airflow DAG to be inspected in a single file, and through the eLog browser interface.</li> <li>The use <code>lute_launcher</code> as a wrapper around <code>launch_airflow.py</code> is to manage authentication and credentials. The <code>launch_airflow.py</code> script requires loading credentials in order to authenticate against the Airflow API. For the average user this is not possible, unless the script is run from within the <code>lute_launcher</code> process.</li> </ul>"},{"location":"usage/","title":"Setup","text":"<p>LUTE is publically available on GitHub. In order to run it, the first step is to clone the repository:</p> <pre><code># Navigate to the directory of your choice.\ngit clone@github.com:slac-lcls/lute\n</code></pre> <p>The repository directory structure is as follows:</p> <pre><code>lute\n  |--- config             # Configuration YAML files (see below) and templates for third party config\n  |--- docs               # Documentation (including this page)\n  |--- launch_scripts     # Entry points for using SLURM and communicating with Airflow\n  |--- lute               # Code\n        |--- run_task.py  # Script to run an individual managed Task\n        |--- ...\n  |--- utilities          # Help utility programs\n  |--- workflows          # This directory contains workflow definitions. It is synced elsewhere and not used directly.\n\n</code></pre> <p>In general, most interactions with the software will be through scripts located in the <code>launch_scripts</code> directory. Some users (for certain use-cases) may also choose to run the <code>run_task.py</code> script directly - it's location has been highlighted within hierarchy. To begin with you will need a YAML file, templates for which are available in the <code>config</code> directory. The structure of the YAML file and how to use the various launch scripts are described in more detail below.</p>"},{"location":"usage/#a-note-on-utilties","title":"A note on utilties","text":"<p>In the <code>utilities</code> directory there are two useful programs to provide assistance with using the software:</p> <ul> <li><code>utilities/dbview</code>: LUTE stores all parameters for every analysis routine it runs (as well as results) in a database. This database is stored in the <code>work_dir</code> defined in the YAML file (see below). The <code>dbview</code> utility is a TUI application (Text-based user interface) which runs in the terminal. It allows you to navigate a LUTE database using the arrow keys, etc. Usage is: <code>utilities/dbview -p &lt;path/to/lute.db&gt;</code>.</li> <li><code>utilities/lute_help</code>: This utility provides help and usage information for running LUTE software. E.g., it provides access to parameter descriptions to assist in properly filling out a configuration YAML. It's usage is described in slightly more detail below.</li> </ul>"},{"location":"usage/#basic-usage","title":"Basic Usage","text":""},{"location":"usage/#overview","title":"Overview","text":"<p>LUTE runs code as <code>Task</code>s that are managed by an <code>Executor</code>. The <code>Executor</code> provides modifications to the environment the <code>Task</code> runs in, as well as controls details of inter-process communication, reporting results to the eLog, etc. Combinations of specific <code>Executor</code>s and <code>Task</code>s are already provided, and are referred to as managed <code>Task</code>s. Managed <code>Task</code>s are submitted as a single unit. They can be run individually, or a series of independent steps can be submitted all at once in the form of a workflow, or directed acyclic graph (DAG). This latter option makes use of Airflow to manage the individual execution steps.</p> <p>Running analysis with LUTE is the process of submitting one or more managed <code>Task</code>s. This is generally a two step process.</p> <ol> <li>First, a configuration YAML file is prepared. This contains the parameterizations of all the <code>Task</code>s which you may run.</li> <li>Individual managed <code>Task</code> submission, or workflow (DAG) submission.</li> </ol> <p>These two steps are described below.</p>"},{"location":"usage/#preparing-a-configuration-yaml","title":"Preparing a Configuration YAML","text":"<p>All <code>Task</code>s are parameterized through a single configuration YAML file - even third party code which requires its own configuration files is managed through this YAML file. The basic structure is split into two documents, a brief header section which contains information that is applicable across all <code>Task</code>s, such as the experiment name, run numbers and the working directory, followed by per <code>Task</code> parameters:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Some title.\"\nexperiment: \"MYEXP123\"\n# run: 12 # Does not need to be provided\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nTaskOne:\n  param_a: 123\n  param_b: 456\n  param_c:\n    sub_var: 3\n    sub_var2: 4\n\nTaskTwo:\n  new_param1: 3\n  new_param2: 4\n\n# ...\n...\n</code></pre> <p>In the first document, the header, it is important that the <code>work_dir</code> is properly specified. This is the root directory from which <code>Task</code> outputs will be written, and the LUTE database will be stored. It may also be desirable to modify the <code>task_timeout</code> parameter which defines the time limit for individual <code>Task</code> jobs. By default it is set to 10 minutes, although this may not be sufficient for long running jobs. This value will be applied to all <code>Task</code>s so should account for the longest running job you expect.</p> <p>The actual analysis parameters are defined in the second document. As these vary from <code>Task</code> to <code>Task</code>, a full description will not be provided here. An actual template with real <code>Task</code> parameters is available in <code>config/test.yaml</code>. Your analysis POC can also help you set up and choose the correct <code>Task</code>s to include as a starting point. The template YAML file has further descriptions of what each parameter does and how to fill it out. You can also refer to the <code>lute_help</code> program described under the following sub-heading.</p> <p>Some things to consider and possible points of confusion:</p> <ul> <li>While we will be submitting managed <code>Task</code>s, the parameters are defined at the <code>Task</code> level. I.e. the managed <code>Task</code> and <code>Task</code> itself have different names, and the names in the YAML refer to the latter. This is because a single <code>Task</code> can be run using different <code>Executor</code> configurations, but using the same parameters. The list of managed <code>Task</code>s is in <code>lute/managed_tasks.py</code>. A table is also provided below for some routines of interest..</li> </ul> Managed <code>Task</code> The <code>Task</code> it Runs <code>Task</code> Description <code>SmallDataProducer</code> <code>SubmitSMD</code> Smalldata production <code>CrystFELIndexer</code> <code>IndexCrystFEL</code> Crystallographic indexing <code>PartialatorMerger</code> <code>MergePartialator</code> Crystallographic merging <code>HKLComparer</code> <code>CompareHKL</code> Crystallographic figures of merit <code>HKLManipulator</code> <code>ManipulateHKL</code> Crystallographic format conversions <code>DimpleSolver</code> <code>DimpleSolve</code> Crystallographic structure solution with molecular replacement <code>PeakFinderPyAlgos</code> <code>FindPeaksPyAlgos</code> Peak finding with PyAlgos algorithm. <code>PeakFinderPsocake</code> <code>FindPeaksPsocake</code> Peak finding with psocake algorithm. <code>StreamFileConcatenator</code> <code>ConcatenateStreamFiles</code> Stream file concatenation."},{"location":"usage/#how-do-i-know-what-parameters-are-available-and-what-they-do","title":"How do I know what parameters are available, and what they do?","text":"<p>A summary of <code>Task</code> parameters is available through the <code>lute_help</code> program.</p> <pre><code>&gt; utilities/lute_help -t [TaskName]\n</code></pre> <p>Note, some parameters may say \"Unknown description\" - this either means they are using an old-style defintion that does not include parameter help, or they may have some internal use. In particular you will see this for <code>lute_config</code> on every <code>Task</code>, this parameter is filled in automatically and should be ignored. E.g. as an example:</p> <pre><code>&gt; utilities/lute_help -t IndexCrystFEL\nINFO:__main__:Fetching parameter information for IndexCrystFEL.\nIndexCrystFEL\n-------------\nParameters for CrystFEL's `indexamajig`.\n\nThere are many parameters, and many combinations. For more information on\nusage, please refer to the CrystFEL documentation, here:\nhttps://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n\n\nRequired Parameters:\n--------------------\n[...]\n\nAll Parameters:\n-------------\n[...]\n\nhighres (number)\n    Mark all pixels greater than `x` has bad.\n\nprofile (boolean) - Default: False\n    Display timing data to monitor performance.\n\ntemp_dir (string)\n    Specify a path for the temp files folder.\n\nwait_for_file (integer) - Default: 0\n    Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\n\nno_image_data (boolean) - Default: False\n    Load only the metadata, no iamges. Can check indexability without high data requirements.\n\n[...]\n</code></pre>"},{"location":"usage/#running-managed-tasks-and-workflows-dags","title":"Running Managed <code>Task</code>s and Workflows (DAGs)","text":"<p>After a YAML file has been filled in you can run a <code>Task</code>. There are multiple ways to submit a <code>Task</code>, but there are 3 that are most likely:</p> <ol> <li>Run a single managed <code>Task</code> interactively by running <code>python ...</code></li> <li>Run a single managed <code>Task</code> as a batch job (e.g. on S3DF) via a SLURM submission <code>submit_slurm.sh ...</code></li> <li>Run a DAG (workflow with multiple managed <code>Task</code>s).</li> </ol> <p>These will be covered in turn below; however, in general all methods will require two parameters: the path to a configuration YAML file, and the name of the managed <code>Task</code> or workflow you want to run. When submitting via SLURM or submitting an entire workflow there are additional parameters to control these processes.</p>"},{"location":"usage/#running-single-managed-tasks-interactively","title":"Running single managed <code>Task</code>s interactively","text":"<p>The simplest submission method is just to run Python interactively. In most cases this is not practical for long-running analysis, but may be of use for short <code>Task</code>s or when debugging. From the root directory of the LUTE repository (or after installation) you can use the <code>run_task.py</code> script:</p> <pre><code>&gt; python -B [-O] run_task.py -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt;\n</code></pre> <p>The command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li><code>-O</code> is the flag controlling whether you run in debug or non-debug mode. By default, i.e. if you do NOT provide this flag you will run in debug mode which enables verbose printing. Passing <code>-O</code> will turn off debug to minimize output.</li> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> </ul>"},{"location":"usage/#submitting-a-single-managed-task-as-a-batch-job","title":"Submitting a single managed <code>Task</code> as a batch job","text":"<p>On S3DF you can also submit individual managed <code>Task</code>s to run as batch jobs. To do so use <code>launch_scripts/submit_slurm.sh</code></p> <pre><code>&gt; launch_scripts/submit_slurm.sh -t &lt;ManagedTaskName&gt; -c &lt;/path/to/config/yaml&gt; [--debug] $SLURM_ARGS\n</code></pre> <p>As before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided</p> <ul> <li><code>-t &lt;ManagedTaskName&gt;</code> is the name of the managed <code>Task</code> you want to run.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML.</li> <li><code>--debug</code> is the flag to control whether or not to run in debug mode.</li> </ul> <p>In addition to the LUTE-specific arguments, SLURM arguments must also be provided (<code>$SLURM_ARGS</code> above). You can provide as many as you want; however you will need to at least provide:</p> <ul> <li><code>--partition=&lt;partition/queue&gt;</code> - The queue to run on, in general for LCLS this is <code>milano</code></li> <li><code>--account=lcls:&lt;experiment&gt;</code> - The account to use for batch job accounting.</li> </ul> <p>You will likely also want to provide at a minimum:</p> <ul> <li><code>--ntasks=&lt;...&gt;</code> to control the number of cores in allocated.</li> </ul> <p>In general, it is best to prefer the long-form of the SLURM-argument (<code>--arg=&lt;...&gt;</code>) in order to avoid potential clashes with present or future LUTE arguments.</p>"},{"location":"usage/#workflow-dag-submission","title":"Workflow (DAG) submission","text":"<p>Finally, you can submit a full workflow (e.g. SFX analysis, smalldata production and summary results, geometry optimization...). This can be done using a single script, <code>submit_launch_airflow.sh</code>, similarly to the SLURM submission above:</p> <pre><code>&gt; launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -c &lt;/path/to/yaml.yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] $SLURM_ARGS\n</code></pre> <p>The submission process is slightly more complicated in this case. A more in-depth explanation is provided under \"Airflow Launch Steps\", in the advanced usage section below if interested. The parameters are as follows - as before command-line arguments in square brackets <code>[]</code> are optional, while those in <code>&lt;&gt;</code> must be provided:</p> <ul> <li>The first argument (must be first) is the full path to the <code>launch_scripts/launch_airflow.py</code> script located in whatever LUTE installation you are running. All other arguments can come afterwards in any order.</li> <li><code>-c &lt;/path/...&gt;</code> is the path to the configuration YAML to use.</li> <li><code>-w &lt;dag_name&gt;</code> is the name of the DAG (workflow) to run. This replaces the task name provided when using the other two methods above. A DAG list is provided below.</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> controls whether to use debug mode (verbose printing)</li> <li><code>--test</code> controls whether to use the test or production instance of Airflow to manage the DAG. The instances are running identical versions of Airflow, but the <code>test</code> instance may have \"test\" or more bleeding edge development DAGs.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> </ul> <p>The <code>$SLURM_ARGS</code> must be provided in the same manner as when submitting an individual managed <code>Task</code> by hand to be run as batch job with the script above. Note that these parameters will be used as the starting point for the SLURM arguments of every managed <code>Task</code> in the DAG; however, individual steps in the DAG may have overrides built-in where appropriate to make sure that step is not submitted with potentially incompatible arguments. For example, a single threaded analysis <code>Task</code> may be capped to running on one core, even if in general everything should be running on 100 cores, per the SLURM argument provided. These caps are added during development and cannot be disabled through configuration changes in the YAML.</p> <p>DAG List</p> <ul> <li><code>find_peaks_index</code></li> <li><code>psocake_sfx_phasing</code></li> <li><code>pyalgos_sfx</code></li> </ul>"},{"location":"usage/#dag-submission-from-the-elog","title":"DAG Submission from the <code>eLog</code>","text":"<p>You can use the script in the previous section to submit jobs through the eLog. To do so navigate to the <code>Workflow &gt; Definitions</code> tab using the blue navigation bar at the top of the eLog. On this tab, in the top-right corner (underneath the help and zoom icons) you can click the <code>+</code> sign to add a new workflow. This will bring up a \"Workflow definition\" UI window. When filling out the eLog workflow definition the following fields are needed (all of them):</p> <ul> <li><code>Name</code>: You can name the workflow anything you like. It should probably be something descriptive, e.g. if you are using LUTE to run smalldata_tools, you may call the workflow <code>lute_smd</code>.</li> <li><code>Executable</code>: In this field you will put the full path to the <code>submit_launch_airflow.sh</code> script:  <code>/path/to/lute/launch_scripts/submit_launch_airflow.sh</code>.</li> <li><code>Parameters</code>: You will use the parameters as described above. Remember the first argument will be the full path to the <code>launch_airflow.py</code> script (this is NOT the same as the bash script used in the executable!): <code>/full/path/to/lute/launch_scripts/launch_airflow.py -c &lt;path/to/yaml&gt; -w &lt;dag_name&gt; [--debug] [--test] $SLURM_ARGS</code></li> <li><code>Location</code>: Be sure to set to <code>S3DF</code>.</li> <li><code>Trigger</code>: You can have the workflow trigger automatically or manually. Which option to choose will depend on the type of workflow you are running. In general the options <code>Manually triggered</code> (which displays as <code>MANUAL</code> on the definitions page) and <code>End of a run</code> (which displays as <code>END_OF_RUN</code> on the definitions page) are safe options for ALL workflows. The latter will be automatically submitted for you when data acquisition has finished. If you are running a workflow with managed <code>Task</code>s that work as data is being acquired (e.g. <code>SmallDataProducer</code>), you may also select <code>Start of a run</code> (which displays as <code>START_OF_RUN</code> on the definitions page).</li> </ul> <p>Upon clicking create you will see a new entry in the table on the definitions page. In order to run <code>MANUAL</code> workflows, or re-run automatic workflows, you must navigate to the <code>Workflows &gt; Control</code> tab. For each acquisition run you will find a drop down menu under the <code>Job</code> column. To submit a workflow you select it from this drop down menu by the <code>Name</code> you provided when creating its definition.</p>"},{"location":"usage/#advanced-usage","title":"Advanced Usage","text":""},{"location":"usage/#variable-substitution-in-yaml-files","title":"Variable Substitution in YAML Files","text":"<p>Using <code>validator</code>s, it is possible to define (generally, default) model parameters for a <code>Task</code> in terms of other parameters. It is also possible to use validated Pydantic model parameters to substitute values into a configuration file required to run a third party <code>Task</code> (e.g. some <code>Task</code>s may require their own JSON, TOML files, etc. to run properly). For more information on these types of substitutions, refer to the <code>new_task.md</code> documentation on <code>Task</code> creation.</p> <p>These types of substitutions, however, have a limitation in that they are not easily adapted at run time. They therefore address only a small number of the possible combinations in the dependencies between different input parameters. In order to support more complex relationships between parameters, variable substitutions can also be used in the configuration YAML itself. Using a syntax similar to <code>Jinja</code> templates, you can define values for YAML parameters in terms of other parameters or environment variables. The values are substituted before Pydantic attempts to validate the configuration.</p> <p>It is perhaps easiest to illustrate with an example. A test case is provided in <code>config/test_var_subs.yaml</code> and is reproduced here:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/scratch/users/d/dorlhiac\"\n...\n---\nOtherTask:\n  useful_other_var: \"USE ME!\"\n\nNonExistentTask:\n  test_sub: \"/path/to/{{ experiment }}/file_r{{ run:04d }}.input\"         # Substitute `experiment` and `run` from header above\n  test_env_sub: \"/path/to/{{ $EXPERIMENT }}/file.input\"                   # Substitute from the environment variable $EXPERIMENT\n  test_nested:\n    a: \"outfile_{{ run }}_one.out\"                                        # Substitute `run` from header above\n    b:\n      c: \"outfile_{{ run }}_two.out\"                                      # Also substitute `run` from header above\n      d: \"{{ OtherTask.useful_other_var }}\"                               # Substitute `useful_other_var` from `OtherTask`\n  test_fmt: \"{{ run:04d }}\"                                               # Subsitute `run` and format as 0012\n  test_env_fmt: \"{{ $RUN:04d }}\"                                          # Substitute environment variable $RUN and pad to 4 w/ zeros\n...\n</code></pre> <p>Input parameters in the config YAML can be substituted with either other input parameters or environment variables, with or without limited string formatting. All substitutions occur between double curly brackets: <code>{{ VARIABLE_TO_SUBSTITUTE }}</code>. Environment variables are indicated by <code>$</code> in front of the variable name. Parameters from the header, i.e. the first YAML document (top section) containing the <code>run</code>, <code>experiment</code>, version fields, etc. can be substituted without any qualification. If you want to use the <code>run</code> parameter, you can substitute it using <code>{{ run }}</code>. All other parameters, i.e. from other <code>Task</code>s or within <code>Task</code>s, must use a qualified name. Nested levels are delimited using a <code>.</code>. E.g. consider a structure like:</p> <pre><code>Task:\n  param_set:\n    a: 1\n    b: 2\n    c: 3\n</code></pre> <p>In order to use parameter <code>c</code>, you would use <code>{{ Task.param_set.c }}</code> as the substitution.</p> <p>Take care when using substitutions! This process will not try to guess for you. When a substitution is not available, e.g. due to misspelling, one of two things will happen:</p> <ul> <li>If it was an environment variable that does not exist, no substitution will be performed, although a message will be printed. I.e. you will be left with <code>param: /my/failed/{{ $SUBSTITUTION }}</code> as your parameter. This may or may not fail the model validation step, but is likely not what you intended.</li> <li>If it was an attempt at substituting another YAML parameter which does not exist, an exception will be thrown and the program will exit.</li> </ul> <p>Defining your own parameters</p> <p>The configuration file is not validated in its totality, only on a <code>Task</code>-by-<code>Task</code> basis, but it is read in its totality. E.g. when running <code>MyTask</code> only that portion of the configuration is validated even though the entire file has been read, and is available for substitutions. As a result, it is safe to introduce extra entries into the YAML file, as long as they are not entered under a specific <code>Task</code>'s configuration. This may be useful to create your own global substitutions, for example if there is a key variable that may be used across different <code>Task</code>s. E.g. Consider a case where you want to create a more generic configuration file where a single variable is used by multiple <code>Task</code>s. This single variable may be changed between experiments, for instance, but is likely static for the duration of a single set of analyses. In order to avoid a mistake when changing the configuration between experiments you can define this special variable (or variables) as a separate entry in the YAML, and make use of substitutions in each <code>Task</code>'s configuration. This way the variable only needs to be changed in one place.</p> <pre><code># Define our substitution. This is only for substitutiosns!\nMY_SPECIAL_SUB: \"EXPMT_DEPENDENT_VALUE\"  # Can change here once per experiment!\n\nRunTask1:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_1: 1\n  var_2: \"a\"\n  # ...\n\nRunTask2:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  var_3: \"abcd\"\n  var_4: 123\n  # ...\n\nRunTask3:\n  special_var: \"{{ MY_SPECIAL_SUB }}\"\n  #...\n\n# ... and so on\n</code></pre>"},{"location":"usage/#gotchas","title":"Gotchas!","text":"<p>Order matters</p> <p>While in general you can use parameters that appear later in a YAML document to substitute for values of parameters that appear earlier, the substitutions themselves will be performed in order of appearance. It is therefore NOT possible to correctly use a later parameter as a substitution for an earlier one, if the later one itself depends on a substitution. The YAML document, however, can be rearranged without error. The order in the YAML document has no effect on execution order which is determined purely by the workflow definition. As mentioned above, the document is not validated in its entirety so rearrangements are allowed. For example consider the following situation which produces an incorrect substitution:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will incorrectly be \"{{ work_dir }}/additional_path/{{ $RUN }}\"\n  # ...\n\nRunTaskTwo:\n  # Remember `work_dir` and `run` come from the header document and don't need to\n  # be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n...\n</code></pre> <p>This configuration can be rearranged to achieve the desired result:</p> <pre><code>%YAML 1.3\n---\ntitle: \"Configuration to Test YAML Substitution\"\nexperiment: \"TestYAMLSubs\"\nrun: 12\ndate: \"2024/05/01\"\nlute_version: 0.1\ntask_timeout: 600\nwork_dir: \"/sdf/data/lcls/ds/exp/experiment/scratch\"\n...\n---\nRunTaskTwo:\n  # Remember `work_dir` comes from the header document and doesn't need to be qualified\n  path: \"{{ work_dir }}/additional_path/{{ run }}\"\n\nRunTaskOne:\n  input_dir: \"{{ RunTaskTwo.path }}\"  # Will now be /sdf/data/lcls/ds/exp/experiment/scratch/additional_path/12\n  # ...\n...\n</code></pre> <p>On the otherhand, relationships such as these may point to inconsistencies in the dependencies between <code>Task</code>s which may warrant a refactor.</p> <p>Found unhashable key</p> <p>To avoid YAML parsing issues when using the substitution syntax, be sure to quote your substitutions. Before substitution is performed, a dictionary is first constructed by the <code>pyyaml</code> package which parses the document - it may fail to parse the document and raise an exception if the substitutions are not quoted. E.g.</p> <pre><code># USE THIS\nMyTask:\n  var_sub: \"{{ other_var:04d }}\"\n\n# **DO NOT** USE THIS\nMyTask:\n  var_sub: {{ other_var:04d }}\n</code></pre> <p>During validation, Pydantic will by default cast variables if possible, because of this it is generally safe to use strings for substitutions. E.g. if your parameter is expecting an integer, and after substitution you pass <code>\"2\"</code>, Pydantic will cast this to the <code>int</code> <code>2</code>, and validation will succeed. As part of the substitution process limited type casting will also be handled if it is necessary for any formatting strings provided. E.g. <code>\"{{ run:04d }}\"</code> requires that run be an integer, so it will be treated as such in order to apply the formatting.</p>"},{"location":"usage/#custom-run-time-dags","title":"Custom Run-Time DAGs","text":"<p>In most cases, standard DAGs should be called as described above. However, Airflow also supports the dynamic creation of DAGs, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Consider a simplified serial femtosecond crystallography DAG which runs peak finding through merging and then calculates some statistics. I.e. we want an execution order that looks like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next:\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>.</p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code>. This name must be identical to a managed <code>Task</code> defined in the LUTE installation you are using.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the <code>next</code> list are run in parallel.</li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined in this way, we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG.  The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"usage/#debug-environment-variables","title":"Debug Environment Variables","text":"<p>Special markers have been inserted at certain points in the execution flow for LUTE. These can be enabled by setting the environment variables detailed below. These are intended to allow developers to exit the program at certain points to investigate behaviour or a bug. For instance, when working on configuration parsing, an environment variable can be set which exits the program after passing this step. This allows you to run LUTE otherwise as normal (described above), without having to modify any additional code or insert your own early exits.</p> <p>Types of debug markers:</p> <ul> <li><code>LUTE_DEBUG_EXIT</code>: Will exit the program at this point if the corresponding environment variable has been set.</li> </ul> <p>Developers can insert these markers as needed into their code to add new exit points, although as a rule of thumb they should be used sparingly, and generally only after major steps in the execution flow (e.g. after parsing, after beginning a task, after returning a result, etc.).</p> <p>In order to include a new marker in your code:</p> <pre><code>from lute.execution.debug_utils import LUTE_DEBUG_EXIT\n\ndef my_code() -&gt; None:\n    # ...\n    LUTE_DEBUG_EXIT(\"MYENVVAR\", \"Additional message to print\")\n    # If MYENVVAR is not set, the above function does nothing\n</code></pre> <p>You can enable a marker by setting to 1, e.g. to enable the example marker above while running <code>Tester</code>:</p> <pre><code>MYENVVAR=1 python -B run_task.py -t Tester -c config/test.yaml\n</code></pre>"},{"location":"usage/#currently-used-environment-variables","title":"Currently used environment variables","text":"<ul> <li><code>LUTE_DEBUG_EXIT_AT_YAML</code>: Exits the program after reading in a YAML configuration file and performing variable substitutions, but BEFORE Pydantic validation.</li> <li><code>LUTE_DEBUG_BEFORE_TPP_EXEC</code>: Exits the program after a ThirdPartyTask has prepared its submission command, but before <code>exec</code> is used to run it.</li> </ul>"},{"location":"usage/#airflow-launch-and-dag-execution-steps","title":"Airflow Launch and DAG Execution Steps","text":"<p>The Airflow launch process actually involves a number of steps, and is rather complicated. There are two wrapper steps prior to getting to the actual Airflow API communication.</p> <ol> <li><code>launch_scripts/submit_launch_airflow.sh</code> is run.</li> <li>This script calls <code>/sdf/group/lcls/ds/tools/lute_launcher</code> with all the same parameters that it was called with.</li> <li><code>lute_launcher</code> runs the <code>launch_scripts/launch_airflow.py</code> script which was provided as the first argument. This is the true launch script</li> <li><code>launch_airflow.py</code> communicates with the Airflow API, requesting that a specific DAG be launched. It then continues to run, and gathers the individual logs and the exit status of each step of the DAG.</li> <li>Airflow will then enter a loop of communication where it asks the JID to submit each step of the requested DAG as batch job using <code>launch_scripts/submit_slurm.sh</code>.</li> </ol> <p>There are some specific reasons for this complexity:</p> <ul> <li>The use of <code>submit_launch_airflow.sh</code> as a thin-wrapper around <code>lute_launcher</code> is to allow the true Airflow launch script to be a long-lived job. This is for compatibility with the eLog and the ARP. When run from the eLog as a workflow, the job submission process must occur within 30 seconds due to a timeout built-in to the system. This is fine when submitting jobs to run on the batch-nodes, as the submission to the queue takes very little time. So here, <code>submit_launch_airflow.sh</code> serves as a thin script to have <code>lute_launcher</code> run as a batch job. It can then run as a long-lived job (for the duration of the entire DAG) collecting log files all in one place. This allows the log for each stage of the Airflow DAG to be inspected in a single file, and through the eLog browser interface.</li> <li>The use <code>lute_launcher</code> as a wrapper around <code>launch_airflow.py</code> is to manage authentication and credentials. The <code>launch_airflow.py</code> script requires loading credentials in order to authenticate against the Airflow API. For the average user this is not possible, unless the script is run from within the <code>lute_launcher</code> process.</li> </ul>"},{"location":"adrs/","title":"Architecture Decision Records","text":"<ul> <li>This directory contains a list of architecture and major feature decisions.</li> <li>Please refer to the <code>madr_template.md</code> for creating new ADRs. This template was adapted from the MADR template (MIT License).</li> <li>A table of ADRs is provided below.</li> </ul> ADR No. Record Date Title Status 1 2023-11-06 All analysis <code>Task</code>s inherit from a base class Accepted 2 2023-11-06 Analysis <code>Task</code> submission and communication is performed via <code>Executor</code>s Accepted 3 2023-11-06 <code>Executor</code>s will run all <code>Task</code>s via subprocess Proposed 4 2023-11-06 Airflow <code>Operator</code>s and LUTE <code>Executor</code>s are separate entities. Proposed 5 2023-12-06 Task-Executor IPC is Managed by Communicator Objects Proposed 6 2024-02-12 Third-party Config Files Managed by Templates Rendered by <code>ThirdPartyTask</code>s Proposed 7 2024-02-12 <code>Task</code> Configuration is Stored in a Database Managed by <code>Executor</code>s Proposed 8 2024-03-18 Airflow credentials/authorization requires special launch program. Proposed 9 2024-04-15 Airflow launch script will run as long lived batch job. Proposed"},{"location":"adrs/MADR_LICENSE/","title":"MADR LICENSE","text":"<p>Copyright 2022 ADR Github Organization</p> <p>Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \u201cSoftware\u201d), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:</p> <p>The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.</p> <p>THE SOFTWARE IS PROVIDED \u201cAS IS\u201d, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p>"},{"location":"adrs/adr-1/","title":"[ADR-1] All Analysis Tasks Inherit from a Base Class","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-1/#status","title":"Status","text":"<p>Accepted</p>"},{"location":"adrs/adr-1/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Analysis programs of interest have varied APIs.</li> <li>Nonetheless, for the purposes of this software, a unified interface is desirable.</li> <li>Providing a unified interface can be simplified by inheritance from a base class for all analysis activites.</li> </ul>"},{"location":"adrs/adr-1/#decision","title":"Decision","text":""},{"location":"adrs/adr-1/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>The original <code>btx</code> tasks had heterogenous interfaces.</li> <li>This makes debugging challenging due to the need to look-up or remember different methods of task handling.</li> <li>The need to provide modular access to various types of software.</li> <li>A desire to reduce code redundancy for implementation decisions which affect ALL tasks.</li> <li>A need to provide access to/wrap third-party binaries.</li> </ul>"},{"location":"adrs/adr-1/#considered-options","title":"Considered Options","text":"<ul> <li>Tasks as functions, with common interfaces provided through decorators, etc.</li> <li>Task code as functions wrapped by the execution code (cf. ADR-2)</li> </ul>"},{"location":"adrs/adr-1/#consequences","title":"Consequences","text":"<ul> <li>Simplified package structure.</li> <li>Ability to push feature updates to all <code>Task</code>s simultaneously.</li> <li>Potential complications due to an additional layer of abstraction.</li> </ul>"},{"location":"adrs/adr-1/#compliance","title":"Compliance","text":"<ul> <li>Data validation and type checking performed</li> <li>Common calling interface at higher levels relies on class structure (Execution layers, cf. ADR-2)</li> </ul>"},{"location":"adrs/adr-1/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-2/","title":"[ADR-2] Analysis Task Submission and Communication is Performed Via Executors","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-2/#status","title":"Status","text":"<p>Accepted</p>"},{"location":"adrs/adr-2/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Analysis code should be independent of the location and manner it is run.</li> <li>Additionally, communication is required after task submission to understand task context/state/results.</li> <li>This communication is best handled outside of the submitted job itself.<ul> <li>This provides a mechanism for continued communication even in the case of task failure.</li> </ul> </li> <li>A separate Executor (Controller) provides a mechanism that allows for communication and job submission to be independent of the task code itself.</li> <li>Therefore: An Executor will be submitted, which in turn submits the Task and manages communication activities.</li> </ul>"},{"location":"adrs/adr-2/#decision","title":"Decision","text":""},{"location":"adrs/adr-2/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Removing the job submission and communication components from the <code>Task</code> code itself provides a separation of concerns allowing <code>Task</code>s to run indepently of execution environment.</li> <li>A separate <code>Executor</code> can prepare environment, submission requirements, etc.</li> <li>A desire to reduce code redundancy. Providing unified interfaces through <code>Executor</code> classes avoids maintaining that code independently for each task (cf. alternatives considered).</li> <li>Job submission strategies can be changed at the <code>Executor</code> level and immediately applied to all <code>Task</code>s.</li> <li>If communication APIs change, this does not affect <code>Task</code> code.</li> <li>Difficulties encountered due to edge-cases in the original <code>btx</code> tasks. E.g. task timeout leading to failure of a processing pipeline even if substantial work had been done and subsequent tasks could proceed.</li> <li>Varied methods of <code>Task</code> submission already exist in the original <code>btx</code> but the methods were not fully standardized.</li> <li>E.g. <code>JobScheduler</code> submission vs direct submission of the task.</li> </ul>"},{"location":"adrs/adr-2/#considered-options","title":"Considered Options","text":"<ul> <li>Wrapping the execution, and communication, into the base <code>Task</code> class interface as pre/post analysis operations.</li> <li>Multiple <code>Task</code> subclasses for different execution environments.</li> <li>For communication specifically:</li> <li>Periodic asynchronous communication in the <code>Task</code> class.</li> </ul>"},{"location":"adrs/adr-2/#consequences","title":"Consequences","text":"<ul> <li><code>Task</code> code independent of execution environment.</li> <li>Ability to maintain communication even in the event of <code>Task</code> failure.</li> <li>Potential complications due to an additional layer of abstraction.</li> </ul>"},{"location":"adrs/adr-2/#compliance","title":"Compliance","text":"<ul> <li>Airflow will submit <code>Executor</code>s as the \"Managed Task\"</li> <li>I.e. at the highlest API layer, <code>Task</code>s will not be submitted independently.</li> </ul>"},{"location":"adrs/adr-2/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-3/","title":"[ADR-3] <code>Executor</code>s will run all <code>Task</code>s via subprocess","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-3/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-3/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>A mechanism is needed to submit <code>Task</code>s from within the <code>Executor</code> (cf. ADR-2)</li> <li>Ideally a single method can be used for all <code>Task</code>s, at all locations, but at the very least all <code>Task</code>s at a single location (e.g. S3DF, NERSC)</li> </ul>"},{"location":"adrs/adr-3/#decision","title":"Decision","text":""},{"location":"adrs/adr-3/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Want to simplify the interface for <code>Task</code> submission, but have to submit both first-party and third-party code.</li> <li>Want to have execution/submission separated from the Task submission (cf. ADR-2)</li> <li>Need flexible method which can be used to run any task, and quickly adapted to new Tasks</li> </ul>"},{"location":"adrs/adr-3/#considered-options","title":"Considered Options","text":"<ul> <li>Executor submits a separate SLURM job.</li> <li>This strategy was employed by <code>JobScheduler</code> for <code>btx</code></li> <li>Challenging to maintain - non-trivial issues can arise, e.g. with MPI</li> <li>Use <code>multiprocessing</code> at the Python level.</li> <li>More complex to manage</li> <li>Provides more flexibility</li> <li>Different mechansims for third-party Task or first-party Tasks</li> </ul>"},{"location":"adrs/adr-3/#consequences","title":"Consequences","text":"<ul> <li>Communication must be via pipes or files</li> <li>Very challenging to share state between executor and task</li> <li>Generally want to limit this, but makes certain communciation tasks harder (passing results e.g.)</li> <li>Easier to run binary (i.e. third party) tasks</li> <li>Simple to implement.</li> <li>Need a separate method (e.g. a single script) which is submitted as a subprocess</li> <li>This script, e.g., will select the Task based on options provided by the Executor</li> </ul>"},{"location":"adrs/adr-3/#compliance","title":"Compliance","text":"<ul> <li>Implementation will be at base class level for the executors</li> </ul>"},{"location":"adrs/adr-3/#metadata","title":"Metadata","text":""},{"location":"adrs/adr-4/","title":"[ADR-4] Airflow <code>Operator</code>s and LUTE <code>Executor</code>s are Separate Entities","text":"<p>Date: 2023-11-06</p>"},{"location":"adrs/adr-4/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-4/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Airflow operators submit tasks by calling the JID API</li> <li>This is required since tasks running where Airflow is running would not have access to the data</li> <li>The current plan (cf. ADR-1 and ADR-2) requires submission of the <code>Executor</code> which in turn submits the <code>Task</code></li> <li>Under this plan the Executor must be separated from the Airflow operator.</li> </ul>"},{"location":"adrs/adr-4/#decision","title":"Decision","text":""},{"location":"adrs/adr-4/#decision-drivers","title":"Decision Drivers","text":"<p>*</p>"},{"location":"adrs/adr-4/#considered-options","title":"Considered Options","text":"<p>*</p>"},{"location":"adrs/adr-4/#consequences","title":"Consequences","text":"<p>*</p>"},{"location":"adrs/adr-4/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-4/#metadata","title":"Metadata","text":""},{"location":"adrs/adr-5/","title":"[ADR-5] Task-Executor IPC is Managed by Communicator Objects","text":"<p>Date: 2023-12-06</p>"},{"location":"adrs/adr-5/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-5/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>A form (or forms) of inter-process communication needs to be standardized between Task subprocesses and executors.</li> <li>Signals need to be sent potentially bidirectionally.</li> <li>Results need to be retrieved from the Task in a generic manner.</li> </ul>"},{"location":"adrs/adr-5/#decision","title":"Decision","text":"<p><code>Communicator</code> objects which maintain simple <code>read</code> and <code>write</code> mechanisms for <code>Message</code> objects. These latter can contain arbitrary Python objects. <code>Task</code>s do not interact directly with the communicator, but rather through specific instance methods which hide the communicator interfaces. Multiple Communicators can be used in parallel. The same <code>Communicator</code> objects are used identically at the <code>Task</code> and <code>Executor</code> layers - any changes to communication protocols are not transferred to the calling objects.</p>"},{"location":"adrs/adr-5/#decision-drivers","title":"Decision Drivers","text":"<ul> <li><code>Task</code> output needs to be routed to other layers of the software, but the <code>Task</code>s themselves should have no knowledge of where the output ends up.</li> <li>Ideally have the ability to send arbitrary objects (strings, arrays, objects, ...)</li> <li>Ideally not limited by size of the transferred object</li> <li>Communication should be hidden from callers - \"somewhat more declarative than imperative.\"</li> <li>Ability for protocols to be swapped out, or trialled without significant rewrites.</li> <li>Must handle uncontrolled output from \"Third-party\" software as well as \"in-house\" or \"first-party\" communication which is directly managed.</li> </ul>"},{"location":"adrs/adr-5/#considered-options","title":"Considered Options","text":"<ul> <li>Singular specific options:</li> <li>Relying solely on pipes over stdout/stderr<ul> <li>These are already controlled when the Executor opens the <code>subprocess</code></li> <li>Unfortunately, the pipe buffer is limited, and processes may hang when the output is too large (~64k or lower depending on machine)</li> </ul> </li> <li>Using a separate IPC method (e.g. Sockets)<ul> <li>\"Binary\" or \"Third-party\" tasks would have no communication captured at all, and while signalling is not possible in the same way with these tasks, some output must still be captured and routed.</li> </ul> </li> <li>Direct management of multiple communication methods</li> <li>E.g. use a combination of pipes and sockets, directly managed by the <code>Task</code> and <code>Executor</code> layers.</li> </ul>"},{"location":"adrs/adr-5/#communicator-types","title":"Communicator Types","text":"<ul> <li><code>Communicator</code> : Abstract base class - defines interface</li> <li><code>PipeCommunicator</code> : Manages communication through pipes (<code>stderr</code> and <code>stdout</code>)</li> <li><code>SocketCommunicator</code> : Manages communication through Unix sockets</li> </ul>"},{"location":"adrs/adr-5/#consequences","title":"Consequences","text":"<ul> <li>Complexity due to management of (potentially) multiple communication methods</li> <li>Some of this compelxity is isolated, however, to a single object.</li> <li>From the <code>Task</code> and <code>Executor</code> side, IPC is greatly simplified</li> <li>Management is delegated to the <code>Communicator</code></li> <li>Communication is \"pluggable\" -&gt; not limited by the advantages and disadvantages of any single communication method or protocol</li> <li>Arbitrary objects can be sent and received</li> <li>Limits on size or type of object should not be an issue (e.g. large results output can be handled)</li> </ul>"},{"location":"adrs/adr-5/#compliance","title":"Compliance","text":"<ul> <li>Communication is handled in base classes.</li> <li><code>Communicator</code> objects are non-public. Their interfaces (already limited) are handled by simple methods in the base classes of <code>Task</code>s and <code>Executor</code>s.</li> <li>The <code>Communicator</code> should have no need to be directly manipulated by callers (even less so by users)</li> </ul>"},{"location":"adrs/adr-5/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-6/","title":"[ADR-6] Third-party Config Files Managed by Templates Rendered by <code>ThirdPartyTask</code>s","text":"<p>Date: 2024-02-12</p>"},{"location":"adrs/adr-6/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-6/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>While many third-party executables of interest to the LUTE platform can be fully configured via command-line arguments, some also require management of an additional config file.</li> <li>Config files may use a variety of languages and methods. E.g. YAML, TOML, JSON, or even direct management of Python scripts.</li> <li>From the perspective of a generic interface to manage these files this poses a challenge.</li> <li>Ideally all aspects of configuraiton could be managed from the single LUTE configuration file.</li> </ul>"},{"location":"adrs/adr-6/#decision","title":"Decision","text":"<p>Templates will be used for the third party configuration files. A generic interface to heterogenous templates will be provided through a combination of pydantic models and the <code>ThirdPartyTask</code> implementation. The pydantic models will label extra arguments to <code>ThirdPartyTask</code>s as being <code>TemplateParameters</code>. I.e. any extra parameters are considered to be for a templated configuration file. The <code>ThirdPartyTask</code> will find the necessary template and render it if any extra parameters are found. This puts the burden of correct parsing on the template definition itself.</p>"},{"location":"adrs/adr-6/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Need to be able to configure the necessary files from within the LUTE framework.</li> <li>Configuration files take many forms so need a generic interface to disparate file types.</li> <li>Want to maintain as simple a <code>Task</code> interface as possible - but due to the above, need a way of handling multiple output files.</li> <li>Text substiution provides a means to do this.</li> </ul>"},{"location":"adrs/adr-6/#considered-options","title":"Considered Options","text":"<ul> <li>Separate configuration <code>Task</code> to be run before the main <code>ThirdPartyTask</code>.</li> <li>Generate the configuration file in its entirety from within the <code>Task</code>.</li> <li>This removes the simplicity in allowing all <code>ThirdPartyTask</code>s to be run as instances of a single class.</li> </ul>"},{"location":"adrs/adr-6/#consequences","title":"Consequences","text":"<ul> <li>Can configure and run third party tasks which require the use of a configuration file.</li> <li>Must manage templates in addition to the standard configuration parsing code.</li> <li>The templates themselves provide the specific \"programming\" for filling them in. I.e. the Python interface assumes that the template will properly handle the block of parameters it is sent.</li> <li>Due to the above, template errors can be fatal, and appropriate attention to template creation is necessary.</li> <li>Allowing for template parameters in the general configuration file requires accepting the possiblity of extra parameters not defined in the data validation (pydantic) models.</li> <li>Extra parameters are not validated in the same way as standard parameters.</li> <li>We have to assume the template will properly deal with them.</li> </ul>"},{"location":"adrs/adr-6/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-6/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-7/","title":"[ADR-7] <code>Task</code> Configuration is Stored in a Database Managed by <code>Executor</code>s","text":"<p>Date: 2024-02-12</p>"},{"location":"adrs/adr-7/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-7/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>For metadata publishing reasons, need a mechanism to maintain a history of <code>Task</code> parameter configurations.</li> <li>Each <code>Task</code>'s code is designed to be independent of other <code>Task</code>'s aside from code shared by inheritance.</li> <li>Dependencies between <code>Task</code>s are intended to be defined only at the level of workflows.</li> <li>Nonetheless, some <code>Task</code>s may have implicit dependencies on others. E.g. one <code>Task</code> may use the output files of another, and so could benefit from having knowledge of where they were written.</li> </ul>"},{"location":"adrs/adr-7/#decision","title":"Decision","text":"<p>Upon <code>Task</code> completion the managing <code>Executor</code> will write the <code>AnalysisConfig</code> object, including <code>TaskParameters</code>, results and generic configuration information to a database. Some entries from this database can be retrieved to provide default files for <code>TaskParameter</code> fields; however, the <code>Task</code> itself has no knowledge, and does not access to the database.</p>"},{"location":"adrs/adr-7/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Want to reduce explicit dependencies between <code>Task</code>s while allowing information to be shared between them.</li> <li>Have <code>Task</code>-independent IO be managed solely at the <code>Executor</code> level.</li> </ul>"},{"location":"adrs/adr-7/#considered-options","title":"Considered Options","text":"<ul> <li><code>Task</code>s write the database.</li> <li><code>Task</code>s pass information through other mechanisms, such as Airflow.</li> </ul>"},{"location":"adrs/adr-7/#consequences","title":"Consequences","text":"<ul> <li>Requires a database.</li> <li>Additional dependency, although at least one backend can be the standard <code>sqlite</code> which should make everything transferrable.</li> <li>Allows for information to be passed between <code>Task</code>s without any explicit code dependencies/linkages between them.</li> <li>The dependency is still mostly determined by the workflow definition. Default values can be provided by the database if needed.</li> </ul>"},{"location":"adrs/adr-7/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-7/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-8/","title":"[ADR-8] Airflow credentials/authorization requires special launch program","text":"<p>Date: 2024-03-18</p>"},{"location":"adrs/adr-8/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-8/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Airflow is used as the workflow manager.</li> <li>Airflow does not currently support multi-tenancy, and LDAP is not currently supported for authentication.</li> <li>Multiple users will be expected to run the software and thus need to authenticate against the Airflow API.</li> <li>We require a mechanism to control shared credentials for multiple users.</li> <li>The credentials are admin credentials, so we do not want unconstrained access to them.<ul> <li>We want users to run workflows, for instance, but not to have free access to add and remove workflows.</li> </ul> </li> </ul>"},{"location":"adrs/adr-8/#decision","title":"Decision","text":"<p>A closed-source <code>lute_launcher</code> program will be used to run the Airflow launch scripts. This program accesses credentials with the correct permissions. Users should otherwise not have access to the credentials. This will help ensure the credentials can be used by everyone but only to run workflows and not perform restricted admin activities.</p>"},{"location":"adrs/adr-8/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Need shared access to credentials for the purpose of launching jobs.</li> <li>Restricted access to credentials for administrative activities.</li> <li>Ease of use for users</li> <li>Authentication should be automatic - users can not be asked for passwords etc, for jobs that need to run automatically upon data acquisition</li> </ul>"},{"location":"adrs/adr-8/#considered-options","title":"Considered Options","text":"<ul> <li>LDAP - this may be used in the future, but requires backend work outside of our control. We will revisit the implementation arising from this ADR in the future if LDAP is supported. *</li> </ul>"},{"location":"adrs/adr-8/#consequences","title":"Consequences","text":"<ul> <li>Complexity</li> </ul>"},{"location":"adrs/adr-8/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-8/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/adr-9/","title":"[ADR-9] Airflow launch script will run as long lived batch job.","text":"<p>Date: 2024-04-15</p>"},{"location":"adrs/adr-9/#status","title":"Status","text":"<p>Proposed</p>"},{"location":"adrs/adr-9/#context-and-problem-statement","title":"Context and Problem Statement","text":"<ul> <li>Each <code>Task</code> will produce its own log file.</li> <li>Log files from jobs (i.e. DAGs/workflows) run by different users will be in different locations/directories.</li> <li>None of these log files will be accessible from the Web UI of the eLog unless they are available to the initial launch script which starts the workflow.</li> </ul>"},{"location":"adrs/adr-9/#decision","title":"Decision","text":"<p>The Airflow launch script will be a long lived process, running for the duration of the entire DAG. It will provide basic status logging information, e.g. what <code>Task</code>s are running, if they succeed or failed. Additionally, at the end of each <code>Task</code> job, the launch job will collect the log file from that job and append it to its own log.</p> <p>As the Airflow launch script is an entry point used from the eLog, only its log file is available to users using that UI. By converting the launch script into a long-lived monitoring job it allows the log information to be easily accessible.</p> <p>In order to accomplish this, the launch script must be submitted as a batch job, in order to comply with the 30 second timeout imposed by jobs run by the ARP. This necessitates providing an additional wrapper script.</p>"},{"location":"adrs/adr-9/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>Log availability from the eLog.</li> <li>All logs available from a single location.</li> </ul>"},{"location":"adrs/adr-9/#considered-options","title":"Considered Options","text":"<ul> <li>All jobs append to the same initial file, by specifying a log file. (<code>--open-mode=append</code> for SLURM)</li> <li>Having a monitoring job provides the opportunity to include additional information.</li> </ul>"},{"location":"adrs/adr-9/#consequences","title":"Consequences","text":"<ul> <li>There needs to be an additional wrapper script: <code>submit_launch_airflow.sh</code> which submits the <code>launch_airflow.py</code> script (run by <code>lute_launcher</code>) as a batch job.</li> <li>Jobs run by the ARP can not be long-lived - there is a 30 second timeout.</li> <li>The ARP was intended to submit batch jobs - it captures the log file from batch jobs, so running the job directly or submitting as a batch job is equivalent in terms of presenting information to the eLog UI.</li> <li>Another core is used to run the job. Overhead is now two cores - 1 for the monitoring job (<code>launch_airflow.py</code>) and 1 for the <code>Executor</code> process. </li> </ul>"},{"location":"adrs/adr-9/#compliance","title":"Compliance","text":""},{"location":"adrs/adr-9/#metadata","title":"Metadata","text":"<ul> <li>This ADR WILL be revisited during the post-mortem of the first prototype.</li> <li>Compliance section will be updated as prototype evolves.</li> </ul>"},{"location":"adrs/madr_template/","title":"Madr template","text":""},{"location":"adrs/madr_template/#title","title":"Title","text":"<p>{ADR #X : Short description/title of feature/decision}</p> <p>Date:</p>"},{"location":"adrs/madr_template/#status","title":"Status","text":"<p>{Accepted | Proposed | Rejected | Deprecated | Superseded} {If this proposal supersedes another, please indicate so, e.g. \"Status: Accepted, supersedes [ADR-3]\"} {Likewise, if this proposal was superseded, e.g. \"Status: Superseded by [ADR-2]\"}</p>"},{"location":"adrs/madr_template/#context-and-problem-statement","title":"Context and Problem Statement","text":"<p>{Describe the problem context and why this decision has been made/feature implemented.}</p>"},{"location":"adrs/madr_template/#decision","title":"Decision","text":"<p>{Describe how the solution was arrived at in the manner it was. You may use the sections below to help.}</p>"},{"location":"adrs/madr_template/#decision-drivers","title":"Decision Drivers","text":"<ul> <li>{driver 1}</li> <li>{driver 2}</li> </ul>"},{"location":"adrs/madr_template/#considered-options","title":"Considered Options","text":"<ul> <li>{option 1}</li> <li>{option 2}</li> </ul>"},{"location":"adrs/madr_template/#consequences","title":"Consequences","text":"<p>{Short description of anticipated consequences} * {Anticipated consequence 1} * {Anticipated consequence 2}</p>"},{"location":"adrs/madr_template/#compliance","title":"Compliance","text":"<p>{How will the decision/implementation be enforced. How will compliance be validated?}</p>"},{"location":"adrs/madr_template/#metadata","title":"Metadata","text":"<p>{Any additional information to include}</p>"},{"location":"design/database/","title":"LUTE Configuration Database Specification","text":"<p>Date: 2024-02-12 VERSION: v0.1</p>"},{"location":"design/database/#basic-outline","title":"Basic Outline","text":"<ul> <li>The backend database will be sqlite, using the standard Python library.</li> <li>A high-level API is provided, so if needed, the backend database can be changed without affecting <code>Executor</code> level code.</li> <li>One LUTE database is created per working directory for this iteration of the database. Note that this database is independent of any database used by a workflow manager (e.g. Airflow) to manage task execution order.</li> <li>Each database has the following tables:</li> <li>1 table for <code>Executor</code> configuration</li> <li>1 table for general task configuration (i.e., <code>lute.io.config.AnalysisHeader</code>)</li> <li>1 table PER <code>Task</code><ul> <li>Executor and general configuration is shared between <code>Task</code> tables by pointing/linking to the entry ids in the above two tables.</li> <li>Multiple experiments can reside in the same table, although in practice this is unlikely to occur in production as the working directory will most likely change between experiments.</li> </ul> </li> </ul>"},{"location":"design/database/#gen_cfg-table","title":"<code>gen_cfg</code> table","text":"<p>The general configuration table contains entries which may be shared between multiple <code>Task</code>s. The format of the table is:</p> id title experiment run date lute_version task_timeout 2 \"My experiment desc\" \"EXPx00000 1 YYYY/MM/DD 0.1 6000 <p>These parameters are extracted from the <code>TaskParameters</code> object. Each of those contains an <code>AnalysisHeader</code> object stored in the <code>lute_config</code> variable. For a given experimental run, this value will be shared across any <code>Task</code>s that are executed.</p>"},{"location":"design/database/#column-descriptions","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>title</code> Arbitrary description/title of the purpose of analysis. E.g. what kind of experiment is being conducted <code>experiment</code> LCLS Experiment. Can be a placeholder if debugging, etc. <code>run</code> LCLS Acquisition run. Can be a placeholder if debugging, testing, etc. <code>date</code> Date the configuration file was first setup. <code>lute_version</code> Version of the codebase being used to execute <code>Task</code>s. <code>task_timeout</code> The maximum amount of time in seconds that a <code>Task</code> can run before being cancelled."},{"location":"design/database/#exec_cfg-table","title":"<code>exec_cfg</code> table","text":"<p>The <code>Executor</code> table contains information on the environment provided to the <code>Executor</code> for <code>Task</code> execution, the polling interval used for IPC between the <code>Task</code> and <code>Executor</code> and information on the communicator protocols used for IPC. This information can be shared between <code>Task</code>s or between experimental runs, but not necessarily every <code>Task</code> of a given run will use exactly the same <code>Executor</code> configuration and environment.</p> id env poll_interval communicator_desc 2 \"VAR1=val1;VAR2=val2\" 0.1 \"PipeCommunicator...;SocketCommunicator...\""},{"location":"design/database/#column-descriptions_1","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>env</code> Execution environment used by the Executor and by proxy any Tasks submitted by an Executor matching this entry. Environment is stored as a string with variables delimited by \";\" <code>poll_interval</code> Polling interval used for Task monitoring. <code>communicator_desc</code> Description of the Communicators used. <p>NOTE: The <code>env</code> column currently only stores variables related to <code>SLURM</code> or <code>LUTE</code> itself.</p>"},{"location":"design/database/#task-tables","title":"<code>Task</code> tables","text":"<p>For every <code>Task</code> a table of the following format will be created. The exact number of columns will depend on the specific <code>Task</code>, as the number of parameters can vary between them, and each parameter gets its own column. Within a table, multiple experiments and runs can coexist. The experiment and run are not recorded directly. Instead, the first two columns point to the id of entries in the general configuration and <code>Executor</code> tables respectively. The general configuration table entry will contain the experiment and run information.</p> id timestamp gen_cfg_id exec_cfg_id P1 P2 ... Pn result.task_status result.summary result.payload result.impl_schemas valid_flag 2 \"YYYY-MM-DD HH:MM:SS\" 1 1 1 2 ... 3 \"COMPLETED\" \"Summary\" \"XYZ\" \"schema1;schema3;\" 1 3 \"YYYY-MM-DD HH:MM:SS\" 1 1 3 1 ... 4 \"FAILED\" \"Summary\" \"XYZ\" \"schema1;schema3;\" 0 <p>Parameter sets which can be described as nested dictionaries are flattened and then delimited with a <code>.</code> to create column names. Parameters which are lists (or Python tuples, etc.) have a column for each entry with names that include an index (counting from 0). E.g. consider the following dictionary of parameters:</p> <pre><code>param_dict: Dict[str, Any] = {\n    \"a\": {               # First parameter a\n        \"b\": (1, 2),\n        \"c\": 1,\n        # ...\n    },\n    \"a2\": 4,             # Second parameter a2\n    # ...\n}\n</code></pre> <p>The dictionary <code>a</code> will produce columns: <code>a.b[0]</code>, <code>a.b[1]</code>, <code>a.c</code>, and so on.</p>"},{"location":"design/database/#column-descriptions_2","title":"Column descriptions","text":"Column Description <code>id</code> ID of the entry in this table. <code>CURRENT_TIMESTAMP</code> Full timestamp for the entry. <code>gen_cfg_id</code> ID of the entry in the general config table that applies to this <code>Task</code> entry. That table has, e.g., experiment and run number. <code>exec_cfg_id</code> The ID of the entry in the <code>Executor</code> table which applies to this <code>Task</code> entry. <code>P1</code> - <code>Pn</code> The specific parameters of the <code>Task</code>. The <code>P{1..n}</code> are replaced by the actual parameter names. <code>result.task_status</code> Reported exit status of the <code>Task</code>. Note that the output may still be labeled invalid by the <code>valid_flag</code> (see below). <code>result.summary</code> Short text summary of the <code>Task</code> result. This is provided by the <code>Task</code>, or sometimes the <code>Executor</code>. <code>result.payload</code> Full description of result from the <code>Task</code>. If the object is incompatible with the database, will instead be a pointer to where it can be found. <code>result.impl_schemas</code> A string of semi-colon separated schema(s) implemented by the <code>Task</code>. Schemas describe conceptually the type output the <code>Task</code> produces. <code>valid_flag</code> A boolean flag for whether the result is valid. May be <code>0</code> (False) if e.g., data is missing, or corrupt, or reported status is failed. <p>NOTE: The <code>result.payload</code> may be distinct from the output files. Payloads can be specified in terms of output parameters, specific output files, or are an optional summary of the results provided by the <code>Task</code>. E.g. this may include graphical descriptions of results (plots, figures, etc.). In many cases, however, the output files will most likely be pointed to by a parameter in one of the columns <code>P{1...n}</code> - if properly specified in the <code>TaskParameters</code> model the value of this output parameter will be replicated in the <code>result.payload</code> column as well..</p>"},{"location":"design/database/#api","title":"API","text":"<p>This API is intended to be used at the <code>Executor</code> level, with some calls intended to provide default values for Pydantic models. Utilities for reading and inspecting the database outside of normal <code>Task</code> execution are addressed in the following subheader.</p>"},{"location":"design/database/#write","title":"Write","text":"<ul> <li><code>record_analysis_db(cfg: DescribedAnalysis) -&gt; None</code>: Writes the configuration to the backend database.</li> <li>...</li> <li>...</li> </ul>"},{"location":"design/database/#read","title":"Read","text":"<ul> <li><code>read_latest_db_entry(db_dir: str, task_name: str, param: str) -&gt; Any</code>: Retrieve the most recent entry from a database for a specific Task.</li> <li>...</li> <li>...</li> </ul>"},{"location":"design/database/#utilities","title":"Utilities","text":""},{"location":"design/database/#scripts","title":"Scripts","text":"<ul> <li><code>invalidate_entry</code>: Marks a database entry as invalid. Common reason to use this is if data has been deleted, or found to be corrupted.</li> <li>...</li> </ul>"},{"location":"design/database/#tui-and-gui","title":"TUI and GUI","text":"<ul> <li><code>dbview</code>: TUI for database inspection. Read only.</li> <li>...</li> </ul>"},{"location":"source/managed_tasks/","title":"managed_tasks","text":"<p>LUTE Managed Tasks.</p> <p>Executor-managed Tasks with specific environment specifications are defined here.</p>"},{"location":"source/managed_tasks/#managed_tasks.BinaryErrTester","title":"<code>BinaryErrTester = Executor('TestBinaryErr')</code>  <code>module-attribute</code>","text":"<p>Runs a test of a third-party task that fails.</p>"},{"location":"source/managed_tasks/#managed_tasks.BinaryTester","title":"<code>BinaryTester: Executor = Executor('TestBinary')</code>  <code>module-attribute</code>","text":"<p>Runs a basic test of a multi-threaded third-party Task.</p>"},{"location":"source/managed_tasks/#managed_tasks.CrystFELIndexer","title":"<code>CrystFELIndexer: Executor = Executor('IndexCrystFEL')</code>  <code>module-attribute</code>","text":"<p>Runs crystallographic indexing using CrystFEL.</p>"},{"location":"source/managed_tasks/#managed_tasks.DimpleSolver","title":"<code>DimpleSolver: Executor = Executor('DimpleSolve')</code>  <code>module-attribute</code>","text":"<p>Solves a crystallographic structure using molecular replacement.</p>"},{"location":"source/managed_tasks/#managed_tasks.HKLComparer","title":"<code>HKLComparer: Executor = Executor('CompareHKL')</code>  <code>module-attribute</code>","text":"<p>Runs analysis on merge results for statistics/figures of merit..</p>"},{"location":"source/managed_tasks/#managed_tasks.HKLManipulator","title":"<code>HKLManipulator: Executor = Executor('ManipulateHKL')</code>  <code>module-attribute</code>","text":"<p>Performs format conversions (among other things) of merge results.</p>"},{"location":"source/managed_tasks/#managed_tasks.MultiNodeCommunicationTester","title":"<code>MultiNodeCommunicationTester: MPIExecutor = MPIExecutor('TestMultiNodeCommunication')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm communication works between multiple nodes.</p>"},{"location":"source/managed_tasks/#managed_tasks.PartialatorMerger","title":"<code>PartialatorMerger: Executor = Executor('MergePartialator')</code>  <code>module-attribute</code>","text":"<p>Runs crystallographic merging using CrystFEL's partialator.</p>"},{"location":"source/managed_tasks/#managed_tasks.PeakFinderPsocake","title":"<code>PeakFinderPsocake: Executor = Executor('FindPeaksPsocake')</code>  <code>module-attribute</code>","text":"<p>Performs Bragg peak finding using psocake - DEPRECATED.</p>"},{"location":"source/managed_tasks/#managed_tasks.PeakFinderPyAlgos","title":"<code>PeakFinderPyAlgos: MPIExecutor = MPIExecutor('FindPeaksPyAlgos')</code>  <code>module-attribute</code>","text":"<p>Performs Bragg peak finding using the PyAlgos algorithm.</p>"},{"location":"source/managed_tasks/#managed_tasks.ReadTester","title":"<code>ReadTester: Executor = Executor('TestReadOutput')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm database reading.</p>"},{"location":"source/managed_tasks/#managed_tasks.SHELXCRunner","title":"<code>SHELXCRunner: Executor = Executor('RunSHELXC')</code>  <code>module-attribute</code>","text":"<p>Runs CCP4 SHELXC - needed for crystallographic phasing.</p>"},{"location":"source/managed_tasks/#managed_tasks.SmallDataProducer","title":"<code>SmallDataProducer: Executor = Executor('SubmitSMD')</code>  <code>module-attribute</code>","text":"<p>Runs the production of a smalldata HDF5 file.</p>"},{"location":"source/managed_tasks/#managed_tasks.SocketTester","title":"<code>SocketTester: Executor = Executor('TestSocket')</code>  <code>module-attribute</code>","text":"<p>Runs a test of socket-based communication.</p>"},{"location":"source/managed_tasks/#managed_tasks.StreamFileConcatenator","title":"<code>StreamFileConcatenator: Executor = Executor('ConcatenateStreamFiles')</code>  <code>module-attribute</code>","text":"<p>Concatenates results from crystallographic indexing of multiple runs.</p>"},{"location":"source/managed_tasks/#managed_tasks.Tester","title":"<code>Tester: Executor = Executor('Test')</code>  <code>module-attribute</code>","text":"<p>Runs a basic test of a first-party Task.</p>"},{"location":"source/managed_tasks/#managed_tasks.WriteTester","title":"<code>WriteTester: Executor = Executor('TestWriteOutput')</code>  <code>module-attribute</code>","text":"<p>Runs a test to confirm database writing.</p>"},{"location":"source/execution/debug_utils/","title":"debug_utils","text":"<p>Functions to assist in debugging execution of LUTE.</p> <p>Functions:</p> Name Description <code>LUTE_DEBUG_EXIT</code> <p>str, str_dump: Optional[str]): Exits the program if the provided <code>env_var</code> is set. Optionally, also prints a message if provided.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Error raised by pydantic during data validation. (From Pydantic)</p>"},{"location":"source/execution/executor/","title":"executor","text":"<p>Base classes and functions for handling <code>Task</code> execution.</p> <p>Executors run a <code>Task</code> as a subprocess and handle all communication with other services, e.g., the eLog. They accept specific handlers to override default stream parsing.</p> <p>Event handlers/hooks are implemented as standalone functions which can be added to an Executor.</p> <p>Classes:</p> Name Description <code>AnalysisConfig</code> <p>Data class for holding a managed Task's configuration.</p> <code>BaseExecutor</code> <p>Abstract base class from which all Executors are derived.</p> <code>Executor</code> <p>Default Executor implementing all basic functionality and IPC.</p> <code>BinaryExecutor</code> <p>Can execute any arbitrary binary/command as a managed task within the framework provided by LUTE.</p>"},{"location":"source/execution/executor/#execution.executor--exceptions","title":"Exceptions","text":""},{"location":"source/execution/executor/#execution.executor.BaseExecutor","title":"<code>BaseExecutor</code>","text":"<p>               Bases: <code>ABC</code></p> <p>ABC to manage Task execution and communication with user services.</p> <p>When running in a workflow, \"tasks\" (not the class instances) are submitted as <code>Executors</code>. The Executor manages environment setup, the actual Task submission, and communication regarding Task results and status with third party services like the eLog.</p> <p>Attributes:</p> <p>Methods:</p> Name Description <code>add_hook</code> <p>str, hook: Callable[[None], None]) -&gt; None: Create a new hook to be called each time a specific event occurs.</p> <code>add_default_hooks</code> <p>Populate the event hooks with the default functions.</p> <code>update_environment</code> <p>Dict[str, str], update_path: str): Update the environment that is passed to the Task subprocess.</p> <code>execute_task</code> <p>Run the task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class BaseExecutor(ABC):\n    \"\"\"ABC to manage Task execution and communication with user services.\n\n    When running in a workflow, \"tasks\" (not the class instances) are submitted\n    as `Executors`. The Executor manages environment setup, the actual Task\n    submission, and communication regarding Task results and status with third\n    party services like the eLog.\n\n    Attributes:\n\n    Methods:\n        add_hook(event: str, hook: Callable[[None], None]) -&gt; None: Create a\n            new hook to be called each time a specific event occurs.\n\n        add_default_hooks() -&gt; None: Populate the event hooks with the default\n            functions.\n\n        update_environment(env: Dict[str, str], update_path: str): Update the\n            environment that is passed to the Task subprocess.\n\n        execute_task(): Run the task as a subprocess.\n    \"\"\"\n\n    class Hooks:\n        \"\"\"A container class for the Executor's event hooks.\n\n        There is a corresponding function (hook) for each event/signal. Each\n        function takes two parameters - a reference to the Executor (self) and\n        a reference to the Message (msg) which includes the corresponding\n        signal.\n        \"\"\"\n\n        def no_pickle_mode(self: Self, msg: Message): ...\n\n        def task_started(self: Self, msg: Message): ...\n\n        def task_failed(self: Self, msg: Message): ...\n\n        def task_stopped(self: Self, msg: Message): ...\n\n        def task_done(self: Self, msg: Message): ...\n\n        def task_cancelled(self: Self, msg: Message): ...\n\n        def task_result(self: Self, msg: Message): ...\n\n    def __init__(\n        self,\n        task_name: str,\n        communicators: List[Communicator],\n        poll_interval: float = 0.05,\n    ) -&gt; None:\n        \"\"\"The Executor will manage the subprocess in which `task_name` is run.\n\n        Args:\n            task_name (str): The name of the Task to be submitted. Must match\n                the Task's class name exactly. The parameter specification must\n                also be in a properly named model to be identified.\n\n            communicators (List[Communicator]): A list of one or more\n                communicators which manage information flow to/from the Task.\n                Subclasses may have different defaults, and new functionality\n                can be introduced by composing Executors with communicators.\n\n            poll_interval (float): Time to wait between reading/writing to the\n                managed subprocess. In seconds.\n        \"\"\"\n        result: TaskResult = TaskResult(\n            task_name=task_name, task_status=TaskStatus.PENDING, summary=\"\", payload=\"\"\n        )\n        task_parameters: Optional[TaskParameters] = None\n        task_env: Dict[str, str] = os.environ.copy()\n        self._communicators: List[Communicator] = communicators\n        communicator_desc: List[str] = []\n        for comm in self._communicators:\n            comm.stage_communicator()\n            communicator_desc.append(str(comm))\n\n        self._analysis_desc: DescribedAnalysis = DescribedAnalysis(\n            task_result=result,\n            task_parameters=task_parameters,\n            task_env=task_env,\n            poll_interval=poll_interval,\n            communicator_desc=communicator_desc,\n        )\n\n    def add_hook(self, event: str, hook: Callable[[Self, Message], None]) -&gt; None:\n        \"\"\"Add a new hook.\n\n        Each hook is a function called any time the Executor receives a signal\n        for a particular event, e.g. Task starts, Task ends, etc. Calling this\n        method will remove any hook that currently exists for the event. I.e.\n        only one hook can be called per event at a time. Creating hooks for\n        events which do not exist is not allowed.\n\n        Args:\n            event (str): The event for which the hook will be called.\n\n            hook (Callable[[None], None]) The function to be called during each\n                occurrence of the event.\n        \"\"\"\n        if event.upper() in LUTE_SIGNALS:\n            setattr(self.Hooks, event.lower(), hook)\n\n    @abstractmethod\n    def add_default_hooks(self) -&gt; None:\n        \"\"\"Populate the set of default event hooks.\"\"\"\n\n        ...\n\n    def update_environment(\n        self, env: Dict[str, str], update_path: str = \"prepend\"\n    ) -&gt; None:\n        \"\"\"Update the stored set of environment variables.\n\n        These are passed to the subprocess to setup its environment.\n\n        Args:\n            env (Dict[str, str]): A dictionary of \"VAR\":\"VALUE\" pairs of\n                environment variables to be added to the subprocess environment.\n                If any variables already exist, the new variables will\n                overwrite them (except PATH, see below).\n\n            update_path (str): If PATH is present in the new set of variables,\n                this argument determines how the old PATH is dealt with. There\n                are three options:\n                * \"prepend\" : The new PATH values are prepended to the old ones.\n                * \"append\" : The new PATH values are appended to the old ones.\n                * \"overwrite\" : The old PATH is overwritten by the new one.\n                \"prepend\" is the default option. If PATH is not present in the\n                current environment, the new PATH is used without modification.\n        \"\"\"\n        if \"PATH\" in env:\n            sep: str = os.pathsep\n            if update_path == \"prepend\":\n                env[\"PATH\"] = (\n                    f\"{env['PATH']}{sep}{self._analysis_desc.task_env['PATH']}\"\n                )\n            elif update_path == \"append\":\n                env[\"PATH\"] = (\n                    f\"{self._analysis_desc.task_env['PATH']}{sep}{env['PATH']}\"\n                )\n            elif update_path == \"overwrite\":\n                pass\n            else:\n                raise ValueError(\n                    (\n                        f\"{update_path} is not a valid option for `update_path`!\"\n                        \" Options are: prepend, append, overwrite.\"\n                    )\n                )\n        os.environ.update(env)\n        self._analysis_desc.task_env.update(env)\n\n    def shell_source(self, env: str) -&gt; None:\n        \"\"\"Source a script.\n\n        Unlike `update_environment` this method sources a new file.\n\n        Args:\n            env (str): Path to the script to source.\n        \"\"\"\n        import sys\n\n        if not os.path.exists(env):\n            logger.info(f\"Cannot source environment from {env}!\")\n            return\n\n        script: str = (\n            f\"set -a\\n\"\n            f'source \"{env}\" &gt;/dev/null\\n'\n            f'{sys.executable} -c \"import os; print(dict(os.environ))\"\\n'\n        )\n        logger.info(f\"Sourcing file {env}\")\n        o, e = subprocess.Popen(\n            [\"bash\", \"-c\", script], stdout=subprocess.PIPE\n        ).communicate()\n        new_environment: Dict[str, str] = eval(o)\n        self._analysis_desc.task_env = new_environment\n\n    def _pre_task(self) -&gt; None:\n        \"\"\"Any actions to be performed before task submission.\n\n        This method may or may not be used by subclasses. It may be useful\n        for logging etc.\n        \"\"\"\n        # This prevents the Executors in managed_tasks.py from all acquiring\n        # resources like sockets.\n        for communicator in self._communicators:\n            communicator.delayed_setup()\n            # Not great, but experience shows we need a bit of time to setup\n            # network.\n            time.sleep(0.1)\n        # Propagate any env vars setup by Communicators - only update LUTE_ vars\n        tmp: Dict[str, str] = {\n            key: os.environ[key] for key in os.environ if \"LUTE_\" in key\n        }\n        self._analysis_desc.task_env.update(tmp)\n\n    def _submit_task(self, cmd: str) -&gt; subprocess.Popen:\n        proc: subprocess.Popen = subprocess.Popen(\n            cmd.split(),\n            stdout=subprocess.PIPE,\n            stderr=subprocess.PIPE,\n            env=self._analysis_desc.task_env,\n        )\n        os.set_blocking(proc.stdout.fileno(), False)\n        os.set_blocking(proc.stderr.fileno(), False)\n        return proc\n\n    @abstractmethod\n    def _task_loop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Actions to perform while the Task is running.\n\n        This function is run in the body of a loop until the Task signals\n        that its finished.\n        \"\"\"\n        ...\n\n    @abstractmethod\n    def _finalize_task(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Any actions to be performed after the Task has ended.\n\n        Examples include a final clearing of the pipes, retrieving results,\n        reporting to third party services, etc.\n        \"\"\"\n        ...\n\n    def _submit_cmd(self, executable_path: str, params: str) -&gt; str:\n        \"\"\"Return a formatted command for launching Task subprocess.\n\n        May be overridden by subclasses.\n\n        Args:\n            executable_path (str): Path to the LUTE subprocess script.\n\n            params (str): String of formatted command-line arguments.\n\n        Returns:\n            cmd (str): Appropriately formatted command for this Executor.\n        \"\"\"\n        cmd: str = \"\"\n        if __debug__:\n            cmd = f\"python -B {executable_path} {params}\"\n        else:\n            cmd = f\"python -OB {executable_path} {params}\"\n\n        return cmd\n\n    def execute_task(self) -&gt; None:\n        \"\"\"Run the requested Task as a subprocess.\"\"\"\n        self._pre_task()\n        lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n        if lute_path is None:\n            logger.debug(\"Absolute path to subprocess_task.py not found.\")\n            lute_path = os.path.abspath(f\"{os.path.dirname(__file__)}/../..\")\n            self.update_environment({\"LUTE_PATH\": lute_path})\n        executable_path: str = f\"{lute_path}/subprocess_task.py\"\n        config_path: str = self._analysis_desc.task_env[\"LUTE_CONFIGPATH\"]\n        params: str = f\"-c {config_path} -t {self._analysis_desc.task_result.task_name}\"\n\n        cmd: str = self._submit_cmd(executable_path, params)\n        proc: subprocess.Popen = self._submit_task(cmd)\n\n        while self._task_is_running(proc):\n            self._task_loop(proc)\n            time.sleep(self._analysis_desc.poll_interval)\n\n        os.set_blocking(proc.stdout.fileno(), True)\n        os.set_blocking(proc.stderr.fileno(), True)\n\n        self._finalize_task(proc)\n        proc.stdout.close()\n        proc.stderr.close()\n        proc.wait()\n        if ret := proc.returncode:\n            logger.info(f\"Task failed with return code: {ret}\")\n            self._analysis_desc.task_result.task_status = TaskStatus.FAILED\n            self.Hooks.task_failed(self, msg=Message())\n        elif self._analysis_desc.task_result.task_status == TaskStatus.RUNNING:\n            # Ret code is 0, no exception was thrown, task forgot to set status\n            self._analysis_desc.task_result.task_status = TaskStatus.COMPLETED\n            logger.debug(f\"Task did not change from RUNNING status. Assume COMPLETED.\")\n            self.Hooks.task_done(self, msg=Message())\n        self._store_configuration()\n        for comm in self._communicators:\n            comm.clear_communicator()\n\n        if self._analysis_desc.task_result.task_status == TaskStatus.FAILED:\n            logger.info(\"Exiting after Task failure. Result recorded.\")\n            sys.exit(-1)\n\n        self.process_results()\n\n    def _store_configuration(self) -&gt; None:\n        \"\"\"Store configuration and results in the LUTE database.\"\"\"\n        record_analysis_db(copy.deepcopy(self._analysis_desc))\n\n    def _task_is_running(self, proc: subprocess.Popen) -&gt; bool:\n        \"\"\"Whether a subprocess is running.\n\n        Args:\n            proc (subprocess.Popen): The subprocess to determine the run status\n                of.\n\n        Returns:\n            bool: Is the subprocess task running.\n        \"\"\"\n        # Add additional conditions - don't want to exit main loop\n        # if only stopped\n        task_status: TaskStatus = self._analysis_desc.task_result.task_status\n        is_running: bool = task_status != TaskStatus.COMPLETED\n        is_running &amp;= task_status != TaskStatus.CANCELLED\n        is_running &amp;= task_status != TaskStatus.TIMEDOUT\n        return proc.poll() is None and is_running\n\n    def _stop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Stop the Task subprocess.\"\"\"\n        os.kill(proc.pid, signal.SIGTSTP)\n        self._analysis_desc.task_result.task_status = TaskStatus.STOPPED\n\n    def _continue(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Resume a stopped Task subprocess.\"\"\"\n        os.kill(proc.pid, signal.SIGCONT)\n        self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n\n    def _set_result_from_parameters(self) -&gt; None:\n        \"\"\"Use TaskParameters object to set TaskResult fields.\n\n        A result may be defined in terms of specific parameters. This is most\n        useful for ThirdPartyTasks which would not otherwise have an easy way of\n        reporting what the TaskResult is. There are two options for specifying\n        results from parameters:\n            1. A single parameter (Field) of the model has an attribute\n               `is_result`. This is a bool indicating that this parameter points\n               to a result. E.g. a parameter `output` may set `is_result=True`.\n            2. The `TaskParameters.Config` has a `result_from_params` attribute.\n               This is an appropriate option if the result is determinable for\n               the Task, but it is not easily defined by a single parameter. The\n               TaskParameters.Config.result_from_param can be set by a custom\n               validator, e.g. to combine the values of multiple parameters into\n               a single result. E.g. an `out_dir` and `out_file` parameter used\n               together specify the result. Currently only string specifiers are\n               supported.\n\n        A TaskParameters object specifies that it contains information about the\n        result by setting a single config option:\n                        TaskParameters.Config.set_result=True\n        In general, this method should only be called when the above condition is\n        met, however, there are minimal checks in it as well.\n        \"\"\"\n        # This method shouldn't be called unless appropriate\n        # But we will add extra guards here\n        if self._analysis_desc.task_parameters is None:\n            logger.debug(\n                \"Cannot set result from TaskParameters. TaskParameters is None!\"\n            )\n            return\n        if (\n            not hasattr(self._analysis_desc.task_parameters.Config, \"set_result\")\n            or not self._analysis_desc.task_parameters.Config.set_result\n        ):\n            logger.debug(\n                \"Cannot set result from TaskParameters. `set_result` not specified!\"\n            )\n            return\n\n        # First try to set from result_from_params (faster)\n        if self._analysis_desc.task_parameters.Config.result_from_params is not None:\n            result_from_params: str = (\n                self._analysis_desc.task_parameters.Config.result_from_params\n            )\n            logger.info(f\"TaskResult specified as {result_from_params}.\")\n            self._analysis_desc.task_result.payload = result_from_params\n        else:\n            # Iterate parameters to find the one that is the result\n            schema: Dict[str, Any] = self._analysis_desc.task_parameters.schema()\n            for param, value in self._analysis_desc.task_parameters.dict().items():\n                param_attrs: Dict[str, Any] = schema[\"properties\"][param]\n                if \"is_result\" in param_attrs:\n                    is_result: bool = param_attrs[\"is_result\"]\n                    if isinstance(is_result, bool) and is_result:\n                        logger.info(f\"TaskResult specified as {value}.\")\n                        self._analysis_desc.task_result.payload = value\n                    else:\n                        logger.debug(\n                            (\n                                f\"{param} specified as result! But specifier is of \"\n                                f\"wrong type: {type(is_result)}!\"\n                            )\n                        )\n                    break  # We should only have 1 result-like parameter!\n\n        # If we get this far and haven't changed the payload we should complain\n        if self._analysis_desc.task_result.payload == \"\":\n            task_name: str = self._analysis_desc.task_result.task_name\n            logger.debug(\n                (\n                    f\"{task_name} specified result be set from {task_name}Parameters,\"\n                    \" but no result provided! Check model definition!\"\n                )\n            )\n        # Now check for impl_schemas and pass to result.impl_schemas\n        # Currently unused\n        impl_schemas: Optional[str] = (\n            self._analysis_desc.task_parameters.Config.impl_schemas\n        )\n        self._analysis_desc.task_result.impl_schemas = impl_schemas\n        # If we set_result but didn't get schema information we should complain\n        if self._analysis_desc.task_result.impl_schemas is None:\n            task_name: str = self._analysis_desc.task_result.task_name\n            logger.debug(\n                (\n                    f\"{task_name} specified result be set from {task_name}Parameters,\"\n                    \" but no schema provided! Check model definition!\"\n                )\n            )\n\n    def process_results(self) -&gt; None:\n        \"\"\"Perform any necessary steps to process TaskResults object.\n\n        Processing will depend on subclass. Examples of steps include, moving\n        files, converting file formats, compiling plots/figures into an HTML\n        file, etc.\n        \"\"\"\n        self._process_results()\n\n    @abstractmethod\n    def _process_results(self) -&gt; None: ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.Hooks","title":"<code>Hooks</code>","text":"<p>A container class for the Executor's event hooks.</p> <p>There is a corresponding function (hook) for each event/signal. Each function takes two parameters - a reference to the Executor (self) and a reference to the Message (msg) which includes the corresponding signal.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class Hooks:\n    \"\"\"A container class for the Executor's event hooks.\n\n    There is a corresponding function (hook) for each event/signal. Each\n    function takes two parameters - a reference to the Executor (self) and\n    a reference to the Message (msg) which includes the corresponding\n    signal.\n    \"\"\"\n\n    def no_pickle_mode(self: Self, msg: Message): ...\n\n    def task_started(self: Self, msg: Message): ...\n\n    def task_failed(self: Self, msg: Message): ...\n\n    def task_stopped(self: Self, msg: Message): ...\n\n    def task_done(self: Self, msg: Message): ...\n\n    def task_cancelled(self: Self, msg: Message): ...\n\n    def task_result(self: Self, msg: Message): ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.__init__","title":"<code>__init__(task_name, communicators, poll_interval=0.05)</code>","text":"<p>The Executor will manage the subprocess in which <code>task_name</code> is run.</p> <p>Parameters:</p> Name Type Description Default <code>task_name</code> <code>str</code> <p>The name of the Task to be submitted. Must match the Task's class name exactly. The parameter specification must also be in a properly named model to be identified.</p> required <code>communicators</code> <code>List[Communicator]</code> <p>A list of one or more communicators which manage information flow to/from the Task. Subclasses may have different defaults, and new functionality can be introduced by composing Executors with communicators.</p> required <code>poll_interval</code> <code>float</code> <p>Time to wait between reading/writing to the managed subprocess. In seconds.</p> <code>0.05</code> Source code in <code>lute/execution/executor.py</code> <pre><code>def __init__(\n    self,\n    task_name: str,\n    communicators: List[Communicator],\n    poll_interval: float = 0.05,\n) -&gt; None:\n    \"\"\"The Executor will manage the subprocess in which `task_name` is run.\n\n    Args:\n        task_name (str): The name of the Task to be submitted. Must match\n            the Task's class name exactly. The parameter specification must\n            also be in a properly named model to be identified.\n\n        communicators (List[Communicator]): A list of one or more\n            communicators which manage information flow to/from the Task.\n            Subclasses may have different defaults, and new functionality\n            can be introduced by composing Executors with communicators.\n\n        poll_interval (float): Time to wait between reading/writing to the\n            managed subprocess. In seconds.\n    \"\"\"\n    result: TaskResult = TaskResult(\n        task_name=task_name, task_status=TaskStatus.PENDING, summary=\"\", payload=\"\"\n    )\n    task_parameters: Optional[TaskParameters] = None\n    task_env: Dict[str, str] = os.environ.copy()\n    self._communicators: List[Communicator] = communicators\n    communicator_desc: List[str] = []\n    for comm in self._communicators:\n        comm.stage_communicator()\n        communicator_desc.append(str(comm))\n\n    self._analysis_desc: DescribedAnalysis = DescribedAnalysis(\n        task_result=result,\n        task_parameters=task_parameters,\n        task_env=task_env,\n        poll_interval=poll_interval,\n        communicator_desc=communicator_desc,\n    )\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.add_default_hooks","title":"<code>add_default_hooks()</code>  <code>abstractmethod</code>","text":"<p>Populate the set of default event hooks.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>@abstractmethod\ndef add_default_hooks(self) -&gt; None:\n    \"\"\"Populate the set of default event hooks.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.add_hook","title":"<code>add_hook(event, hook)</code>","text":"<p>Add a new hook.</p> <p>Each hook is a function called any time the Executor receives a signal for a particular event, e.g. Task starts, Task ends, etc. Calling this method will remove any hook that currently exists for the event. I.e. only one hook can be called per event at a time. Creating hooks for events which do not exist is not allowed.</p> <p>Parameters:</p> Name Type Description Default <code>event</code> <code>str</code> <p>The event for which the hook will be called.</p> required Source code in <code>lute/execution/executor.py</code> <pre><code>def add_hook(self, event: str, hook: Callable[[Self, Message], None]) -&gt; None:\n    \"\"\"Add a new hook.\n\n    Each hook is a function called any time the Executor receives a signal\n    for a particular event, e.g. Task starts, Task ends, etc. Calling this\n    method will remove any hook that currently exists for the event. I.e.\n    only one hook can be called per event at a time. Creating hooks for\n    events which do not exist is not allowed.\n\n    Args:\n        event (str): The event for which the hook will be called.\n\n        hook (Callable[[None], None]) The function to be called during each\n            occurrence of the event.\n    \"\"\"\n    if event.upper() in LUTE_SIGNALS:\n        setattr(self.Hooks, event.lower(), hook)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.execute_task","title":"<code>execute_task()</code>","text":"<p>Run the requested Task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def execute_task(self) -&gt; None:\n    \"\"\"Run the requested Task as a subprocess.\"\"\"\n    self._pre_task()\n    lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n    if lute_path is None:\n        logger.debug(\"Absolute path to subprocess_task.py not found.\")\n        lute_path = os.path.abspath(f\"{os.path.dirname(__file__)}/../..\")\n        self.update_environment({\"LUTE_PATH\": lute_path})\n    executable_path: str = f\"{lute_path}/subprocess_task.py\"\n    config_path: str = self._analysis_desc.task_env[\"LUTE_CONFIGPATH\"]\n    params: str = f\"-c {config_path} -t {self._analysis_desc.task_result.task_name}\"\n\n    cmd: str = self._submit_cmd(executable_path, params)\n    proc: subprocess.Popen = self._submit_task(cmd)\n\n    while self._task_is_running(proc):\n        self._task_loop(proc)\n        time.sleep(self._analysis_desc.poll_interval)\n\n    os.set_blocking(proc.stdout.fileno(), True)\n    os.set_blocking(proc.stderr.fileno(), True)\n\n    self._finalize_task(proc)\n    proc.stdout.close()\n    proc.stderr.close()\n    proc.wait()\n    if ret := proc.returncode:\n        logger.info(f\"Task failed with return code: {ret}\")\n        self._analysis_desc.task_result.task_status = TaskStatus.FAILED\n        self.Hooks.task_failed(self, msg=Message())\n    elif self._analysis_desc.task_result.task_status == TaskStatus.RUNNING:\n        # Ret code is 0, no exception was thrown, task forgot to set status\n        self._analysis_desc.task_result.task_status = TaskStatus.COMPLETED\n        logger.debug(f\"Task did not change from RUNNING status. Assume COMPLETED.\")\n        self.Hooks.task_done(self, msg=Message())\n    self._store_configuration()\n    for comm in self._communicators:\n        comm.clear_communicator()\n\n    if self._analysis_desc.task_result.task_status == TaskStatus.FAILED:\n        logger.info(\"Exiting after Task failure. Result recorded.\")\n        sys.exit(-1)\n\n    self.process_results()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.process_results","title":"<code>process_results()</code>","text":"<p>Perform any necessary steps to process TaskResults object.</p> <p>Processing will depend on subclass. Examples of steps include, moving files, converting file formats, compiling plots/figures into an HTML file, etc.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def process_results(self) -&gt; None:\n    \"\"\"Perform any necessary steps to process TaskResults object.\n\n    Processing will depend on subclass. Examples of steps include, moving\n    files, converting file formats, compiling plots/figures into an HTML\n    file, etc.\n    \"\"\"\n    self._process_results()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.shell_source","title":"<code>shell_source(env)</code>","text":"<p>Source a script.</p> <p>Unlike <code>update_environment</code> this method sources a new file.</p> <p>Parameters:</p> Name Type Description Default <code>env</code> <code>str</code> <p>Path to the script to source.</p> required Source code in <code>lute/execution/executor.py</code> <pre><code>def shell_source(self, env: str) -&gt; None:\n    \"\"\"Source a script.\n\n    Unlike `update_environment` this method sources a new file.\n\n    Args:\n        env (str): Path to the script to source.\n    \"\"\"\n    import sys\n\n    if not os.path.exists(env):\n        logger.info(f\"Cannot source environment from {env}!\")\n        return\n\n    script: str = (\n        f\"set -a\\n\"\n        f'source \"{env}\" &gt;/dev/null\\n'\n        f'{sys.executable} -c \"import os; print(dict(os.environ))\"\\n'\n    )\n    logger.info(f\"Sourcing file {env}\")\n    o, e = subprocess.Popen(\n        [\"bash\", \"-c\", script], stdout=subprocess.PIPE\n    ).communicate()\n    new_environment: Dict[str, str] = eval(o)\n    self._analysis_desc.task_env = new_environment\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.BaseExecutor.update_environment","title":"<code>update_environment(env, update_path='prepend')</code>","text":"<p>Update the stored set of environment variables.</p> <p>These are passed to the subprocess to setup its environment.</p> <p>Parameters:</p> Name Type Description Default <code>env</code> <code>Dict[str, str]</code> <p>A dictionary of \"VAR\":\"VALUE\" pairs of environment variables to be added to the subprocess environment. If any variables already exist, the new variables will overwrite them (except PATH, see below).</p> required <code>update_path</code> <code>str</code> <p>If PATH is present in the new set of variables, this argument determines how the old PATH is dealt with. There are three options: * \"prepend\" : The new PATH values are prepended to the old ones. * \"append\" : The new PATH values are appended to the old ones. * \"overwrite\" : The old PATH is overwritten by the new one. \"prepend\" is the default option. If PATH is not present in the current environment, the new PATH is used without modification.</p> <code>'prepend'</code> Source code in <code>lute/execution/executor.py</code> <pre><code>def update_environment(\n    self, env: Dict[str, str], update_path: str = \"prepend\"\n) -&gt; None:\n    \"\"\"Update the stored set of environment variables.\n\n    These are passed to the subprocess to setup its environment.\n\n    Args:\n        env (Dict[str, str]): A dictionary of \"VAR\":\"VALUE\" pairs of\n            environment variables to be added to the subprocess environment.\n            If any variables already exist, the new variables will\n            overwrite them (except PATH, see below).\n\n        update_path (str): If PATH is present in the new set of variables,\n            this argument determines how the old PATH is dealt with. There\n            are three options:\n            * \"prepend\" : The new PATH values are prepended to the old ones.\n            * \"append\" : The new PATH values are appended to the old ones.\n            * \"overwrite\" : The old PATH is overwritten by the new one.\n            \"prepend\" is the default option. If PATH is not present in the\n            current environment, the new PATH is used without modification.\n    \"\"\"\n    if \"PATH\" in env:\n        sep: str = os.pathsep\n        if update_path == \"prepend\":\n            env[\"PATH\"] = (\n                f\"{env['PATH']}{sep}{self._analysis_desc.task_env['PATH']}\"\n            )\n        elif update_path == \"append\":\n            env[\"PATH\"] = (\n                f\"{self._analysis_desc.task_env['PATH']}{sep}{env['PATH']}\"\n            )\n        elif update_path == \"overwrite\":\n            pass\n        else:\n            raise ValueError(\n                (\n                    f\"{update_path} is not a valid option for `update_path`!\"\n                    \" Options are: prepend, append, overwrite.\"\n                )\n            )\n    os.environ.update(env)\n    self._analysis_desc.task_env.update(env)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator","title":"<code>Communicator</code>","text":"<p>               Bases: <code>ABC</code></p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Communicator(ABC):\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"Abstract Base Class for IPC Communicator objects.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using pickle prior to\n                sending it.\n        \"\"\"\n        self._party = party\n        self._use_pickle = use_pickle\n        self.desc = \"Communicator abstract base class.\"\n\n    @abstractmethod\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Method for reading data through the communication mechanism.\"\"\"\n        ...\n\n    @abstractmethod\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Method for sending data through the communication mechanism.\"\"\"\n        ...\n\n    def __str__(self):\n        name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        return f\"{name}: {self.desc}\"\n\n    def __repr__(self):\n        return self.__str__()\n\n    def __enter__(self) -&gt; Self:\n        return self\n\n    def __exit__(self) -&gt; None: ...\n\n    @property\n    def has_messages(self) -&gt; bool:\n        \"\"\"Whether the Communicator has remaining messages.\n\n        The precise method for determining whether there are remaining messages\n        will depend on the specific Communicator sub-class.\n        \"\"\"\n        return False\n\n    def stage_communicator(self):\n        \"\"\"Alternative method for staging outside of context manager.\"\"\"\n        self.__enter__()\n\n    def clear_communicator(self):\n        \"\"\"Alternative exit method outside of context manager.\"\"\"\n        self.__exit__()\n\n    def delayed_setup(self):\n        \"\"\"Any setup that should be done later than init.\"\"\"\n        ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.has_messages","title":"<code>has_messages: bool</code>  <code>property</code>","text":"<p>Whether the Communicator has remaining messages.</p> <p>The precise method for determining whether there are remaining messages will depend on the specific Communicator sub-class.</p>"},{"location":"source/execution/executor/#execution.executor.Communicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>Abstract Base Class for IPC Communicator objects.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using pickle prior to sending it.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"Abstract Base Class for IPC Communicator objects.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using pickle prior to\n            sending it.\n    \"\"\"\n    self._party = party\n    self._use_pickle = use_pickle\n    self.desc = \"Communicator abstract base class.\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.clear_communicator","title":"<code>clear_communicator()</code>","text":"<p>Alternative exit method outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def clear_communicator(self):\n    \"\"\"Alternative exit method outside of context manager.\"\"\"\n    self.__exit__()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Any setup that should be done later than init.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self):\n    \"\"\"Any setup that should be done later than init.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.read","title":"<code>read(proc)</code>  <code>abstractmethod</code>","text":"<p>Method for reading data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Method for reading data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.stage_communicator","title":"<code>stage_communicator()</code>","text":"<p>Alternative method for staging outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def stage_communicator(self):\n    \"\"\"Alternative method for staging outside of context manager.\"\"\"\n    self.__enter__()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Communicator.write","title":"<code>write(msg)</code>  <code>abstractmethod</code>","text":"<p>Method for sending data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef write(self, msg: Message) -&gt; None:\n    \"\"\"Method for sending data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Executor","title":"<code>Executor</code>","text":"<p>               Bases: <code>BaseExecutor</code></p> <p>Basic implementation of an Executor which manages simple IPC with Task.</p> <p>Attributes:</p> <p>Methods:</p> Name Description <code>add_hook</code> <p>str, hook: Callable[[None], None]) -&gt; None: Create a new hook to be called each time a specific event occurs.</p> <code>add_default_hooks</code> <p>Populate the event hooks with the default functions.</p> <code>update_environment</code> <p>Dict[str, str], update_path: str): Update the environment that is passed to the Task subprocess.</p> <code>execute_task</code> <p>Run the task as a subprocess.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class Executor(BaseExecutor):\n    \"\"\"Basic implementation of an Executor which manages simple IPC with Task.\n\n    Attributes:\n\n    Methods:\n        add_hook(event: str, hook: Callable[[None], None]) -&gt; None: Create a\n            new hook to be called each time a specific event occurs.\n\n        add_default_hooks() -&gt; None: Populate the event hooks with the default\n            functions.\n\n        update_environment(env: Dict[str, str], update_path: str): Update the\n            environment that is passed to the Task subprocess.\n\n        execute_task(): Run the task as a subprocess.\n    \"\"\"\n\n    def __init__(\n        self,\n        task_name: str,\n        communicators: List[Communicator] = [\n            PipeCommunicator(Party.EXECUTOR),\n            SocketCommunicator(Party.EXECUTOR),\n        ],\n        poll_interval: float = 0.05,\n    ) -&gt; None:\n        super().__init__(\n            task_name=task_name,\n            communicators=communicators,\n            poll_interval=poll_interval,\n        )\n        self.add_default_hooks()\n\n    def add_default_hooks(self) -&gt; None:\n        \"\"\"Populate the set of default event hooks.\"\"\"\n\n        def no_pickle_mode(self: Executor, msg: Message):\n            for idx, communicator in enumerate(self._communicators):\n                if isinstance(communicator, PipeCommunicator):\n                    self._communicators[idx] = PipeCommunicator(\n                        Party.EXECUTOR, use_pickle=False\n                    )\n\n        self.add_hook(\"no_pickle_mode\", no_pickle_mode)\n\n        def task_started(self: Executor, msg: Message):\n            if isinstance(msg.contents, TaskParameters):\n                self._analysis_desc.task_parameters = msg.contents\n                # Maybe just run this no matter what? Rely on the other guards?\n                # Perhaps just check if ThirdPartyParameters?\n                # if isinstance(self._analysis_desc.task_parameters, ThirdPartyParameters):\n                if hasattr(self._analysis_desc.task_parameters.Config, \"set_result\"):\n                    # Third party Tasks may mark a parameter as the result\n                    # If so, setup the result now.\n                    self._set_result_from_parameters()\n            logger.info(\n                f\"Executor: {self._analysis_desc.task_result.task_name} started\"\n            )\n            self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"RUNNING\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_started\", task_started)\n\n        def task_failed(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"FAILED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_failed\", task_failed)\n\n        def task_stopped(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"STOPPED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_stopped\", task_stopped)\n\n        def task_done(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_done\", task_done)\n\n        def task_cancelled(self: Executor, msg: Message):\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"CANCELLED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_cancelled\", task_cancelled)\n\n        def task_result(self: Executor, msg: Message):\n            if isinstance(msg.contents, TaskResult):\n                self._analysis_desc.task_result = msg.contents\n                logger.info(self._analysis_desc.task_result.summary)\n                logger.info(self._analysis_desc.task_result.task_status)\n            elog_data: Dict[str, str] = {\n                f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n            }\n            post_elog_run_status(elog_data)\n\n        self.add_hook(\"task_result\", task_result)\n\n    def _task_loop(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Actions to perform while the Task is running.\n\n        This function is run in the body of a loop until the Task signals\n        that its finished.\n        \"\"\"\n        for communicator in self._communicators:\n            while True:\n                msg: Message = communicator.read(proc)\n                if msg.signal is not None and msg.signal.upper() in LUTE_SIGNALS:\n                    hook: Callable[[Executor, Message], None] = getattr(\n                        self.Hooks, msg.signal.lower()\n                    )\n                    hook(self, msg)\n                if msg.contents is not None:\n                    if isinstance(msg.contents, str) and msg.contents != \"\":\n                        logger.info(msg.contents)\n                    elif not isinstance(msg.contents, str):\n                        logger.info(msg.contents)\n                if not communicator.has_messages:\n                    break\n\n    def _finalize_task(self, proc: subprocess.Popen) -&gt; None:\n        \"\"\"Any actions to be performed after the Task has ended.\n\n        Examples include a final clearing of the pipes, retrieving results,\n        reporting to third party services, etc.\n        \"\"\"\n        self._task_loop(proc)  # Perform a final read.\n\n    def _process_results(self) -&gt; None:\n        \"\"\"Performs result processing.\n\n        Actions include:\n        - For `ElogSummaryPlots`, will save the summary plot to the appropriate\n            directory for display in the eLog.\n        \"\"\"\n        task_result: TaskResult = self._analysis_desc.task_result\n        self._process_result_payload(task_result.payload)\n        self._process_result_summary(task_result.summary)\n\n    def _process_result_payload(self, payload: Any) -&gt; None:\n        if self._analysis_desc.task_parameters is None:\n            logger.debug(\"Please run Task before using this method!\")\n            return\n        if isinstance(payload, ElogSummaryPlots):\n            # ElogSummaryPlots has figures and a display name\n            # display name also serves as a path.\n            expmt: str = self._analysis_desc.task_parameters.lute_config.experiment\n            base_path: str = f\"/sdf/data/lcls/ds/{expmt[:3]}/{expmt}/stats/summary\"\n            full_path: str = f\"{base_path}/{payload.display_name}\"\n            if not os.path.isdir(full_path):\n                os.makedirs(full_path)\n\n            # Preferred plots are pn.Tabs objects which save directly as html\n            # Only supported plot type that has \"save\" method - do not want to\n            # import plot modules here to do type checks.\n            if hasattr(payload.figures, \"save\"):\n                payload.figures.save(f\"{full_path}/report.html\")\n            else:\n                ...\n        elif isinstance(payload, str):\n            # May be a path to a file...\n            schemas: Optional[str] = self._analysis_desc.task_result.impl_schemas\n            # Should also check `impl_schemas` to determine what to do with path\n\n    def _process_result_summary(self, summary: str) -&gt; None: ...\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Executor.add_default_hooks","title":"<code>add_default_hooks()</code>","text":"<p>Populate the set of default event hooks.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>def add_default_hooks(self) -&gt; None:\n    \"\"\"Populate the set of default event hooks.\"\"\"\n\n    def no_pickle_mode(self: Executor, msg: Message):\n        for idx, communicator in enumerate(self._communicators):\n            if isinstance(communicator, PipeCommunicator):\n                self._communicators[idx] = PipeCommunicator(\n                    Party.EXECUTOR, use_pickle=False\n                )\n\n    self.add_hook(\"no_pickle_mode\", no_pickle_mode)\n\n    def task_started(self: Executor, msg: Message):\n        if isinstance(msg.contents, TaskParameters):\n            self._analysis_desc.task_parameters = msg.contents\n            # Maybe just run this no matter what? Rely on the other guards?\n            # Perhaps just check if ThirdPartyParameters?\n            # if isinstance(self._analysis_desc.task_parameters, ThirdPartyParameters):\n            if hasattr(self._analysis_desc.task_parameters.Config, \"set_result\"):\n                # Third party Tasks may mark a parameter as the result\n                # If so, setup the result now.\n                self._set_result_from_parameters()\n        logger.info(\n            f\"Executor: {self._analysis_desc.task_result.task_name} started\"\n        )\n        self._analysis_desc.task_result.task_status = TaskStatus.RUNNING\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"RUNNING\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_started\", task_started)\n\n    def task_failed(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"FAILED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_failed\", task_failed)\n\n    def task_stopped(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"STOPPED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_stopped\", task_stopped)\n\n    def task_done(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_done\", task_done)\n\n    def task_cancelled(self: Executor, msg: Message):\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"CANCELLED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_cancelled\", task_cancelled)\n\n    def task_result(self: Executor, msg: Message):\n        if isinstance(msg.contents, TaskResult):\n            self._analysis_desc.task_result = msg.contents\n            logger.info(self._analysis_desc.task_result.summary)\n            logger.info(self._analysis_desc.task_result.task_status)\n        elog_data: Dict[str, str] = {\n            f\"{self._analysis_desc.task_result.task_name} status\": \"COMPLETED\",\n        }\n        post_elog_run_status(elog_data)\n\n    self.add_hook(\"task_result\", task_result)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.MPIExecutor","title":"<code>MPIExecutor</code>","text":"<p>               Bases: <code>Executor</code></p> <p>Runs first-party Tasks that require MPI.</p> <p>This Executor is otherwise identical to the standard Executor, except it uses <code>mpirun</code> for <code>Task</code> submission. Currently this Executor assumes a job has been submitted using SLURM as a first step. It will determine the number of MPI ranks based on the resources requested. As a fallback, it will try to determine the number of local cores available for cases where a job has not been submitted via SLURM. On S3DF, the second determination mechanism should accurately match the environment variable provided by SLURM indicating resources allocated.</p> <p>This Executor will submit the Task to run with a number of processes equal to the total number of cores available minus 1. A single core is reserved for the Executor itself. Note that currently this means that you must submit on 3 cores or more, since MPI requires a minimum of 2 ranks, and the number of ranks is determined from the cores dedicated to Task execution.</p> <p>Methods:</p> Name Description <code>_submit_cmd</code> <p>Run the task as a subprocess using <code>mpirun</code>.</p> Source code in <code>lute/execution/executor.py</code> <pre><code>class MPIExecutor(Executor):\n    \"\"\"Runs first-party Tasks that require MPI.\n\n    This Executor is otherwise identical to the standard Executor, except it\n    uses `mpirun` for `Task` submission. Currently this Executor assumes a job\n    has been submitted using SLURM as a first step. It will determine the number\n    of MPI ranks based on the resources requested. As a fallback, it will try\n    to determine the number of local cores available for cases where a job has\n    not been submitted via SLURM. On S3DF, the second determination mechanism\n    should accurately match the environment variable provided by SLURM indicating\n    resources allocated.\n\n    This Executor will submit the Task to run with a number of processes equal\n    to the total number of cores available minus 1. A single core is reserved\n    for the Executor itself. Note that currently this means that you must submit\n    on 3 cores or more, since MPI requires a minimum of 2 ranks, and the number\n    of ranks is determined from the cores dedicated to Task execution.\n\n    Methods:\n        _submit_cmd: Run the task as a subprocess using `mpirun`.\n    \"\"\"\n\n    def _submit_cmd(self, executable_path: str, params: str) -&gt; str:\n        \"\"\"Override submission command to use `mpirun`\n\n        Args:\n            executable_path (str): Path to the LUTE subprocess script.\n\n            params (str): String of formatted command-line arguments.\n\n        Returns:\n            cmd (str): Appropriately formatted command for this Executor.\n        \"\"\"\n        py_cmd: str = \"\"\n        nprocs: int = max(\n            int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1\n        )\n        mpi_cmd: str = f\"mpirun -np {nprocs}\"\n        if __debug__:\n            py_cmd = f\"python -B -u -m mpi4py.run {executable_path} {params}\"\n        else:\n            py_cmd = f\"python -OB -u -m mpi4py.run {executable_path} {params}\"\n\n        cmd: str = f\"{mpi_cmd} {py_cmd}\"\n        return cmd\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Party","title":"<code>Party</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Identifier for which party (side/end) is using a communicator.</p> <p>For some types of communication streams there may be different interfaces depending on which side of the communicator you are on. This enum is used by the communicator to determine which interface to use.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Party(Enum):\n    \"\"\"Identifier for which party (side/end) is using a communicator.\n\n    For some types of communication streams there may be different interfaces\n    depending on which side of the communicator you are on. This enum is used\n    by the communicator to determine which interface to use.\n    \"\"\"\n\n    TASK = 0\n    \"\"\"\n    The Task (client) side.\n    \"\"\"\n    EXECUTOR = 1\n    \"\"\"\n    The Executor (server) side.\n    \"\"\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.Party.EXECUTOR","title":"<code>EXECUTOR = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Executor (server) side.</p>"},{"location":"source/execution/executor/#execution.executor.Party.TASK","title":"<code>TASK = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Task (client) side.</p>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator","title":"<code>PipeCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication through pipes over stderr/stdout.</p> <p>The implementation of this communicator has reading and writing ocurring on stderr and stdout. In general the <code>Task</code> will be writing while the <code>Executor</code> will be reading. <code>stderr</code> is used for sending signals.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class PipeCommunicator(Communicator):\n    \"\"\"Provides communication through pipes over stderr/stdout.\n\n    The implementation of this communicator has reading and writing ocurring\n    on stderr and stdout. In general the `Task` will be writing while the\n    `Executor` will be reading. `stderr` is used for sending signals.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC through pipes.\n\n        Arbitrary objects may be transmitted using pickle to serialize the data.\n        If pickle is not used\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using Pickle prior to\n                sending it. If False, data is assumed to be text whi\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n        self.desc = \"Communicates through stderr and stdout using pickle.\"\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Read from stdout and stderr.\n\n        Args:\n            proc (subprocess.Popen): The process to read from.\n\n        Returns:\n            msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        signal: Optional[str]\n        contents: Optional[str]\n        raw_signal: bytes = proc.stderr.read()\n        raw_contents: bytes = proc.stdout.read()\n        if raw_signal is not None:\n            signal = raw_signal.decode()\n        else:\n            signal = raw_signal\n        if raw_contents:\n            if self._use_pickle:\n                try:\n                    contents = pickle.loads(raw_contents)\n                except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                    self._use_pickle = False\n                    contents = self._safe_unpickle_decode(raw_contents)\n            else:\n                try:\n                    contents = raw_contents.decode()\n                except UnicodeDecodeError as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                    self._use_pickle = True\n                    contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            contents = None\n\n        if signal and signal not in LUTE_SIGNALS:\n            # Some tasks write on stderr\n            # If the signal channel has \"non-signal\" info, add it to\n            # contents\n            if not contents:\n                contents = f\"({signal})\"\n            else:\n                contents = f\"{contents} ({signal})\"\n            signal = None\n\n        return Message(contents=contents, signal=signal)\n\n    def _safe_unpickle_decode(self, maybe_mixed: bytes) -&gt; Optional[str]:\n        \"\"\"This method is used to unpickle and/or decode a bytes object.\n\n        It attempts to handle cases where contents can be mixed, i.e., part of\n        the message must be decoded and the other part unpickled. It handles\n        only two-way splits. If there are more complex arrangements such as:\n        &lt;pickled&gt;:&lt;unpickled&gt;:&lt;pickled&gt; etc, it will give up.\n\n        The simpler two way splits are unlikely to occur in normal usage. They\n        may arise when debugging if, e.g., `print` statements are mixed with the\n        usage of the `_report_to_executor` method.\n\n        Note that this method works because ONLY text data is assumed to be\n        sent via the pipes. The method needs to be revised to handle non-text\n        data if the `Task` is modified to also send that via PipeCommunicator.\n        The use of pickle is supported to provide for this option if it is\n        necessary. It may be deprecated in the future.\n\n        Be careful when making changes. This method has seemingly redundant\n        checks because unpickling will not throw an error if a full object can\n        be retrieved. That is, the library will ignore extraneous bytes. This\n        method attempts to retrieve that information if the pickled data comes\n        first in the stream.\n\n        Args:\n            maybe_mixed (bytes): A bytes object which could require unpickling,\n                decoding, or both.\n\n        Returns:\n            contents (Optional[str]): The unpickled/decoded contents if possible.\n                Otherwise, None.\n        \"\"\"\n        contents: Optional[str]\n        try:\n            contents = pickle.loads(maybe_mixed)\n            repickled: bytes = pickle.dumps(contents)\n            if len(repickled) &lt; len(maybe_mixed):\n                # Successful unpickling, but pickle stops even if there are more bytes\n                try:\n                    additional_data: str = maybe_mixed[len(repickled) :].decode()\n                    contents = f\"{contents}{additional_data}\"\n                except UnicodeDecodeError:\n                    # Can't decode the bytes left by pickle, so they are lost\n                    missing_bytes: int = len(maybe_mixed) - len(repickled)\n                    logger.debug(\n                        f\"PipeCommunicator has truncated message. Unable to retrieve {missing_bytes} bytes.\"\n                    )\n        except (pickle.UnpicklingError, ValueError, EOFError) as err:\n            # Pickle may also throw a ValueError, e.g. this bytes: b\"Found! \\n\"\n            # Pickle may also throw an EOFError, eg. this bytes: b\"F0\\n\"\n            try:\n                contents = maybe_mixed.decode()\n            except UnicodeDecodeError as err2:\n                try:\n                    contents = maybe_mixed[: err2.start].decode()\n                    contents = f\"{contents}{pickle.loads(maybe_mixed[err2.start:])}\"\n                except Exception as err3:\n                    logger.debug(\n                        f\"PipeCommunicator unable to decode/parse data! {err3}\"\n                    )\n                    contents = None\n        return contents\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Write to stdout and stderr.\n\n         The signal component is sent to `stderr` while the contents of the\n         Message are sent to `stdout`.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        if self._use_pickle:\n            signal: bytes\n            if msg.signal:\n                signal = msg.signal.encode()\n            else:\n                signal = b\"\"\n\n            contents: bytes = pickle.dumps(msg.contents)\n\n            sys.stderr.buffer.write(signal)\n            sys.stdout.buffer.write(contents)\n\n            sys.stderr.buffer.flush()\n            sys.stdout.buffer.flush()\n        else:\n            raw_signal: str\n            if msg.signal:\n                raw_signal = msg.signal\n            else:\n                raw_signal = \"\"\n\n            raw_contents: str\n            if isinstance(msg.contents, str):\n                raw_contents = msg.contents\n            elif msg.contents is None:\n                raw_contents = \"\"\n            else:\n                raise ValueError(\n                    f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n                )\n            sys.stderr.write(raw_signal)\n            sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC through pipes.</p> <p>Arbitrary objects may be transmitted using pickle to serialize the data. If pickle is not used</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using Pickle prior to sending it. If False, data is assumed to be text whi</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC through pipes.\n\n    Arbitrary objects may be transmitted using pickle to serialize the data.\n    If pickle is not used\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using Pickle prior to\n            sending it. If False, data is assumed to be text whi\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n    self.desc = \"Communicates through stderr and stdout using pickle.\"\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Read from stdout and stderr.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Read from stdout and stderr.\n\n    Args:\n        proc (subprocess.Popen): The process to read from.\n\n    Returns:\n        msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    signal: Optional[str]\n    contents: Optional[str]\n    raw_signal: bytes = proc.stderr.read()\n    raw_contents: bytes = proc.stdout.read()\n    if raw_signal is not None:\n        signal = raw_signal.decode()\n    else:\n        signal = raw_signal\n    if raw_contents:\n        if self._use_pickle:\n            try:\n                contents = pickle.loads(raw_contents)\n            except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                self._use_pickle = False\n                contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            try:\n                contents = raw_contents.decode()\n            except UnicodeDecodeError as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                self._use_pickle = True\n                contents = self._safe_unpickle_decode(raw_contents)\n    else:\n        contents = None\n\n    if signal and signal not in LUTE_SIGNALS:\n        # Some tasks write on stderr\n        # If the signal channel has \"non-signal\" info, add it to\n        # contents\n        if not contents:\n            contents = f\"({signal})\"\n        else:\n            contents = f\"{contents} ({signal})\"\n        signal = None\n\n    return Message(contents=contents, signal=signal)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.PipeCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Write to stdout and stderr.</p> <p>The signal component is sent to <code>stderr</code> while the contents of the  Message are sent to <code>stdout</code>.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Write to stdout and stderr.\n\n     The signal component is sent to `stderr` while the contents of the\n     Message are sent to `stdout`.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    if self._use_pickle:\n        signal: bytes\n        if msg.signal:\n            signal = msg.signal.encode()\n        else:\n            signal = b\"\"\n\n        contents: bytes = pickle.dumps(msg.contents)\n\n        sys.stderr.buffer.write(signal)\n        sys.stdout.buffer.write(contents)\n\n        sys.stderr.buffer.flush()\n        sys.stdout.buffer.flush()\n    else:\n        raw_signal: str\n        if msg.signal:\n            raw_signal = msg.signal\n        else:\n            raw_signal = \"\"\n\n        raw_contents: str\n        if isinstance(msg.contents, str):\n            raw_contents = msg.contents\n        elif msg.contents is None:\n            raw_contents = \"\"\n        else:\n            raise ValueError(\n                f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n            )\n        sys.stderr.write(raw_signal)\n        sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator","title":"<code>SocketCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication over Unix or TCP sockets.</p> <p>Communication is provided either using sockets with the Python socket library or using ZMQ. The choice of implementation is controlled by the global bool <code>USE_ZMQ</code>.</p> Whether to use TCP or Unix sockets is controlled by the environment <p><code>LUTE_USE_TCP=1</code></p> <p>If defined, TCP sockets will be used, otherwise Unix sockets will be used.</p> <p>Regardless of socket type, the environment variable                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> will be defined by the Executor-side Communicator.</p> <p>For TCP sockets: The Executor-side Communicator should be run first and will bind to all interfaces on the port determined by the environment variable:                         <code>LUTE_PORT=###</code> If no port is defined, a port scan will be performed and the Executor-side Communicator will bind the first one available from a random selection. It will then define the environment variable so the Task-side can pick it up.</p> <p>For Unix sockets: The path to the Unix socket is defined by the environment variable:                   <code>LUTE_SOCKET=/path/to/socket</code> This class assumes proper permissions and that this above environment variable has been defined. The <code>Task</code> is configured as what would commonly be referred to as the <code>client</code>, while the <code>Executor</code> is configured as the server.</p> <p>If the Task process is run on a different machine than the Executor, the Task-side Communicator will open a ssh-tunnel to forward traffic from a local Unix socket to the Executor Unix socket. Opening of the tunnel relies on the environment variable:                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> to determine the Executor's host. This variable should be defined by the Executor and passed to the Task process automatically, but it can also be defined manually if launching the Task process separately. The Task will use the local socket <code>&lt;LUTE_SOCKET&gt;.task{##}</code>. Multiple local sockets may be created. Currently, it is assumed that the user is identical on both the Task machine and Executor machine.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class SocketCommunicator(Communicator):\n    \"\"\"Provides communication over Unix or TCP sockets.\n\n    Communication is provided either using sockets with the Python socket library\n    or using ZMQ. The choice of implementation is controlled by the global bool\n    `USE_ZMQ`.\n\n    Whether to use TCP or Unix sockets is controlled by the environment:\n                           `LUTE_USE_TCP=1`\n    If defined, TCP sockets will be used, otherwise Unix sockets will be used.\n\n    Regardless of socket type, the environment variable\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    will be defined by the Executor-side Communicator.\n\n\n    For TCP sockets:\n    The Executor-side Communicator should be run first and will bind to all\n    interfaces on the port determined by the environment variable:\n                            `LUTE_PORT=###`\n    If no port is defined, a port scan will be performed and the Executor-side\n    Communicator will bind the first one available from a random selection. It\n    will then define the environment variable so the Task-side can pick it up.\n\n    For Unix sockets:\n    The path to the Unix socket is defined by the environment variable:\n                      `LUTE_SOCKET=/path/to/socket`\n    This class assumes proper permissions and that this above environment\n    variable has been defined. The `Task` is configured as what would commonly\n    be referred to as the `client`, while the `Executor` is configured as the\n    server.\n\n    If the Task process is run on a different machine than the Executor, the\n    Task-side Communicator will open a ssh-tunnel to forward traffic from a local\n    Unix socket to the Executor Unix socket. Opening of the tunnel relies on the\n    environment variable:\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    to determine the Executor's host. This variable should be defined by the\n    Executor and passed to the Task process automatically, but it can also be\n    defined manually if launching the Task process separately. The Task will use\n    the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n    created. Currently, it is assumed that the user is identical on both the Task\n    machine and Executor machine.\n    \"\"\"\n\n    ACCEPT_TIMEOUT: float = 0.01\n    \"\"\"\n    Maximum time to wait to accept connections. Used by Executor-side.\n    \"\"\"\n    MSG_HEAD: bytes = b\"MSG\"\n    \"\"\"\n    Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].\n    \"\"\"\n    MSG_SEP: bytes = b\";;;\"\n    \"\"\"\n    Separator for parts of a message. Messages have a start, length, message and end.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC over a TCP or Unix socket.\n\n        Unlike with the PipeCommunicator, pickle is always used to send data\n        through the socket.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n            use_pickle (bool): Whether to use pickle. Always True currently,\n                passing False does not change behaviour.\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n\n    def delayed_setup(self) -&gt; None:\n        \"\"\"Delays the creation of socket objects.\n\n        The Executor initializes the Communicator when it is created. Since\n        all Executors are created and available at once we want to delay\n        acquisition of socket resources until a single Executor is ready\n        to use them.\n        \"\"\"\n        self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n        if USE_ZMQ:\n            self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n            self._context: zmq.context.Context = zmq.Context()\n            self._data_socket = self._create_socket_zmq()\n        else:\n            self.desc: str = \"Communicates through a TCP or Unix socket.\"\n            self._data_socket = self._create_socket_raw()\n            self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n        if self._party == Party.EXECUTOR:\n            # Executor created first so we can define the hostname env variable\n            os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n            # Setup reader thread\n            self._reader_thread: threading.Thread = threading.Thread(\n                target=self._read_socket\n            )\n            self._msg_queue: queue.Queue = queue.Queue()\n            self._partial_msg: Optional[bytes] = None\n            self._stop_thread: bool = False\n            self._reader_thread.start()\n        else:\n            # Only used by Party.TASK\n            self._use_ssh_tunnel: bool = False\n            self._ssh_proc: Optional[subprocess.Popen] = None\n            self._local_socket_path: Optional[str] = None\n\n    # Read\n    ############################################################################\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Return a message from the queue if available.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Args:\n            proc (subprocess.Popen): The process to read from. Provided for\n                compatibility with other Communicator subtypes. Is ignored.\n\n        Returns:\n             msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        msg: Message\n        try:\n            msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n        except queue.Empty:\n            msg = Message()\n\n        return msg\n\n    def _read_socket(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Calls an underlying method for either raw sockets or ZMQ.\n        \"\"\"\n\n        while True:\n            if self._stop_thread:\n                logger.debug(\"Stopping socket reader thread.\")\n                break\n            if USE_ZMQ:\n                self._read_socket_zmq()\n            else:\n                self._read_socket_raw()\n\n    def _read_socket_raw(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Raw socket implementation for the reader thread.\n        \"\"\"\n        connection: socket.socket\n        addr: Union[str, Tuple[str, int]]\n        try:\n            connection, addr = self._data_socket.accept()\n            full_data: bytes = b\"\"\n            while True:\n                data: bytes = connection.recv(8192)\n                if data:\n                    full_data += data\n                else:\n                    break\n            connection.close()\n            self._unpack_messages(full_data)\n        except socket.timeout:\n            pass\n\n    def _read_socket_zmq(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        ZMQ implementation for the reader thread.\n        \"\"\"\n        try:\n            full_data: bytes = self._data_socket.recv(0)\n            self._unpack_messages(full_data)\n        except zmq.ZMQError:\n            pass\n\n    def _unpack_messages(self, data: bytes) -&gt; None:\n        \"\"\"Unpacks a byte stream into individual messages.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        Partial messages (a series of bytes which cannot be converted to a full\n        message) are stored for later. An attempt is made to reconstruct the\n        message with the next call to this method.\n\n        Args:\n            data (bytes): A raw byte stream containing anywhere from a partial\n                message to multiple full messages.\n        \"\"\"\n        msg: Message\n        working_data: bytes\n        if self._partial_msg:\n            # Concatenate the previous partial message to the beginning\n            working_data = self._partial_msg + data\n            self._partial_msg = None\n        else:\n            working_data = data\n        while working_data:\n            try:\n                # Message encoding: &lt;HEAD&gt;&lt;SEP&gt;&lt;len&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n                end = working_data.find(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD[::-1]\n                )\n                msg_parts: List[bytes] = working_data[:end].split(\n                    SocketCommunicator.MSG_SEP\n                )\n                if len(msg_parts) != 3:\n                    self._partial_msg = working_data\n                    break\n\n                cmd: bytes\n                nbytes: bytes\n                raw_msg: bytes\n                cmd, nbytes, raw_msg = msg_parts\n                if len(raw_msg) != int(nbytes):\n                    self._partial_msg = working_data\n                    break\n                msg = pickle.loads(raw_msg)\n                self._msg_queue.put(msg)\n            except pickle.UnpicklingError:\n                self._partial_msg = working_data\n                break\n            if end &lt; len(working_data):\n                # Add len(SEP+HEAD) since end marks the start of &lt;SEP&gt;&lt;HEAD[::-1]\n                offset: int = len(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD\n                )\n                working_data = working_data[end + offset :]\n            else:\n                working_data = b\"\"\n\n    # Write\n    ############################################################################\n\n    def _write_socket(self, msg: Message) -&gt; None:\n        \"\"\"Sends data over a socket from the 'client' (Task) side.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        This structure is used for decoding the message on the other end.\n        \"\"\"\n        data: bytes = pickle.dumps(msg)\n        cmd: bytes = SocketCommunicator.MSG_HEAD\n        size: bytes = b\"%d\" % len(data)\n        end: bytes = SocketCommunicator.MSG_HEAD[::-1]\n        sep: bytes = SocketCommunicator.MSG_SEP\n        packed_msg: bytes = cmd + sep + size + sep + data + sep + end\n        if USE_ZMQ:\n            self._data_socket.send(packed_msg)\n        else:\n            self._data_socket.sendall(packed_msg)\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Send a single Message.\n\n        The entire Message (signal and contents) is serialized and sent through\n        a connection over Unix socket.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        self._write_socket(msg)\n\n    # Generic create\n    ############################################################################\n\n    def _create_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): TCP or Unix socket.\n        \"\"\"\n        import struct\n\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        sock: socket.socket\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw TCP sockets.\")\n            sock = self._init_tcp_socket_raw()\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw Unix sockets.\")\n            sock = self._init_unix_socket_raw()\n        sock.setsockopt(\n            socket.SOL_SOCKET, socket.SO_LINGER, struct.pack(\"ii\", 1, 10000)\n        )\n        return sock\n\n    def _create_socket_zmq(self) -&gt; zmq.sugar.socket.Socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_type: Literal[zmq.PULL, zmq.PUSH]\n        if self._party == Party.EXECUTOR:\n            socket_type = zmq.PULL\n        else:\n            socket_type = zmq.PUSH\n\n        data_socket: zmq.sugar.socket.Socket = self._context.socket(socket_type)\n        data_socket.set_hwm(160000)\n        # Need to multiply by 1000 since ZMQ uses ms\n        data_socket.setsockopt(\n            zmq.RCVTIMEO, int(SocketCommunicator.ACCEPT_TIMEOUT * 1000)\n        )\n        # Try TCP first\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use TCP (ZMQ).\")\n            self._init_tcp_socket_zmq(data_socket)\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use Unix sockets (ZMQ).\")\n            self._init_unix_socket_zmq(data_socket)\n\n        return data_socket\n\n    # TCP Init\n    ############################################################################\n\n    def _find_random_port(\n        self, min_port: int = 41923, max_port: int = 64324, max_tries: int = 100\n    ) -&gt; Optional[int]:\n        \"\"\"Find a random open port to bind to if using TCP.\"\"\"\n        from random import choices\n\n        sock: socket.socket\n        ports: List[int] = choices(range(min_port, max_port), k=max_tries)\n        for port in ports:\n            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n            try:\n                sock.bind((\"\", port))\n                sock.close()\n                del sock\n                return port\n            except:\n                continue\n        return None\n\n    def _init_tcp_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Initialize a TCP socket.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_PORT=###`\n        is defined, if so binds it, otherwise find a free port from a selection\n        of random ports. If a port search is performed, the `LUTE_PORT` variable\n        will be defined so it can be picked up by the the Task-side Communicator.\n\n        In the event that no port can be bound on the Executor-side, or the port\n        and hostname information is unavailable to the Task-side, the program\n        will exit.\n\n        Returns:\n            data_socket (socket.socket): TCP socket object.\n        \"\"\"\n        data_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                # If port is None find one\n                # Executor code executes first\n                port = self._find_random_port()\n                if port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                # Provide port env var for Task-side\n                os.environ[\"LUTE_PORT\"] = str(port)\n            data_socket.bind((\"\", int(port)))\n            data_socket.listen()\n        else:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect((\"localhost\", int(port)))\n            else:\n                data_socket.connect((executor_hostname, int(port)))\n        return data_socket\n\n    def _init_tcp_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a TCP socket using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (zmq.socket.Socket): Socket object.\n        \"\"\"\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                new_port: int = data_socket.bind_to_random_port(\"tcp://*\")\n                if new_port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                port = new_port\n                os.environ[\"LUTE_PORT\"] = str(port)\n            else:\n                data_socket.bind(f\"tcp://*:{port}\")\n            logger.debug(f\"Executor bound port {port}\")\n        else:\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            data_socket.connect(f\"tcp://{executor_hostname}:{port}\")\n\n    # Unix Init\n    ############################################################################\n\n    def _get_socket_path(self) -&gt; str:\n        \"\"\"Return the socket path, defining one if it is not available.\n\n        Returns:\n            socket_path (str): Path to the Unix socket.\n        \"\"\"\n        socket_path: str\n        try:\n            socket_path = os.environ[\"LUTE_SOCKET\"]\n        except KeyError as err:\n            import uuid\n            import tempfile\n\n            # Define a path, and add to environment\n            # Executor-side always created first, Task will use the same one\n            socket_path = f\"{tempfile.gettempdir()}/lute_{uuid.uuid4().hex}.sock\"\n            os.environ[\"LUTE_SOCKET\"] = socket_path\n            logger.debug(f\"SocketCommunicator defines socket_path: {socket_path}\")\n        if USE_ZMQ:\n            return f\"ipc://{socket_path}\"\n        else:\n            return socket_path\n\n    def _init_unix_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Returns a Unix socket object.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_SOCKET=XYZ`\n        is defined, if so binds it, otherwise it will create a new path and\n        define the environment variable for the Task-side to find.\n\n        On the Task (client-side), this method will also open a SSH tunnel to\n        forward a local Unix socket to an Executor Unix socket if the Task and\n        Executor processes are on different machines.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_path: str = self._get_socket_path()\n        data_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n            data_socket.listen()\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path, hostname, executor_hostname\n                )\n                while 1:\n                    # Keep trying reconnect until ssh tunnel works.\n                    try:\n                        data_socket.connect(self._local_socket_path)\n                        break\n                    except FileNotFoundError:\n                        continue\n\n        return data_socket\n\n    def _init_unix_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a Unix socket object, using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (socket.socket): ZMQ object.\n        \"\"\"\n        socket_path = self._get_socket_path()\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                self._data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                # Need to remove ipc:// from socket_path for forwarding\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path[6:], hostname, executor_hostname\n                )\n                # Need to add it back\n                path: str = f\"ipc://{self._local_socket_path}\"\n                data_socket.connect(path)\n\n    def _setup_unix_ssh_tunnel(\n        self, socket_path: str, hostname: str, executor_hostname: str\n    ) -&gt; str:\n        \"\"\"Prepares an SSH tunnel for forwarding between Unix sockets on two hosts.\n\n        An SSH tunnel is opened with `ssh -L &lt;local&gt;:&lt;remote&gt; sleep 2`.\n        This method of communication is slightly slower and incurs additional\n        overhead - it should only be used as a backup. If communication across\n        multiple hosts is required consider using TCP.  The Task will use\n        the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n        created. It is assumed that the user is identical on both the\n        Task machine and Executor machine.\n\n        Returns:\n            local_socket_path (str): The local Unix socket to connect to.\n        \"\"\"\n        if \"uuid\" not in globals():\n            import uuid\n        local_socket_path = f\"{socket_path}.task{uuid.uuid4().hex[:4]}\"\n        self._use_ssh_tunnel = True\n        ssh_cmd: List[str] = [\n            \"ssh\",\n            \"-o\",\n            \"LogLevel=quiet\",\n            \"-L\",\n            f\"{local_socket_path}:{socket_path}\",\n            executor_hostname,\n            \"sleep\",\n            \"2\",\n        ]\n        logger.debug(f\"Opening tunnel from {hostname} to {executor_hostname}\")\n        self._ssh_proc = subprocess.Popen(ssh_cmd)\n        time.sleep(0.4)  # Need to wait... -&gt; Use single Task comm at beginning?\n        return local_socket_path\n\n    # Clean up and properties\n    ############################################################################\n\n    def _clean_up(self) -&gt; None:\n        \"\"\"Clean up connections.\"\"\"\n        if self._party == Party.EXECUTOR:\n            self._stop_thread = True\n            self._reader_thread.join()\n            logger.debug(\"Closed reading thread.\")\n\n        self._data_socket.close()\n        if USE_ZMQ:\n            self._context.term()\n        else:\n            ...\n\n        if os.getenv(\"LUTE_USE_TCP\"):\n            return\n        else:\n            if self._party == Party.EXECUTOR:\n                os.unlink(os.getenv(\"LUTE_SOCKET\"))  # Should be defined\n                return\n            elif self._use_ssh_tunnel:\n                if self._ssh_proc is not None:\n                    self._ssh_proc.terminate()\n\n    @property\n    def has_messages(self) -&gt; bool:\n        if self._party == Party.TASK:\n            # Shouldn't be called on Task-side\n            return False\n\n        if self._msg_queue.qsize() &gt; 0:\n            return True\n        return False\n\n    def __exit__(self):\n        self._clean_up()\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.ACCEPT_TIMEOUT","title":"<code>ACCEPT_TIMEOUT: float = 0.01</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Maximum time to wait to accept connections. Used by Executor-side.</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.MSG_HEAD","title":"<code>MSG_HEAD: bytes = b'MSG'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.MSG_SEP","title":"<code>MSG_SEP: bytes = b';;;'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Separator for parts of a message. Messages have a start, length, message and end.</p>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC over a TCP or Unix socket.</p> <p>Unlike with the PipeCommunicator, pickle is always used to send data through the socket.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to use pickle. Always True currently, passing False does not change behaviour.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC over a TCP or Unix socket.\n\n    Unlike with the PipeCommunicator, pickle is always used to send data\n    through the socket.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n        use_pickle (bool): Whether to use pickle. Always True currently,\n            passing False does not change behaviour.\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Delays the creation of socket objects.</p> <p>The Executor initializes the Communicator when it is created. Since all Executors are created and available at once we want to delay acquisition of socket resources until a single Executor is ready to use them.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self) -&gt; None:\n    \"\"\"Delays the creation of socket objects.\n\n    The Executor initializes the Communicator when it is created. Since\n    all Executors are created and available at once we want to delay\n    acquisition of socket resources until a single Executor is ready\n    to use them.\n    \"\"\"\n    self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n    if USE_ZMQ:\n        self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n        self._context: zmq.context.Context = zmq.Context()\n        self._data_socket = self._create_socket_zmq()\n    else:\n        self.desc: str = \"Communicates through a TCP or Unix socket.\"\n        self._data_socket = self._create_socket_raw()\n        self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n    if self._party == Party.EXECUTOR:\n        # Executor created first so we can define the hostname env variable\n        os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n        # Setup reader thread\n        self._reader_thread: threading.Thread = threading.Thread(\n            target=self._read_socket\n        )\n        self._msg_queue: queue.Queue = queue.Queue()\n        self._partial_msg: Optional[bytes] = None\n        self._stop_thread: bool = False\n        self._reader_thread.start()\n    else:\n        # Only used by Party.TASK\n        self._use_ssh_tunnel: bool = False\n        self._ssh_proc: Optional[subprocess.Popen] = None\n        self._local_socket_path: Optional[str] = None\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Return a message from the queue if available.</p> <p>Socket(s) are continuously monitored, and read from when new data is available.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from. Provided for compatibility with other Communicator subtypes. Is ignored.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Return a message from the queue if available.\n\n    Socket(s) are continuously monitored, and read from when new data is\n    available.\n\n    Args:\n        proc (subprocess.Popen): The process to read from. Provided for\n            compatibility with other Communicator subtypes. Is ignored.\n\n    Returns:\n         msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    msg: Message\n    try:\n        msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n    except queue.Empty:\n        msg = Message()\n\n    return msg\n</code></pre>"},{"location":"source/execution/executor/#execution.executor.SocketCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Send a single Message.</p> <p>The entire Message (signal and contents) is serialized and sent through a connection over Unix socket.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Send a single Message.\n\n    The entire Message (signal and contents) is serialized and sent through\n    a connection over Unix socket.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    self._write_socket(msg)\n</code></pre>"},{"location":"source/execution/ipc/","title":"ipc","text":"<p>Classes and utilities for communication between Executors and subprocesses.</p> <p>Communicators manage message passing and parsing between subprocesses. They maintain a limited public interface of \"read\" and \"write\" operations. Behind this interface the methods of communication vary from serialization across pipes to Unix sockets, etc. All communicators pass a single object called a \"Message\" which contains an arbitrary \"contents\" field as well as an optional \"signal\" field.</p> <p>Classes:</p> Name Description <code>Party</code> <p>Enum describing whether Communicator is on Task-side or Executor-side.</p> <code>Message</code> <p>A dataclass used for passing information from Task to Executor.</p> <code>Communicator</code> <p>Abstract base class for Communicator types.</p> <code>PipeCommunicator</code> <p>Manages communication between Task and Executor via pipes (stderr and stdout).</p> <code>SocketCommunicator</code> <p>Manages communication using sockets, either raw or using zmq. Supports both TCP and Unix sockets.</p>"},{"location":"source/execution/ipc/#execution.ipc.Communicator","title":"<code>Communicator</code>","text":"<p>               Bases: <code>ABC</code></p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Communicator(ABC):\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"Abstract Base Class for IPC Communicator objects.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using pickle prior to\n                sending it.\n        \"\"\"\n        self._party = party\n        self._use_pickle = use_pickle\n        self.desc = \"Communicator abstract base class.\"\n\n    @abstractmethod\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Method for reading data through the communication mechanism.\"\"\"\n        ...\n\n    @abstractmethod\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Method for sending data through the communication mechanism.\"\"\"\n        ...\n\n    def __str__(self):\n        name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        return f\"{name}: {self.desc}\"\n\n    def __repr__(self):\n        return self.__str__()\n\n    def __enter__(self) -&gt; Self:\n        return self\n\n    def __exit__(self) -&gt; None: ...\n\n    @property\n    def has_messages(self) -&gt; bool:\n        \"\"\"Whether the Communicator has remaining messages.\n\n        The precise method for determining whether there are remaining messages\n        will depend on the specific Communicator sub-class.\n        \"\"\"\n        return False\n\n    def stage_communicator(self):\n        \"\"\"Alternative method for staging outside of context manager.\"\"\"\n        self.__enter__()\n\n    def clear_communicator(self):\n        \"\"\"Alternative exit method outside of context manager.\"\"\"\n        self.__exit__()\n\n    def delayed_setup(self):\n        \"\"\"Any setup that should be done later than init.\"\"\"\n        ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.has_messages","title":"<code>has_messages: bool</code>  <code>property</code>","text":"<p>Whether the Communicator has remaining messages.</p> <p>The precise method for determining whether there are remaining messages will depend on the specific Communicator sub-class.</p>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>Abstract Base Class for IPC Communicator objects.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using pickle prior to sending it.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"Abstract Base Class for IPC Communicator objects.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using pickle prior to\n            sending it.\n    \"\"\"\n    self._party = party\n    self._use_pickle = use_pickle\n    self.desc = \"Communicator abstract base class.\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.clear_communicator","title":"<code>clear_communicator()</code>","text":"<p>Alternative exit method outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def clear_communicator(self):\n    \"\"\"Alternative exit method outside of context manager.\"\"\"\n    self.__exit__()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Any setup that should be done later than init.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self):\n    \"\"\"Any setup that should be done later than init.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.read","title":"<code>read(proc)</code>  <code>abstractmethod</code>","text":"<p>Method for reading data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Method for reading data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.stage_communicator","title":"<code>stage_communicator()</code>","text":"<p>Alternative method for staging outside of context manager.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def stage_communicator(self):\n    \"\"\"Alternative method for staging outside of context manager.\"\"\"\n    self.__enter__()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Communicator.write","title":"<code>write(msg)</code>  <code>abstractmethod</code>","text":"<p>Method for sending data through the communication mechanism.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>@abstractmethod\ndef write(self, msg: Message) -&gt; None:\n    \"\"\"Method for sending data through the communication mechanism.\"\"\"\n    ...\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Party","title":"<code>Party</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Identifier for which party (side/end) is using a communicator.</p> <p>For some types of communication streams there may be different interfaces depending on which side of the communicator you are on. This enum is used by the communicator to determine which interface to use.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class Party(Enum):\n    \"\"\"Identifier for which party (side/end) is using a communicator.\n\n    For some types of communication streams there may be different interfaces\n    depending on which side of the communicator you are on. This enum is used\n    by the communicator to determine which interface to use.\n    \"\"\"\n\n    TASK = 0\n    \"\"\"\n    The Task (client) side.\n    \"\"\"\n    EXECUTOR = 1\n    \"\"\"\n    The Executor (server) side.\n    \"\"\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.Party.EXECUTOR","title":"<code>EXECUTOR = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Executor (server) side.</p>"},{"location":"source/execution/ipc/#execution.ipc.Party.TASK","title":"<code>TASK = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>The Task (client) side.</p>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator","title":"<code>PipeCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication through pipes over stderr/stdout.</p> <p>The implementation of this communicator has reading and writing ocurring on stderr and stdout. In general the <code>Task</code> will be writing while the <code>Executor</code> will be reading. <code>stderr</code> is used for sending signals.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class PipeCommunicator(Communicator):\n    \"\"\"Provides communication through pipes over stderr/stdout.\n\n    The implementation of this communicator has reading and writing ocurring\n    on stderr and stdout. In general the `Task` will be writing while the\n    `Executor` will be reading. `stderr` is used for sending signals.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC through pipes.\n\n        Arbitrary objects may be transmitted using pickle to serialize the data.\n        If pickle is not used\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n            use_pickle (bool): Whether to serialize data using Pickle prior to\n                sending it. If False, data is assumed to be text whi\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n        self.desc = \"Communicates through stderr and stdout using pickle.\"\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Read from stdout and stderr.\n\n        Args:\n            proc (subprocess.Popen): The process to read from.\n\n        Returns:\n            msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        signal: Optional[str]\n        contents: Optional[str]\n        raw_signal: bytes = proc.stderr.read()\n        raw_contents: bytes = proc.stdout.read()\n        if raw_signal is not None:\n            signal = raw_signal.decode()\n        else:\n            signal = raw_signal\n        if raw_contents:\n            if self._use_pickle:\n                try:\n                    contents = pickle.loads(raw_contents)\n                except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                    self._use_pickle = False\n                    contents = self._safe_unpickle_decode(raw_contents)\n            else:\n                try:\n                    contents = raw_contents.decode()\n                except UnicodeDecodeError as err:\n                    logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                    self._use_pickle = True\n                    contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            contents = None\n\n        if signal and signal not in LUTE_SIGNALS:\n            # Some tasks write on stderr\n            # If the signal channel has \"non-signal\" info, add it to\n            # contents\n            if not contents:\n                contents = f\"({signal})\"\n            else:\n                contents = f\"{contents} ({signal})\"\n            signal = None\n\n        return Message(contents=contents, signal=signal)\n\n    def _safe_unpickle_decode(self, maybe_mixed: bytes) -&gt; Optional[str]:\n        \"\"\"This method is used to unpickle and/or decode a bytes object.\n\n        It attempts to handle cases where contents can be mixed, i.e., part of\n        the message must be decoded and the other part unpickled. It handles\n        only two-way splits. If there are more complex arrangements such as:\n        &lt;pickled&gt;:&lt;unpickled&gt;:&lt;pickled&gt; etc, it will give up.\n\n        The simpler two way splits are unlikely to occur in normal usage. They\n        may arise when debugging if, e.g., `print` statements are mixed with the\n        usage of the `_report_to_executor` method.\n\n        Note that this method works because ONLY text data is assumed to be\n        sent via the pipes. The method needs to be revised to handle non-text\n        data if the `Task` is modified to also send that via PipeCommunicator.\n        The use of pickle is supported to provide for this option if it is\n        necessary. It may be deprecated in the future.\n\n        Be careful when making changes. This method has seemingly redundant\n        checks because unpickling will not throw an error if a full object can\n        be retrieved. That is, the library will ignore extraneous bytes. This\n        method attempts to retrieve that information if the pickled data comes\n        first in the stream.\n\n        Args:\n            maybe_mixed (bytes): A bytes object which could require unpickling,\n                decoding, or both.\n\n        Returns:\n            contents (Optional[str]): The unpickled/decoded contents if possible.\n                Otherwise, None.\n        \"\"\"\n        contents: Optional[str]\n        try:\n            contents = pickle.loads(maybe_mixed)\n            repickled: bytes = pickle.dumps(contents)\n            if len(repickled) &lt; len(maybe_mixed):\n                # Successful unpickling, but pickle stops even if there are more bytes\n                try:\n                    additional_data: str = maybe_mixed[len(repickled) :].decode()\n                    contents = f\"{contents}{additional_data}\"\n                except UnicodeDecodeError:\n                    # Can't decode the bytes left by pickle, so they are lost\n                    missing_bytes: int = len(maybe_mixed) - len(repickled)\n                    logger.debug(\n                        f\"PipeCommunicator has truncated message. Unable to retrieve {missing_bytes} bytes.\"\n                    )\n        except (pickle.UnpicklingError, ValueError, EOFError) as err:\n            # Pickle may also throw a ValueError, e.g. this bytes: b\"Found! \\n\"\n            # Pickle may also throw an EOFError, eg. this bytes: b\"F0\\n\"\n            try:\n                contents = maybe_mixed.decode()\n            except UnicodeDecodeError as err2:\n                try:\n                    contents = maybe_mixed[: err2.start].decode()\n                    contents = f\"{contents}{pickle.loads(maybe_mixed[err2.start:])}\"\n                except Exception as err3:\n                    logger.debug(\n                        f\"PipeCommunicator unable to decode/parse data! {err3}\"\n                    )\n                    contents = None\n        return contents\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Write to stdout and stderr.\n\n         The signal component is sent to `stderr` while the contents of the\n         Message are sent to `stdout`.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        if self._use_pickle:\n            signal: bytes\n            if msg.signal:\n                signal = msg.signal.encode()\n            else:\n                signal = b\"\"\n\n            contents: bytes = pickle.dumps(msg.contents)\n\n            sys.stderr.buffer.write(signal)\n            sys.stdout.buffer.write(contents)\n\n            sys.stderr.buffer.flush()\n            sys.stdout.buffer.flush()\n        else:\n            raw_signal: str\n            if msg.signal:\n                raw_signal = msg.signal\n            else:\n                raw_signal = \"\"\n\n            raw_contents: str\n            if isinstance(msg.contents, str):\n                raw_contents = msg.contents\n            elif msg.contents is None:\n                raw_contents = \"\"\n            else:\n                raise ValueError(\n                    f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n                )\n            sys.stderr.write(raw_signal)\n            sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC through pipes.</p> <p>Arbitrary objects may be transmitted using pickle to serialize the data. If pickle is not used</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to serialize data using Pickle prior to sending it. If False, data is assumed to be text whi</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC through pipes.\n\n    Arbitrary objects may be transmitted using pickle to serialize the data.\n    If pickle is not used\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n        use_pickle (bool): Whether to serialize data using Pickle prior to\n            sending it. If False, data is assumed to be text whi\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n    self.desc = \"Communicates through stderr and stdout using pickle.\"\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Read from stdout and stderr.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Read from stdout and stderr.\n\n    Args:\n        proc (subprocess.Popen): The process to read from.\n\n    Returns:\n        msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    signal: Optional[str]\n    contents: Optional[str]\n    raw_signal: bytes = proc.stderr.read()\n    raw_contents: bytes = proc.stdout.read()\n    if raw_signal is not None:\n        signal = raw_signal.decode()\n    else:\n        signal = raw_signal\n    if raw_contents:\n        if self._use_pickle:\n            try:\n                contents = pickle.loads(raw_contents)\n            except (pickle.UnpicklingError, ValueError, EOFError) as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=False\")\n                self._use_pickle = False\n                contents = self._safe_unpickle_decode(raw_contents)\n        else:\n            try:\n                contents = raw_contents.decode()\n            except UnicodeDecodeError as err:\n                logger.debug(\"PipeCommunicator (Executor) - Set _use_pickle=True\")\n                self._use_pickle = True\n                contents = self._safe_unpickle_decode(raw_contents)\n    else:\n        contents = None\n\n    if signal and signal not in LUTE_SIGNALS:\n        # Some tasks write on stderr\n        # If the signal channel has \"non-signal\" info, add it to\n        # contents\n        if not contents:\n            contents = f\"({signal})\"\n        else:\n            contents = f\"{contents} ({signal})\"\n        signal = None\n\n    return Message(contents=contents, signal=signal)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.PipeCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Write to stdout and stderr.</p> <p>The signal component is sent to <code>stderr</code> while the contents of the  Message are sent to <code>stdout</code>.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Write to stdout and stderr.\n\n     The signal component is sent to `stderr` while the contents of the\n     Message are sent to `stdout`.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    if self._use_pickle:\n        signal: bytes\n        if msg.signal:\n            signal = msg.signal.encode()\n        else:\n            signal = b\"\"\n\n        contents: bytes = pickle.dumps(msg.contents)\n\n        sys.stderr.buffer.write(signal)\n        sys.stdout.buffer.write(contents)\n\n        sys.stderr.buffer.flush()\n        sys.stdout.buffer.flush()\n    else:\n        raw_signal: str\n        if msg.signal:\n            raw_signal = msg.signal\n        else:\n            raw_signal = \"\"\n\n        raw_contents: str\n        if isinstance(msg.contents, str):\n            raw_contents = msg.contents\n        elif msg.contents is None:\n            raw_contents = \"\"\n        else:\n            raise ValueError(\n                f\"Cannot send msg contents of type: {type(msg.contents)} when not using pickle!\"\n            )\n        sys.stderr.write(raw_signal)\n        sys.stdout.write(raw_contents)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator","title":"<code>SocketCommunicator</code>","text":"<p>               Bases: <code>Communicator</code></p> <p>Provides communication over Unix or TCP sockets.</p> <p>Communication is provided either using sockets with the Python socket library or using ZMQ. The choice of implementation is controlled by the global bool <code>USE_ZMQ</code>.</p> Whether to use TCP or Unix sockets is controlled by the environment <p><code>LUTE_USE_TCP=1</code></p> <p>If defined, TCP sockets will be used, otherwise Unix sockets will be used.</p> <p>Regardless of socket type, the environment variable                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> will be defined by the Executor-side Communicator.</p> <p>For TCP sockets: The Executor-side Communicator should be run first and will bind to all interfaces on the port determined by the environment variable:                         <code>LUTE_PORT=###</code> If no port is defined, a port scan will be performed and the Executor-side Communicator will bind the first one available from a random selection. It will then define the environment variable so the Task-side can pick it up.</p> <p>For Unix sockets: The path to the Unix socket is defined by the environment variable:                   <code>LUTE_SOCKET=/path/to/socket</code> This class assumes proper permissions and that this above environment variable has been defined. The <code>Task</code> is configured as what would commonly be referred to as the <code>client</code>, while the <code>Executor</code> is configured as the server.</p> <p>If the Task process is run on a different machine than the Executor, the Task-side Communicator will open a ssh-tunnel to forward traffic from a local Unix socket to the Executor Unix socket. Opening of the tunnel relies on the environment variable:                   <code>LUTE_EXECUTOR_HOST=&lt;hostname&gt;</code> to determine the Executor's host. This variable should be defined by the Executor and passed to the Task process automatically, but it can also be defined manually if launching the Task process separately. The Task will use the local socket <code>&lt;LUTE_SOCKET&gt;.task{##}</code>. Multiple local sockets may be created. Currently, it is assumed that the user is identical on both the Task machine and Executor machine.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>class SocketCommunicator(Communicator):\n    \"\"\"Provides communication over Unix or TCP sockets.\n\n    Communication is provided either using sockets with the Python socket library\n    or using ZMQ. The choice of implementation is controlled by the global bool\n    `USE_ZMQ`.\n\n    Whether to use TCP or Unix sockets is controlled by the environment:\n                           `LUTE_USE_TCP=1`\n    If defined, TCP sockets will be used, otherwise Unix sockets will be used.\n\n    Regardless of socket type, the environment variable\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    will be defined by the Executor-side Communicator.\n\n\n    For TCP sockets:\n    The Executor-side Communicator should be run first and will bind to all\n    interfaces on the port determined by the environment variable:\n                            `LUTE_PORT=###`\n    If no port is defined, a port scan will be performed and the Executor-side\n    Communicator will bind the first one available from a random selection. It\n    will then define the environment variable so the Task-side can pick it up.\n\n    For Unix sockets:\n    The path to the Unix socket is defined by the environment variable:\n                      `LUTE_SOCKET=/path/to/socket`\n    This class assumes proper permissions and that this above environment\n    variable has been defined. The `Task` is configured as what would commonly\n    be referred to as the `client`, while the `Executor` is configured as the\n    server.\n\n    If the Task process is run on a different machine than the Executor, the\n    Task-side Communicator will open a ssh-tunnel to forward traffic from a local\n    Unix socket to the Executor Unix socket. Opening of the tunnel relies on the\n    environment variable:\n                      `LUTE_EXECUTOR_HOST=&lt;hostname&gt;`\n    to determine the Executor's host. This variable should be defined by the\n    Executor and passed to the Task process automatically, but it can also be\n    defined manually if launching the Task process separately. The Task will use\n    the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n    created. Currently, it is assumed that the user is identical on both the Task\n    machine and Executor machine.\n    \"\"\"\n\n    ACCEPT_TIMEOUT: float = 0.01\n    \"\"\"\n    Maximum time to wait to accept connections. Used by Executor-side.\n    \"\"\"\n    MSG_HEAD: bytes = b\"MSG\"\n    \"\"\"\n    Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].\n    \"\"\"\n    MSG_SEP: bytes = b\";;;\"\n    \"\"\"\n    Separator for parts of a message. Messages have a start, length, message and end.\n    \"\"\"\n\n    def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n        \"\"\"IPC over a TCP or Unix socket.\n\n        Unlike with the PipeCommunicator, pickle is always used to send data\n        through the socket.\n\n        Args:\n            party (Party): Which object (side/process) the Communicator is\n                managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n            use_pickle (bool): Whether to use pickle. Always True currently,\n                passing False does not change behaviour.\n        \"\"\"\n        super().__init__(party=party, use_pickle=use_pickle)\n\n    def delayed_setup(self) -&gt; None:\n        \"\"\"Delays the creation of socket objects.\n\n        The Executor initializes the Communicator when it is created. Since\n        all Executors are created and available at once we want to delay\n        acquisition of socket resources until a single Executor is ready\n        to use them.\n        \"\"\"\n        self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n        if USE_ZMQ:\n            self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n            self._context: zmq.context.Context = zmq.Context()\n            self._data_socket = self._create_socket_zmq()\n        else:\n            self.desc: str = \"Communicates through a TCP or Unix socket.\"\n            self._data_socket = self._create_socket_raw()\n            self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n        if self._party == Party.EXECUTOR:\n            # Executor created first so we can define the hostname env variable\n            os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n            # Setup reader thread\n            self._reader_thread: threading.Thread = threading.Thread(\n                target=self._read_socket\n            )\n            self._msg_queue: queue.Queue = queue.Queue()\n            self._partial_msg: Optional[bytes] = None\n            self._stop_thread: bool = False\n            self._reader_thread.start()\n        else:\n            # Only used by Party.TASK\n            self._use_ssh_tunnel: bool = False\n            self._ssh_proc: Optional[subprocess.Popen] = None\n            self._local_socket_path: Optional[str] = None\n\n    # Read\n    ############################################################################\n\n    def read(self, proc: subprocess.Popen) -&gt; Message:\n        \"\"\"Return a message from the queue if available.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Args:\n            proc (subprocess.Popen): The process to read from. Provided for\n                compatibility with other Communicator subtypes. Is ignored.\n\n        Returns:\n             msg (Message): The message read, containing contents and signal.\n        \"\"\"\n        msg: Message\n        try:\n            msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n        except queue.Empty:\n            msg = Message()\n\n        return msg\n\n    def _read_socket(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Socket(s) are continuously monitored, and read from when new data is\n        available.\n\n        Calls an underlying method for either raw sockets or ZMQ.\n        \"\"\"\n\n        while True:\n            if self._stop_thread:\n                logger.debug(\"Stopping socket reader thread.\")\n                break\n            if USE_ZMQ:\n                self._read_socket_zmq()\n            else:\n                self._read_socket_raw()\n\n    def _read_socket_raw(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        Raw socket implementation for the reader thread.\n        \"\"\"\n        connection: socket.socket\n        addr: Union[str, Tuple[str, int]]\n        try:\n            connection, addr = self._data_socket.accept()\n            full_data: bytes = b\"\"\n            while True:\n                data: bytes = connection.recv(8192)\n                if data:\n                    full_data += data\n                else:\n                    break\n            connection.close()\n            self._unpack_messages(full_data)\n        except socket.timeout:\n            pass\n\n    def _read_socket_zmq(self) -&gt; None:\n        \"\"\"Read data from a socket.\n\n        ZMQ implementation for the reader thread.\n        \"\"\"\n        try:\n            full_data: bytes = self._data_socket.recv(0)\n            self._unpack_messages(full_data)\n        except zmq.ZMQError:\n            pass\n\n    def _unpack_messages(self, data: bytes) -&gt; None:\n        \"\"\"Unpacks a byte stream into individual messages.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        Partial messages (a series of bytes which cannot be converted to a full\n        message) are stored for later. An attempt is made to reconstruct the\n        message with the next call to this method.\n\n        Args:\n            data (bytes): A raw byte stream containing anywhere from a partial\n                message to multiple full messages.\n        \"\"\"\n        msg: Message\n        working_data: bytes\n        if self._partial_msg:\n            # Concatenate the previous partial message to the beginning\n            working_data = self._partial_msg + data\n            self._partial_msg = None\n        else:\n            working_data = data\n        while working_data:\n            try:\n                # Message encoding: &lt;HEAD&gt;&lt;SEP&gt;&lt;len&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n                end = working_data.find(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD[::-1]\n                )\n                msg_parts: List[bytes] = working_data[:end].split(\n                    SocketCommunicator.MSG_SEP\n                )\n                if len(msg_parts) != 3:\n                    self._partial_msg = working_data\n                    break\n\n                cmd: bytes\n                nbytes: bytes\n                raw_msg: bytes\n                cmd, nbytes, raw_msg = msg_parts\n                if len(raw_msg) != int(nbytes):\n                    self._partial_msg = working_data\n                    break\n                msg = pickle.loads(raw_msg)\n                self._msg_queue.put(msg)\n            except pickle.UnpicklingError:\n                self._partial_msg = working_data\n                break\n            if end &lt; len(working_data):\n                # Add len(SEP+HEAD) since end marks the start of &lt;SEP&gt;&lt;HEAD[::-1]\n                offset: int = len(\n                    SocketCommunicator.MSG_SEP + SocketCommunicator.MSG_HEAD\n                )\n                working_data = working_data[end + offset :]\n            else:\n                working_data = b\"\"\n\n    # Write\n    ############################################################################\n\n    def _write_socket(self, msg: Message) -&gt; None:\n        \"\"\"Sends data over a socket from the 'client' (Task) side.\n\n        Messages are encoded in the following format:\n                 &lt;HEAD&gt;&lt;SEP&gt;&lt;len(msg)&gt;&lt;SEP&gt;&lt;msg&gt;&lt;SEP&gt;&lt;HEAD[::-1]&gt;\n        The items between &lt;&gt; are replaced as follows:\n            - &lt;HEAD&gt;: A start marker\n            - &lt;SEP&gt;: A separator for components of the message\n            - &lt;len(msg)&gt;: The length of the message payload in bytes.\n            - &lt;msg&gt;: The message payload in bytes\n            - &lt;HEAD[::-1]&gt;: The start marker in reverse to indicate the end.\n\n        This structure is used for decoding the message on the other end.\n        \"\"\"\n        data: bytes = pickle.dumps(msg)\n        cmd: bytes = SocketCommunicator.MSG_HEAD\n        size: bytes = b\"%d\" % len(data)\n        end: bytes = SocketCommunicator.MSG_HEAD[::-1]\n        sep: bytes = SocketCommunicator.MSG_SEP\n        packed_msg: bytes = cmd + sep + size + sep + data + sep + end\n        if USE_ZMQ:\n            self._data_socket.send(packed_msg)\n        else:\n            self._data_socket.sendall(packed_msg)\n\n    def write(self, msg: Message) -&gt; None:\n        \"\"\"Send a single Message.\n\n        The entire Message (signal and contents) is serialized and sent through\n        a connection over Unix socket.\n\n        Args:\n            msg (Message): The Message to send.\n        \"\"\"\n        self._write_socket(msg)\n\n    # Generic create\n    ############################################################################\n\n    def _create_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): TCP or Unix socket.\n        \"\"\"\n        import struct\n\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        sock: socket.socket\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw TCP sockets.\")\n            sock = self._init_tcp_socket_raw()\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use raw Unix sockets.\")\n            sock = self._init_unix_socket_raw()\n        sock.setsockopt(\n            socket.SOL_SOCKET, socket.SO_LINGER, struct.pack(\"ii\", 1, 10000)\n        )\n        return sock\n\n    def _create_socket_zmq(self) -&gt; zmq.sugar.socket.Socket:\n        \"\"\"Create either a Unix or TCP socket.\n\n        If the environment variable:\n                              `LUTE_USE_TCP=1`\n        is defined, a TCP socket is returned, otherwise a Unix socket.\n\n        Refer to the individual initialization methods for additional environment\n        variables controlling the behaviour of these two communication types.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_type: Literal[zmq.PULL, zmq.PUSH]\n        if self._party == Party.EXECUTOR:\n            socket_type = zmq.PULL\n        else:\n            socket_type = zmq.PUSH\n\n        data_socket: zmq.sugar.socket.Socket = self._context.socket(socket_type)\n        data_socket.set_hwm(160000)\n        # Need to multiply by 1000 since ZMQ uses ms\n        data_socket.setsockopt(\n            zmq.RCVTIMEO, int(SocketCommunicator.ACCEPT_TIMEOUT * 1000)\n        )\n        # Try TCP first\n        use_tcp: Optional[str] = os.getenv(\"LUTE_USE_TCP\")\n        if use_tcp is not None:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use TCP (ZMQ).\")\n            self._init_tcp_socket_zmq(data_socket)\n        else:\n            if self._party == Party.EXECUTOR:\n                logger.info(\"Will use Unix sockets (ZMQ).\")\n            self._init_unix_socket_zmq(data_socket)\n\n        return data_socket\n\n    # TCP Init\n    ############################################################################\n\n    def _find_random_port(\n        self, min_port: int = 41923, max_port: int = 64324, max_tries: int = 100\n    ) -&gt; Optional[int]:\n        \"\"\"Find a random open port to bind to if using TCP.\"\"\"\n        from random import choices\n\n        sock: socket.socket\n        ports: List[int] = choices(range(min_port, max_port), k=max_tries)\n        for port in ports:\n            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n            try:\n                sock.bind((\"\", port))\n                sock.close()\n                del sock\n                return port\n            except:\n                continue\n        return None\n\n    def _init_tcp_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Initialize a TCP socket.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_PORT=###`\n        is defined, if so binds it, otherwise find a free port from a selection\n        of random ports. If a port search is performed, the `LUTE_PORT` variable\n        will be defined so it can be picked up by the the Task-side Communicator.\n\n        In the event that no port can be bound on the Executor-side, or the port\n        and hostname information is unavailable to the Task-side, the program\n        will exit.\n\n        Returns:\n            data_socket (socket.socket): TCP socket object.\n        \"\"\"\n        data_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                # If port is None find one\n                # Executor code executes first\n                port = self._find_random_port()\n                if port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                # Provide port env var for Task-side\n                os.environ[\"LUTE_PORT\"] = str(port)\n            data_socket.bind((\"\", int(port)))\n            data_socket.listen()\n        else:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect((\"localhost\", int(port)))\n            else:\n                data_socket.connect((executor_hostname, int(port)))\n        return data_socket\n\n    def _init_tcp_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a TCP socket using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (zmq.socket.Socket): Socket object.\n        \"\"\"\n        port: Optional[Union[str, int]] = os.getenv(\"LUTE_PORT\")\n        if self._party == Party.EXECUTOR:\n            if port is None:\n                new_port: int = data_socket.bind_to_random_port(\"tcp://*\")\n                if new_port is None:\n                    # Failed to find a port to bind\n                    logger.info(\n                        \"Executor failed to bind a port. \"\n                        \"Try providing a LUTE_PORT directly! Exiting!\"\n                    )\n                    sys.exit(-1)\n                port = new_port\n                os.environ[\"LUTE_PORT\"] = str(port)\n            else:\n                data_socket.bind(f\"tcp://*:{port}\")\n            logger.debug(f\"Executor bound port {port}\")\n        else:\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None or port is None:\n                logger.info(\n                    \"Task-side does not have host/port information!\"\n                    \" Check environment variables! Exiting!\"\n                )\n                sys.exit(-1)\n            data_socket.connect(f\"tcp://{executor_hostname}:{port}\")\n\n    # Unix Init\n    ############################################################################\n\n    def _get_socket_path(self) -&gt; str:\n        \"\"\"Return the socket path, defining one if it is not available.\n\n        Returns:\n            socket_path (str): Path to the Unix socket.\n        \"\"\"\n        socket_path: str\n        try:\n            socket_path = os.environ[\"LUTE_SOCKET\"]\n        except KeyError as err:\n            import uuid\n            import tempfile\n\n            # Define a path, and add to environment\n            # Executor-side always created first, Task will use the same one\n            socket_path = f\"{tempfile.gettempdir()}/lute_{uuid.uuid4().hex}.sock\"\n            os.environ[\"LUTE_SOCKET\"] = socket_path\n            logger.debug(f\"SocketCommunicator defines socket_path: {socket_path}\")\n        if USE_ZMQ:\n            return f\"ipc://{socket_path}\"\n        else:\n            return socket_path\n\n    def _init_unix_socket_raw(self) -&gt; socket.socket:\n        \"\"\"Returns a Unix socket object.\n\n        Executor-side code should always be run first. It checks to see if\n        the environment variable\n                                `LUTE_SOCKET=XYZ`\n        is defined, if so binds it, otherwise it will create a new path and\n        define the environment variable for the Task-side to find.\n\n        On the Task (client-side), this method will also open a SSH tunnel to\n        forward a local Unix socket to an Executor Unix socket if the Task and\n        Executor processes are on different machines.\n\n        Returns:\n            data_socket (socket.socket): Unix socket object.\n        \"\"\"\n        socket_path: str = self._get_socket_path()\n        data_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n            data_socket.listen()\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path, hostname, executor_hostname\n                )\n                while 1:\n                    # Keep trying reconnect until ssh tunnel works.\n                    try:\n                        data_socket.connect(self._local_socket_path)\n                        break\n                    except FileNotFoundError:\n                        continue\n\n        return data_socket\n\n    def _init_unix_socket_zmq(self, data_socket: zmq.sugar.socket.Socket) -&gt; None:\n        \"\"\"Initialize a Unix socket object, using ZMQ.\n\n        Equivalent as the method above but requires passing in a ZMQ socket\n        object instead of returning one.\n\n        Args:\n            data_socket (socket.socket): ZMQ object.\n        \"\"\"\n        socket_path = self._get_socket_path()\n        if self._party == Party.EXECUTOR:\n            if os.path.exists(socket_path):\n                os.unlink(socket_path)\n            data_socket.bind(socket_path)\n        elif self._party == Party.TASK:\n            hostname: str = socket.gethostname()\n            executor_hostname: Optional[str] = os.getenv(\"LUTE_EXECUTOR_HOST\")\n            if executor_hostname is None:\n                logger.info(\"Hostname for Executor process not found! Exiting!\")\n                self._data_socket.close()\n                sys.exit(-1)\n            if hostname == executor_hostname:\n                data_socket.connect(socket_path)\n            else:\n                # Need to remove ipc:// from socket_path for forwarding\n                self._local_socket_path = self._setup_unix_ssh_tunnel(\n                    socket_path[6:], hostname, executor_hostname\n                )\n                # Need to add it back\n                path: str = f\"ipc://{self._local_socket_path}\"\n                data_socket.connect(path)\n\n    def _setup_unix_ssh_tunnel(\n        self, socket_path: str, hostname: str, executor_hostname: str\n    ) -&gt; str:\n        \"\"\"Prepares an SSH tunnel for forwarding between Unix sockets on two hosts.\n\n        An SSH tunnel is opened with `ssh -L &lt;local&gt;:&lt;remote&gt; sleep 2`.\n        This method of communication is slightly slower and incurs additional\n        overhead - it should only be used as a backup. If communication across\n        multiple hosts is required consider using TCP.  The Task will use\n        the local socket `&lt;LUTE_SOCKET&gt;.task{##}`. Multiple local sockets may be\n        created. It is assumed that the user is identical on both the\n        Task machine and Executor machine.\n\n        Returns:\n            local_socket_path (str): The local Unix socket to connect to.\n        \"\"\"\n        if \"uuid\" not in globals():\n            import uuid\n        local_socket_path = f\"{socket_path}.task{uuid.uuid4().hex[:4]}\"\n        self._use_ssh_tunnel = True\n        ssh_cmd: List[str] = [\n            \"ssh\",\n            \"-o\",\n            \"LogLevel=quiet\",\n            \"-L\",\n            f\"{local_socket_path}:{socket_path}\",\n            executor_hostname,\n            \"sleep\",\n            \"2\",\n        ]\n        logger.debug(f\"Opening tunnel from {hostname} to {executor_hostname}\")\n        self._ssh_proc = subprocess.Popen(ssh_cmd)\n        time.sleep(0.4)  # Need to wait... -&gt; Use single Task comm at beginning?\n        return local_socket_path\n\n    # Clean up and properties\n    ############################################################################\n\n    def _clean_up(self) -&gt; None:\n        \"\"\"Clean up connections.\"\"\"\n        if self._party == Party.EXECUTOR:\n            self._stop_thread = True\n            self._reader_thread.join()\n            logger.debug(\"Closed reading thread.\")\n\n        self._data_socket.close()\n        if USE_ZMQ:\n            self._context.term()\n        else:\n            ...\n\n        if os.getenv(\"LUTE_USE_TCP\"):\n            return\n        else:\n            if self._party == Party.EXECUTOR:\n                os.unlink(os.getenv(\"LUTE_SOCKET\"))  # Should be defined\n                return\n            elif self._use_ssh_tunnel:\n                if self._ssh_proc is not None:\n                    self._ssh_proc.terminate()\n\n    @property\n    def has_messages(self) -&gt; bool:\n        if self._party == Party.TASK:\n            # Shouldn't be called on Task-side\n            return False\n\n        if self._msg_queue.qsize() &gt; 0:\n            return True\n        return False\n\n    def __exit__(self):\n        self._clean_up()\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.ACCEPT_TIMEOUT","title":"<code>ACCEPT_TIMEOUT: float = 0.01</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Maximum time to wait to accept connections. Used by Executor-side.</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.MSG_HEAD","title":"<code>MSG_HEAD: bytes = b'MSG'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Start signal of a message. The end of a message is indicated by MSG_HEAD[::-1].</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.MSG_SEP","title":"<code>MSG_SEP: bytes = b';;;'</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Separator for parts of a message. Messages have a start, length, message and end.</p>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.__init__","title":"<code>__init__(party=Party.TASK, use_pickle=True)</code>","text":"<p>IPC over a TCP or Unix socket.</p> <p>Unlike with the PipeCommunicator, pickle is always used to send data through the socket.</p> <p>Parameters:</p> Name Type Description Default <code>party</code> <code>Party</code> <p>Which object (side/process) the Communicator is managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.</p> <code>TASK</code> <code>use_pickle</code> <code>bool</code> <p>Whether to use pickle. Always True currently, passing False does not change behaviour.</p> <code>True</code> Source code in <code>lute/execution/ipc.py</code> <pre><code>def __init__(self, party: Party = Party.TASK, use_pickle: bool = True) -&gt; None:\n    \"\"\"IPC over a TCP or Unix socket.\n\n    Unlike with the PipeCommunicator, pickle is always used to send data\n    through the socket.\n\n    Args:\n        party (Party): Which object (side/process) the Communicator is\n            managing IPC for. I.e., is this the \"Task\" or \"Executor\" side.\n\n        use_pickle (bool): Whether to use pickle. Always True currently,\n            passing False does not change behaviour.\n    \"\"\"\n    super().__init__(party=party, use_pickle=use_pickle)\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.delayed_setup","title":"<code>delayed_setup()</code>","text":"<p>Delays the creation of socket objects.</p> <p>The Executor initializes the Communicator when it is created. Since all Executors are created and available at once we want to delay acquisition of socket resources until a single Executor is ready to use them.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def delayed_setup(self) -&gt; None:\n    \"\"\"Delays the creation of socket objects.\n\n    The Executor initializes the Communicator when it is created. Since\n    all Executors are created and available at once we want to delay\n    acquisition of socket resources until a single Executor is ready\n    to use them.\n    \"\"\"\n    self._data_socket: Union[socket.socket, zmq.sugar.socket.Socket]\n    if USE_ZMQ:\n        self.desc: str = \"Communicates using ZMQ through TCP or Unix sockets.\"\n        self._context: zmq.context.Context = zmq.Context()\n        self._data_socket = self._create_socket_zmq()\n    else:\n        self.desc: str = \"Communicates through a TCP or Unix socket.\"\n        self._data_socket = self._create_socket_raw()\n        self._data_socket.settimeout(SocketCommunicator.ACCEPT_TIMEOUT)\n\n    if self._party == Party.EXECUTOR:\n        # Executor created first so we can define the hostname env variable\n        os.environ[\"LUTE_EXECUTOR_HOST\"] = socket.gethostname()\n        # Setup reader thread\n        self._reader_thread: threading.Thread = threading.Thread(\n            target=self._read_socket\n        )\n        self._msg_queue: queue.Queue = queue.Queue()\n        self._partial_msg: Optional[bytes] = None\n        self._stop_thread: bool = False\n        self._reader_thread.start()\n    else:\n        # Only used by Party.TASK\n        self._use_ssh_tunnel: bool = False\n        self._ssh_proc: Optional[subprocess.Popen] = None\n        self._local_socket_path: Optional[str] = None\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.read","title":"<code>read(proc)</code>","text":"<p>Return a message from the queue if available.</p> <p>Socket(s) are continuously monitored, and read from when new data is available.</p> <p>Parameters:</p> Name Type Description Default <code>proc</code> <code>Popen</code> <p>The process to read from. Provided for compatibility with other Communicator subtypes. Is ignored.</p> required <p>Returns:</p> Name Type Description <code>msg</code> <code>Message</code> <p>The message read, containing contents and signal.</p> Source code in <code>lute/execution/ipc.py</code> <pre><code>def read(self, proc: subprocess.Popen) -&gt; Message:\n    \"\"\"Return a message from the queue if available.\n\n    Socket(s) are continuously monitored, and read from when new data is\n    available.\n\n    Args:\n        proc (subprocess.Popen): The process to read from. Provided for\n            compatibility with other Communicator subtypes. Is ignored.\n\n    Returns:\n         msg (Message): The message read, containing contents and signal.\n    \"\"\"\n    msg: Message\n    try:\n        msg = self._msg_queue.get(timeout=SocketCommunicator.ACCEPT_TIMEOUT)\n    except queue.Empty:\n        msg = Message()\n\n    return msg\n</code></pre>"},{"location":"source/execution/ipc/#execution.ipc.SocketCommunicator.write","title":"<code>write(msg)</code>","text":"<p>Send a single Message.</p> <p>The entire Message (signal and contents) is serialized and sent through a connection over Unix socket.</p> <p>Parameters:</p> Name Type Description Default <code>msg</code> <code>Message</code> <p>The Message to send.</p> required Source code in <code>lute/execution/ipc.py</code> <pre><code>def write(self, msg: Message) -&gt; None:\n    \"\"\"Send a single Message.\n\n    The entire Message (signal and contents) is serialized and sent through\n    a connection over Unix socket.\n\n    Args:\n        msg (Message): The Message to send.\n    \"\"\"\n    self._write_socket(msg)\n</code></pre>"},{"location":"source/io/_sqlite/","title":"_sqlite","text":"<p>Backend SQLite database utilites.</p> <p>Functions should be used only by the higher-level database module.</p>"},{"location":"source/io/config/","title":"config","text":"<p>Machinary for the IO of configuration YAML files and their validation.</p> <p>Functions:</p> Name Description <code>parse_config</code> <p>str, config_path: str) -&gt; TaskParameters: Parse a configuration file and return a TaskParameters object of validated parameters for a specific Task. Raises an exception if the provided configuration does not match the expected model.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Error raised by pydantic during data validation. (From Pydantic)</p>"},{"location":"source/io/config/#io.config.AnalysisHeader","title":"<code>AnalysisHeader</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Header information for LUTE analysis runs.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class AnalysisHeader(BaseModel):\n    \"\"\"Header information for LUTE analysis runs.\"\"\"\n\n    title: str = Field(\n        \"LUTE Task Configuration\",\n        description=\"Description of the configuration or experiment.\",\n    )\n    experiment: str = Field(\"\", description=\"Experiment.\")\n    run: Union[str, int] = Field(\"\", description=\"Data acquisition run.\")\n    date: str = Field(\"1970/01/01\", description=\"Start date of analysis.\")\n    lute_version: Union[float, str] = Field(\n        0.1, description=\"Version of LUTE used for analysis.\"\n    )\n    task_timeout: PositiveInt = Field(\n        600,\n        description=(\n            \"Time in seconds until a task times out. Should be slightly shorter\"\n            \" than job timeout if using a job manager (e.g. SLURM).\"\n        ),\n    )\n    work_dir: str = Field(\"\", description=\"Main working directory for LUTE.\")\n\n    @validator(\"work_dir\", always=True)\n    def validate_work_dir(cls, directory: str, values: Dict[str, Any]) -&gt; str:\n        work_dir: str\n        if directory == \"\":\n            std_work_dir = (\n                f\"/sdf/data/lcls/ds/{values['experiment'][:3]}/\"\n                f\"{values['experiment']}/scratch\"\n            )\n            work_dir = std_work_dir\n        else:\n            work_dir = directory\n        # Check existence and permissions\n        if not os.path.exists(work_dir):\n            raise ValueError(f\"Working Directory: {work_dir} does not exist!\")\n        if not os.access(work_dir, os.W_OK):\n            # Need write access for database, files etc.\n            raise ValueError(f\"Not write access for working directory: {work_dir}!\")\n        return work_dir\n\n    @validator(\"run\", always=True)\n    def validate_run(\n        cls, run: Union[str, int], values: Dict[str, Any]\n    ) -&gt; Union[str, int]:\n        if run == \"\":\n            # From Airflow RUN_NUM should have Format \"RUN_DATETIME\" - Num is first part\n            run_time: str = os.environ.get(\"RUN_NUM\", \"\")\n            if run_time != \"\":\n                return int(run_time.split(\"_\")[0])\n        return run\n\n    @validator(\"experiment\", always=True)\n    def validate_experiment(cls, experiment: str, values: Dict[str, Any]) -&gt; str:\n        if experiment == \"\":\n            arp_exp: str = os.environ.get(\"EXPERIMENT\", \"EXPX00000\")\n            return arp_exp\n        return experiment\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters","title":"<code>CompareHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>compare_hkl</code> for calculating figures of merit.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class CompareHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `compare_hkl` for calculating figures of merit.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/compare_hkl\",\n        description=\"CrystFEL's reflection comparison binary.\",\n        flag_type=\"\",\n    )\n    in_files: Optional[str] = Field(\n        \"\",\n        description=\"Path to input HKLs. Space-separated list of 2. Use output of partialator e.g.\",\n        flag_type=\"\",\n    )\n    ## Need mechanism to set is_result=True ...\n    symmetry: str = Field(\"\", description=\"Point group symmetry.\", flag_type=\"--\")\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    fom: str = Field(\n        \"Rsplit\", description=\"Specify figure of merit to calculate.\", flag_type=\"--\"\n    )\n    nshells: int = Field(10, description=\"Use n resolution shells.\", flag_type=\"--\")\n    # NEED A NEW CASE FOR THIS -&gt; Boolean flag, no arg, one hyphen...\n    # fix_unity: bool = Field(\n    #    False,\n    #    description=\"Fix scale factors to unity.\",\n    #    flag_type=\"-\",\n    #    rename_param=\"u\",\n    # )\n    shell_file: str = Field(\n        \"\",\n        description=\"Write the statistics in resolution shells to a file.\",\n        flag_type=\"--\",\n        rename_param=\"shell-file\",\n        is_result=True,\n    )\n    ignore_negs: bool = Field(\n        False,\n        description=\"Ignore reflections with negative reflections.\",\n        flag_type=\"--\",\n        rename_param=\"ignore-negs\",\n    )\n    zero_negs: bool = Field(\n        False,\n        description=\"Set negative intensities to 0.\",\n        flag_type=\"--\",\n        rename_param=\"zero-negs\",\n    )\n    sigma_cutoff: Optional[Union[float, int, str]] = Field(\n        # \"-infinity\",\n        description=\"Discard reflections with I/sigma(I) &lt; n. -infinity means no cutoff.\",\n        flag_type=\"--\",\n        rename_param=\"sigma-cutoff\",\n    )\n    rmin: Optional[float] = Field(\n        description=\"Low resolution cutoff of 1/d (m-1). Use this or --lowres NOT both.\",\n        flag_type=\"--\",\n    )\n    lowres: Optional[float] = Field(\n        descirption=\"Low resolution cutoff in Angstroms. Use this or --rmin NOT both.\",\n        flag_type=\"--\",\n    )\n    rmax: Optional[float] = Field(\n        description=\"High resolution cutoff in 1/d (m-1). Use this or --highres NOT both.\",\n        flag_type=\"--\",\n    )\n    highres: Optional[float] = Field(\n        description=\"High resolution cutoff in Angstroms. Use this or --rmax NOT both.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_files\", always=True)\n    def validate_in_files(cls, in_files: str, values: Dict[str, Any]) -&gt; str:\n        if in_files == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                hkls: str = f\"{partialator_file}1 {partialator_file}2\"\n                return hkls\n        return in_files\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n\n    @validator(\"symmetry\", always=True)\n    def validate_symmetry(cls, symmetry: str, values: Dict[str, Any]) -&gt; str:\n        if symmetry == \"\":\n            partialator_sym: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"symmetry\"\n            )\n            if partialator_sym:\n                return partialator_sym\n        return symmetry\n\n    @validator(\"shell_file\", always=True)\n    def validate_shell_file(cls, shell_file: str, values: Dict[str, Any]) -&gt; str:\n        if shell_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                shells_out: str = partialator_file.split(\".\")[0]\n                shells_out = f\"{shells_out}_{values['fom']}_n{values['nshells']}.dat\"\n                return shells_out\n        return shell_file\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.CompareHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters","title":"<code>ConcatenateStreamFilesParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for stream concatenation.</p> <p>Concatenates the stream file output from CrystFEL indexing for multiple experimental runs.</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class ConcatenateStreamFilesParameters(TaskParameters):\n    \"\"\"Parameters for stream concatenation.\n\n    Concatenates the stream file output from CrystFEL indexing for multiple\n    experimental runs.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    in_file: str = Field(\n        \"\",\n        description=\"Root of directory tree storing stream files to merge.\",\n    )\n\n    tag: Optional[str] = Field(\n        \"\",\n        description=\"Tag identifying the stream files to merge.\",\n    )\n\n    out_file: str = Field(\n        \"\", description=\"Path to merged output stream file.\", is_result=True\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_dir: str = str(Path(stream_file).parent)\n                return stream_dir\n        return in_file\n\n    @validator(\"tag\", always=True)\n    def validate_tag(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_tag: str = Path(stream_file).name.split(\"_\")[0]\n                return stream_tag\n        return tag\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_out_file: str = str(\n                Path(values[\"in_file\"]).parent / f\"{values['tag'].stream}\"\n            )\n            return stream_out_file\n        return tag\n</code></pre>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ConcatenateStreamFilesParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.DimpleSolveParameters","title":"<code>DimpleSolveParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's dimple program.</p> <p>There are many parameters. For more information on usage, please refer to the CCP4 documentation, here: https://ccp4.github.io/dimple/</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class DimpleSolveParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's dimple program.\n\n    There are many parameters. For more information on\n    usage, please refer to the CCP4 documentation, here:\n    https://ccp4.github.io/dimple/\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/dimple\",\n        description=\"CCP4 Dimple for solving structures with MR.\",\n        flag_type=\"\",\n    )\n    # Positional requirements - all required.\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input mtz.\",\n        flag_type=\"\",\n    )\n    pdb: str = Field(\"\", description=\"Path to a PDB.\", flag_type=\"\")\n    out_dir: str = Field(\"\", description=\"Output DIRECTORY.\", flag_type=\"\")\n    # Most used options\n    mr_thresh: PositiveFloat = Field(\n        0.4,\n        description=\"Threshold for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-when-r\",\n    )\n    slow: Optional[bool] = Field(\n        False, description=\"Perform more refinement.\", flag_type=\"--\"\n    )\n    # Other options (IO)\n    hklout: str = Field(\n        \"final.mtz\", description=\"Output mtz file name.\", flag_type=\"--\"\n    )\n    xyzout: str = Field(\n        \"final.pdb\", description=\"Output PDB file name.\", flag_type=\"--\"\n    )\n    icolumn: Optional[str] = Field(\n        # \"IMEAN\",\n        description=\"Name for the I column.\",\n        flag_type=\"--\",\n    )\n    sigicolumn: Optional[str] = Field(\n        # \"SIG&lt;ICOL&gt;\",\n        description=\"Name for the Sig&lt;I&gt; column.\",\n        flag_type=\"--\",\n    )\n    fcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the F column.\",\n        flag_type=\"--\",\n    )\n    sigfcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the Sig&lt;F&gt; column.\",\n        flag_type=\"--\",\n    )\n    libin: Optional[str] = Field(\n        description=\"Ligand descriptions for refmac (LIBIN).\", flag_type=\"--\"\n    )\n    refmac_key: Optional[str] = Field(\n        description=\"Extra Refmac keywords to use in refinement.\",\n        flag_type=\"--\",\n        rename_param=\"refmac-key\",\n    )\n    free_r_flags: Optional[str] = Field(\n        description=\"Path to a mtz file with freeR flags.\",\n        flag_type=\"--\",\n        rename_param=\"free-r-flags\",\n    )\n    freecolumn: Optional[Union[int, float]] = Field(\n        # 0,\n        description=\"Refree column with an optional value.\",\n        flag_type=\"--\",\n    )\n    img_format: Optional[str] = Field(\n        description=\"Format of generated images. (png, jpeg, none).\",\n        flag_type=\"-\",\n        rename_param=\"f\",\n    )\n    white_bg: bool = Field(\n        False,\n        description=\"Use a white background in Coot and in images.\",\n        flag_type=\"--\",\n        rename_param=\"white-bg\",\n    )\n    no_cleanup: bool = Field(\n        False,\n        description=\"Retain intermediate files.\",\n        flag_type=\"--\",\n        rename_param=\"no-cleanup\",\n    )\n    # Calculations\n    no_blob_search: bool = Field(\n        False,\n        description=\"Do not search for unmodelled blobs.\",\n        flag_type=\"--\",\n        rename_param=\"no-blob-search\",\n    )\n    anode: bool = Field(\n        False, description=\"Use SHELX/AnoDe to find peaks in the anomalous map.\"\n    )\n    # Run customization\n    no_hetatm: bool = Field(\n        False,\n        description=\"Remove heteroatoms from the given model.\",\n        flag_type=\"--\",\n        rename_param=\"no-hetatm\",\n    )\n    rigid_cycles: Optional[PositiveInt] = Field(\n        # 10,\n        description=\"Number of cycles of rigid-body refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"rigid-cycles\",\n    )\n    jelly: Optional[PositiveInt] = Field(\n        # 4,\n        description=\"Number of cycles of jelly-body refinement to perform.\",\n        flag_type=\"--\",\n    )\n    restr_cycles: Optional[PositiveInt] = Field(\n        # 8,\n        description=\"Number of cycles of refmac final refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"restr-cycles\",\n    )\n    lim_resolution: Optional[PositiveFloat] = Field(\n        description=\"Limit the final resolution.\", flag_type=\"--\", rename_param=\"reso\"\n    )\n    weight: Optional[str] = Field(\n        # \"auto-weight\",\n        description=\"The refmac matrix weight.\",\n        flag_type=\"--\",\n    )\n    mr_prog: Optional[str] = Field(\n        # \"phaser\",\n        description=\"Molecular replacement program. phaser or molrep.\",\n        flag_type=\"--\",\n        rename_param=\"mr-prog\",\n    )\n    mr_num: Optional[Union[str, int]] = Field(\n        # \"auto\",\n        description=\"Number of molecules to use for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-num\",\n    )\n    mr_reso: Optional[PositiveFloat] = Field(\n        # 3.25,\n        description=\"High resolution for molecular replacement. If &gt;10 interpreted as eLLG.\",\n        flag_type=\"--\",\n        rename_param=\"mr-reso\",\n    )\n    itof_prog: Optional[str] = Field(\n        description=\"Program to calculate amplitudes. truncate, or ctruncate.\",\n        flag_type=\"--\",\n        rename_param=\"ItoF-prog\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return get_hkl_file\n        return in_file\n\n    @validator(\"out_dir\", always=True)\n    def validate_out_dir(cls, out_dir: str, values: Dict[str, Any]) -&gt; str:\n        if out_dir == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return os.path.dirname(get_hkl_file)\n        return out_dir\n</code></pre>"},{"location":"source/io/config/#io.config.FindOverlapXSSParameters","title":"<code>FindOverlapXSSParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>TaskParameter model for FindOverlapXSS Task.</p> <p>This Task determines spatial or temporal overlap between an optical pulse and the FEL pulse based on difference scattering (XSS) signal. This Task uses SmallData HDF5 files as a source.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class FindOverlapXSSParameters(TaskParameters):\n    \"\"\"TaskParameter model for FindOverlapXSS Task.\n\n    This Task determines spatial or temporal overlap between an optical pulse\n    and the FEL pulse based on difference scattering (XSS) signal. This Task\n    uses SmallData HDF5 files as a source.\n    \"\"\"\n\n    class ExpConfig(BaseModel):\n        det_name: str\n        ipm_var: str\n        scan_var: Union[str, List[str]]\n\n    class Thresholds(BaseModel):\n        min_Iscat: Union[int, float]\n        min_ipm: Union[int, float]\n\n    class AnalysisFlags(BaseModel):\n        use_pyfai: bool = True\n        use_asymls: bool = False\n\n    exp_config: ExpConfig\n    thresholds: Thresholds\n    analysis_flags: AnalysisFlags\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters","title":"<code>FindPeaksPsocakeParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using Psocake.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation. NOTE: This Task is deprecated and provided for compatibility only.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPsocakeParameters(ThirdPartyParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using Psocake.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    NOTE: This Task is deprecated and provided for compatibility only.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    class SZParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description=\"SZ compression algorithm (qoz, sz3)\"\n        )\n        binSize: int = Field(2, description=\"SZ compression's bin size paramater\")\n        roiWindowSize: int = Field(\n            2, description=\"SZ compression's ROI window size paramater\"\n        )\n        absError: float = Field(10, descriptionp=\"Maximum absolute error value\")\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    mca: str = Field(\n        \"btl ^openib\", description=\"Mca option for the MPI executable\", flag_type=\"--\"\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    p_arg2: str = Field(\n        \"findPeaksSZ.py\",\n        description=\"Executable to run with mpi (i.e. python).\",\n        flag_type=\"\",\n    )\n    d: str = Field(description=\"Detector name\", flag_type=\"-\")\n    e: str = Field(\"\", description=\"Experiment name\", flag_type=\"-\")\n    r: int = Field(-1, description=\"Run number\", flag_type=\"-\")\n    outDir: str = Field(\n        description=\"Output directory where .cxi will be saved\", flag_type=\"--\"\n    )\n    algorithm: int = Field(1, description=\"PyAlgos algorithm to use\", flag_type=\"--\")\n    alg_npix_min: float = Field(\n        1.0, description=\"PyAlgos algorithm's npix_min parameter\", flag_type=\"--\"\n    )\n    alg_npix_max: float = Field(\n        45.0, description=\"PyAlgos algorithm's npix_max parameter\", flag_type=\"--\"\n    )\n    alg_amax_thr: float = Field(\n        250.0, description=\"PyAlgos algorithm's amax_thr parameter\", flag_type=\"--\"\n    )\n    alg_atot_thr: float = Field(\n        330.0, description=\"PyAlgos algorithm's atot_thr parameter\", flag_type=\"--\"\n    )\n    alg_son_min: float = Field(\n        10.0, description=\"PyAlgos algorithm's son_min parameter\", flag_type=\"--\"\n    )\n    alg1_thr_low: float = Field(\n        80.0, description=\"PyAlgos algorithm's thr_low parameter\", flag_type=\"--\"\n    )\n    alg1_thr_high: float = Field(\n        270.0, description=\"PyAlgos algorithm's thr_high parameter\", flag_type=\"--\"\n    )\n    alg1_rank: int = Field(\n        3, description=\"PyAlgos algorithm's rank parameter\", flag_type=\"--\"\n    )\n    alg1_radius: int = Field(\n        3, description=\"PyAlgos algorithm's radius parameter\", flag_type=\"--\"\n    )\n    alg1_dr: int = Field(\n        1, description=\"PyAlgos algorithm's dr parameter\", flag_type=\"--\"\n    )\n    psanaMask_on: str = Field(\n        \"True\", description=\"Whether psana's mask should be used\", flag_type=\"--\"\n    )\n    psanaMask_calib: str = Field(\n        \"True\", description=\"Psana mask's calib parameter\", flag_type=\"--\"\n    )\n    psanaMask_status: str = Field(\n        \"True\", description=\"Psana mask's status parameter\", flag_type=\"--\"\n    )\n    psanaMask_edges: str = Field(\n        \"True\", description=\"Psana mask's edges parameter\", flag_type=\"--\"\n    )\n    psanaMask_central: str = Field(\n        \"True\", description=\"Psana mask's central parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbond: str = Field(\n        \"True\", description=\"Psana mask's unbond parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbondnrs: str = Field(\n        \"True\", description=\"Psana mask's unbondnbrs parameter\", flag_type=\"--\"\n    )\n    mask: str = Field(\n        \"\", description=\"Path to an additional mask to apply\", flag_type=\"--\"\n    )\n    clen: str = Field(\n        description=\"Epics variable storing the camera length\", flag_type=\"--\"\n    )\n    coffset: float = Field(0, description=\"Camera offset in m\", flag_type=\"--\")\n    minPeaks: int = Field(\n        15,\n        description=\"Minimum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    maxPeaks: int = Field(\n        15,\n        description=\"Maximum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    minRes: int = Field(\n        0,\n        description=\"Minimum peak resolution to mark frame for indexing \",\n        flag_type=\"--\",\n    )\n    sample: str = Field(\"\", description=\"Sample name\", flag_type=\"--\")\n    instrument: Union[None, str] = Field(\n        None, description=\"Instrument name\", flag_type=\"--\"\n    )\n    pixelSize: float = Field(0.0, description=\"Pixel size\", flag_type=\"--\")\n    auto: str = Field(\n        \"False\",\n        description=(\n            \"Whether to automatically determine peak per event peak \"\n            \"finding parameters\"\n        ),\n        flag_type=\"--\",\n    )\n    detectorDistance: float = Field(\n        0.0, description=\"Detector distance from interaction point in m\", flag_type=\"--\"\n    )\n    access: Literal[\"ana\", \"ffb\"] = Field(\n        \"ana\", description=\"Data node type: {ana,ffb}\", flag_type=\"--\"\n    )\n    szfile: str = Field(\"qoz.json\", description=\"Path to SZ's JSON configuration file\")\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"sz.json\",\n            output_path=\"\",  # Will want to change where this goes...\n        ),\n        description=\"Template information for the sz.json file\",\n    )\n    sz_parameters: SZParameters = Field(\n        description=\"Configuration parameters for SZ Compression\", flag_type=\"\"\n    )\n\n    @validator(\"e\", always=True)\n    def validate_e(cls, e: str, values: Dict[str, Any]) -&gt; str:\n        if e == \"\":\n            return values[\"lute_config\"].experiment\n        return e\n\n    @validator(\"r\", always=True)\n    def validate_r(cls, r: int, values: Dict[str, Any]) -&gt; int:\n        if r == -1:\n            return values[\"lute_config\"].run\n        return r\n\n    @validator(\"lute_template_cfg\", always=True)\n    def set_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"szfile\"]\n        return lute_template_cfg\n\n    @validator(\"sz_parameters\", always=True)\n    def set_sz_compression_parameters(\n        cls, sz_parameters: SZParameters, values: Dict[str, Any]\n    ) -&gt; None:\n        values[\"compressor\"] = sz_parameters.compressor\n        values[\"binSize\"] = sz_parameters.binSize\n        values[\"roiWindowSize\"] = sz_parameters.roiWindowSize\n        if sz_parameters.compressor == \"qoz\":\n            values[\"pressio_opts\"] = {\n                \"pressio:abs\": sz_parameters.absError,\n                \"qoz\": {\"qoz:stride\": 8},\n            }\n        else:\n            values[\"pressio_opts\"] = {\"pressio:abs\": sz_parameters.absError}\n        return None\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        run: int = int(values[\"lute_config\"].run)\n        directory: str = values[\"outDir\"]\n        fname: str = f\"{exp}_{run:04d}.lst\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.FindPeaksPsocakeParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters","title":"<code>FindPeaksPyAlgosParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using PyAlgos.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgosParameters(TaskParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using PyAlgos.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    class SZCompressorParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description='Compression algorithm (\"qoz\" or \"sz3\")'\n        )\n        abs_error: float = Field(10.0, description=\"Absolute error bound\")\n        bin_size: int = Field(2, description=\"Bin size\")\n        roi_window_size: int = Field(\n            9,\n            description=\"Default window size\",\n        )\n\n    outdir: str = Field(\n        description=\"Output directory for cxi files\",\n    )\n    n_events: int = Field(\n        0,\n        description=\"Number of events to process (0 to process all events)\",\n    )\n    det_name: str = Field(\n        description=\"Psana name of the detector storing the image data\",\n    )\n    event_receiver: Literal[\"evr0\", \"evr1\"] = Field(\n        description=\"Event Receiver to be used: evr0 or evr1\",\n    )\n    tag: str = Field(\n        \"\",\n        description=\"Tag to add to the output file names\",\n    )\n    pv_camera_length: Union[str, float] = Field(\n        \"\",\n        description=\"PV associated with camera length \"\n        \"(if a number, camera length directly)\",\n    )\n    event_logic: bool = Field(\n        False,\n        description=\"True if only events with a specific event code should be \"\n        \"processed. False if the event code should be ignored\",\n    )\n    event_code: int = Field(\n        0,\n        description=\"Required events code for events to be processed if event logic \"\n        \"is True\",\n    )\n    psana_mask: bool = Field(\n        False,\n        description=\"If True, apply mask from psana Detector object\",\n    )\n    mask_file: Union[str, None] = Field(\n        None,\n        description=\"File with a custom mask to apply. If None, no custom mask is \"\n        \"applied\",\n    )\n    min_peaks: int = Field(2, description=\"Minimum number of peaks per image\")\n    max_peaks: int = Field(\n        2048,\n        description=\"Maximum number of peaks per image\",\n    )\n    npix_min: int = Field(\n        2,\n        description=\"Minimum number of pixels per peak\",\n    )\n    npix_max: int = Field(\n        30,\n        description=\"Maximum number of pixels per peak\",\n    )\n    amax_thr: float = Field(\n        80.0,\n        description=\"Minimum intensity threshold for starting a peak\",\n    )\n    atot_thr: float = Field(\n        120.0,\n        description=\"Minimum summed intensity threshold for pixel collection\",\n    )\n    son_min: float = Field(\n        7.0,\n        description=\"Minimum signal-to-noise ratio to be considered a peak\",\n    )\n    peak_rank: int = Field(\n        3,\n        description=\"Radius in which central peak pixel is a local maximum\",\n    )\n    r0: float = Field(\n        3.0,\n        description=\"Radius of ring for background evaluation in pixels\",\n    )\n    dr: float = Field(\n        2.0,\n        description=\"Width of ring for background evaluation in pixels\",\n    )\n    nsigm: float = Field(\n        7.0,\n        description=\"Intensity threshold to include pixel in connected group\",\n    )\n    compression: Optional[SZCompressorParameters] = Field(\n        None,\n        description=\"Options for the SZ Compression Algorithm\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            fname: Path = (\n                Path(values[\"outdir\"])\n                / f\"{values['lute_config'].experiment}_{values['lute_config'].run}_\"\n                f\"{values['tag']}.list\"\n            )\n            return str(fname)\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.FindPeaksPyAlgosParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters","title":"<code>IndexCrystFELParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>indexamajig</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-indexamajig.html</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class IndexCrystFELParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `indexamajig`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/indexamajig\",\n        description=\"CrystFEL's indexing binary.\",\n        flag_type=\"\",\n    )\n    # Basic options\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    geometry: str = Field(\n        \"\", description=\"Path to geometry file.\", flag_type=\"-\", rename_param=\"g\"\n    )\n    zmq_input: Optional[str] = Field(\n        description=\"ZMQ address to receive data over. `input` and `zmq-input` are mutually exclusive\",\n        flag_type=\"--\",\n        rename_param=\"zmq-input\",\n    )\n    zmq_subscribe: Optional[str] = Field(  # Can be used multiple times...\n        description=\"Subscribe to ZMQ message of type `tag`\",\n        flag_type=\"--\",\n        rename_param=\"zmq-subscribe\",\n    )\n    zmq_request: Optional[AnyUrl] = Field(\n        description=\"Request new data over ZMQ by sending this value\",\n        flag_type=\"--\",\n        rename_param=\"zmq-request\",\n    )\n    asapo_endpoint: Optional[str] = Field(\n        description=\"ASAP::O endpoint. zmq-input and this are mutually exclusive.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-endpoint\",\n    )\n    asapo_token: Optional[str] = Field(\n        description=\"ASAP::O authentication token.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-token\",\n    )\n    asapo_beamtime: Optional[str] = Field(\n        description=\"ASAP::O beatime.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-beamtime\",\n    )\n    asapo_source: Optional[str] = Field(\n        description=\"ASAP::O data source.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-source\",\n    )\n    asapo_group: Optional[str] = Field(\n        description=\"ASAP::O consumer group.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-group\",\n    )\n    asapo_stream: Optional[str] = Field(\n        description=\"ASAP::O stream.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    asapo_wait_for_stream: Optional[str] = Field(\n        description=\"If ASAP::O stream does not exist, wait for it to appear.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-wait-for-stream\",\n    )\n    data_format: Optional[str] = Field(\n        description=\"Specify format for ZMQ or ASAP::O. `msgpack`, `hdf5` or `seedee`.\",\n        flag_type=\"--\",\n        rename_param=\"data-format\",\n    )\n    basename: bool = Field(\n        False,\n        description=\"Remove directory parts of filenames. Acts before prefix if prefix also given.\",\n        flag_type=\"--\",\n    )\n    prefix: Optional[str] = Field(\n        description=\"Add a prefix to the filenames from the infile argument.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    nthreads: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of threads to use. See also `max_indexer_threads`.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    no_check_prefix: bool = Field(\n        False,\n        description=\"Don't attempt to correct the prefix if it seems incorrect.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-prefix\",\n    )\n    highres: Optional[float] = Field(\n        description=\"Mark all pixels greater than `x` has bad.\", flag_type=\"--\"\n    )\n    profile: bool = Field(\n        False, description=\"Display timing data to monitor performance.\", flag_type=\"--\"\n    )\n    temp_dir: Optional[str] = Field(\n        description=\"Specify a path for the temp files folder.\",\n        flag_type=\"--\",\n        rename_param=\"temp-dir\",\n    )\n    wait_for_file: conint(gt=-2) = Field(\n        0,\n        description=\"Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\",\n        flag_type=\"--\",\n        rename_param=\"wait-for-file\",\n    )\n    no_image_data: bool = Field(\n        False,\n        description=\"Load only the metadata, no iamges. Can check indexability without high data requirements.\",\n        flag_type=\"--\",\n        rename_param=\"no-image-data\",\n    )\n    # Peak-finding options\n    # ....\n    # Indexing options\n    indexing: Optional[str] = Field(\n        description=\"Comma-separated list of supported indexing algorithms to use. Default is to automatically detect.\",\n        flag_type=\"--\",\n    )\n    cell_file: Optional[str] = Field(\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    tolerance: str = Field(\n        \"5,5,5,1.5\",\n        description=(\n            \"Tolerances (in percent) for unit cell comparison. \"\n            \"Comma-separated list a,b,c,angle. Default=5,5,5,1.5\"\n        ),\n        flag_type=\"--\",\n    )\n    no_check_cell: bool = Field(\n        False,\n        description=\"Do not check cell parameters against unit cell. Replaces '-raw' method.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-cell\",\n    )\n    no_check_peaks: bool = Field(\n        False,\n        description=\"Do not verify peaks are accounted for by solution.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-peaks\",\n    )\n    multi: bool = Field(\n        False, description=\"Enable multi-lattice indexing.\", flag_type=\"--\"\n    )\n    wavelength_estimate: Optional[float] = Field(\n        description=\"Estimate for X-ray wavelength. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"wavelength-estimate\",\n    )\n    camera_length_estimate: Optional[float] = Field(\n        description=\"Estimate for camera distance. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"camera-length-estimate\",\n    )\n    max_indexer_threads: Optional[PositiveInt] = Field(\n        # 1,\n        description=\"Some indexing algos can use multiple threads. In addition to image-based.\",\n        flag_type=\"--\",\n        rename_param=\"max-indexer-threads\",\n    )\n    no_retry: bool = Field(\n        False,\n        description=\"Do not remove weak peaks and try again.\",\n        flag_type=\"--\",\n        rename_param=\"no-retry\",\n    )\n    no_refine: bool = Field(\n        False,\n        description=\"Skip refinement step.\",\n        flag_type=\"--\",\n        rename_param=\"no-refine\",\n    )\n    no_revalidate: bool = Field(\n        False,\n        description=\"Skip revalidation step.\",\n        flag_type=\"--\",\n        rename_param=\"no-revalidate\",\n    )\n    # TakeTwo specific parameters\n    taketwo_member_threshold: Optional[PositiveInt] = Field(\n        # 20,\n        description=\"Minimum number of vectors to consider.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-member-threshold\",\n    )\n    taketwo_len_tolerance: Optional[PositiveFloat] = Field(\n        # 0.001,\n        description=\"TakeTwo length tolerance in Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-len-tolerance\",\n    )\n    taketwo_angle_tolerance: Optional[PositiveFloat] = Field(\n        # 0.6,\n        description=\"TakeTwo angle tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-angle-tolerance\",\n    )\n    taketwo_trace_tolerance: Optional[PositiveFloat] = Field(\n        # 3,\n        description=\"Matrix trace tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-trace-tolerance\",\n    )\n    # Felix-specific parameters\n    # felix_domega\n    # felix-fraction-max-visits\n    # felix-max-internal-angle\n    # felix-max-uniqueness\n    # felix-min-completeness\n    # felix-min-visits\n    # felix-num-voxels\n    # felix-sigma\n    # felix-tthrange-max\n    # felix-tthrange-min\n    # XGANDALF-specific parameters\n    xgandalf_sampling_pitch: Optional[NonNegativeInt] = Field(\n        # 6,\n        description=\"Density of reciprocal space sampling.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-sampling-pitch\",\n    )\n    xgandalf_grad_desc_iterations: Optional[NonNegativeInt] = Field(\n        # 4,\n        description=\"Number of gradient descent iterations.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-grad-desc-iterations\",\n    )\n    xgandalf_tolerance: Optional[PositiveFloat] = Field(\n        # 0.02,\n        description=\"Relative tolerance of lattice vectors\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-tolerance\",\n    )\n    xgandalf_no_deviation_from_provided_cell: Optional[bool] = Field(\n        description=\"Found unit cell must match provided.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-no-deviation-from-provided-cell\",\n    )\n    xgandalf_min_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 30,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-min-lattice-vector-length\",\n    )\n    xgandalf_max_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 250,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-lattice-vector-length\",\n    )\n    xgandalf_max_peaks: Optional[PositiveInt] = Field(\n        # 250,\n        description=\"Maximum number of peaks to use for indexing.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-peaks\",\n    )\n    xgandalf_fast_execution: bool = Field(\n        False,\n        description=\"Shortcut to set sampling-pitch=2, and grad-desc-iterations=3.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-fast-execution\",\n    )\n    # pinkIndexer parameters\n    # ...\n    # asdf_fast: bool = Field(False, description=\"Enable fast mode for asdf. 3x faster for 7% loss in accuracy.\", flag_type=\"--\", rename_param=\"asdf-fast\")\n    # Integration parameters\n    integration: str = Field(\n        \"rings-nocen\", description=\"Method for integrating reflections.\", flag_type=\"--\"\n    )\n    fix_profile_radius: Optional[float] = Field(\n        description=\"Fix the profile radius (m^{-1})\",\n        flag_type=\"--\",\n        rename_param=\"fix-profile-radius\",\n    )\n    fix_divergence: Optional[float] = Field(\n        0,\n        description=\"Fix the divergence (rad, full angle).\",\n        flag_type=\"--\",\n        rename_param=\"fix-divergence\",\n    )\n    int_radius: str = Field(\n        \"4,5,7\",\n        description=\"Inner, middle, and outer radii for 3-ring integration.\",\n        flag_type=\"--\",\n        rename_param=\"int-radius\",\n    )\n    int_diag: str = Field(\n        \"none\",\n        description=\"Show detailed information on integration when condition is met.\",\n        flag_type=\"--\",\n        rename_param=\"int-diag\",\n    )\n    push_res: str = Field(\n        \"infinity\",\n        description=\"Integrate `x` higher than apparent resolution limit (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    overpredict: bool = Field(\n        False,\n        description=\"Over-predict reflections. Maybe useful with post-refinement.\",\n        flag_type=\"--\",\n    )\n    cell_parameters_only: bool = Field(\n        False, description=\"Do not predict refletions at all\", flag_type=\"--\"\n    )\n    # Output parameters\n    no_non_hits_in_stream: bool = Field(\n        False,\n        description=\"Exclude non-hits from the stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-non-hits-in-stream\",\n    )\n    copy_hheader: Optional[str] = Field(\n        description=\"Copy information from header in the image to output stream.\",\n        flag_type=\"--\",\n        rename_param=\"copy-hheader\",\n    )\n    no_peaks_in_stream: bool = Field(\n        False,\n        description=\"Do not record peaks in stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-peaks-in-stream\",\n    )\n    no_refls_in_stream: bool = Field(\n        False,\n        description=\"Do not record reflections in stream.\",\n        flag_type=\"--\",\n        rename_param=\"no-refls-in-stream\",\n    )\n    serial_offset: Optional[PositiveInt] = Field(\n        description=\"Start numbering at `x` instead of 1.\",\n        flag_type=\"--\",\n        rename_param=\"serial-offset\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            filename: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"FindPeaksPyAlgos\", \"out_file\"\n            )\n            if filename is None:\n                exp: str = values[\"lute_config\"].experiment\n                run: int = int(values[\"lute_config\"].run)\n                tag: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"tag\"\n                )\n                out_dir: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"outDir\"\n                )\n                if out_dir is not None:\n                    fname: str = f\"{out_dir}/{exp}_{run:04d}\"\n                    if tag is not None:\n                        fname = f\"{fname}_{tag}\"\n                    return f\"{fname}.lst\"\n            else:\n                return filename\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            expmt: str = values[\"lute_config\"].experiment\n            run: int = int(values[\"lute_config\"].run)\n            work_dir: str = values[\"lute_config\"].work_dir\n            fname: str = f\"{expmt}_r{run:04d}.stream\"\n            return f\"{work_dir}/{fname}\"\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.IndexCrystFELParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters","title":"<code>ManipulateHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>get_hkl</code> for manipulating lists of reflections.</p> <p>This Task is predominantly used internally to convert <code>hkl</code> to <code>mtz</code> files. Note that performing multiple manipulations is undefined behaviour. Run the Task with multiple configurations in explicit separate steps. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class ManipulateHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `get_hkl` for manipulating lists of reflections.\n\n    This Task is predominantly used internally to convert `hkl` to `mtz` files.\n    Note that performing multiple manipulations is undefined behaviour. Run\n    the Task with multiple configurations in explicit separate steps. For more\n    information on usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/get_hkl\",\n        description=\"CrystFEL's reflection manipulation binary.\",\n        flag_type=\"\",\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input HKL file.\",\n        flag_type=\"-\",\n        rename_param=\"i\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    output_format: str = Field(\n        \"mtz\",\n        description=\"Output format. One of mtz, mtz-bij, or xds. Otherwise CrystFEL format.\",\n        flag_type=\"--\",\n        rename_param=\"output-format\",\n    )\n    expand: Optional[str] = Field(\n        description=\"Reflections will be expanded to fill asymmetric unit of specified point group.\",\n        flag_type=\"--\",\n    )\n    # Reducing reflections to higher symmetry\n    twin: Optional[str] = Field(\n        description=\"Reflections equivalent to specified point group will have intensities summed.\",\n        flag_type=\"--\",\n    )\n    no_need_all_parts: Optional[bool] = Field(\n        description=\"Use with --twin to allow reflections missing a 'twin mate' to be written out.\",\n        flag_type=\"--\",\n        rename_param=\"no-need-all-parts\",\n    )\n    # Noise - Add to data\n    noise: Optional[bool] = Field(\n        description=\"Generate 10% uniform noise.\", flag_type=\"--\"\n    )\n    poisson: Optional[bool] = Field(\n        description=\"Generate Poisson noise. Intensities assumed to be A.U.\",\n        flag_type=\"--\",\n    )\n    adu_per_photon: Optional[int] = Field(\n        description=\"Use with --poisson to convert A.U. to photons.\",\n        flag_type=\"--\",\n        rename_param=\"adu-per-photon\",\n    )\n    # Remove duplicate reflections\n    trim_centrics: Optional[bool] = Field(\n        description=\"Duplicated reflections (according to symmetry) are removed.\",\n        flag_type=\"--\",\n    )\n    # Restrict to template file\n    template: Optional[str] = Field(\n        description=\"Only reflections which also appear in specified file are written out.\",\n        flag_type=\"--\",\n    )\n    # Multiplicity\n    multiplicity: Optional[bool] = Field(\n        description=\"Reflections are multiplied by their symmetric multiplicites.\",\n        flag_type=\"--\",\n    )\n    # Resolution cutoffs\n    cutoff_angstroms: Optional[Union[str, int, float]] = Field(\n        description=\"Either n, or n1,n2,n3. For n, reflections &lt; n are removed. For n1,n2,n3 anisotropic trunction performed at separate resolution limits for a*, b*, c*.\",\n        flag_type=\"--\",\n        rename_param=\"cutoff-angstroms\",\n    )\n    lowres: Optional[float] = Field(\n        description=\"Remove reflections with d &gt; n\", flag_type=\"--\"\n    )\n    highres: Optional[float] = Field(\n        description=\"Synonym for first form of --cutoff-angstroms\"\n    )\n    reindex: Optional[str] = Field(\n        description=\"Reindex according to specified operator. E.g. k,h,-l.\",\n        flag_type=\"--\",\n    )\n    # Override input symmetry\n    symmetry: Optional[str] = Field(\n        description=\"Point group symmetry to use to override. Almost always OMIT this option.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                return partialator_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                mtz_out: str = partialator_file.split(\".\")[0]\n                mtz_out = f\"{mtz_out}.mtz\"\n                return mtz_out\n        return out_file\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n</code></pre>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.ManipulateHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.MergePartialatorParameters","title":"<code>MergePartialatorParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>partialator</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class MergePartialatorParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `partialator`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/partialator\",\n        description=\"CrystFEL's Partialator binary.\",\n        flag_type=\"\",\n    )\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input stream.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    symmetry: str = Field(description=\"Point group symmetry.\", flag_type=\"--\")\n    niter: Optional[int] = Field(\n        description=\"Number of cycles of scaling and post-refinement.\",\n        flag_type=\"-\",\n        rename_param=\"n\",\n    )\n    no_scale: Optional[bool] = Field(\n        description=\"Disable scaling.\", flag_type=\"--\", rename_param=\"no-scale\"\n    )\n    no_Bscale: Optional[bool] = Field(\n        description=\"Disable Debye-Waller part of scaling.\",\n        flag_type=\"--\",\n        rename_param=\"no-Bscale\",\n    )\n    no_pr: Optional[bool] = Field(\n        description=\"Disable orientation model.\", flag_type=\"--\", rename_param=\"no-pr\"\n    )\n    no_deltacchalf: Optional[bool] = Field(\n        description=\"Disable rejection based on deltaCC1/2.\",\n        flag_type=\"--\",\n        rename_param=\"no-deltacchalf\",\n    )\n    model: str = Field(\n        \"unity\",\n        description=\"Partiality model. Options: xsphere, unity, offset, ggpm.\",\n        flag_type=\"--\",\n    )\n    nthreads: int = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of parallel analyses.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    polarisation: Optional[str] = Field(\n        description=\"Specification of incident polarisation. Refer to CrystFEL docs for more info.\",\n        flag_type=\"--\",\n    )\n    no_polarisation: Optional[bool] = Field(\n        description=\"Synonym for --polarisation=none\",\n        flag_type=\"--\",\n        rename_param=\"no-polarisation\",\n    )\n    max_adu: Optional[float] = Field(\n        description=\"Maximum intensity of reflection to include.\",\n        flag_type=\"--\",\n        rename_param=\"max-adu\",\n    )\n    min_res: Optional[float] = Field(\n        description=\"Only include crystals diffracting to a minimum resolution.\",\n        flag_type=\"--\",\n        rename_param=\"min-res\",\n    )\n    min_measurements: int = Field(\n        2,\n        description=\"Include a reflection only if it appears a minimum number of times.\",\n        flag_type=\"--\",\n        rename_param=\"min-measurements\",\n    )\n    push_res: Optional[float] = Field(\n        description=\"Merge reflections up to higher than the apparent resolution limit.\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    start_after: int = Field(\n        0,\n        description=\"Ignore the first n crystals.\",\n        flag_type=\"--\",\n        rename_param=\"start-after\",\n    )\n    stop_after: int = Field(\n        0,\n        description=\"Stop after processing n crystals. 0 means process all.\",\n        flag_type=\"--\",\n        rename_param=\"stop-after\",\n    )\n    no_free: Optional[bool] = Field(\n        description=\"Disable cross-validation. Testing ONLY.\",\n        flag_type=\"--\",\n        rename_param=\"no-free\",\n    )\n    custom_split: Optional[str] = Field(\n        description=\"Read a set of filenames, event and dataset IDs from a filename.\",\n        flag_type=\"--\",\n        rename_param=\"custom-split\",\n    )\n    max_rel_B: float = Field(\n        100,\n        description=\"Reject crystals if |relB| &gt; n sq Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"max-rel-B\",\n    )\n    output_every_cycle: bool = Field(\n        False,\n        description=\"Write per-crystal params after every refinement cycle.\",\n        flag_type=\"--\",\n        rename_param=\"output-every-cycle\",\n    )\n    no_logs: bool = Field(\n        False,\n        description=\"Do not write logs needed for plots, maps and graphs.\",\n        flag_type=\"--\",\n        rename_param=\"no-logs\",\n    )\n    set_symmetry: Optional[str] = Field(\n        description=\"Set the apparent symmetry of the crystals to a point group.\",\n        flag_type=\"-\",\n        rename_param=\"w\",\n    )\n    operator: Optional[str] = Field(\n        description=\"Specify an ambiguity operator. E.g. k,h,-l.\", flag_type=\"--\"\n    )\n    force_bandwidth: Optional[float] = Field(\n        description=\"Set X-ray bandwidth. As percent, e.g. 0.0013 (0.13%).\",\n        flag_type=\"--\",\n        rename_param=\"force-bandwidth\",\n    )\n    force_radius: Optional[float] = Field(\n        description=\"Set the initial profile radius (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"force-radius\",\n    )\n    force_lambda: Optional[float] = Field(\n        description=\"Set the wavelength. In Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"force-lambda\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"ConcatenateStreamFiles\",\n                \"out_file\",\n            )\n            if stream_file:\n                return stream_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            in_file: str = values[\"in_file\"]\n            if in_file:\n                tag: str = in_file.split(\".\")[0]\n                return f\"{tag}.hkl\"\n            else:\n                return \"partialator.hkl\"\n        return out_file\n</code></pre>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.MergePartialatorParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.RunSHELXCParameters","title":"<code>RunSHELXCParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's SHELXC program.</p> <p>SHELXC prepares files for SHELXD and SHELXE.</p> <p>For more information please refer to the official documentation: https://www.ccp4.ac.uk/html/crank.html</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class RunSHELXCParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's SHELXC program.\n\n    SHELXC prepares files for SHELXD and SHELXE.\n\n    For more information please refer to the official documentation:\n    https://www.ccp4.ac.uk/html/crank.html\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/shelxc\",\n        description=\"CCP4 SHELXC. Generates input files for SHELXD/SHELXE.\",\n        flag_type=\"\",\n    )\n    placeholder: str = Field(\n        \"xx\", description=\"Placeholder filename stem.\", flag_type=\"\"\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Input file for SHELXC with reflections AND proper records.\",\n        flag_type=\"\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            # get_hkl needed to be run to produce an XDS format file...\n            xds_format_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if xds_format_file:\n                in_file = xds_format_file\n        if in_file[0] != \"&lt;\":\n            # Need to add a redirection for this program\n            # Runs like `shelxc xx &lt;input_file.xds`\n            in_file = f\"&lt;{in_file}\"\n        return in_file\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters","title":"<code>SubmitSMDParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for running smalldata to produce reduced HDF5 files.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class SubmitSMDParameters(ThirdPartyParameters):\n    \"\"\"Parameters for running smalldata to produce reduced HDF5 files.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    m: str = Field(\n        \"mpi4py.run\",\n        description=\"Python option to execute a module's contents as __main__ module.\",\n        flag_type=\"-\",\n    )\n    producer: str = Field(\n        \"\", description=\"Path to the SmallData producer Python script.\", flag_type=\"\"\n    )\n    run: str = Field(\n        os.environ.get(\"RUN_NUM\", \"\"), description=\"DAQ Run Number.\", flag_type=\"--\"\n    )\n    experiment: str = Field(\n        os.environ.get(\"EXPERIMENT\", \"\"),\n        description=\"LCLS Experiment Number.\",\n        flag_type=\"--\",\n    )\n    stn: NonNegativeInt = Field(0, description=\"Hutch endstation.\", flag_type=\"--\")\n    nevents: int = Field(\n        int(1e9), description=\"Number of events to process.\", flag_type=\"--\"\n    )\n    directory: Optional[str] = Field(\n        None,\n        description=\"Optional output directory. If None, will be in ${EXP_FOLDER}/hdf5/smalldata.\",\n        flag_type=\"--\",\n    )\n    ## Need mechanism to set result_from_param=True ...\n    gather_interval: PositiveInt = Field(\n        25, description=\"Number of events to collect at a time.\", flag_type=\"--\"\n    )\n    norecorder: bool = Field(\n        False, description=\"Whether to ignore recorder streams.\", flag_type=\"--\"\n    )\n    url: HttpUrl = Field(\n        \"https://pswww.slac.stanford.edu/ws-auth/lgbk\",\n        description=\"Base URL for eLog posting.\",\n        flag_type=\"--\",\n    )\n    epicsAll: bool = Field(\n        False,\n        description=\"Whether to store all EPICS PVs. Use with care.\",\n        flag_type=\"--\",\n    )\n    full: bool = Field(\n        False,\n        description=\"Whether to store all data. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    fullSum: bool = Field(\n        False,\n        description=\"Whether to store sums for all area detector images.\",\n        flag_type=\"--\",\n    )\n    default: bool = Field(\n        False,\n        description=\"Whether to store only the default minimal set of data.\",\n        flag_type=\"--\",\n    )\n    image: bool = Field(\n        False,\n        description=\"Whether to save everything as images. Use with care.\",\n        flag_type=\"--\",\n    )\n    tiff: bool = Field(\n        False,\n        description=\"Whether to save all images as a single TIFF. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    centerpix: bool = Field(\n        False,\n        description=\"Whether to mask center pixels for Epix10k2M detectors.\",\n        flag_type=\"--\",\n    )\n    postRuntable: bool = Field(\n        False,\n        description=\"Whether to post run tables. Also used as a trigger for summary jobs.\",\n        flag_type=\"--\",\n    )\n    wait: bool = Field(\n        False, description=\"Whether to wait for a file to appear.\", flag_type=\"--\"\n    )\n    xtcav: bool = Field(\n        False,\n        description=\"Whether to add XTCAV processing to the HDF5 generation.\",\n        flag_type=\"--\",\n    )\n    noarch: bool = Field(\n        False, description=\"Whether to not use archiver data.\", flag_type=\"--\"\n    )\n\n    lute_template_cfg: TemplateConfig = TemplateConfig(template_name=\"\", output_path=\"\")\n\n    @validator(\"producer\", always=True)\n    def validate_producer_path(cls, producer: str) -&gt; str:\n        return producer\n\n    @validator(\"lute_template_cfg\", always=True)\n    def use_producer(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if not lute_template_cfg.output_path:\n            lute_template_cfg.output_path = values[\"producer\"]\n        return lute_template_cfg\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        hutch: str = exp[:3]\n        run: int = int(values[\"lute_config\"].run)\n        directory: Optional[str] = values[\"directory\"]\n        if directory is None:\n            directory = f\"/sdf/data/lcls/ds/{hutch}/{exp}/hdf5/smalldata\"\n        fname: str = f\"{exp}_Run{run:04d}.h5\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Identical to super-class Config but includes a result.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.SubmitSMDParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.TaskParameters","title":"<code>TaskParameters</code>","text":"<p>               Bases: <code>BaseSettings</code></p> <p>Base class for models of task parameters to be validated.</p> <p>Parameters are read from a configuration YAML file and validated against subclasses of this type in order to ensure that both all parameters are present, and that the parameters are of the correct type.</p> Note <p>Pydantic is used for data validation. Pydantic does not perform \"strict\" validation by default. Parameter values may be cast to conform with the model specified by the subclass definition if it is possible to do so. Consider whether this may cause issues (e.g. if a float is cast to an int).</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TaskParameters(BaseSettings):\n    \"\"\"Base class for models of task parameters to be validated.\n\n    Parameters are read from a configuration YAML file and validated against\n    subclasses of this type in order to ensure that both all parameters are\n    present, and that the parameters are of the correct type.\n\n    Note:\n        Pydantic is used for data validation. Pydantic does not perform \"strict\"\n        validation by default. Parameter values may be cast to conform with the\n        model specified by the subclass definition if it is possible to do so.\n        Consider whether this may cause issues (e.g. if a float is cast to an\n        int).\n    \"\"\"\n\n    class Config:\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration. A number of LUTE-specific\n        configuration has also been placed here.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). False. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n                `set_result==True`\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however. Only used if `set_result==True`\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if `set_result==True`.\n        \"\"\"\n\n        env_prefix = \"LUTE_\"\n        underscore_attrs_are_private: bool = True\n        copy_on_model_validation: str = \"deep\"\n        allow_inf_nan: bool = False\n\n        run_directory: Optional[str] = None\n        \"\"\"Set the directory that the Task is run from.\"\"\"\n        set_result: bool = False\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n        result_from_params: Optional[str] = None\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n        result_summary: Optional[str] = None\n        \"\"\"Format a TaskResult.summary from output.\"\"\"\n        impl_schemas: Optional[str] = None\n        \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n\n    lute_config: AnalysisHeader\n</code></pre>"},{"location":"source/io/config/#io.config.TaskParameters.Config","title":"<code>Config</code>","text":"<p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration. A number of LUTE-specific configuration has also been placed here.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc. Only used if <code>set_result==True</code></p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however. Only used if <code>set_result==True</code></p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config:\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration. A number of LUTE-specific\n    configuration has also been placed here.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). False. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n            `set_result==True`\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however. Only used if `set_result==True`\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if `set_result==True`.\n    \"\"\"\n\n    env_prefix = \"LUTE_\"\n    underscore_attrs_are_private: bool = True\n    copy_on_model_validation: str = \"deep\"\n    allow_inf_nan: bool = False\n\n    run_directory: Optional[str] = None\n    \"\"\"Set the directory that the Task is run from.\"\"\"\n    set_result: bool = False\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n    result_from_params: Optional[str] = None\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n    result_summary: Optional[str] = None\n    \"\"\"Format a TaskResult.summary from output.\"\"\"\n    impl_schemas: Optional[str] = None\n    \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.TaskParameters.Config.impl_schemas","title":"<code>impl_schemas: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Schema specification for output result. Will be passed to TaskResult.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.result_from_params","title":"<code>result_from_params: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.result_summary","title":"<code>result_summary: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Format a TaskResult.summary from output.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.run_directory","title":"<code>run_directory: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Set the directory that the Task is run from.</p>"},{"location":"source/io/config/#io.config.TaskParameters.Config.set_result","title":"<code>set_result: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.TemplateConfig","title":"<code>TemplateConfig</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Parameters used for templating of third party configuration files.</p> <p>Attributes:</p> Name Type Description <code>template_name</code> <code>str</code> <p>The name of the template to use. This template must live in <code>config/templates</code>.</p> <code>output_path</code> <code>str</code> <p>The FULL path, including filename to write the rendered template to.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TemplateConfig(BaseModel):\n    \"\"\"Parameters used for templating of third party configuration files.\n\n    Attributes:\n        template_name (str): The name of the template to use. This template must\n            live in `config/templates`.\n\n        output_path (str): The FULL path, including filename to write the\n            rendered template to.\n    \"\"\"\n\n    template_name: str\n    output_path: str\n</code></pre>"},{"location":"source/io/config/#io.config.TemplateParameters","title":"<code>TemplateParameters</code>","text":"<p>Class for representing parameters for third party configuration files.</p> <p>These parameters can represent arbitrary data types and are used in conjunction with templates for modifying third party configuration files from the single LUTE YAML. Due to the storage of arbitrary data types, and the use of a template file, a single instance of this class can hold from a single template variable to an entire configuration file. The data parsing is done by jinja using the complementary template. All data is stored in the single model variable <code>params.</code></p> <p>The pydantic \"dataclass\" is used over the BaseModel/Settings to allow positional argument instantiation of the <code>params</code> Field.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>@dataclass\nclass TemplateParameters:\n    \"\"\"Class for representing parameters for third party configuration files.\n\n    These parameters can represent arbitrary data types and are used in\n    conjunction with templates for modifying third party configuration files\n    from the single LUTE YAML. Due to the storage of arbitrary data types, and\n    the use of a template file, a single instance of this class can hold from a\n    single template variable to an entire configuration file. The data parsing\n    is done by jinja using the complementary template.\n    All data is stored in the single model variable `params.`\n\n    The pydantic \"dataclass\" is used over the BaseModel/Settings to allow\n    positional argument instantiation of the `params` Field.\n    \"\"\"\n\n    params: Any\n</code></pre>"},{"location":"source/io/config/#io.config.TestBinaryErrParameters","title":"<code>TestBinaryErrParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Same as TestBinary, but exits with non-zero code.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestBinaryErrParameters(ThirdPartyParameters):\n    \"\"\"Same as TestBinary, but exits with non-zero code.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/home/d/dorlhiac/test_tasks/test_threads_err\",\n        description=\"Multi-threaded tes tbinary with non-zero exit code.\",\n    )\n    p_arg1: int = Field(1, description=\"Number of threads.\")\n</code></pre>"},{"location":"source/io/config/#io.config.TestMultiNodeCommunicationParameters","title":"<code>TestMultiNodeCommunicationParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>TestMultiNodeCommunication</code>.</p> <p>Test verifies communication across multiple machines.</p> Source code in <code>lute/io/models/mpi_tests.py</code> <pre><code>class TestMultiNodeCommunicationParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `TestMultiNodeCommunication`.\n\n    Test verifies communication across multiple machines.\n    \"\"\"\n\n    send_obj: Literal[\"plot\", \"array\"] = Field(\n        \"array\", description=\"Object to send to Executor. `plot` or `array`\"\n    )\n    arr_size: Optional[int] = Field(\n        None, description=\"Size of array to send back to Executor.\"\n    )\n</code></pre>"},{"location":"source/io/config/#io.config.TestParameters","title":"<code>TestParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>Test</code>.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `Test`.\"\"\"\n\n    float_var: float = Field(0.01, description=\"A floating point number.\")\n    str_var: str = Field(\"test\", description=\"A string.\")\n\n    class CompoundVar(BaseModel):\n        int_var: int = 1\n        dict_var: Dict[str, str] = {\"a\": \"b\"}\n\n    compound_var: CompoundVar = Field(\n        description=(\n            \"A compound parameter - consists of a `int_var` (int) and `dict_var`\"\n            \" (Dict[str, str]).\"\n        )\n    )\n    throw_error: bool = Field(\n        False, description=\"If `True`, raise an exception to test error handling.\"\n    )\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters","title":"<code>ThirdPartyParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Base class for third party task parameters.</p> <p>Contains special validators for extra arguments and handling of parameters used for filling in third party configuration files.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class ThirdPartyParameters(TaskParameters):\n    \"\"\"Base class for third party task parameters.\n\n    Contains special validators for extra arguments and handling of parameters\n    used for filling in third party configuration files.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration and inherited configuration\n        from the base `TaskParameters.Config` class. A number of values are also\n        overridden, and there are some specific configuration options to\n        ThirdPartyParameters. A full list of options (with TaskParameters options\n        repeated) is described below.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). True. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc.\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however.\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if set_result is True.\n\n            -----------------------\n            ThirdPartyTask-specific:\n\n            extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n                arguments.\n\n            short_flags_use_eq (bool): False. If True, \"short\" command-line args\n                are passed as `-x=arg`. ThirdPartyTask-specific.\n\n            long_flags_use_eq (bool): False. If True, \"long\" command-line args\n                are passed as `--long=arg`. ThirdPartyTask-specific.\n        \"\"\"\n\n        extra: str = \"allow\"\n        short_flags_use_eq: bool = False\n        \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n        long_flags_use_eq: bool = False\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    # lute_template_cfg: TemplateConfig\n\n    @root_validator(pre=False)\n    def extra_fields_to_thirdparty(cls, values: Dict[str, Any]):\n        for key in values:\n            if key not in cls.__fields__:\n                values[key] = TemplateParameters(values[key])\n\n        return values\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration and inherited configuration from the base <code>TaskParameters.Config</code> class. A number of values are also overridden, and there are some specific configuration options to ThirdPartyParameters. A full list of options (with TaskParameters options repeated) is described below.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc.</p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however.</p> <code>ThirdPartyTask-specific</code> <code>Optional[str]</code> <code>extra</code> <code>str</code> <p>\"allow\". Pydantic configuration. Allow (or ignore) extra arguments.</p> <code>short_flags_use_eq</code> <code>bool</code> <p>False. If True, \"short\" command-line args are passed as <code>-x=arg</code>. ThirdPartyTask-specific.</p> <code>long_flags_use_eq</code> <code>bool</code> <p>False. If True, \"long\" command-line args are passed as <code>--long=arg</code>. ThirdPartyTask-specific.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config(TaskParameters.Config):\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration and inherited configuration\n    from the base `TaskParameters.Config` class. A number of values are also\n    overridden, and there are some specific configuration options to\n    ThirdPartyParameters. A full list of options (with TaskParameters options\n    repeated) is described below.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). True. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc.\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however.\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if set_result is True.\n\n        -----------------------\n        ThirdPartyTask-specific:\n\n        extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n            arguments.\n\n        short_flags_use_eq (bool): False. If True, \"short\" command-line args\n            are passed as `-x=arg`. ThirdPartyTask-specific.\n\n        long_flags_use_eq (bool): False. If True, \"long\" command-line args\n            are passed as `--long=arg`. ThirdPartyTask-specific.\n    \"\"\"\n\n    extra: str = \"allow\"\n    short_flags_use_eq: bool = False\n    \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n    long_flags_use_eq: bool = False\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/config/#io.config.ThirdPartyParameters.Config.short_flags_use_eq","title":"<code>short_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether short command-line arguments are passed like <code>-x=arg</code>.</p>"},{"location":"source/io/config/#io.config.parse_config","title":"<code>parse_config(task_name='test', config_path='')</code>","text":"<p>Parse a configuration file and validate the contents.</p> <p>Parameters:</p> Name Type Description Default <code>task_name</code> <code>str</code> <p>Name of the specific task that will be run.</p> <code>'test'</code> <code>config_path</code> <code>str</code> <p>Path to the configuration file.</p> <code>''</code> <p>Returns:</p> Name Type Description <code>params</code> <code>TaskParameters</code> <p>A TaskParameters object of validated task-specific parameters. Parameters are accessed with \"dot\" notation. E.g. <code>params.param1</code>.</p> <p>Raises:</p> Type Description <code>ValidationError</code> <p>Raised if there are problems with the configuration file. Passed through from Pydantic.</p> Source code in <code>lute/io/config.py</code> <pre><code>def parse_config(task_name: str = \"test\", config_path: str = \"\") -&gt; TaskParameters:\n    \"\"\"Parse a configuration file and validate the contents.\n\n    Args:\n        task_name (str): Name of the specific task that will be run.\n\n        config_path (str): Path to the configuration file.\n\n    Returns:\n        params (TaskParameters): A TaskParameters object of validated\n            task-specific parameters. Parameters are accessed with \"dot\"\n            notation. E.g. `params.param1`.\n\n    Raises:\n        ValidationError: Raised if there are problems with the configuration\n            file. Passed through from Pydantic.\n    \"\"\"\n    task_config_name: str = f\"{task_name}Parameters\"\n\n    with open(config_path, \"r\") as f:\n        docs: Iterator[Dict[str, Any]] = yaml.load_all(stream=f, Loader=yaml.FullLoader)\n        header: Dict[str, Any] = next(docs)\n        config: Dict[str, Any] = next(docs)\n    substitute_variables(header, header)\n    substitute_variables(header, config)\n    LUTE_DEBUG_EXIT(\"LUTE_DEBUG_EXIT_AT_YAML\", pprint.pformat(config))\n    lute_config: Dict[str, AnalysisHeader] = {\"lute_config\": AnalysisHeader(**header)}\n    try:\n        task_config: Dict[str, Any] = dict(config[task_name])\n        lute_config.update(task_config)\n    except KeyError as err:\n        warnings.warn(\n            (\n                f\"{task_name} has no parameter definitions in YAML file.\"\n                \" Attempting default parameter initialization.\"\n            )\n        )\n    parsed_parameters: TaskParameters = globals()[task_config_name](**lute_config)\n    return parsed_parameters\n</code></pre>"},{"location":"source/io/config/#io.config.substitute_variables","title":"<code>substitute_variables(header, config, curr_key=None)</code>","text":"<p>Performs variable substitutions on a dictionary read from config YAML file.</p> <p>Can be used to define input parameters in terms of other input parameters. This is similar to functionality employed by validators for parameters in the specific Task models, but is intended to be more accessible to users. Variable substitutions are defined using a minimal syntax from Jinja:                            {{ experiment }} defines a substitution of the variable <code>experiment</code>. The characters <code>{{ }}</code> can be escaped if the literal symbols are needed in place.</p> <p>For example, a path to a file can be defined in terms of experiment and run values in the config file:     MyTask:       experiment: myexp       run: 2       special_file: /path/to/{{ experiment }}/{{ run }}/file.inp</p> <p>Acceptable variables for substitutions are values defined elsewhere in the YAML file. Environment variables can also be used if prefaced with a <code>$</code> character. E.g. to get the experiment from an environment variable:     MyTask:       run: 2       special_file: /path/to/{{ $EXPERIMENT }}/{{ run }}/file.inp</p> <p>Parameters:</p> Name Type Description Default <code>config</code> <code>Dict[str, Any]</code> <p>A dictionary of parsed configuration.</p> required <code>curr_key</code> <code>Optional[str]</code> <p>Used to keep track of recursion level when scanning through iterable items in the config dictionary.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subbed_config</code> <code>Dict[str, Any]</code> <p>The config dictionary after substitutions have been made. May be identical to the input if no substitutions are needed.</p> Source code in <code>lute/io/config.py</code> <pre><code>def substitute_variables(\n    header: Dict[str, Any], config: Dict[str, Any], curr_key: Optional[str] = None\n) -&gt; None:\n    \"\"\"Performs variable substitutions on a dictionary read from config YAML file.\n\n    Can be used to define input parameters in terms of other input parameters.\n    This is similar to functionality employed by validators for parameters in\n    the specific Task models, but is intended to be more accessible to users.\n    Variable substitutions are defined using a minimal syntax from Jinja:\n                               {{ experiment }}\n    defines a substitution of the variable `experiment`. The characters `{{ }}`\n    can be escaped if the literal symbols are needed in place.\n\n    For example, a path to a file can be defined in terms of experiment and run\n    values in the config file:\n        MyTask:\n          experiment: myexp\n          run: 2\n          special_file: /path/to/{{ experiment }}/{{ run }}/file.inp\n\n    Acceptable variables for substitutions are values defined elsewhere in the\n    YAML file. Environment variables can also be used if prefaced with a `$`\n    character. E.g. to get the experiment from an environment variable:\n        MyTask:\n          run: 2\n          special_file: /path/to/{{ $EXPERIMENT }}/{{ run }}/file.inp\n\n    Args:\n        config (Dict[str, Any]):  A dictionary of parsed configuration.\n\n        curr_key (Optional[str]): Used to keep track of recursion level when scanning\n            through iterable items in the config dictionary.\n\n    Returns:\n        subbed_config (Dict[str, Any]): The config dictionary after substitutions\n            have been made. May be identical to the input if no substitutions are\n            needed.\n    \"\"\"\n    _sub_pattern = r\"\\{\\{[^}{]*\\}\\}\"\n    iterable: Dict[str, Any] = config\n    if curr_key is not None:\n        # Need to handle nested levels by interpreting curr_key\n        keys_by_level: List[str] = curr_key.split(\".\")\n        for key in keys_by_level:\n            iterable = iterable[key]\n    else:\n        ...\n        # iterable = config\n    for param, value in iterable.items():\n        if isinstance(value, dict):\n            new_key: str\n            if curr_key is None:\n                new_key = param\n            else:\n                new_key = f\"{curr_key}.{param}\"\n            substitute_variables(header, config, curr_key=new_key)\n        elif isinstance(value, list):\n            ...\n        # Scalars str - we skip numeric types\n        elif isinstance(value, str):\n            matches: List[str] = re.findall(_sub_pattern, value)\n            for m in matches:\n                key_to_sub_maybe_with_fmt: List[str] = m[2:-2].strip().split(\":\")\n                key_to_sub: str = key_to_sub_maybe_with_fmt[0]\n                fmt: Optional[str] = None\n                if len(key_to_sub_maybe_with_fmt) == 2:\n                    fmt = key_to_sub_maybe_with_fmt[1]\n                sub: Any\n                if key_to_sub[0] == \"$\":\n                    sub = os.getenv(key_to_sub[1:], None)\n                    if sub is None:\n                        print(\n                            f\"Environment variable {key_to_sub[1:]} not found! Cannot substitute in YAML config!\",\n                            flush=True,\n                        )\n                        continue\n                    # substitutions from env vars will be strings, so convert back\n                    # to numeric in order to perform formatting later on (e.g. {var:04d})\n                    sub = _check_str_numeric(sub)\n                else:\n                    try:\n                        sub = config\n                        for key in key_to_sub.split(\".\"):\n                            sub = sub[key]\n                    except KeyError:\n                        sub = header[key_to_sub]\n                pattern: str = (\n                    m.replace(\"{{\", r\"\\{\\{\").replace(\"}}\", r\"\\}\\}\").replace(\"$\", r\"\\$\")\n                )\n                if fmt is not None:\n                    sub = f\"{sub:{fmt}}\"\n                else:\n                    sub = f\"{sub}\"\n                iterable[param] = re.sub(pattern, sub, iterable[param])\n            # Reconvert back to numeric values if needed...\n            iterable[param] = _check_str_numeric(iterable[param])\n</code></pre>"},{"location":"source/io/db/","title":"db","text":"<p>Tools for working with the LUTE parameter and configuration database.</p> <p>The current implementation relies on a sqlite backend database. In the future this may change - therefore relatively few high-level API function calls are intended to be public. These abstract away the details of the database interface and work exclusively on LUTE objects.</p> <p>Functions:</p> Name Description <code>record_analysis_db</code> <p>DescribedAnalysis) -&gt; None: Writes the configuration to the backend database.</p> <code>read_latest_db_entry</code> <p>str, task_name: str, param: str) -&gt; Any: Retrieve the most recent entry from a database for a specific Task.</p> <p>Raises:</p> Type Description <code>DatabaseError</code> <p>Generic exception raised for LUTE database errors.</p>"},{"location":"source/io/db/#io.db.DatabaseError","title":"<code>DatabaseError</code>","text":"<p>               Bases: <code>Exception</code></p> <p>General LUTE database error.</p> Source code in <code>lute/io/db.py</code> <pre><code>class DatabaseError(Exception):\n    \"\"\"General LUTE database error.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/io/db/#io.db.read_latest_db_entry","title":"<code>read_latest_db_entry(db_dir, task_name, param, valid_only=True)</code>","text":"<p>Read most recent value entered into the database for a Task parameter.</p> <p>(Will be updated for schema compliance as well as Task name.)</p> <p>Parameters:</p> Name Type Description Default <code>db_dir</code> <code>str</code> <p>Database location.</p> required <code>task_name</code> <code>str</code> <p>The name of the Task to check the database for.</p> required <code>param</code> <code>str</code> <p>The parameter name for the Task that we want to retrieve.</p> required <code>valid_only</code> <code>bool</code> <p>Whether to consider only valid results or not. E.g. An input file may be useful even if the Task result is invalid (Failed). Default = True.</p> <code>True</code> <p>Returns:</p> Name Type Description <code>val</code> <code>Any</code> <p>The most recently entered value for <code>param</code> of <code>task_name</code> that can be found in the database. Returns None if nothing found.</p> Source code in <code>lute/io/db.py</code> <pre><code>def read_latest_db_entry(\n    db_dir: str, task_name: str, param: str, valid_only: bool = True\n) -&gt; Optional[Any]:\n    \"\"\"Read most recent value entered into the database for a Task parameter.\n\n    (Will be updated for schema compliance as well as Task name.)\n\n    Args:\n        db_dir (str): Database location.\n\n        task_name (str): The name of the Task to check the database for.\n\n        param (str): The parameter name for the Task that we want to retrieve.\n\n        valid_only (bool): Whether to consider only valid results or not. E.g.\n            An input file may be useful even if the Task result is invalid\n            (Failed). Default = True.\n\n    Returns:\n        val (Any): The most recently entered value for `param` of `task_name`\n            that can be found in the database. Returns None if nothing found.\n    \"\"\"\n    import sqlite3\n    from ._sqlite import _select_from_db\n\n    con: sqlite3.Connection = sqlite3.Connection(f\"{db_dir}/lute.db\")\n    with con:\n        try:\n            cond: Dict[str, str] = {}\n            if valid_only:\n                cond = {\"valid_flag\": \"1\"}\n            entry: Any = _select_from_db(con, task_name, param, cond)\n        except sqlite3.OperationalError as err:\n            logger.debug(f\"Cannot retrieve value {param} due to: {err}\")\n            entry = None\n    return entry\n</code></pre>"},{"location":"source/io/db/#io.db.record_analysis_db","title":"<code>record_analysis_db(cfg)</code>","text":"<p>Write an DescribedAnalysis object to the database.</p> <p>The DescribedAnalysis object is maintained by the Executor and contains all information necessary to fully describe a single <code>Task</code> execution. The contained fields are split across multiple tables within the database as some of the information can be shared across multiple Tasks. Refer to <code>docs/design/database.md</code> for more information on the database specification.</p> Source code in <code>lute/io/db.py</code> <pre><code>def record_analysis_db(cfg: DescribedAnalysis) -&gt; None:\n    \"\"\"Write an DescribedAnalysis object to the database.\n\n    The DescribedAnalysis object is maintained by the Executor and contains all\n    information necessary to fully describe a single `Task` execution. The\n    contained fields are split across multiple tables within the database as\n    some of the information can be shared across multiple Tasks. Refer to\n    `docs/design/database.md` for more information on the database specification.\n    \"\"\"\n    import sqlite3\n    from ._sqlite import (\n        _make_shared_table,\n        _make_task_table,\n        _add_row_no_duplicate,\n        _add_task_entry,\n    )\n\n    try:\n        work_dir: str = cfg.task_parameters.lute_config.work_dir\n    except AttributeError:\n        logger.info(\n            (\n                \"Unable to access TaskParameters object. Likely wasn't created. \"\n                \"Cannot store result.\"\n            )\n        )\n        return\n    del cfg.task_parameters.lute_config.work_dir\n\n    exec_entry, exec_columns = _cfg_to_exec_entry_cols(cfg)\n    task_name: str = cfg.task_result.task_name\n    # All `Task`s have an AnalysisHeader, but this info can be shared so is\n    # split into a different table\n    (\n        task_entry,  # Dict[str, Any]\n        task_columns,  # Dict[str, str]\n        gen_entry,  # Dict[str, Any]\n        gen_columns,  # Dict[str, str]\n    ) = _params_to_entry_cols(cfg.task_parameters)\n    x, y = _result_to_entry_cols(cfg.task_result)\n    task_entry.update(x)\n    task_columns.update(y)\n\n    con: sqlite3.Connection = sqlite3.Connection(f\"{work_dir}/lute.db\")\n    with con:\n        # --- Table Creation ---#\n        if not _make_shared_table(con, \"gen_cfg\", gen_columns):\n            raise DatabaseError(\"Could not make general configuration table!\")\n        if not _make_shared_table(con, \"exec_cfg\", exec_columns):\n            raise DatabaseError(\"Could not make Executor configuration table!\")\n        if not _make_task_table(con, task_name, task_columns):\n            raise DatabaseError(f\"Could not make Task table for: {task_name}!\")\n\n        # --- Row Addition ---#\n        gen_id: int = _add_row_no_duplicate(con, \"gen_cfg\", gen_entry)\n        exec_id: int = _add_row_no_duplicate(con, \"exec_cfg\", exec_entry)\n\n        full_task_entry: Dict[str, Any] = {\n            \"gen_cfg_id\": gen_id,\n            \"exec_cfg_id\": exec_id,\n        }\n        full_task_entry.update(task_entry)\n        # Prepare flag to indicate whether the task entry is valid or not\n        # By default we say it is assuming proper completion\n        valid_flag: int = (\n            1 if cfg.task_result.task_status == TaskStatus.COMPLETED else 0\n        )\n        full_task_entry.update({\"valid_flag\": valid_flag})\n\n        _add_task_entry(con, task_name, full_task_entry)\n</code></pre>"},{"location":"source/io/elog/","title":"elog","text":"<p>Provides utilities for communicating with the LCLS eLog.</p> <p>Make use of various eLog API endpoint to retrieve information or post results.</p> <p>Functions:</p> Name Description <code>get_elog_opr_auth</code> <p>str): Return an authorization object to interact with eLog API as an opr account for the hutch where <code>exp</code> was conducted.</p> <code>get_elog_kerberos_auth</code> <p>Return the authorization headers for the user account submitting the job.</p> <code>elog_http_request</code> <p>str, request_type: str, **params): Make an HTTP request to the API endpoint at <code>url</code>.</p> <code>format_file_for_post</code> <p>Union[str, tuple, list]): Prepare files according to the specification needed to add them as attachments to eLog posts.</p> <code>post_elog_message</code> <p>str, msg: str, tag: Optional[str],               title: Optional[str],               in_files: List[Union[str, tuple, list]],               auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Post a message to the eLog.</p> <code>post_elog_run_status</code> <p>Dict[str, Union[str, int, float]],                  update_url: Optional[str] = None) Post a run status to the summary section on the Workflows&gt;Control tab.</p> <code>post_elog_run_table</code> <p>str, run: int, data: Dict[str, Any],                auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Update run table in the eLog.</p> <code>get_elog_runs_by_tag</code> <p>str, tag: str,                  auth: Optional[Union[HTTPBasicAuth, Dict]] = None) Return a list of runs with a specific tag.</p> <code>get_elog_params_by_run</code> <p>str, params: List[str], runs: Optional[List[int]]) Retrieve the requested parameters by run. If no run is provided, retrieve the requested parameters for all runs.</p>"},{"location":"source/io/elog/#io.elog.elog_http_request","title":"<code>elog_http_request(exp, endpoint, request_type, **params)</code>","text":"<p>Make an HTTP request to the eLog.</p> <p>This method will determine the proper authorization method and update the passed parameters appropriately. Functions implementing specific endpoint functionality and calling this function should only pass the necessary endpoint-specific parameters and not include the authorization objects.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment.</p> required <code>endpoint</code> <code>str</code> <p>eLog API endpoint.</p> required <code>request_type</code> <code>str</code> <p>Type of request to make. Recognized options: POST or GET.</p> required <code>**params</code> <code>Dict</code> <p>Endpoint parameters to pass with the HTTP request! Differs depending on the API endpoint. Do not include auth objects.</p> <code>{}</code> <p>Returns:</p> Name Type Description <code>status_code</code> <code>int</code> <p>Response status code. Can be checked for errors.</p> <code>msg</code> <code>str</code> <p>An error message, or a message saying SUCCESS.</p> <code>value</code> <code>Optional[Any]</code> <p>For GET requests ONLY, return the requested information.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def elog_http_request(\n    exp: str, endpoint: str, request_type: str, **params\n) -&gt; Tuple[int, str, Optional[Any]]:\n    \"\"\"Make an HTTP request to the eLog.\n\n    This method will determine the proper authorization method and update the\n    passed parameters appropriately. Functions implementing specific endpoint\n    functionality and calling this function should only pass the necessary\n    endpoint-specific parameters and not include the authorization objects.\n\n    Args:\n        exp (str): Experiment.\n\n        endpoint (str): eLog API endpoint.\n\n        request_type (str): Type of request to make. Recognized options: POST or\n            GET.\n\n        **params (Dict): Endpoint parameters to pass with the HTTP request!\n            Differs depending on the API endpoint. Do not include auth objects.\n\n    Returns:\n        status_code (int): Response status code. Can be checked for errors.\n\n        msg (str): An error message, or a message saying SUCCESS.\n\n        value (Optional[Any]): For GET requests ONLY, return the requested\n            information.\n    \"\"\"\n    auth: Union[HTTPBasicAuth, Dict[str, str]] = get_elog_auth(exp)\n    base_url: str\n    if isinstance(auth, HTTPBasicAuth):\n        params.update({\"auth\": auth})\n        base_url = \"https://pswww.slac.stanford.edu/ws-auth/lgbk/lgbk\"\n    elif isinstance(auth, dict):\n        params.update({\"headers\": auth})\n        base_url = \"https://pswww.slac.stanford.edu/ws-kerb/lgbk/lgbk\"\n\n    url: str = f\"{base_url}/{endpoint}\"\n\n    resp: requests.models.Response\n    if request_type.upper() == \"POST\":\n        resp = requests.post(url, **params)\n    elif request_type.upper() == \"GET\":\n        resp = requests.get(url, **params)\n    else:\n        return (-1, \"Invalid request type!\", None)\n\n    status_code: int = resp.status_code\n    msg: str = \"SUCCESS\"\n\n    if resp.json()[\"success\"] and request_type.upper() == \"GET\":\n        return (status_code, msg, resp.json()[\"value\"])\n\n    if status_code &gt;= 300:\n        msg = f\"Error when posting to eLog: Response {status_code}\"\n\n    if not resp.json()[\"success\"]:\n        err_msg = resp.json()[\"error_msg\"]\n        msg += f\"\\nInclude message: {err_msg}\"\n    return (resp.status_code, msg, None)\n</code></pre>"},{"location":"source/io/elog/#io.elog.format_file_for_post","title":"<code>format_file_for_post(in_file)</code>","text":"<p>Format a file for attachment to an eLog post.</p> <p>The eLog API expects a specifically formatted tuple when adding file attachments. This function prepares the tuple to specification given a number of different input types.</p> <p>Parameters:</p> Name Type Description Default <code>in_file</code> <code>str | tuple | list</code> <p>File to include as an attachment in an eLog post.</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def format_file_for_post(\n    in_file: Union[str, tuple, list]\n) -&gt; Tuple[str, Tuple[str, BufferedReader], Any]:\n    \"\"\"Format a file for attachment to an eLog post.\n\n    The eLog API expects a specifically formatted tuple when adding file\n    attachments. This function prepares the tuple to specification given a\n    number of different input types.\n\n    Args:\n        in_file (str | tuple | list): File to include as an attachment in an\n            eLog post.\n    \"\"\"\n    description: str\n    fptr: BufferedReader\n    ftype: Optional[str]\n    if isinstance(in_file, str):\n        description = os.path.basename(in_file)\n        fptr = open(in_file, \"rb\")\n        ftype = mimetypes.guess_type(in_file)[0]\n    elif isinstance(in_file, tuple) or isinstance(in_file, list):\n        description = in_file[1]\n        fptr = open(in_file[0], \"rb\")\n        ftype = mimetypes.guess_type(in_file[0])[0]\n    else:\n        raise ElogFileFormatError(f\"Unrecognized format: {in_file}\")\n\n    out_file: Tuple[str, Tuple[str, BufferedReader], Any] = (\n        \"files\",\n        (description, fptr),\n        ftype,\n    )\n    return out_file\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_active_expmt","title":"<code>get_elog_active_expmt(hutch, *, endstation=0)</code>","text":"<p>Get the current active experiment for a hutch.</p> <p>This function is one of two functions to manage the HTTP request independently. This is because it does not require an authorization object, and its result is needed for the generic function <code>elog_http_request</code> to work properly.</p> <p>Parameters:</p> Name Type Description Default <code>hutch</code> <code>str</code> <p>The hutch to get the active experiment for.</p> required <code>endstation</code> <code>int</code> <p>The hutch endstation to get the experiment for. This should generally be 0.</p> <code>0</code> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_active_expmt(hutch: str, *, endstation: int = 0) -&gt; str:\n    \"\"\"Get the current active experiment for a hutch.\n\n    This function is one of two functions to manage the HTTP request independently.\n    This is because it does not require an authorization object, and its result\n    is needed for the generic function `elog_http_request` to work properly.\n\n    Args:\n        hutch (str): The hutch to get the active experiment for.\n\n        endstation (int): The hutch endstation to get the experiment for. This\n            should generally be 0.\n    \"\"\"\n\n    base_url: str = \"https://pswww.slac.stanford.edu/ws/lgbk/lgbk\"\n    endpoint: str = \"ws/activeexperiment_for_instrument_station\"\n    url: str = f\"{base_url}/{endpoint}\"\n    params: Dict[str, str] = {\"instrument_name\": hutch, \"station\": f\"{endstation}\"}\n    resp: requests.models.Response = requests.get(url, params)\n    if resp.status_code &gt; 300:\n        raise RuntimeError(\n            f\"Error getting current experiment!\\n\\t\\tIncorrect hutch: '{hutch}'?\"\n        )\n    if resp.json()[\"success\"]:\n        return resp.json()[\"value\"][\"name\"]\n    else:\n        msg: str = resp.json()[\"error_msg\"]\n        raise RuntimeError(f\"Error getting current experiment! Err: {msg}\")\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_auth","title":"<code>get_elog_auth(exp)</code>","text":"<p>Determine the appropriate auth method depending on experiment state.</p> <p>Returns:</p> Name Type Description <code>auth</code> <code>HTTPBasicAuth | Dict[str, str]</code> <p>Depending on whether an experiment is active/live, returns authorization for the hutch operator account or the current user submitting a job.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_auth(exp: str) -&gt; Union[HTTPBasicAuth, Dict[str, str]]:\n    \"\"\"Determine the appropriate auth method depending on experiment state.\n\n    Returns:\n        auth (HTTPBasicAuth | Dict[str, str]): Depending on whether an experiment\n            is active/live, returns authorization for the hutch operator account\n            or the current user submitting a job.\n    \"\"\"\n    hutch: str = exp[:3]\n    if exp.lower() == get_elog_active_expmt(hutch=hutch).lower():\n        return get_elog_opr_auth(exp)\n    else:\n        return get_elog_kerberos_auth()\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_kerberos_auth","title":"<code>get_elog_kerberos_auth()</code>","text":"<p>Returns Kerberos authorization key.</p> <p>This functions returns authorization for the USER account submitting jobs. It assumes that <code>kinit</code> has been run.</p> <p>Returns:</p> Name Type Description <code>auth</code> <code>Dict[str, str]</code> <p>Dictionary containing Kerberos authorization key.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_kerberos_auth() -&gt; Dict[str, str]:\n    \"\"\"Returns Kerberos authorization key.\n\n    This functions returns authorization for the USER account submitting jobs.\n    It assumes that `kinit` has been run.\n\n    Returns:\n        auth (Dict[str, str]): Dictionary containing Kerberos authorization key.\n    \"\"\"\n    from krtc import KerberosTicket\n\n    return KerberosTicket(\"HTTP@pswww.slac.stanford.edu\").getAuthHeaders()\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_opr_auth","title":"<code>get_elog_opr_auth(exp)</code>","text":"<p>Produce authentication for the \"opr\" user associated to an experiment.</p> <p>This method uses basic authentication using username and password.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Name of the experiment to produce authentication for.</p> required <p>Returns:</p> Name Type Description <code>auth</code> <code>HTTPBasicAuth</code> <p>HTTPBasicAuth for an active experiment based on username and password for the associated operator account.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_opr_auth(exp: str) -&gt; HTTPBasicAuth:\n    \"\"\"Produce authentication for the \"opr\" user associated to an experiment.\n\n    This method uses basic authentication using username and password.\n\n    Args:\n        exp (str): Name of the experiment to produce authentication for.\n\n    Returns:\n        auth (HTTPBasicAuth): HTTPBasicAuth for an active experiment based on\n            username and password for the associated operator account.\n    \"\"\"\n    opr: str = f\"{exp[:3]}opr\"\n    with open(\"/sdf/group/lcls/ds/tools/forElogPost.txt\", \"r\") as f:\n        pw: str = f.readline()[:-1]\n    return HTTPBasicAuth(opr, pw)\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_params_by_run","title":"<code>get_elog_params_by_run(exp, params, runs=None)</code>","text":"<p>Retrieve requested parameters by run or for all runs.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment to retrieve parameters for.</p> required <code>params</code> <code>List[str]</code> <p>A list of parameters to retrieve. These can be any parameter recorded in the eLog (PVs, parameters posted by other Tasks, etc.)</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_params_by_run(\n    exp: str, params: List[str], runs: Optional[List[int]] = None\n) -&gt; Dict[str, str]:\n    \"\"\"Retrieve requested parameters by run or for all runs.\n\n    Args:\n        exp (str): Experiment to retrieve parameters for.\n\n        params (List[str]): A list of parameters to retrieve. These can be any\n            parameter recorded in the eLog (PVs, parameters posted by other\n            Tasks, etc.)\n    \"\"\"\n    ...\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_runs_by_tag","title":"<code>get_elog_runs_by_tag(exp, tag, auth=None)</code>","text":"<p>Retrieve run numbers with a specified tag.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>tag</code> <code>str</code> <p>The tag to retrieve runs for.</p> required Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_runs_by_tag(\n    exp: str, tag: str, auth: Optional[Union[HTTPBasicAuth, Dict]] = None\n) -&gt; List[int]:\n    \"\"\"Retrieve run numbers with a specified tag.\n\n    Args:\n        exp (str): Experiment name.\n\n        tag (str): The tag to retrieve runs for.\n    \"\"\"\n    endpoint: str = f\"{exp}/ws/get_runs_with_tag?tag={tag}\"\n    params: Dict[str, Any] = {}\n\n    status_code, resp_msg, tagged_runs = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"GET\", **params\n    )\n\n    if not tagged_runs:\n        tagged_runs = []\n\n    return tagged_runs\n</code></pre>"},{"location":"source/io/elog/#io.elog.get_elog_workflows","title":"<code>get_elog_workflows(exp)</code>","text":"<p>Get the current workflow definitions for an experiment.</p> <p>Returns:</p> Name Type Description <code>defns</code> <code>Dict[str, str]</code> <p>A dictionary of workflow definitions.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def get_elog_workflows(exp: str) -&gt; Dict[str, str]:\n    \"\"\"Get the current workflow definitions for an experiment.\n\n    Returns:\n        defns (Dict[str, str]): A dictionary of workflow definitions.\n    \"\"\"\n    raise NotImplementedError\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_message","title":"<code>post_elog_message(exp, msg, *, tag, title, in_files=[])</code>","text":"<p>Post a new message to the eLog. Inspired by the <code>elog</code> package.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>msg</code> <code>str</code> <p>BODY of the eLog post.</p> required <code>tag</code> <code>str | None</code> <p>Optional \"tag\" to associate with the eLog post.</p> required <code>title</code> <code>str | None</code> <p>Optional title to include in the eLog post.</p> required <code>in_files</code> <code>List[str | tuple | list]</code> <p>Files to include as attachments in the eLog post.</p> <code>[]</code> <p>Returns:</p> Name Type Description <code>err_msg</code> <code>str | None</code> <p>If successful, nothing is returned, otherwise, return an error message.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_message(\n    exp: str,\n    msg: str,\n    *,\n    tag: Optional[str],\n    title: Optional[str],\n    in_files: List[Union[str, tuple, list]] = [],\n) -&gt; Optional[str]:\n    \"\"\"Post a new message to the eLog. Inspired by the `elog` package.\n\n    Args:\n        exp (str): Experiment name.\n\n        msg (str): BODY of the eLog post.\n\n        tag (str | None): Optional \"tag\" to associate with the eLog post.\n\n        title (str | None): Optional title to include in the eLog post.\n\n        in_files (List[str | tuple | list]): Files to include as attachments in\n            the eLog post.\n\n    Returns:\n        err_msg (str | None): If successful, nothing is returned, otherwise,\n            return an error message.\n    \"\"\"\n    # MOSTLY CORRECT\n    out_files: list = []\n    for f in in_files:\n        try:\n            out_files.append(format_file_for_post(in_file=f))\n        except ElogFileFormatError as err:\n            logger.debug(f\"ElogFileFormatError: {err}\")\n    post: Dict[str, str] = {}\n    post[\"log_text\"] = msg\n    if tag:\n        post[\"log_tags\"] = tag\n    if title:\n        post[\"log_title\"] = title\n\n    endpoint: str = f\"{exp}/ws/new_elog_entry\"\n\n    params: Dict[str, Any] = {\"data\": post}\n\n    if out_files:\n        params.update({\"files\": out_files})\n\n    status_code, resp_msg, _ = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"POST\", **params\n    )\n\n    if resp_msg != \"SUCCESS\":\n        return resp_msg\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_run_status","title":"<code>post_elog_run_status(data, update_url=None)</code>","text":"<p>Post a summary to the status/report section of a specific run.</p> <p>In contrast to most eLog update/post mechanisms, this function searches for a specific environment variable which contains a specific URL for posting. This is updated every job/run as jobs are submitted by the JID. The URL can optionally be passed to this function if it is known.</p> <p>Parameters:</p> Name Type Description Default <code>data</code> <code>Dict[str, Union[str, int, float]]</code> <p>The data to post to the eLog report section. Formatted in key:value pairs.</p> required <code>update_url</code> <code>Optional[str]</code> <p>Optional update URL. If not provided, the function searches for the corresponding environment variable. If neither is found, the function aborts</p> <code>None</code> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_run_status(\n    data: Dict[str, Union[str, int, float]], update_url: Optional[str] = None\n) -&gt; None:\n    \"\"\"Post a summary to the status/report section of a specific run.\n\n    In contrast to most eLog update/post mechanisms, this function searches\n    for a specific environment variable which contains a specific URL for\n    posting. This is updated every job/run as jobs are submitted by the JID.\n    The URL can optionally be passed to this function if it is known.\n\n    Args:\n        data (Dict[str, Union[str, int, float]]): The data to post to the eLog\n            report section. Formatted in key:value pairs.\n\n        update_url (Optional[str]): Optional update URL. If not provided, the\n            function searches for the corresponding environment variable. If\n            neither is found, the function aborts\n    \"\"\"\n    if update_url is None:\n        update_url = os.environ.get(\"JID_UPDATE_COUNTERS\")\n        if update_url is None:\n            logger.info(\"eLog Update Failed! JID_UPDATE_COUNTERS is not defined!\")\n            return\n    current_status: Dict[str, Union[str, int, float]] = _get_current_run_status(\n        update_url\n    )\n    current_status.update(data)\n    post_list: List[Dict[str, str]] = [\n        {\"key\": f\"{key}\", \"value\": f\"{value}\"} for key, value in current_status.items()\n    ]\n    params: Dict[str, List[Dict[str, str]]] = {\"json\": post_list}\n    resp: requests.models.Response = requests.post(update_url, **params)\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_run_table","title":"<code>post_elog_run_table(exp, run, data)</code>","text":"<p>Post data for eLog run tables.</p> <p>Parameters:</p> Name Type Description Default <code>exp</code> <code>str</code> <p>Experiment name.</p> required <code>run</code> <code>int</code> <p>Run number corresponding to the data being posted.</p> required <code>data</code> <code>Dict[str, Any]</code> <p>Data to be posted in format data[\"column_header\"] = value.</p> required <p>Returns:</p> Name Type Description <code>err_msg</code> <code>None | str</code> <p>If successful, nothing is returned, otherwise, return an error message.</p> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_run_table(\n    exp: str,\n    run: int,\n    data: Dict[str, Any],\n) -&gt; Optional[str]:\n    \"\"\"Post data for eLog run tables.\n\n    Args:\n        exp (str): Experiment name.\n\n        run (int): Run number corresponding to the data being posted.\n\n        data (Dict[str, Any]): Data to be posted in format\n            data[\"column_header\"] = value.\n\n    Returns:\n        err_msg (None | str): If successful, nothing is returned, otherwise,\n            return an error message.\n    \"\"\"\n    endpoint: str = f\"run_control/{exp}/ws/add_run_params\"\n\n    params: Dict[str, Any] = {\"params\": {\"run_num\": run}, \"json\": data}\n\n    status_code, resp_msg, _ = elog_http_request(\n        exp=exp, endpoint=endpoint, request_type=\"POST\", **params\n    )\n\n    if resp_msg != \"SUCCESS\":\n        return resp_msg\n</code></pre>"},{"location":"source/io/elog/#io.elog.post_elog_workflow","title":"<code>post_elog_workflow(exp, name, executable, wf_params, *, trigger='run_end', location='S3DF', **trig_args)</code>","text":"<p>Create a new eLog workflow, or update an existing one.</p> <p>The workflow will run a specific executable as a batch job when the specified trigger occurs. The precise arguments may vary depending on the selected trigger type.</p> <p>Parameters:</p> Name Type Description Default <code>name</code> <code>str</code> <p>An identifying name for the workflow. E.g. \"process data\"</p> required <code>executable</code> <code>str</code> <p>Full path to the executable to be run.</p> required <code>wf_params</code> <code>str</code> <p>All command-line parameters for the executable as a string.</p> required <code>trigger</code> <code>str</code> <p>When to trigger execution of the specified executable. One of:     - 'manual': Must be manually triggered. No automatic processing.     - 'run_start': Execute immediately if a new run begins.     - 'run_end': As soon as a run ends.     - 'param_is': As soon as a parameter has a specific value for a run.</p> <code>'run_end'</code> <code>location</code> <code>str</code> <p>Where to submit the job. S3DF or NERSC.</p> <code>'S3DF'</code> <code>**trig_args</code> <code>str</code> <p>Arguments required for a specific trigger type. trigger='param_is' - 2 Arguments     trig_param (str): Name of the parameter to watch for.     trig_param_val (str): Value the parameter should have to trigger.</p> <code>{}</code> Source code in <code>lute/io/elog.py</code> <pre><code>def post_elog_workflow(\n    exp: str,\n    name: str,\n    executable: str,\n    wf_params: str,\n    *,\n    trigger: str = \"run_end\",\n    location: str = \"S3DF\",\n    **trig_args: str,\n) -&gt; None:\n    \"\"\"Create a new eLog workflow, or update an existing one.\n\n    The workflow will run a specific executable as a batch job when the\n    specified trigger occurs. The precise arguments may vary depending on the\n    selected trigger type.\n\n    Args:\n        name (str): An identifying name for the workflow. E.g. \"process data\"\n\n        executable (str): Full path to the executable to be run.\n\n        wf_params (str): All command-line parameters for the executable as a string.\n\n        trigger (str): When to trigger execution of the specified executable.\n            One of:\n                - 'manual': Must be manually triggered. No automatic processing.\n                - 'run_start': Execute immediately if a new run begins.\n                - 'run_end': As soon as a run ends.\n                - 'param_is': As soon as a parameter has a specific value for a run.\n\n        location (str): Where to submit the job. S3DF or NERSC.\n\n        **trig_args (str): Arguments required for a specific trigger type.\n            trigger='param_is' - 2 Arguments\n                trig_param (str): Name of the parameter to watch for.\n                trig_param_val (str): Value the parameter should have to trigger.\n    \"\"\"\n    endpoint: str = f\"{exp}/ws/create_update_workflow_def\"\n    trig_map: Dict[str, str] = {\n        \"manual\": \"MANUAL\",\n        \"run_start\": \"START_OF_RUN\",\n        \"run_end\": \"END_OF_RUN\",\n        \"param_is\": \"RUN_PARAM_IS_VALUE\",\n    }\n    if trigger not in trig_map.keys():\n        raise NotImplementedError(\n            f\"Cannot create workflow with trigger type: {trigger}\"\n        )\n    wf_defn: Dict[str, str] = {\n        \"name\": name,\n        \"executable\": executable,\n        \"parameters\": wf_params,\n        \"trigger\": trig_map[trigger],\n        \"location\": location,\n    }\n    if trigger == \"param_is\":\n        if \"trig_param\" not in trig_args or \"trig_param_val\" not in trig_args:\n            raise RuntimeError(\n                \"Trigger type 'param_is' requires: 'trig_param' and 'trig_param_val' arguments\"\n            )\n        wf_defn.update(\n            {\n                \"run_param_name\": trig_args[\"trig_param\"],\n                \"run_param_val\": trig_args[\"trig_param_val\"],\n            }\n        )\n    post_params: Dict[str, Dict[str, str]] = {\"json\": wf_defn}\n    status_code, resp_msg, _ = elog_http_request(\n        exp, endpoint=endpoint, request_type=\"POST\", **post_params\n    )\n</code></pre>"},{"location":"source/io/exceptions/","title":"exceptions","text":"<p>Specifies custom exceptions defined for IO problems.</p> <p>Raises:</p> Type Description <code>ElogFileFormatError</code> <p>Raised if an attachment is specified in an incorrect format.</p>"},{"location":"source/io/exceptions/#io.exceptions.ElogFileFormatError","title":"<code>ElogFileFormatError</code>","text":"<p>               Bases: <code>Exception</code></p> <p>Raised when an eLog attachment is specified in an invalid format.</p> Source code in <code>lute/io/exceptions.py</code> <pre><code>class ElogFileFormatError(Exception):\n    \"\"\"Raised when an eLog attachment is specified in an invalid format.\"\"\"\n\n    ...\n</code></pre>"},{"location":"source/io/models/base/","title":"base","text":"<p>Base classes for describing Task parameters.</p> <p>Classes:</p> Name Description <code>AnalysisHeader</code> <p>Model holding shared configuration across Tasks. E.g. experiment name, run number and working directory.</p> <code>TaskParameters</code> <p>Base class for Task parameters. Subclasses specify a model of parameters and their types for validation.</p> <code>ThirdPartyParameters</code> <p>Base class for Third-party, binary executable Tasks.</p> <code>TemplateParameters</code> <p>Dataclass to represent parameters of binary (third-party) Tasks which are used for additional config files.</p> <code>TemplateConfig</code> <p>Class for holding information on where templates are stored in order to properly handle ThirdPartyParameter objects.</p>"},{"location":"source/io/models/base/#io.models.base.AnalysisHeader","title":"<code>AnalysisHeader</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Header information for LUTE analysis runs.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class AnalysisHeader(BaseModel):\n    \"\"\"Header information for LUTE analysis runs.\"\"\"\n\n    title: str = Field(\n        \"LUTE Task Configuration\",\n        description=\"Description of the configuration or experiment.\",\n    )\n    experiment: str = Field(\"\", description=\"Experiment.\")\n    run: Union[str, int] = Field(\"\", description=\"Data acquisition run.\")\n    date: str = Field(\"1970/01/01\", description=\"Start date of analysis.\")\n    lute_version: Union[float, str] = Field(\n        0.1, description=\"Version of LUTE used for analysis.\"\n    )\n    task_timeout: PositiveInt = Field(\n        600,\n        description=(\n            \"Time in seconds until a task times out. Should be slightly shorter\"\n            \" than job timeout if using a job manager (e.g. SLURM).\"\n        ),\n    )\n    work_dir: str = Field(\"\", description=\"Main working directory for LUTE.\")\n\n    @validator(\"work_dir\", always=True)\n    def validate_work_dir(cls, directory: str, values: Dict[str, Any]) -&gt; str:\n        work_dir: str\n        if directory == \"\":\n            std_work_dir = (\n                f\"/sdf/data/lcls/ds/{values['experiment'][:3]}/\"\n                f\"{values['experiment']}/scratch\"\n            )\n            work_dir = std_work_dir\n        else:\n            work_dir = directory\n        # Check existence and permissions\n        if not os.path.exists(work_dir):\n            raise ValueError(f\"Working Directory: {work_dir} does not exist!\")\n        if not os.access(work_dir, os.W_OK):\n            # Need write access for database, files etc.\n            raise ValueError(f\"Not write access for working directory: {work_dir}!\")\n        return work_dir\n\n    @validator(\"run\", always=True)\n    def validate_run(\n        cls, run: Union[str, int], values: Dict[str, Any]\n    ) -&gt; Union[str, int]:\n        if run == \"\":\n            # From Airflow RUN_NUM should have Format \"RUN_DATETIME\" - Num is first part\n            run_time: str = os.environ.get(\"RUN_NUM\", \"\")\n            if run_time != \"\":\n                return int(run_time.split(\"_\")[0])\n        return run\n\n    @validator(\"experiment\", always=True)\n    def validate_experiment(cls, experiment: str, values: Dict[str, Any]) -&gt; str:\n        if experiment == \"\":\n            arp_exp: str = os.environ.get(\"EXPERIMENT\", \"EXPX00000\")\n            return arp_exp\n        return experiment\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters","title":"<code>TaskParameters</code>","text":"<p>               Bases: <code>BaseSettings</code></p> <p>Base class for models of task parameters to be validated.</p> <p>Parameters are read from a configuration YAML file and validated against subclasses of this type in order to ensure that both all parameters are present, and that the parameters are of the correct type.</p> Note <p>Pydantic is used for data validation. Pydantic does not perform \"strict\" validation by default. Parameter values may be cast to conform with the model specified by the subclass definition if it is possible to do so. Consider whether this may cause issues (e.g. if a float is cast to an int).</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TaskParameters(BaseSettings):\n    \"\"\"Base class for models of task parameters to be validated.\n\n    Parameters are read from a configuration YAML file and validated against\n    subclasses of this type in order to ensure that both all parameters are\n    present, and that the parameters are of the correct type.\n\n    Note:\n        Pydantic is used for data validation. Pydantic does not perform \"strict\"\n        validation by default. Parameter values may be cast to conform with the\n        model specified by the subclass definition if it is possible to do so.\n        Consider whether this may cause issues (e.g. if a float is cast to an\n        int).\n    \"\"\"\n\n    class Config:\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration. A number of LUTE-specific\n        configuration has also been placed here.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). False. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n                `set_result==True`\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however. Only used if `set_result==True`\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if `set_result==True`.\n        \"\"\"\n\n        env_prefix = \"LUTE_\"\n        underscore_attrs_are_private: bool = True\n        copy_on_model_validation: str = \"deep\"\n        allow_inf_nan: bool = False\n\n        run_directory: Optional[str] = None\n        \"\"\"Set the directory that the Task is run from.\"\"\"\n        set_result: bool = False\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n        result_from_params: Optional[str] = None\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n        result_summary: Optional[str] = None\n        \"\"\"Format a TaskResult.summary from output.\"\"\"\n        impl_schemas: Optional[str] = None\n        \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n\n    lute_config: AnalysisHeader\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config","title":"<code>Config</code>","text":"<p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration. A number of LUTE-specific configuration has also been placed here.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc. Only used if <code>set_result==True</code></p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however. Only used if <code>set_result==True</code></p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config:\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration. A number of LUTE-specific\n    configuration has also been placed here.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). False. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc. Only used if\n            `set_result==True`\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however. Only used if `set_result==True`\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if `set_result==True`.\n    \"\"\"\n\n    env_prefix = \"LUTE_\"\n    underscore_attrs_are_private: bool = True\n    copy_on_model_validation: str = \"deep\"\n    allow_inf_nan: bool = False\n\n    run_directory: Optional[str] = None\n    \"\"\"Set the directory that the Task is run from.\"\"\"\n    set_result: bool = False\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n    result_from_params: Optional[str] = None\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n    result_summary: Optional[str] = None\n    \"\"\"Format a TaskResult.summary from output.\"\"\"\n    impl_schemas: Optional[str] = None\n    \"\"\"Schema specification for output result. Will be passed to TaskResult.\"\"\"\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.impl_schemas","title":"<code>impl_schemas: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Schema specification for output result. Will be passed to TaskResult.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.result_from_params","title":"<code>result_from_params: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.result_summary","title":"<code>result_summary: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Format a TaskResult.summary from output.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.run_directory","title":"<code>run_directory: Optional[str] = None</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Set the directory that the Task is run from.</p>"},{"location":"source/io/models/base/#io.models.base.TaskParameters.Config.set_result","title":"<code>set_result: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/base/#io.models.base.TemplateConfig","title":"<code>TemplateConfig</code>","text":"<p>               Bases: <code>BaseModel</code></p> <p>Parameters used for templating of third party configuration files.</p> <p>Attributes:</p> Name Type Description <code>template_name</code> <code>str</code> <p>The name of the template to use. This template must live in <code>config/templates</code>.</p> <code>output_path</code> <code>str</code> <p>The FULL path, including filename to write the rendered template to.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class TemplateConfig(BaseModel):\n    \"\"\"Parameters used for templating of third party configuration files.\n\n    Attributes:\n        template_name (str): The name of the template to use. This template must\n            live in `config/templates`.\n\n        output_path (str): The FULL path, including filename to write the\n            rendered template to.\n    \"\"\"\n\n    template_name: str\n    output_path: str\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.TemplateParameters","title":"<code>TemplateParameters</code>","text":"<p>Class for representing parameters for third party configuration files.</p> <p>These parameters can represent arbitrary data types and are used in conjunction with templates for modifying third party configuration files from the single LUTE YAML. Due to the storage of arbitrary data types, and the use of a template file, a single instance of this class can hold from a single template variable to an entire configuration file. The data parsing is done by jinja using the complementary template. All data is stored in the single model variable <code>params.</code></p> <p>The pydantic \"dataclass\" is used over the BaseModel/Settings to allow positional argument instantiation of the <code>params</code> Field.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>@dataclass\nclass TemplateParameters:\n    \"\"\"Class for representing parameters for third party configuration files.\n\n    These parameters can represent arbitrary data types and are used in\n    conjunction with templates for modifying third party configuration files\n    from the single LUTE YAML. Due to the storage of arbitrary data types, and\n    the use of a template file, a single instance of this class can hold from a\n    single template variable to an entire configuration file. The data parsing\n    is done by jinja using the complementary template.\n    All data is stored in the single model variable `params.`\n\n    The pydantic \"dataclass\" is used over the BaseModel/Settings to allow\n    positional argument instantiation of the `params` Field.\n    \"\"\"\n\n    params: Any\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters","title":"<code>ThirdPartyParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Base class for third party task parameters.</p> <p>Contains special validators for extra arguments and handling of parameters used for filling in third party configuration files.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class ThirdPartyParameters(TaskParameters):\n    \"\"\"Base class for third party task parameters.\n\n    Contains special validators for extra arguments and handling of parameters\n    used for filling in third party configuration files.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        \"\"\"Configuration for parameters model.\n\n        The Config class holds Pydantic configuration and inherited configuration\n        from the base `TaskParameters.Config` class. A number of values are also\n        overridden, and there are some specific configuration options to\n        ThirdPartyParameters. A full list of options (with TaskParameters options\n        repeated) is described below.\n\n        Attributes:\n            env_prefix (str): Pydantic configuration. Will set parameters from\n                environment variables containing this prefix. E.g. a model\n                parameter `input` can be set with an environment variable:\n                `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n            underscore_attrs_are_private (bool): Pydantic configuration. Whether\n                to hide attributes (parameters) prefixed with an underscore.\n\n            copy_on_model_validation (str): Pydantic configuration. How to copy\n                the input object passed to the class instance for model\n                validation. Set to perform a deep copy.\n\n            allow_inf_nan (bool): Pydantic configuration. Whether to allow\n                infinity or NAN in float fields.\n\n            run_directory (Optional[str]): None. If set, it should be a valid\n                path. The `Task` will be run from this directory. This may be\n                useful for some `Task`s which rely on searching the working\n                directory.\n\n            set_result (bool). True. If True, the model has information about\n                setting the TaskResult object from the parameters it contains.\n                E.g. it has an `output` parameter which is marked as the result.\n                The result can be set with a field value of `is_result=True` on\n                a specific parameter, or using `result_from_params` and a\n                validator.\n\n            result_from_params (Optional[str]): None. Optionally used to define\n                results from information available in the model using a custom\n                validator. E.g. use a `outdir` and `filename` field to set\n                `result_from_params=f\"{outdir}/{filename}`, etc.\n\n            result_summary (Optional[str]): None. Defines a result summary that\n                can be known after processing the Pydantic model. Use of summary\n                depends on the Executor running the Task. All summaries are\n                stored in the database, however.\n\n            impl_schemas (Optional[str]). Specifies a the schemas the\n                output/results conform to. Only used if set_result is True.\n\n            -----------------------\n            ThirdPartyTask-specific:\n\n            extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n                arguments.\n\n            short_flags_use_eq (bool): False. If True, \"short\" command-line args\n                are passed as `-x=arg`. ThirdPartyTask-specific.\n\n            long_flags_use_eq (bool): False. If True, \"long\" command-line args\n                are passed as `--long=arg`. ThirdPartyTask-specific.\n        \"\"\"\n\n        extra: str = \"allow\"\n        short_flags_use_eq: bool = False\n        \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n        long_flags_use_eq: bool = False\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    # lute_template_cfg: TemplateConfig\n\n    @root_validator(pre=False)\n    def extra_fields_to_thirdparty(cls, values: Dict[str, Any]):\n        for key in values:\n            if key not in cls.__fields__:\n                values[key] = TemplateParameters(values[key])\n\n        return values\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Configuration for parameters model.</p> <p>The Config class holds Pydantic configuration and inherited configuration from the base <code>TaskParameters.Config</code> class. A number of values are also overridden, and there are some specific configuration options to ThirdPartyParameters. A full list of options (with TaskParameters options repeated) is described below.</p> <p>Attributes:</p> Name Type Description <code>env_prefix</code> <code>str</code> <p>Pydantic configuration. Will set parameters from environment variables containing this prefix. E.g. a model parameter <code>input</code> can be set with an environment variable: <code>{env_prefix}input</code>, in LUTE's case <code>LUTE_input</code>.</p> <code>underscore_attrs_are_private</code> <code>bool</code> <p>Pydantic configuration. Whether to hide attributes (parameters) prefixed with an underscore.</p> <code>copy_on_model_validation</code> <code>str</code> <p>Pydantic configuration. How to copy the input object passed to the class instance for model validation. Set to perform a deep copy.</p> <code>allow_inf_nan</code> <code>bool</code> <p>Pydantic configuration. Whether to allow infinity or NAN in float fields.</p> <code>run_directory</code> <code>Optional[str]</code> <p>None. If set, it should be a valid path. The <code>Task</code> will be run from this directory. This may be useful for some <code>Task</code>s which rely on searching the working directory.</p> <code>result_from_params</code> <code>Optional[str]</code> <p>None. Optionally used to define results from information available in the model using a custom validator. E.g. use a <code>outdir</code> and <code>filename</code> field to set <code>result_from_params=f\"{outdir}/{filename}</code>, etc.</p> <code>result_summary</code> <code>Optional[str]</code> <p>None. Defines a result summary that can be known after processing the Pydantic model. Use of summary depends on the Executor running the Task. All summaries are stored in the database, however.</p> <code>ThirdPartyTask-specific</code> <code>Optional[str]</code> <code>extra</code> <code>str</code> <p>\"allow\". Pydantic configuration. Allow (or ignore) extra arguments.</p> <code>short_flags_use_eq</code> <code>bool</code> <p>False. If True, \"short\" command-line args are passed as <code>-x=arg</code>. ThirdPartyTask-specific.</p> <code>long_flags_use_eq</code> <code>bool</code> <p>False. If True, \"long\" command-line args are passed as <code>--long=arg</code>. ThirdPartyTask-specific.</p> Source code in <code>lute/io/models/base.py</code> <pre><code>class Config(TaskParameters.Config):\n    \"\"\"Configuration for parameters model.\n\n    The Config class holds Pydantic configuration and inherited configuration\n    from the base `TaskParameters.Config` class. A number of values are also\n    overridden, and there are some specific configuration options to\n    ThirdPartyParameters. A full list of options (with TaskParameters options\n    repeated) is described below.\n\n    Attributes:\n        env_prefix (str): Pydantic configuration. Will set parameters from\n            environment variables containing this prefix. E.g. a model\n            parameter `input` can be set with an environment variable:\n            `{env_prefix}input`, in LUTE's case `LUTE_input`.\n\n        underscore_attrs_are_private (bool): Pydantic configuration. Whether\n            to hide attributes (parameters) prefixed with an underscore.\n\n        copy_on_model_validation (str): Pydantic configuration. How to copy\n            the input object passed to the class instance for model\n            validation. Set to perform a deep copy.\n\n        allow_inf_nan (bool): Pydantic configuration. Whether to allow\n            infinity or NAN in float fields.\n\n        run_directory (Optional[str]): None. If set, it should be a valid\n            path. The `Task` will be run from this directory. This may be\n            useful for some `Task`s which rely on searching the working\n            directory.\n\n        set_result (bool). True. If True, the model has information about\n            setting the TaskResult object from the parameters it contains.\n            E.g. it has an `output` parameter which is marked as the result.\n            The result can be set with a field value of `is_result=True` on\n            a specific parameter, or using `result_from_params` and a\n            validator.\n\n        result_from_params (Optional[str]): None. Optionally used to define\n            results from information available in the model using a custom\n            validator. E.g. use a `outdir` and `filename` field to set\n            `result_from_params=f\"{outdir}/{filename}`, etc.\n\n        result_summary (Optional[str]): None. Defines a result summary that\n            can be known after processing the Pydantic model. Use of summary\n            depends on the Executor running the Task. All summaries are\n            stored in the database, however.\n\n        impl_schemas (Optional[str]). Specifies a the schemas the\n            output/results conform to. Only used if set_result is True.\n\n        -----------------------\n        ThirdPartyTask-specific:\n\n        extra (str): \"allow\". Pydantic configuration. Allow (or ignore) extra\n            arguments.\n\n        short_flags_use_eq (bool): False. If True, \"short\" command-line args\n            are passed as `-x=arg`. ThirdPartyTask-specific.\n\n        long_flags_use_eq (bool): False. If True, \"long\" command-line args\n            are passed as `--long=arg`. ThirdPartyTask-specific.\n    \"\"\"\n\n    extra: str = \"allow\"\n    short_flags_use_eq: bool = False\n    \"\"\"Whether short command-line arguments are passed like `-x=arg`.\"\"\"\n    long_flags_use_eq: bool = False\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/base/#io.models.base.ThirdPartyParameters.Config.short_flags_use_eq","title":"<code>short_flags_use_eq: bool = False</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether short command-line arguments are passed like <code>-x=arg</code>.</p>"},{"location":"source/io/models/sfx_find_peaks/","title":"sfx_find_peaks","text":""},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters","title":"<code>FindPeaksPsocakeParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using Psocake.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation. NOTE: This Task is deprecated and provided for compatibility only.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPsocakeParameters(ThirdPartyParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using Psocake.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    NOTE: This Task is deprecated and provided for compatibility only.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    class SZParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description=\"SZ compression algorithm (qoz, sz3)\"\n        )\n        binSize: int = Field(2, description=\"SZ compression's bin size paramater\")\n        roiWindowSize: int = Field(\n            2, description=\"SZ compression's ROI window size paramater\"\n        )\n        absError: float = Field(10, descriptionp=\"Maximum absolute error value\")\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    mca: str = Field(\n        \"btl ^openib\", description=\"Mca option for the MPI executable\", flag_type=\"--\"\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    p_arg2: str = Field(\n        \"findPeaksSZ.py\",\n        description=\"Executable to run with mpi (i.e. python).\",\n        flag_type=\"\",\n    )\n    d: str = Field(description=\"Detector name\", flag_type=\"-\")\n    e: str = Field(\"\", description=\"Experiment name\", flag_type=\"-\")\n    r: int = Field(-1, description=\"Run number\", flag_type=\"-\")\n    outDir: str = Field(\n        description=\"Output directory where .cxi will be saved\", flag_type=\"--\"\n    )\n    algorithm: int = Field(1, description=\"PyAlgos algorithm to use\", flag_type=\"--\")\n    alg_npix_min: float = Field(\n        1.0, description=\"PyAlgos algorithm's npix_min parameter\", flag_type=\"--\"\n    )\n    alg_npix_max: float = Field(\n        45.0, description=\"PyAlgos algorithm's npix_max parameter\", flag_type=\"--\"\n    )\n    alg_amax_thr: float = Field(\n        250.0, description=\"PyAlgos algorithm's amax_thr parameter\", flag_type=\"--\"\n    )\n    alg_atot_thr: float = Field(\n        330.0, description=\"PyAlgos algorithm's atot_thr parameter\", flag_type=\"--\"\n    )\n    alg_son_min: float = Field(\n        10.0, description=\"PyAlgos algorithm's son_min parameter\", flag_type=\"--\"\n    )\n    alg1_thr_low: float = Field(\n        80.0, description=\"PyAlgos algorithm's thr_low parameter\", flag_type=\"--\"\n    )\n    alg1_thr_high: float = Field(\n        270.0, description=\"PyAlgos algorithm's thr_high parameter\", flag_type=\"--\"\n    )\n    alg1_rank: int = Field(\n        3, description=\"PyAlgos algorithm's rank parameter\", flag_type=\"--\"\n    )\n    alg1_radius: int = Field(\n        3, description=\"PyAlgos algorithm's radius parameter\", flag_type=\"--\"\n    )\n    alg1_dr: int = Field(\n        1, description=\"PyAlgos algorithm's dr parameter\", flag_type=\"--\"\n    )\n    psanaMask_on: str = Field(\n        \"True\", description=\"Whether psana's mask should be used\", flag_type=\"--\"\n    )\n    psanaMask_calib: str = Field(\n        \"True\", description=\"Psana mask's calib parameter\", flag_type=\"--\"\n    )\n    psanaMask_status: str = Field(\n        \"True\", description=\"Psana mask's status parameter\", flag_type=\"--\"\n    )\n    psanaMask_edges: str = Field(\n        \"True\", description=\"Psana mask's edges parameter\", flag_type=\"--\"\n    )\n    psanaMask_central: str = Field(\n        \"True\", description=\"Psana mask's central parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbond: str = Field(\n        \"True\", description=\"Psana mask's unbond parameter\", flag_type=\"--\"\n    )\n    psanaMask_unbondnrs: str = Field(\n        \"True\", description=\"Psana mask's unbondnbrs parameter\", flag_type=\"--\"\n    )\n    mask: str = Field(\n        \"\", description=\"Path to an additional mask to apply\", flag_type=\"--\"\n    )\n    clen: str = Field(\n        description=\"Epics variable storing the camera length\", flag_type=\"--\"\n    )\n    coffset: float = Field(0, description=\"Camera offset in m\", flag_type=\"--\")\n    minPeaks: int = Field(\n        15,\n        description=\"Minimum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    maxPeaks: int = Field(\n        15,\n        description=\"Maximum number of peaks to mark frame for indexing\",\n        flag_type=\"--\",\n    )\n    minRes: int = Field(\n        0,\n        description=\"Minimum peak resolution to mark frame for indexing \",\n        flag_type=\"--\",\n    )\n    sample: str = Field(\"\", description=\"Sample name\", flag_type=\"--\")\n    instrument: Union[None, str] = Field(\n        None, description=\"Instrument name\", flag_type=\"--\"\n    )\n    pixelSize: float = Field(0.0, description=\"Pixel size\", flag_type=\"--\")\n    auto: str = Field(\n        \"False\",\n        description=(\n            \"Whether to automatically determine peak per event peak \"\n            \"finding parameters\"\n        ),\n        flag_type=\"--\",\n    )\n    detectorDistance: float = Field(\n        0.0, description=\"Detector distance from interaction point in m\", flag_type=\"--\"\n    )\n    access: Literal[\"ana\", \"ffb\"] = Field(\n        \"ana\", description=\"Data node type: {ana,ffb}\", flag_type=\"--\"\n    )\n    szfile: str = Field(\"qoz.json\", description=\"Path to SZ's JSON configuration file\")\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"sz.json\",\n            output_path=\"\",  # Will want to change where this goes...\n        ),\n        description=\"Template information for the sz.json file\",\n    )\n    sz_parameters: SZParameters = Field(\n        description=\"Configuration parameters for SZ Compression\", flag_type=\"\"\n    )\n\n    @validator(\"e\", always=True)\n    def validate_e(cls, e: str, values: Dict[str, Any]) -&gt; str:\n        if e == \"\":\n            return values[\"lute_config\"].experiment\n        return e\n\n    @validator(\"r\", always=True)\n    def validate_r(cls, r: int, values: Dict[str, Any]) -&gt; int:\n        if r == -1:\n            return values[\"lute_config\"].run\n        return r\n\n    @validator(\"lute_template_cfg\", always=True)\n    def set_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"szfile\"]\n        return lute_template_cfg\n\n    @validator(\"sz_parameters\", always=True)\n    def set_sz_compression_parameters(\n        cls, sz_parameters: SZParameters, values: Dict[str, Any]\n    ) -&gt; None:\n        values[\"compressor\"] = sz_parameters.compressor\n        values[\"binSize\"] = sz_parameters.binSize\n        values[\"roiWindowSize\"] = sz_parameters.roiWindowSize\n        if sz_parameters.compressor == \"qoz\":\n            values[\"pressio_opts\"] = {\n                \"pressio:abs\": sz_parameters.absError,\n                \"qoz\": {\"qoz:stride\": 8},\n            }\n        else:\n            values[\"pressio_opts\"] = {\"pressio:abs\": sz_parameters.absError}\n        return None\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        run: int = int(values[\"lute_config\"].run)\n        directory: str = values[\"outDir\"]\n        fname: str = f\"{exp}_{run:04d}.lst\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPsocakeParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters","title":"<code>FindPeaksPyAlgosParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for crystallographic (Bragg) peak finding using PyAlgos.</p> <p>This peak finding Task optionally has the ability to compress/decompress data with SZ for the purpose of compression validation.</p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgosParameters(TaskParameters):\n    \"\"\"Parameters for crystallographic (Bragg) peak finding using PyAlgos.\n\n    This peak finding Task optionally has the ability to compress/decompress\n    data with SZ for the purpose of compression validation.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    class SZCompressorParameters(BaseModel):\n        compressor: Literal[\"qoz\", \"sz3\"] = Field(\n            \"qoz\", description='Compression algorithm (\"qoz\" or \"sz3\")'\n        )\n        abs_error: float = Field(10.0, description=\"Absolute error bound\")\n        bin_size: int = Field(2, description=\"Bin size\")\n        roi_window_size: int = Field(\n            9,\n            description=\"Default window size\",\n        )\n\n    outdir: str = Field(\n        description=\"Output directory for cxi files\",\n    )\n    n_events: int = Field(\n        0,\n        description=\"Number of events to process (0 to process all events)\",\n    )\n    det_name: str = Field(\n        description=\"Psana name of the detector storing the image data\",\n    )\n    event_receiver: Literal[\"evr0\", \"evr1\"] = Field(\n        description=\"Event Receiver to be used: evr0 or evr1\",\n    )\n    tag: str = Field(\n        \"\",\n        description=\"Tag to add to the output file names\",\n    )\n    pv_camera_length: Union[str, float] = Field(\n        \"\",\n        description=\"PV associated with camera length \"\n        \"(if a number, camera length directly)\",\n    )\n    event_logic: bool = Field(\n        False,\n        description=\"True if only events with a specific event code should be \"\n        \"processed. False if the event code should be ignored\",\n    )\n    event_code: int = Field(\n        0,\n        description=\"Required events code for events to be processed if event logic \"\n        \"is True\",\n    )\n    psana_mask: bool = Field(\n        False,\n        description=\"If True, apply mask from psana Detector object\",\n    )\n    mask_file: Union[str, None] = Field(\n        None,\n        description=\"File with a custom mask to apply. If None, no custom mask is \"\n        \"applied\",\n    )\n    min_peaks: int = Field(2, description=\"Minimum number of peaks per image\")\n    max_peaks: int = Field(\n        2048,\n        description=\"Maximum number of peaks per image\",\n    )\n    npix_min: int = Field(\n        2,\n        description=\"Minimum number of pixels per peak\",\n    )\n    npix_max: int = Field(\n        30,\n        description=\"Maximum number of pixels per peak\",\n    )\n    amax_thr: float = Field(\n        80.0,\n        description=\"Minimum intensity threshold for starting a peak\",\n    )\n    atot_thr: float = Field(\n        120.0,\n        description=\"Minimum summed intensity threshold for pixel collection\",\n    )\n    son_min: float = Field(\n        7.0,\n        description=\"Minimum signal-to-noise ratio to be considered a peak\",\n    )\n    peak_rank: int = Field(\n        3,\n        description=\"Radius in which central peak pixel is a local maximum\",\n    )\n    r0: float = Field(\n        3.0,\n        description=\"Radius of ring for background evaluation in pixels\",\n    )\n    dr: float = Field(\n        2.0,\n        description=\"Width of ring for background evaluation in pixels\",\n    )\n    nsigm: float = Field(\n        7.0,\n        description=\"Intensity threshold to include pixel in connected group\",\n    )\n    compression: Optional[SZCompressorParameters] = Field(\n        None,\n        description=\"Options for the SZ Compression Algorithm\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            fname: Path = (\n                Path(values[\"outdir\"])\n                / f\"{values['lute_config'].experiment}_{values['lute_config'].run}_\"\n                f\"{values['tag']}.list\"\n            )\n            return str(fname)\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_find_peaks.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_find_peaks/#io.models.sfx_find_peaks.FindPeaksPyAlgosParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_index/","title":"sfx_index","text":"<p>Models for serial femtosecond crystallography indexing.</p> <p>Classes:</p> Name Description <code>IndexCrystFELParameters</code> <p>Perform indexing of hits/peaks using CrystFEL's <code>indexamajig</code>.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters","title":"<code>ConcatenateStreamFilesParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for stream concatenation.</p> <p>Concatenates the stream file output from CrystFEL indexing for multiple experimental runs.</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class ConcatenateStreamFilesParameters(TaskParameters):\n    \"\"\"Parameters for stream concatenation.\n\n    Concatenates the stream file output from CrystFEL indexing for multiple\n    experimental runs.\n    \"\"\"\n\n    class Config(TaskParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    in_file: str = Field(\n        \"\",\n        description=\"Root of directory tree storing stream files to merge.\",\n    )\n\n    tag: Optional[str] = Field(\n        \"\",\n        description=\"Tag identifying the stream files to merge.\",\n    )\n\n    out_file: str = Field(\n        \"\", description=\"Path to merged output stream file.\", is_result=True\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_dir: str = str(Path(stream_file).parent)\n                return stream_dir\n        return in_file\n\n    @validator(\"tag\", always=True)\n    def validate_tag(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"IndexCrystFEL\", \"out_file\"\n            )\n            if stream_file:\n                stream_tag: str = Path(stream_file).name.split(\"_\")[0]\n                return stream_tag\n        return tag\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, tag: str, values: Dict[str, Any]) -&gt; str:\n        if tag == \"\":\n            stream_out_file: str = str(\n                Path(values[\"in_file\"]).parent / f\"{values['tag'].stream}\"\n            )\n            return stream_out_file\n        return tag\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(TaskParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.ConcatenateStreamFilesParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters","title":"<code>IndexCrystFELParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>indexamajig</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-indexamajig.html</p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class IndexCrystFELParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `indexamajig`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-indexamajig.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/indexamajig\",\n        description=\"CrystFEL's indexing binary.\",\n        flag_type=\"\",\n    )\n    # Basic options\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    geometry: str = Field(\n        \"\", description=\"Path to geometry file.\", flag_type=\"-\", rename_param=\"g\"\n    )\n    zmq_input: Optional[str] = Field(\n        description=\"ZMQ address to receive data over. `input` and `zmq-input` are mutually exclusive\",\n        flag_type=\"--\",\n        rename_param=\"zmq-input\",\n    )\n    zmq_subscribe: Optional[str] = Field(  # Can be used multiple times...\n        description=\"Subscribe to ZMQ message of type `tag`\",\n        flag_type=\"--\",\n        rename_param=\"zmq-subscribe\",\n    )\n    zmq_request: Optional[AnyUrl] = Field(\n        description=\"Request new data over ZMQ by sending this value\",\n        flag_type=\"--\",\n        rename_param=\"zmq-request\",\n    )\n    asapo_endpoint: Optional[str] = Field(\n        description=\"ASAP::O endpoint. zmq-input and this are mutually exclusive.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-endpoint\",\n    )\n    asapo_token: Optional[str] = Field(\n        description=\"ASAP::O authentication token.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-token\",\n    )\n    asapo_beamtime: Optional[str] = Field(\n        description=\"ASAP::O beatime.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-beamtime\",\n    )\n    asapo_source: Optional[str] = Field(\n        description=\"ASAP::O data source.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-source\",\n    )\n    asapo_group: Optional[str] = Field(\n        description=\"ASAP::O consumer group.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-group\",\n    )\n    asapo_stream: Optional[str] = Field(\n        description=\"ASAP::O stream.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    asapo_wait_for_stream: Optional[str] = Field(\n        description=\"If ASAP::O stream does not exist, wait for it to appear.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-wait-for-stream\",\n    )\n    data_format: Optional[str] = Field(\n        description=\"Specify format for ZMQ or ASAP::O. `msgpack`, `hdf5` or `seedee`.\",\n        flag_type=\"--\",\n        rename_param=\"data-format\",\n    )\n    basename: bool = Field(\n        False,\n        description=\"Remove directory parts of filenames. Acts before prefix if prefix also given.\",\n        flag_type=\"--\",\n    )\n    prefix: Optional[str] = Field(\n        description=\"Add a prefix to the filenames from the infile argument.\",\n        flag_type=\"--\",\n        rename_param=\"asapo-stream\",\n    )\n    nthreads: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of threads to use. See also `max_indexer_threads`.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    no_check_prefix: bool = Field(\n        False,\n        description=\"Don't attempt to correct the prefix if it seems incorrect.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-prefix\",\n    )\n    highres: Optional[float] = Field(\n        description=\"Mark all pixels greater than `x` has bad.\", flag_type=\"--\"\n    )\n    profile: bool = Field(\n        False, description=\"Display timing data to monitor performance.\", flag_type=\"--\"\n    )\n    temp_dir: Optional[str] = Field(\n        description=\"Specify a path for the temp files folder.\",\n        flag_type=\"--\",\n        rename_param=\"temp-dir\",\n    )\n    wait_for_file: conint(gt=-2) = Field(\n        0,\n        description=\"Wait at most `x` seconds for a file to be created. A value of -1 means wait forever.\",\n        flag_type=\"--\",\n        rename_param=\"wait-for-file\",\n    )\n    no_image_data: bool = Field(\n        False,\n        description=\"Load only the metadata, no iamges. Can check indexability without high data requirements.\",\n        flag_type=\"--\",\n        rename_param=\"no-image-data\",\n    )\n    # Peak-finding options\n    # ....\n    # Indexing options\n    indexing: Optional[str] = Field(\n        description=\"Comma-separated list of supported indexing algorithms to use. Default is to automatically detect.\",\n        flag_type=\"--\",\n    )\n    cell_file: Optional[str] = Field(\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    tolerance: str = Field(\n        \"5,5,5,1.5\",\n        description=(\n            \"Tolerances (in percent) for unit cell comparison. \"\n            \"Comma-separated list a,b,c,angle. Default=5,5,5,1.5\"\n        ),\n        flag_type=\"--\",\n    )\n    no_check_cell: bool = Field(\n        False,\n        description=\"Do not check cell parameters against unit cell. Replaces '-raw' method.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-cell\",\n    )\n    no_check_peaks: bool = Field(\n        False,\n        description=\"Do not verify peaks are accounted for by solution.\",\n        flag_type=\"--\",\n        rename_param=\"no-check-peaks\",\n    )\n    multi: bool = Field(\n        False, description=\"Enable multi-lattice indexing.\", flag_type=\"--\"\n    )\n    wavelength_estimate: Optional[float] = Field(\n        description=\"Estimate for X-ray wavelength. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"wavelength-estimate\",\n    )\n    camera_length_estimate: Optional[float] = Field(\n        description=\"Estimate for camera distance. Required for some methods.\",\n        flag_type=\"--\",\n        rename_param=\"camera-length-estimate\",\n    )\n    max_indexer_threads: Optional[PositiveInt] = Field(\n        # 1,\n        description=\"Some indexing algos can use multiple threads. In addition to image-based.\",\n        flag_type=\"--\",\n        rename_param=\"max-indexer-threads\",\n    )\n    no_retry: bool = Field(\n        False,\n        description=\"Do not remove weak peaks and try again.\",\n        flag_type=\"--\",\n        rename_param=\"no-retry\",\n    )\n    no_refine: bool = Field(\n        False,\n        description=\"Skip refinement step.\",\n        flag_type=\"--\",\n        rename_param=\"no-refine\",\n    )\n    no_revalidate: bool = Field(\n        False,\n        description=\"Skip revalidation step.\",\n        flag_type=\"--\",\n        rename_param=\"no-revalidate\",\n    )\n    # TakeTwo specific parameters\n    taketwo_member_threshold: Optional[PositiveInt] = Field(\n        # 20,\n        description=\"Minimum number of vectors to consider.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-member-threshold\",\n    )\n    taketwo_len_tolerance: Optional[PositiveFloat] = Field(\n        # 0.001,\n        description=\"TakeTwo length tolerance in Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-len-tolerance\",\n    )\n    taketwo_angle_tolerance: Optional[PositiveFloat] = Field(\n        # 0.6,\n        description=\"TakeTwo angle tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-angle-tolerance\",\n    )\n    taketwo_trace_tolerance: Optional[PositiveFloat] = Field(\n        # 3,\n        description=\"Matrix trace tolerance in degrees.\",\n        flag_type=\"--\",\n        rename_param=\"taketwo-trace-tolerance\",\n    )\n    # Felix-specific parameters\n    # felix_domega\n    # felix-fraction-max-visits\n    # felix-max-internal-angle\n    # felix-max-uniqueness\n    # felix-min-completeness\n    # felix-min-visits\n    # felix-num-voxels\n    # felix-sigma\n    # felix-tthrange-max\n    # felix-tthrange-min\n    # XGANDALF-specific parameters\n    xgandalf_sampling_pitch: Optional[NonNegativeInt] = Field(\n        # 6,\n        description=\"Density of reciprocal space sampling.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-sampling-pitch\",\n    )\n    xgandalf_grad_desc_iterations: Optional[NonNegativeInt] = Field(\n        # 4,\n        description=\"Number of gradient descent iterations.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-grad-desc-iterations\",\n    )\n    xgandalf_tolerance: Optional[PositiveFloat] = Field(\n        # 0.02,\n        description=\"Relative tolerance of lattice vectors\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-tolerance\",\n    )\n    xgandalf_no_deviation_from_provided_cell: Optional[bool] = Field(\n        description=\"Found unit cell must match provided.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-no-deviation-from-provided-cell\",\n    )\n    xgandalf_min_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 30,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-min-lattice-vector-length\",\n    )\n    xgandalf_max_lattice_vector_length: Optional[PositiveFloat] = Field(\n        # 250,\n        description=\"Minimum possible lattice length.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-lattice-vector-length\",\n    )\n    xgandalf_max_peaks: Optional[PositiveInt] = Field(\n        # 250,\n        description=\"Maximum number of peaks to use for indexing.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-max-peaks\",\n    )\n    xgandalf_fast_execution: bool = Field(\n        False,\n        description=\"Shortcut to set sampling-pitch=2, and grad-desc-iterations=3.\",\n        flag_type=\"--\",\n        rename_param=\"xgandalf-fast-execution\",\n    )\n    # pinkIndexer parameters\n    # ...\n    # asdf_fast: bool = Field(False, description=\"Enable fast mode for asdf. 3x faster for 7% loss in accuracy.\", flag_type=\"--\", rename_param=\"asdf-fast\")\n    # Integration parameters\n    integration: str = Field(\n        \"rings-nocen\", description=\"Method for integrating reflections.\", flag_type=\"--\"\n    )\n    fix_profile_radius: Optional[float] = Field(\n        description=\"Fix the profile radius (m^{-1})\",\n        flag_type=\"--\",\n        rename_param=\"fix-profile-radius\",\n    )\n    fix_divergence: Optional[float] = Field(\n        0,\n        description=\"Fix the divergence (rad, full angle).\",\n        flag_type=\"--\",\n        rename_param=\"fix-divergence\",\n    )\n    int_radius: str = Field(\n        \"4,5,7\",\n        description=\"Inner, middle, and outer radii for 3-ring integration.\",\n        flag_type=\"--\",\n        rename_param=\"int-radius\",\n    )\n    int_diag: str = Field(\n        \"none\",\n        description=\"Show detailed information on integration when condition is met.\",\n        flag_type=\"--\",\n        rename_param=\"int-diag\",\n    )\n    push_res: str = Field(\n        \"infinity\",\n        description=\"Integrate `x` higher than apparent resolution limit (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    overpredict: bool = Field(\n        False,\n        description=\"Over-predict reflections. Maybe useful with post-refinement.\",\n        flag_type=\"--\",\n    )\n    cell_parameters_only: bool = Field(\n        False, description=\"Do not predict refletions at all\", flag_type=\"--\"\n    )\n    # Output parameters\n    no_non_hits_in_stream: bool = Field(\n        False,\n        description=\"Exclude non-hits from the stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-non-hits-in-stream\",\n    )\n    copy_hheader: Optional[str] = Field(\n        description=\"Copy information from header in the image to output stream.\",\n        flag_type=\"--\",\n        rename_param=\"copy-hheader\",\n    )\n    no_peaks_in_stream: bool = Field(\n        False,\n        description=\"Do not record peaks in stream file.\",\n        flag_type=\"--\",\n        rename_param=\"no-peaks-in-stream\",\n    )\n    no_refls_in_stream: bool = Field(\n        False,\n        description=\"Do not record reflections in stream.\",\n        flag_type=\"--\",\n        rename_param=\"no-refls-in-stream\",\n    )\n    serial_offset: Optional[PositiveInt] = Field(\n        description=\"Start numbering at `x` instead of 1.\",\n        flag_type=\"--\",\n        rename_param=\"serial-offset\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            filename: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"FindPeaksPyAlgos\", \"out_file\"\n            )\n            if filename is None:\n                exp: str = values[\"lute_config\"].experiment\n                run: int = int(values[\"lute_config\"].run)\n                tag: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"tag\"\n                )\n                out_dir: Optional[str] = read_latest_db_entry(\n                    f\"{values['lute_config'].work_dir}\", \"FindPeaksPsocake\", \"outDir\"\n                )\n                if out_dir is not None:\n                    fname: str = f\"{out_dir}/{exp}_{run:04d}\"\n                    if tag is not None:\n                        fname = f\"{fname}_{tag}\"\n                    return f\"{fname}.lst\"\n            else:\n                return filename\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            expmt: str = values[\"lute_config\"].experiment\n            run: int = int(values[\"lute_config\"].run)\n            work_dir: str = values[\"lute_config\"].work_dir\n            fname: str = f\"{expmt}_r{run:04d}.stream\"\n            return f\"{work_dir}/{fname}\"\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_index.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_index/#io.models.sfx_index.IndexCrystFELParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/","title":"sfx_merge","text":"<p>Models for merging reflections in serial femtosecond crystallography.</p> <p>Classes:</p> Name Description <code>MergePartialatorParameters</code> <p>Perform merging using CrystFEL's <code>partialator</code>.</p> <code>CompareHKLParameters</code> <p>Calculate figures of merit using CrystFEL's <code>compare_hkl</code>.</p> <code>ManipulateHKLParameters</code> <p>Perform transformations on lists of reflections using CrystFEL's <code>get_hkl</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters","title":"<code>CompareHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>compare_hkl</code> for calculating figures of merit.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class CompareHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `compare_hkl` for calculating figures of merit.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/compare_hkl\",\n        description=\"CrystFEL's reflection comparison binary.\",\n        flag_type=\"\",\n    )\n    in_files: Optional[str] = Field(\n        \"\",\n        description=\"Path to input HKLs. Space-separated list of 2. Use output of partialator e.g.\",\n        flag_type=\"\",\n    )\n    ## Need mechanism to set is_result=True ...\n    symmetry: str = Field(\"\", description=\"Point group symmetry.\", flag_type=\"--\")\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    fom: str = Field(\n        \"Rsplit\", description=\"Specify figure of merit to calculate.\", flag_type=\"--\"\n    )\n    nshells: int = Field(10, description=\"Use n resolution shells.\", flag_type=\"--\")\n    # NEED A NEW CASE FOR THIS -&gt; Boolean flag, no arg, one hyphen...\n    # fix_unity: bool = Field(\n    #    False,\n    #    description=\"Fix scale factors to unity.\",\n    #    flag_type=\"-\",\n    #    rename_param=\"u\",\n    # )\n    shell_file: str = Field(\n        \"\",\n        description=\"Write the statistics in resolution shells to a file.\",\n        flag_type=\"--\",\n        rename_param=\"shell-file\",\n        is_result=True,\n    )\n    ignore_negs: bool = Field(\n        False,\n        description=\"Ignore reflections with negative reflections.\",\n        flag_type=\"--\",\n        rename_param=\"ignore-negs\",\n    )\n    zero_negs: bool = Field(\n        False,\n        description=\"Set negative intensities to 0.\",\n        flag_type=\"--\",\n        rename_param=\"zero-negs\",\n    )\n    sigma_cutoff: Optional[Union[float, int, str]] = Field(\n        # \"-infinity\",\n        description=\"Discard reflections with I/sigma(I) &lt; n. -infinity means no cutoff.\",\n        flag_type=\"--\",\n        rename_param=\"sigma-cutoff\",\n    )\n    rmin: Optional[float] = Field(\n        description=\"Low resolution cutoff of 1/d (m-1). Use this or --lowres NOT both.\",\n        flag_type=\"--\",\n    )\n    lowres: Optional[float] = Field(\n        descirption=\"Low resolution cutoff in Angstroms. Use this or --rmin NOT both.\",\n        flag_type=\"--\",\n    )\n    rmax: Optional[float] = Field(\n        description=\"High resolution cutoff in 1/d (m-1). Use this or --highres NOT both.\",\n        flag_type=\"--\",\n    )\n    highres: Optional[float] = Field(\n        description=\"High resolution cutoff in Angstroms. Use this or --rmax NOT both.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_files\", always=True)\n    def validate_in_files(cls, in_files: str, values: Dict[str, Any]) -&gt; str:\n        if in_files == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                hkls: str = f\"{partialator_file}1 {partialator_file}2\"\n                return hkls\n        return in_files\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n\n    @validator(\"symmetry\", always=True)\n    def validate_symmetry(cls, symmetry: str, values: Dict[str, Any]) -&gt; str:\n        if symmetry == \"\":\n            partialator_sym: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"symmetry\"\n            )\n            if partialator_sym:\n                return partialator_sym\n        return symmetry\n\n    @validator(\"shell_file\", always=True)\n    def validate_shell_file(cls, shell_file: str, values: Dict[str, Any]) -&gt; str:\n        if shell_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                shells_out: str = partialator_file.split(\".\")[0]\n                shells_out = f\"{shells_out}_{values['fom']}_n{values['nshells']}.dat\"\n                return shells_out\n        return shell_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.CompareHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters","title":"<code>ManipulateHKLParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>get_hkl</code> for manipulating lists of reflections.</p> <p>This Task is predominantly used internally to convert <code>hkl</code> to <code>mtz</code> files. Note that performing multiple manipulations is undefined behaviour. Run the Task with multiple configurations in explicit separate steps. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class ManipulateHKLParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `get_hkl` for manipulating lists of reflections.\n\n    This Task is predominantly used internally to convert `hkl` to `mtz` files.\n    Note that performing multiple manipulations is undefined behaviour. Run\n    the Task with multiple configurations in explicit separate steps. For more\n    information on usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/get_hkl\",\n        description=\"CrystFEL's reflection manipulation binary.\",\n        flag_type=\"\",\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input HKL file.\",\n        flag_type=\"-\",\n        rename_param=\"i\",\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    cell_file: str = Field(\n        \"\",\n        description=\"Path to a file containing unit cell information (PDB or CrystFEL format).\",\n        flag_type=\"-\",\n        rename_param=\"p\",\n    )\n    output_format: str = Field(\n        \"mtz\",\n        description=\"Output format. One of mtz, mtz-bij, or xds. Otherwise CrystFEL format.\",\n        flag_type=\"--\",\n        rename_param=\"output-format\",\n    )\n    expand: Optional[str] = Field(\n        description=\"Reflections will be expanded to fill asymmetric unit of specified point group.\",\n        flag_type=\"--\",\n    )\n    # Reducing reflections to higher symmetry\n    twin: Optional[str] = Field(\n        description=\"Reflections equivalent to specified point group will have intensities summed.\",\n        flag_type=\"--\",\n    )\n    no_need_all_parts: Optional[bool] = Field(\n        description=\"Use with --twin to allow reflections missing a 'twin mate' to be written out.\",\n        flag_type=\"--\",\n        rename_param=\"no-need-all-parts\",\n    )\n    # Noise - Add to data\n    noise: Optional[bool] = Field(\n        description=\"Generate 10% uniform noise.\", flag_type=\"--\"\n    )\n    poisson: Optional[bool] = Field(\n        description=\"Generate Poisson noise. Intensities assumed to be A.U.\",\n        flag_type=\"--\",\n    )\n    adu_per_photon: Optional[int] = Field(\n        description=\"Use with --poisson to convert A.U. to photons.\",\n        flag_type=\"--\",\n        rename_param=\"adu-per-photon\",\n    )\n    # Remove duplicate reflections\n    trim_centrics: Optional[bool] = Field(\n        description=\"Duplicated reflections (according to symmetry) are removed.\",\n        flag_type=\"--\",\n    )\n    # Restrict to template file\n    template: Optional[str] = Field(\n        description=\"Only reflections which also appear in specified file are written out.\",\n        flag_type=\"--\",\n    )\n    # Multiplicity\n    multiplicity: Optional[bool] = Field(\n        description=\"Reflections are multiplied by their symmetric multiplicites.\",\n        flag_type=\"--\",\n    )\n    # Resolution cutoffs\n    cutoff_angstroms: Optional[Union[str, int, float]] = Field(\n        description=\"Either n, or n1,n2,n3. For n, reflections &lt; n are removed. For n1,n2,n3 anisotropic trunction performed at separate resolution limits for a*, b*, c*.\",\n        flag_type=\"--\",\n        rename_param=\"cutoff-angstroms\",\n    )\n    lowres: Optional[float] = Field(\n        description=\"Remove reflections with d &gt; n\", flag_type=\"--\"\n    )\n    highres: Optional[float] = Field(\n        description=\"Synonym for first form of --cutoff-angstroms\"\n    )\n    reindex: Optional[str] = Field(\n        description=\"Reindex according to specified operator. E.g. k,h,-l.\",\n        flag_type=\"--\",\n    )\n    # Override input symmetry\n    symmetry: Optional[str] = Field(\n        description=\"Point group symmetry to use to override. Almost always OMIT this option.\",\n        flag_type=\"--\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                return partialator_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            partialator_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"MergePartialator\", \"out_file\"\n            )\n            if partialator_file:\n                mtz_out: str = partialator_file.split(\".\")[0]\n                mtz_out = f\"{mtz_out}.mtz\"\n                return mtz_out\n        return out_file\n\n    @validator(\"cell_file\", always=True)\n    def validate_cell_file(cls, cell_file: str, values: Dict[str, Any]) -&gt; str:\n        if cell_file == \"\":\n            idx_cell_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"IndexCrystFEL\",\n                \"cell_file\",\n                valid_only=False,\n            )\n            if idx_cell_file:\n                return idx_cell_file\n        return cell_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.ManipulateHKLParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters","title":"<code>MergePartialatorParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CrystFEL's <code>partialator</code>.</p> <p>There are many parameters, and many combinations. For more information on usage, please refer to the CrystFEL documentation, here: https://www.desy.de/~twhite/crystfel/manual-partialator.html</p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class MergePartialatorParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CrystFEL's `partialator`.\n\n    There are many parameters, and many combinations. For more information on\n    usage, please refer to the CrystFEL documentation, here:\n    https://www.desy.de/~twhite/crystfel/manual-partialator.html\n    \"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True\n        \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/crystfel/0.10.2/bin/partialator\",\n        description=\"CrystFEL's Partialator binary.\",\n        flag_type=\"\",\n    )\n    in_file: Optional[str] = Field(\n        \"\", description=\"Path to input stream.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    out_file: str = Field(\n        \"\",\n        description=\"Path to output file.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,\n    )\n    symmetry: str = Field(description=\"Point group symmetry.\", flag_type=\"--\")\n    niter: Optional[int] = Field(\n        description=\"Number of cycles of scaling and post-refinement.\",\n        flag_type=\"-\",\n        rename_param=\"n\",\n    )\n    no_scale: Optional[bool] = Field(\n        description=\"Disable scaling.\", flag_type=\"--\", rename_param=\"no-scale\"\n    )\n    no_Bscale: Optional[bool] = Field(\n        description=\"Disable Debye-Waller part of scaling.\",\n        flag_type=\"--\",\n        rename_param=\"no-Bscale\",\n    )\n    no_pr: Optional[bool] = Field(\n        description=\"Disable orientation model.\", flag_type=\"--\", rename_param=\"no-pr\"\n    )\n    no_deltacchalf: Optional[bool] = Field(\n        description=\"Disable rejection based on deltaCC1/2.\",\n        flag_type=\"--\",\n        rename_param=\"no-deltacchalf\",\n    )\n    model: str = Field(\n        \"unity\",\n        description=\"Partiality model. Options: xsphere, unity, offset, ggpm.\",\n        flag_type=\"--\",\n    )\n    nthreads: int = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of parallel analyses.\",\n        flag_type=\"-\",\n        rename_param=\"j\",\n    )\n    polarisation: Optional[str] = Field(\n        description=\"Specification of incident polarisation. Refer to CrystFEL docs for more info.\",\n        flag_type=\"--\",\n    )\n    no_polarisation: Optional[bool] = Field(\n        description=\"Synonym for --polarisation=none\",\n        flag_type=\"--\",\n        rename_param=\"no-polarisation\",\n    )\n    max_adu: Optional[float] = Field(\n        description=\"Maximum intensity of reflection to include.\",\n        flag_type=\"--\",\n        rename_param=\"max-adu\",\n    )\n    min_res: Optional[float] = Field(\n        description=\"Only include crystals diffracting to a minimum resolution.\",\n        flag_type=\"--\",\n        rename_param=\"min-res\",\n    )\n    min_measurements: int = Field(\n        2,\n        description=\"Include a reflection only if it appears a minimum number of times.\",\n        flag_type=\"--\",\n        rename_param=\"min-measurements\",\n    )\n    push_res: Optional[float] = Field(\n        description=\"Merge reflections up to higher than the apparent resolution limit.\",\n        flag_type=\"--\",\n        rename_param=\"push-res\",\n    )\n    start_after: int = Field(\n        0,\n        description=\"Ignore the first n crystals.\",\n        flag_type=\"--\",\n        rename_param=\"start-after\",\n    )\n    stop_after: int = Field(\n        0,\n        description=\"Stop after processing n crystals. 0 means process all.\",\n        flag_type=\"--\",\n        rename_param=\"stop-after\",\n    )\n    no_free: Optional[bool] = Field(\n        description=\"Disable cross-validation. Testing ONLY.\",\n        flag_type=\"--\",\n        rename_param=\"no-free\",\n    )\n    custom_split: Optional[str] = Field(\n        description=\"Read a set of filenames, event and dataset IDs from a filename.\",\n        flag_type=\"--\",\n        rename_param=\"custom-split\",\n    )\n    max_rel_B: float = Field(\n        100,\n        description=\"Reject crystals if |relB| &gt; n sq Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"max-rel-B\",\n    )\n    output_every_cycle: bool = Field(\n        False,\n        description=\"Write per-crystal params after every refinement cycle.\",\n        flag_type=\"--\",\n        rename_param=\"output-every-cycle\",\n    )\n    no_logs: bool = Field(\n        False,\n        description=\"Do not write logs needed for plots, maps and graphs.\",\n        flag_type=\"--\",\n        rename_param=\"no-logs\",\n    )\n    set_symmetry: Optional[str] = Field(\n        description=\"Set the apparent symmetry of the crystals to a point group.\",\n        flag_type=\"-\",\n        rename_param=\"w\",\n    )\n    operator: Optional[str] = Field(\n        description=\"Specify an ambiguity operator. E.g. k,h,-l.\", flag_type=\"--\"\n    )\n    force_bandwidth: Optional[float] = Field(\n        description=\"Set X-ray bandwidth. As percent, e.g. 0.0013 (0.13%).\",\n        flag_type=\"--\",\n        rename_param=\"force-bandwidth\",\n    )\n    force_radius: Optional[float] = Field(\n        description=\"Set the initial profile radius (nm-1).\",\n        flag_type=\"--\",\n        rename_param=\"force-radius\",\n    )\n    force_lambda: Optional[float] = Field(\n        description=\"Set the wavelength. In Angstroms.\",\n        flag_type=\"--\",\n        rename_param=\"force-lambda\",\n    )\n    harvest_file: Optional[str] = Field(\n        description=\"Write parameters to file in JSON format.\",\n        flag_type=\"--\",\n        rename_param=\"harvest-file\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            stream_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",\n                \"ConcatenateStreamFiles\",\n                \"out_file\",\n            )\n            if stream_file:\n                return stream_file\n        return in_file\n\n    @validator(\"out_file\", always=True)\n    def validate_out_file(cls, out_file: str, values: Dict[str, Any]) -&gt; str:\n        if out_file == \"\":\n            in_file: str = values[\"in_file\"]\n            if in_file:\n                tag: str = in_file.split(\".\")[0]\n                return f\"{tag}.hkl\"\n            else:\n                return \"partialator.hkl\"\n        return out_file\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> Source code in <code>lute/io/models/sfx_merge.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    long_flags_use_eq: bool = True\n    \"\"\"Whether long command-line arguments are passed like `--long=arg`.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n</code></pre>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config.long_flags_use_eq","title":"<code>long_flags_use_eq: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether long command-line arguments are passed like <code>--long=arg</code>.</p>"},{"location":"source/io/models/sfx_merge/#io.models.sfx_merge.MergePartialatorParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/sfx_solve/","title":"sfx_solve","text":"<p>Models for structure solution in serial femtosecond crystallography.</p> <p>Classes:</p> Name Description <code>DimpleSolveParameters</code> <p>Perform structure solution using CCP4's dimple (molecular replacement).</p>"},{"location":"source/io/models/sfx_solve/#io.models.sfx_solve.DimpleSolveParameters","title":"<code>DimpleSolveParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's dimple program.</p> <p>There are many parameters. For more information on usage, please refer to the CCP4 documentation, here: https://ccp4.github.io/dimple/</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class DimpleSolveParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's dimple program.\n\n    There are many parameters. For more information on\n    usage, please refer to the CCP4 documentation, here:\n    https://ccp4.github.io/dimple/\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/dimple\",\n        description=\"CCP4 Dimple for solving structures with MR.\",\n        flag_type=\"\",\n    )\n    # Positional requirements - all required.\n    in_file: str = Field(\n        \"\",\n        description=\"Path to input mtz.\",\n        flag_type=\"\",\n    )\n    pdb: str = Field(\"\", description=\"Path to a PDB.\", flag_type=\"\")\n    out_dir: str = Field(\"\", description=\"Output DIRECTORY.\", flag_type=\"\")\n    # Most used options\n    mr_thresh: PositiveFloat = Field(\n        0.4,\n        description=\"Threshold for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-when-r\",\n    )\n    slow: Optional[bool] = Field(\n        False, description=\"Perform more refinement.\", flag_type=\"--\"\n    )\n    # Other options (IO)\n    hklout: str = Field(\n        \"final.mtz\", description=\"Output mtz file name.\", flag_type=\"--\"\n    )\n    xyzout: str = Field(\n        \"final.pdb\", description=\"Output PDB file name.\", flag_type=\"--\"\n    )\n    icolumn: Optional[str] = Field(\n        # \"IMEAN\",\n        description=\"Name for the I column.\",\n        flag_type=\"--\",\n    )\n    sigicolumn: Optional[str] = Field(\n        # \"SIG&lt;ICOL&gt;\",\n        description=\"Name for the Sig&lt;I&gt; column.\",\n        flag_type=\"--\",\n    )\n    fcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the F column.\",\n        flag_type=\"--\",\n    )\n    sigfcolumn: Optional[str] = Field(\n        # \"F\",\n        description=\"Name for the Sig&lt;F&gt; column.\",\n        flag_type=\"--\",\n    )\n    libin: Optional[str] = Field(\n        description=\"Ligand descriptions for refmac (LIBIN).\", flag_type=\"--\"\n    )\n    refmac_key: Optional[str] = Field(\n        description=\"Extra Refmac keywords to use in refinement.\",\n        flag_type=\"--\",\n        rename_param=\"refmac-key\",\n    )\n    free_r_flags: Optional[str] = Field(\n        description=\"Path to a mtz file with freeR flags.\",\n        flag_type=\"--\",\n        rename_param=\"free-r-flags\",\n    )\n    freecolumn: Optional[Union[int, float]] = Field(\n        # 0,\n        description=\"Refree column with an optional value.\",\n        flag_type=\"--\",\n    )\n    img_format: Optional[str] = Field(\n        description=\"Format of generated images. (png, jpeg, none).\",\n        flag_type=\"-\",\n        rename_param=\"f\",\n    )\n    white_bg: bool = Field(\n        False,\n        description=\"Use a white background in Coot and in images.\",\n        flag_type=\"--\",\n        rename_param=\"white-bg\",\n    )\n    no_cleanup: bool = Field(\n        False,\n        description=\"Retain intermediate files.\",\n        flag_type=\"--\",\n        rename_param=\"no-cleanup\",\n    )\n    # Calculations\n    no_blob_search: bool = Field(\n        False,\n        description=\"Do not search for unmodelled blobs.\",\n        flag_type=\"--\",\n        rename_param=\"no-blob-search\",\n    )\n    anode: bool = Field(\n        False, description=\"Use SHELX/AnoDe to find peaks in the anomalous map.\"\n    )\n    # Run customization\n    no_hetatm: bool = Field(\n        False,\n        description=\"Remove heteroatoms from the given model.\",\n        flag_type=\"--\",\n        rename_param=\"no-hetatm\",\n    )\n    rigid_cycles: Optional[PositiveInt] = Field(\n        # 10,\n        description=\"Number of cycles of rigid-body refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"rigid-cycles\",\n    )\n    jelly: Optional[PositiveInt] = Field(\n        # 4,\n        description=\"Number of cycles of jelly-body refinement to perform.\",\n        flag_type=\"--\",\n    )\n    restr_cycles: Optional[PositiveInt] = Field(\n        # 8,\n        description=\"Number of cycles of refmac final refinement to perform.\",\n        flag_type=\"--\",\n        rename_param=\"restr-cycles\",\n    )\n    lim_resolution: Optional[PositiveFloat] = Field(\n        description=\"Limit the final resolution.\", flag_type=\"--\", rename_param=\"reso\"\n    )\n    weight: Optional[str] = Field(\n        # \"auto-weight\",\n        description=\"The refmac matrix weight.\",\n        flag_type=\"--\",\n    )\n    mr_prog: Optional[str] = Field(\n        # \"phaser\",\n        description=\"Molecular replacement program. phaser or molrep.\",\n        flag_type=\"--\",\n        rename_param=\"mr-prog\",\n    )\n    mr_num: Optional[Union[str, int]] = Field(\n        # \"auto\",\n        description=\"Number of molecules to use for molecular replacement.\",\n        flag_type=\"--\",\n        rename_param=\"mr-num\",\n    )\n    mr_reso: Optional[PositiveFloat] = Field(\n        # 3.25,\n        description=\"High resolution for molecular replacement. If &gt;10 interpreted as eLLG.\",\n        flag_type=\"--\",\n        rename_param=\"mr-reso\",\n    )\n    itof_prog: Optional[str] = Field(\n        description=\"Program to calculate amplitudes. truncate, or ctruncate.\",\n        flag_type=\"--\",\n        rename_param=\"ItoF-prog\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return get_hkl_file\n        return in_file\n\n    @validator(\"out_dir\", always=True)\n    def validate_out_dir(cls, out_dir: str, values: Dict[str, Any]) -&gt; str:\n        if out_dir == \"\":\n            get_hkl_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if get_hkl_file:\n                return os.path.dirname(get_hkl_file)\n        return out_dir\n</code></pre>"},{"location":"source/io/models/sfx_solve/#io.models.sfx_solve.RunSHELXCParameters","title":"<code>RunSHELXCParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for CCP4's SHELXC program.</p> <p>SHELXC prepares files for SHELXD and SHELXE.</p> <p>For more information please refer to the official documentation: https://www.ccp4.ac.uk/html/crank.html</p> Source code in <code>lute/io/models/sfx_solve.py</code> <pre><code>class RunSHELXCParameters(ThirdPartyParameters):\n    \"\"\"Parameters for CCP4's SHELXC program.\n\n    SHELXC prepares files for SHELXD and SHELXE.\n\n    For more information please refer to the official documentation:\n    https://www.ccp4.ac.uk/html/crank.html\n    \"\"\"\n\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/ccp4-8.0/bin/shelxc\",\n        description=\"CCP4 SHELXC. Generates input files for SHELXD/SHELXE.\",\n        flag_type=\"\",\n    )\n    placeholder: str = Field(\n        \"xx\", description=\"Placeholder filename stem.\", flag_type=\"\"\n    )\n    in_file: str = Field(\n        \"\",\n        description=\"Input file for SHELXC with reflections AND proper records.\",\n        flag_type=\"\",\n    )\n\n    @validator(\"in_file\", always=True)\n    def validate_in_file(cls, in_file: str, values: Dict[str, Any]) -&gt; str:\n        if in_file == \"\":\n            # get_hkl needed to be run to produce an XDS format file...\n            xds_format_file: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\", \"ManipulateHKL\", \"out_file\"\n            )\n            if xds_format_file:\n                in_file = xds_format_file\n        if in_file[0] != \"&lt;\":\n            # Need to add a redirection for this program\n            # Runs like `shelxc xx &lt;input_file.xds`\n            in_file = f\"&lt;{in_file}\"\n        return in_file\n</code></pre>"},{"location":"source/io/models/smd/","title":"smd","text":"<p>Models for smalldata_tools Tasks.</p> <p>Classes:</p> Name Description <code>SubmitSMDParameters</code> <p>Parameters to run smalldata_tools to produce a smalldata HDF5 file.</p> <code>FindOverlapXSSParameters</code> <p>Parameter model for the FindOverlapXSS Task. Used to determine spatial/temporal overlap based on XSS difference signal.</p>"},{"location":"source/io/models/smd/#io.models.smd.FindOverlapXSSParameters","title":"<code>FindOverlapXSSParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>TaskParameter model for FindOverlapXSS Task.</p> <p>This Task determines spatial or temporal overlap between an optical pulse and the FEL pulse based on difference scattering (XSS) signal. This Task uses SmallData HDF5 files as a source.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class FindOverlapXSSParameters(TaskParameters):\n    \"\"\"TaskParameter model for FindOverlapXSS Task.\n\n    This Task determines spatial or temporal overlap between an optical pulse\n    and the FEL pulse based on difference scattering (XSS) signal. This Task\n    uses SmallData HDF5 files as a source.\n    \"\"\"\n\n    class ExpConfig(BaseModel):\n        det_name: str\n        ipm_var: str\n        scan_var: Union[str, List[str]]\n\n    class Thresholds(BaseModel):\n        min_Iscat: Union[int, float]\n        min_ipm: Union[int, float]\n\n    class AnalysisFlags(BaseModel):\n        use_pyfai: bool = True\n        use_asymls: bool = False\n\n    exp_config: ExpConfig\n    thresholds: Thresholds\n    analysis_flags: AnalysisFlags\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters","title":"<code>SubmitSMDParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Parameters for running smalldata to produce reduced HDF5 files.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class SubmitSMDParameters(ThirdPartyParameters):\n    \"\"\"Parameters for running smalldata to produce reduced HDF5 files.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n        set_result: bool = True\n        \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n        result_from_params: str = \"\"\n        \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable.\", flag_type=\"\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    p_arg1: str = Field(\n        \"python\", description=\"Executable to run with mpi (i.e. python).\", flag_type=\"\"\n    )\n    u: str = Field(\n        \"\", description=\"Python option for unbuffered output.\", flag_type=\"-\"\n    )\n    m: str = Field(\n        \"mpi4py.run\",\n        description=\"Python option to execute a module's contents as __main__ module.\",\n        flag_type=\"-\",\n    )\n    producer: str = Field(\n        \"\", description=\"Path to the SmallData producer Python script.\", flag_type=\"\"\n    )\n    run: str = Field(\n        os.environ.get(\"RUN_NUM\", \"\"), description=\"DAQ Run Number.\", flag_type=\"--\"\n    )\n    experiment: str = Field(\n        os.environ.get(\"EXPERIMENT\", \"\"),\n        description=\"LCLS Experiment Number.\",\n        flag_type=\"--\",\n    )\n    stn: NonNegativeInt = Field(0, description=\"Hutch endstation.\", flag_type=\"--\")\n    nevents: int = Field(\n        int(1e9), description=\"Number of events to process.\", flag_type=\"--\"\n    )\n    directory: Optional[str] = Field(\n        None,\n        description=\"Optional output directory. If None, will be in ${EXP_FOLDER}/hdf5/smalldata.\",\n        flag_type=\"--\",\n    )\n    ## Need mechanism to set result_from_param=True ...\n    gather_interval: PositiveInt = Field(\n        25, description=\"Number of events to collect at a time.\", flag_type=\"--\"\n    )\n    norecorder: bool = Field(\n        False, description=\"Whether to ignore recorder streams.\", flag_type=\"--\"\n    )\n    url: HttpUrl = Field(\n        \"https://pswww.slac.stanford.edu/ws-auth/lgbk\",\n        description=\"Base URL for eLog posting.\",\n        flag_type=\"--\",\n    )\n    epicsAll: bool = Field(\n        False,\n        description=\"Whether to store all EPICS PVs. Use with care.\",\n        flag_type=\"--\",\n    )\n    full: bool = Field(\n        False,\n        description=\"Whether to store all data. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    fullSum: bool = Field(\n        False,\n        description=\"Whether to store sums for all area detector images.\",\n        flag_type=\"--\",\n    )\n    default: bool = Field(\n        False,\n        description=\"Whether to store only the default minimal set of data.\",\n        flag_type=\"--\",\n    )\n    image: bool = Field(\n        False,\n        description=\"Whether to save everything as images. Use with care.\",\n        flag_type=\"--\",\n    )\n    tiff: bool = Field(\n        False,\n        description=\"Whether to save all images as a single TIFF. Use with EXTRA care.\",\n        flag_type=\"--\",\n    )\n    centerpix: bool = Field(\n        False,\n        description=\"Whether to mask center pixels for Epix10k2M detectors.\",\n        flag_type=\"--\",\n    )\n    postRuntable: bool = Field(\n        False,\n        description=\"Whether to post run tables. Also used as a trigger for summary jobs.\",\n        flag_type=\"--\",\n    )\n    wait: bool = Field(\n        False, description=\"Whether to wait for a file to appear.\", flag_type=\"--\"\n    )\n    xtcav: bool = Field(\n        False,\n        description=\"Whether to add XTCAV processing to the HDF5 generation.\",\n        flag_type=\"--\",\n    )\n    noarch: bool = Field(\n        False, description=\"Whether to not use archiver data.\", flag_type=\"--\"\n    )\n\n    lute_template_cfg: TemplateConfig = TemplateConfig(template_name=\"\", output_path=\"\")\n\n    @validator(\"producer\", always=True)\n    def validate_producer_path(cls, producer: str) -&gt; str:\n        return producer\n\n    @validator(\"lute_template_cfg\", always=True)\n    def use_producer(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if not lute_template_cfg.output_path:\n            lute_template_cfg.output_path = values[\"producer\"]\n        return lute_template_cfg\n\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        exp: str = values[\"lute_config\"].experiment\n        hutch: str = exp[:3]\n        run: int = int(values[\"lute_config\"].run)\n        directory: Optional[str] = values[\"directory\"]\n        if directory is None:\n            directory = f\"/sdf/data/lcls/ds/{hutch}/{exp}/hdf5/smalldata\"\n        fname: str = f\"{exp}_Run{run:04d}.h5\"\n\n        cls.Config.result_from_params = f\"{directory}/{fname}\"\n        return values\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config","title":"<code>Config</code>","text":"<p>               Bases: <code>Config</code></p> <p>Identical to super-class Config but includes a result.</p> Source code in <code>lute/io/models/smd.py</code> <pre><code>class Config(ThirdPartyParameters.Config):\n    \"\"\"Identical to super-class Config but includes a result.\"\"\"\n\n    set_result: bool = True\n    \"\"\"Whether the Executor should mark a specified parameter as a result.\"\"\"\n\n    result_from_params: str = \"\"\n    \"\"\"Defines a result from the parameters. Use a validator to do so.\"\"\"\n</code></pre>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config.result_from_params","title":"<code>result_from_params: str = ''</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Defines a result from the parameters. Use a validator to do so.</p>"},{"location":"source/io/models/smd/#io.models.smd.SubmitSMDParameters.Config.set_result","title":"<code>set_result: bool = True</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Whether the Executor should mark a specified parameter as a result.</p>"},{"location":"source/io/models/tests/","title":"tests","text":"<p>Models for all test Tasks.</p> <p>Classes:</p> Name Description <code>TestParameters</code> <p>Model for most basic test case. Single core first-party Task. Uses only communication via pipes.</p> <code>TestBinaryParameters</code> <p>Parameters for a simple multi- threaded binary executable.</p> <code>TestSocketParameters</code> <p>Model for first-party test requiring communication via socket.</p> <code>TestWriteOutputParameters</code> <p>Model for test Task which writes an output file. Location of file is recorded in database.</p> <code>TestReadOutputParameters</code> <p>Model for test Task which locates an output file based on an entry in the database, if no path is provided.</p>"},{"location":"source/io/models/tests/#io.models.tests.TestBinaryErrParameters","title":"<code>TestBinaryErrParameters</code>","text":"<p>               Bases: <code>ThirdPartyParameters</code></p> <p>Same as TestBinary, but exits with non-zero code.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestBinaryErrParameters(ThirdPartyParameters):\n    \"\"\"Same as TestBinary, but exits with non-zero code.\"\"\"\n\n    executable: str = Field(\n        \"/sdf/home/d/dorlhiac/test_tasks/test_threads_err\",\n        description=\"Multi-threaded tes tbinary with non-zero exit code.\",\n    )\n    p_arg1: int = Field(1, description=\"Number of threads.\")\n</code></pre>"},{"location":"source/io/models/tests/#io.models.tests.TestParameters","title":"<code>TestParameters</code>","text":"<p>               Bases: <code>TaskParameters</code></p> <p>Parameters for the test Task <code>Test</code>.</p> Source code in <code>lute/io/models/tests.py</code> <pre><code>class TestParameters(TaskParameters):\n    \"\"\"Parameters for the test Task `Test`.\"\"\"\n\n    float_var: float = Field(0.01, description=\"A floating point number.\")\n    str_var: str = Field(\"test\", description=\"A string.\")\n\n    class CompoundVar(BaseModel):\n        int_var: int = 1\n        dict_var: Dict[str, str] = {\"a\": \"b\"}\n\n    compound_var: CompoundVar = Field(\n        description=(\n            \"A compound parameter - consists of a `int_var` (int) and `dict_var`\"\n            \" (Dict[str, str]).\"\n        )\n    )\n    throw_error: bool = Field(\n        False, description=\"If `True`, raise an exception to test error handling.\"\n    )\n</code></pre>"},{"location":"source/tasks/dataclasses/","title":"dataclasses","text":"<p>Classes for describing Task state and results.</p> <p>Classes:</p> Name Description <code>TaskResult</code> <p>Output of a specific analysis task.</p> <code>TaskStatus</code> <p>Enumeration of possible Task statuses (running, pending, failed, etc.).</p> <code>DescribedAnalysis</code> <p>Executor's description of a <code>Task</code> run (results, parameters, env).</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.DescribedAnalysis","title":"<code>DescribedAnalysis</code>  <code>dataclass</code>","text":"<p>Complete analysis description. Held by an Executor.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass DescribedAnalysis:\n    \"\"\"Complete analysis description. Held by an Executor.\"\"\"\n\n    task_result: TaskResult\n    task_parameters: Optional[TaskParameters]\n    task_env: Dict[str, str]\n    poll_interval: float\n    communicator_desc: List[str]\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.ElogSummaryPlots","title":"<code>ElogSummaryPlots</code>  <code>dataclass</code>","text":"<p>Holds a graphical summary intended for display in the eLog.</p> <p>Attributes:</p> Name Type Description <code>display_name</code> <code>str</code> <p>This represents both a path and how the result will be displayed in the eLog. Can include \"/\" characters. E.g. <code>display_name = \"scans/my_motor_scan\"</code> will have plots shown on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors how the file is stored on disk as well.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass ElogSummaryPlots:\n    \"\"\"Holds a graphical summary intended for display in the eLog.\n\n    Attributes:\n        display_name (str): This represents both a path and how the result will be\n            displayed in the eLog. Can include \"/\" characters. E.g.\n            `display_name = \"scans/my_motor_scan\"` will have plots shown\n            on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors\n            how the file is stored on disk as well.\n    \"\"\"\n\n    display_name: str\n    figures: Union[pn.Tabs, hv.Image, plt.Figure]\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskResult","title":"<code>TaskResult</code>  <code>dataclass</code>","text":"<p>Class for storing the result of a Task's execution with metadata.</p> <p>Attributes:</p> Name Type Description <code>task_name</code> <code>str</code> <p>Name of the associated task which produced it.</p> <code>task_status</code> <code>TaskStatus</code> <p>Status of associated task.</p> <code>summary</code> <code>str</code> <p>Short message/summary associated with the result.</p> <code>payload</code> <code>Any</code> <p>Actual result. May be data in any format.</p> <code>impl_schemas</code> <code>Optional[str]</code> <p>A string listing <code>Task</code> schemas implemented by the associated <code>Task</code>. Schemas define the category and expected output of the <code>Task</code>. An individual task may implement/conform to multiple schemas. Multiple schemas are separated by ';', e.g.     * impl_schemas = \"schema1;schema2\"</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass TaskResult:\n    \"\"\"Class for storing the result of a Task's execution with metadata.\n\n    Attributes:\n        task_name (str): Name of the associated task which produced it.\n\n        task_status (TaskStatus): Status of associated task.\n\n        summary (str): Short message/summary associated with the result.\n\n        payload (Any): Actual result. May be data in any format.\n\n        impl_schemas (Optional[str]): A string listing `Task` schemas implemented\n            by the associated `Task`. Schemas define the category and expected\n            output of the `Task`. An individual task may implement/conform to\n            multiple schemas. Multiple schemas are separated by ';', e.g.\n                * impl_schemas = \"schema1;schema2\"\n    \"\"\"\n\n    task_name: str\n    task_status: TaskStatus\n    summary: str\n    payload: Any\n    impl_schemas: Optional[str] = None\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus","title":"<code>TaskStatus</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Possible Task statuses.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>class TaskStatus(Enum):\n    \"\"\"Possible Task statuses.\"\"\"\n\n    PENDING = 0\n    \"\"\"\n    Task has yet to run. Is Queued, or waiting for prior tasks.\n    \"\"\"\n    RUNNING = 1\n    \"\"\"\n    Task is in the process of execution.\n    \"\"\"\n    COMPLETED = 2\n    \"\"\"\n    Task has completed without fatal errors.\n    \"\"\"\n    FAILED = 3\n    \"\"\"\n    Task encountered a fatal error.\n    \"\"\"\n    STOPPED = 4\n    \"\"\"\n    Task was, potentially temporarily, stopped/suspended.\n    \"\"\"\n    CANCELLED = 5\n    \"\"\"\n    Task was cancelled prior to completion or failure.\n    \"\"\"\n    TIMEDOUT = 6\n    \"\"\"\n    Task did not reach completion due to timeout.\n    \"\"\"\n</code></pre>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.CANCELLED","title":"<code>CANCELLED = 5</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was cancelled prior to completion or failure.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.COMPLETED","title":"<code>COMPLETED = 2</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has completed without fatal errors.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.FAILED","title":"<code>FAILED = 3</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task encountered a fatal error.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.PENDING","title":"<code>PENDING = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has yet to run. Is Queued, or waiting for prior tasks.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.RUNNING","title":"<code>RUNNING = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task is in the process of execution.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.STOPPED","title":"<code>STOPPED = 4</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was, potentially temporarily, stopped/suspended.</p>"},{"location":"source/tasks/dataclasses/#tasks.dataclasses.TaskStatus.TIMEDOUT","title":"<code>TIMEDOUT = 6</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task did not reach completion due to timeout.</p>"},{"location":"source/tasks/sfx_find_peaks/","title":"sfx_find_peaks","text":"<p>Classes for peak finding tasks in SFX.</p> <p>Classes:</p> Name Description <code>CxiWriter</code> <p>utility class for writing peak finding results to CXI files.</p> <code>FindPeaksPyAlgos</code> <p>peak finding using psana's PyAlgos algorithm. Optional data compression and decompression with libpressio for data reduction tests.</p>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter","title":"<code>CxiWriter</code>","text":"Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>class CxiWriter:\n\n    def __init__(\n        self,\n        outdir: str,\n        rank: int,\n        exp: str,\n        run: int,\n        n_events: int,\n        det_shape: Tuple[int, ...],\n        min_peaks: int,\n        max_peaks: int,\n        i_x: Any,  # Not typed becomes it comes from psana\n        i_y: Any,  # Not typed becomes it comes from psana\n        ipx: Any,  # Not typed becomes it comes from psana\n        ipy: Any,  # Not typed becomes it comes from psana\n        tag: str,\n    ):\n        \"\"\"\n        Set up the CXI files to which peak finding results will be saved.\n\n        Parameters:\n\n            outdir (str): Output directory for cxi file.\n\n            rank (int): MPI rank of the caller.\n\n            exp (str): Experiment string.\n\n            run (int): Experimental run.\n\n            n_events (int): Number of events to process.\n\n            det_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n                data. This must be aCheetah-stile 2D array.\n\n            min_peaks (int): Minimum number of peaks per image.\n\n            max_peaks (int): Maximum number of peaks per image.\n\n            i_x (Any): Array of pixel indexes along x\n\n            i_y (Any): Array of pixel indexes along y\n\n            ipx (Any): Pixel indexes with respect to detector origin (x component)\n\n            ipy (Any): Pixel indexes with respect to detector origin (y component)\n\n            tag (str): Tag to append to cxi file names.\n        \"\"\"\n        self._det_shape: Tuple[int, ...] = det_shape\n        self._i_x: Any = i_x\n        self._i_y: Any = i_y\n        self._ipx: Any = ipx\n        self._ipy: Any = ipy\n        self._index: int = 0\n\n        # Create and open the HDF5 file\n        fname: str = f\"{exp}_r{run:0&gt;4}_{rank}{tag}.cxi\"\n        Path(outdir).mkdir(exist_ok=True)\n        self._outh5: Any = h5py.File(Path(outdir) / fname, \"w\")\n\n        # Entry_1 entry for processing with CrystFEL\n        entry_1: Any = self._outh5.create_group(\"entry_1\")\n        keys: List[str] = [\n            \"nPeaks\",\n            \"peakXPosRaw\",\n            \"peakYPosRaw\",\n            \"rcent\",\n            \"ccent\",\n            \"rmin\",\n            \"rmax\",\n            \"cmin\",\n            \"cmax\",\n            \"peakTotalIntensity\",\n            \"peakMaxIntensity\",\n            \"peakRadius\",\n        ]\n        ds_expId: Any = entry_1.create_dataset(\n            \"experimental_identifier\", (n_events,), maxshape=(None,), dtype=int\n        )\n        ds_expId.attrs[\"axes\"] = \"experiment_identifier\"\n        data_1: Any = entry_1.create_dataset(\n            \"/entry_1/data_1/data\",\n            (n_events, det_shape[0], det_shape[1]),\n            chunks=(1, det_shape[0], det_shape[1]),\n            maxshape=(None, det_shape[0], det_shape[1]),\n            dtype=numpy.float32,\n        )\n        data_1.attrs[\"axes\"] = \"experiment_identifier\"\n        key: str\n        for key in [\"powderHits\", \"powderMisses\", \"mask\"]:\n            entry_1.create_dataset(\n                f\"/entry_1/data_1/{key}\",\n                (det_shape[0], det_shape[1]),\n                chunks=(det_shape[0], det_shape[1]),\n                maxshape=(det_shape[0], det_shape[1]),\n                dtype=float,\n            )\n\n        # Peak-related entries\n        for key in keys:\n            if key == \"nPeaks\":\n                ds_x: Any = self._outh5.create_dataset(\n                    f\"/entry_1/result_1/{key}\",\n                    (n_events,),\n                    maxshape=(None,),\n                    dtype=int,\n                )\n                ds_x.attrs[\"minPeaks\"] = min_peaks\n                ds_x.attrs[\"maxPeaks\"] = max_peaks\n            else:\n                ds_x: Any = self._outh5.create_dataset(\n                    f\"/entry_1/result_1/{key}\",\n                    (n_events, max_peaks),\n                    maxshape=(None, max_peaks),\n                    chunks=(1, max_peaks),\n                    dtype=float,\n                )\n            ds_x.attrs[\"axes\"] = \"experiment_identifier:peaks\"\n\n        # Timestamp entries\n        lcls_1: Any = self._outh5.create_group(\"LCLS\")\n        keys: List[str] = [\n            \"eventNumber\",\n            \"machineTime\",\n            \"machineTimeNanoSeconds\",\n            \"fiducial\",\n            \"photon_energy_eV\",\n        ]\n        key: str\n        for key in keys:\n            if key == \"photon_energy_eV\":\n                ds_x: Any = lcls_1.create_dataset(\n                    f\"{key}\", (n_events,), maxshape=(None,), dtype=float\n                )\n            else:\n                ds_x = lcls_1.create_dataset(\n                    f\"{key}\", (n_events,), maxshape=(None,), dtype=int\n                )\n            ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n        ds_x = self._outh5.create_dataset(\n            \"/LCLS/detector_1/EncoderValue\", (n_events,), maxshape=(None,), dtype=float\n        )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n    def write_event(\n        self,\n        img: NDArray[numpy.float_],\n        peaks: Any,  # Not typed becomes it comes from psana\n        timestamp_seconds: int,\n        timestamp_nanoseconds: int,\n        timestamp_fiducials: int,\n        photon_energy: float,\n    ):\n        \"\"\"\n        Write peak finding results for an event into the HDF5 file.\n\n        Parameters:\n\n            img (NDArray[numpy.float_]): Detector data for the event\n\n            peaks: (Any): Peak information for the event, as recovered from the PyAlgos\n                algorithm\n\n            timestamp_seconds (int): Second part of the event's timestamp information\n\n            timestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n                information\n\n            timestamp_fiducials (int): Fiducials part of the event's timestamp\n                information\n\n            photon_energy (float): Photon energy for the event\n        \"\"\"\n        ch_rows: NDArray[numpy.float_] = peaks[:, 0] * self._det_shape[1] + peaks[:, 1]\n        ch_cols: NDArray[numpy.float_] = peaks[:, 2]\n\n        if self._outh5[\"/entry_1/data_1/data\"].shape[0] &lt;= self._index:\n            self._outh5[\"entry_1/data_1/data\"].resize(self._index + 1, axis=0)\n            ds_key: str\n            for ds_key in self._outh5[\"/entry_1/result_1\"].keys():\n                self._outh5[f\"/entry_1/result_1/{ds_key}\"].resize(\n                    self._index + 1, axis=0\n                )\n            for ds_key in (\n                \"machineTime\",\n                \"machineTimeNanoSeconds\",\n                \"fiducial\",\n                \"photon_energy_eV\",\n            ):\n                self._outh5[f\"/LCLS/{ds_key}\"].resize(self._index + 1, axis=0)\n\n        # Entry_1 entry for processing with CrystFEL\n        self._outh5[\"/entry_1/data_1/data\"][self._index, :, :] = img.reshape(\n            -1, img.shape[-1]\n        )\n        self._outh5[\"/entry_1/result_1/nPeaks\"][self._index] = peaks.shape[0]\n        self._outh5[\"/entry_1/result_1/peakXPosRaw\"][self._index, : peaks.shape[0]] = (\n            ch_cols.astype(\"int\")\n        )\n        self._outh5[\"/entry_1/result_1/peakYPosRaw\"][self._index, : peaks.shape[0]] = (\n            ch_rows.astype(\"int\")\n        )\n        self._outh5[\"/entry_1/result_1/rcent\"][self._index, : peaks.shape[0]] = peaks[\n            :, 6\n        ]\n        self._outh5[\"/entry_1/result_1/ccent\"][self._index, : peaks.shape[0]] = peaks[\n            :, 7\n        ]\n        self._outh5[\"/entry_1/result_1/rmin\"][self._index, : peaks.shape[0]] = peaks[\n            :, 10\n        ]\n        self._outh5[\"/entry_1/result_1/rmax\"][self._index, : peaks.shape[0]] = peaks[\n            :, 11\n        ]\n        self._outh5[\"/entry_1/result_1/cmin\"][self._index, : peaks.shape[0]] = peaks[\n            :, 12\n        ]\n        self._outh5[\"/entry_1/result_1/cmax\"][self._index, : peaks.shape[0]] = peaks[\n            :, 13\n        ]\n        self._outh5[\"/entry_1/result_1/peakTotalIntensity\"][\n            self._index, : peaks.shape[0]\n        ] = peaks[:, 5]\n        self._outh5[\"/entry_1/result_1/peakMaxIntensity\"][\n            self._index, : peaks.shape[0]\n        ] = peaks[:, 4]\n\n        # Calculate and write pixel radius\n        peaks_cenx: NDArray[numpy.float_] = (\n            self._i_x[\n                numpy.array(peaks[:, 0], dtype=numpy.int64),\n                numpy.array(peaks[:, 1], dtype=numpy.int64),\n                numpy.array(peaks[:, 2], dtype=numpy.int64),\n            ]\n            + 0.5\n            - self._ipx\n        )\n        peaks_ceny: NDArray[numpy.float_] = (\n            self._i_y[\n                numpy.array(peaks[:, 0], dtype=numpy.int64),\n                numpy.array(peaks[:, 1], dtype=numpy.int64),\n                numpy.array(peaks[:, 2], dtype=numpy.int64),\n            ]\n            + 0.5\n            - self._ipy\n        )\n        peak_radius: NDArray[numpy.float_] = numpy.sqrt(\n            (peaks_cenx**2) + (peaks_ceny**2)\n        )\n        self._outh5[\"/entry_1/result_1/peakRadius\"][\n            self._index, : peaks.shape[0]\n        ] = peak_radius\n\n        # LCLS entry dataset\n        self._outh5[\"/LCLS/machineTime\"][self._index] = timestamp_seconds\n        self._outh5[\"/LCLS/machineTimeNanoSeconds\"][self._index] = timestamp_nanoseconds\n        self._outh5[\"/LCLS/fiducial\"][self._index] = timestamp_fiducials\n        self._outh5[\"/LCLS/photon_energy_eV\"][self._index] = photon_energy\n\n        self._index += 1\n\n    def write_non_event_data(\n        self,\n        powder_hits: NDArray[numpy.float_],\n        powder_misses: NDArray[numpy.float_],\n        mask: NDArray[numpy.uint16],\n        clen: float,\n    ):\n        \"\"\"\n        Write to the file data that is not related to a specific event (masks, powders)\n\n        Parameters:\n\n            powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n            powder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n            mask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n\n        \"\"\"\n        # Add powders and mask to files, reshaping them to match the crystfel\n        # convention\n        self._outh5[\"/entry_1/data_1/powderHits\"][:] = powder_hits.reshape(\n            -1, powder_hits.shape[-1]\n        )\n        self._outh5[\"/entry_1/data_1/powderMisses\"][:] = powder_misses.reshape(\n            -1, powder_misses.shape[-1]\n        )\n        self._outh5[\"/entry_1/data_1/mask\"][:] = (1 - mask).reshape(\n            -1, mask.shape[-1]\n        )  # Crystfel expects inverted values\n\n        # Add clen distance\n        self._outh5[\"/LCLS/detector_1/EncoderValue\"][:] = clen\n\n    def optimize_and_close_file(\n        self,\n        num_hits: int,\n        max_peaks: int,\n    ):\n        \"\"\"\n        Resize data blocks and write additional information to the file\n\n        Parameters:\n\n            num_hits (int): Number of hits for which information has been saved to the\n                file\n\n            max_peaks (int): Maximum number of peaks (per event) for which information\n                can be written into the file\n        \"\"\"\n\n        # Resize the entry_1 entry\n        data_shape: Tuple[int, ...] = self._outh5[\"/entry_1/data_1/data\"].shape\n        self._outh5[\"/entry_1/data_1/data\"].resize(\n            (num_hits, data_shape[1], data_shape[2])\n        )\n        self._outh5[f\"/entry_1/result_1/nPeaks\"].resize((num_hits,))\n        key: str\n        for key in [\n            \"peakXPosRaw\",\n            \"peakYPosRaw\",\n            \"rcent\",\n            \"ccent\",\n            \"rmin\",\n            \"rmax\",\n            \"cmin\",\n            \"cmax\",\n            \"peakTotalIntensity\",\n            \"peakMaxIntensity\",\n            \"peakRadius\",\n        ]:\n            self._outh5[f\"/entry_1/result_1/{key}\"].resize((num_hits, max_peaks))\n\n        # Resize LCLS entry\n        for key in [\n            \"eventNumber\",\n            \"machineTime\",\n            \"machineTimeNanoSeconds\",\n            \"fiducial\",\n            \"detector_1/EncoderValue\",\n            \"photon_energy_eV\",\n        ]:\n            self._outh5[f\"/LCLS/{key}\"].resize((num_hits,))\n        self._outh5.close()\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.__init__","title":"<code>__init__(outdir, rank, exp, run, n_events, det_shape, min_peaks, max_peaks, i_x, i_y, ipx, ipy, tag)</code>","text":"<p>Set up the CXI files to which peak finding results will be saved.</p> <p>Parameters:</p> <pre><code>outdir (str): Output directory for cxi file.\n\nrank (int): MPI rank of the caller.\n\nexp (str): Experiment string.\n\nrun (int): Experimental run.\n\nn_events (int): Number of events to process.\n\ndet_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n    data. This must be aCheetah-stile 2D array.\n\nmin_peaks (int): Minimum number of peaks per image.\n\nmax_peaks (int): Maximum number of peaks per image.\n\ni_x (Any): Array of pixel indexes along x\n\ni_y (Any): Array of pixel indexes along y\n\nipx (Any): Pixel indexes with respect to detector origin (x component)\n\nipy (Any): Pixel indexes with respect to detector origin (y component)\n\ntag (str): Tag to append to cxi file names.\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def __init__(\n    self,\n    outdir: str,\n    rank: int,\n    exp: str,\n    run: int,\n    n_events: int,\n    det_shape: Tuple[int, ...],\n    min_peaks: int,\n    max_peaks: int,\n    i_x: Any,  # Not typed becomes it comes from psana\n    i_y: Any,  # Not typed becomes it comes from psana\n    ipx: Any,  # Not typed becomes it comes from psana\n    ipy: Any,  # Not typed becomes it comes from psana\n    tag: str,\n):\n    \"\"\"\n    Set up the CXI files to which peak finding results will be saved.\n\n    Parameters:\n\n        outdir (str): Output directory for cxi file.\n\n        rank (int): MPI rank of the caller.\n\n        exp (str): Experiment string.\n\n        run (int): Experimental run.\n\n        n_events (int): Number of events to process.\n\n        det_shape (Tuple[int, int]): Shape of the numpy array storing the detector\n            data. This must be aCheetah-stile 2D array.\n\n        min_peaks (int): Minimum number of peaks per image.\n\n        max_peaks (int): Maximum number of peaks per image.\n\n        i_x (Any): Array of pixel indexes along x\n\n        i_y (Any): Array of pixel indexes along y\n\n        ipx (Any): Pixel indexes with respect to detector origin (x component)\n\n        ipy (Any): Pixel indexes with respect to detector origin (y component)\n\n        tag (str): Tag to append to cxi file names.\n    \"\"\"\n    self._det_shape: Tuple[int, ...] = det_shape\n    self._i_x: Any = i_x\n    self._i_y: Any = i_y\n    self._ipx: Any = ipx\n    self._ipy: Any = ipy\n    self._index: int = 0\n\n    # Create and open the HDF5 file\n    fname: str = f\"{exp}_r{run:0&gt;4}_{rank}{tag}.cxi\"\n    Path(outdir).mkdir(exist_ok=True)\n    self._outh5: Any = h5py.File(Path(outdir) / fname, \"w\")\n\n    # Entry_1 entry for processing with CrystFEL\n    entry_1: Any = self._outh5.create_group(\"entry_1\")\n    keys: List[str] = [\n        \"nPeaks\",\n        \"peakXPosRaw\",\n        \"peakYPosRaw\",\n        \"rcent\",\n        \"ccent\",\n        \"rmin\",\n        \"rmax\",\n        \"cmin\",\n        \"cmax\",\n        \"peakTotalIntensity\",\n        \"peakMaxIntensity\",\n        \"peakRadius\",\n    ]\n    ds_expId: Any = entry_1.create_dataset(\n        \"experimental_identifier\", (n_events,), maxshape=(None,), dtype=int\n    )\n    ds_expId.attrs[\"axes\"] = \"experiment_identifier\"\n    data_1: Any = entry_1.create_dataset(\n        \"/entry_1/data_1/data\",\n        (n_events, det_shape[0], det_shape[1]),\n        chunks=(1, det_shape[0], det_shape[1]),\n        maxshape=(None, det_shape[0], det_shape[1]),\n        dtype=numpy.float32,\n    )\n    data_1.attrs[\"axes\"] = \"experiment_identifier\"\n    key: str\n    for key in [\"powderHits\", \"powderMisses\", \"mask\"]:\n        entry_1.create_dataset(\n            f\"/entry_1/data_1/{key}\",\n            (det_shape[0], det_shape[1]),\n            chunks=(det_shape[0], det_shape[1]),\n            maxshape=(det_shape[0], det_shape[1]),\n            dtype=float,\n        )\n\n    # Peak-related entries\n    for key in keys:\n        if key == \"nPeaks\":\n            ds_x: Any = self._outh5.create_dataset(\n                f\"/entry_1/result_1/{key}\",\n                (n_events,),\n                maxshape=(None,),\n                dtype=int,\n            )\n            ds_x.attrs[\"minPeaks\"] = min_peaks\n            ds_x.attrs[\"maxPeaks\"] = max_peaks\n        else:\n            ds_x: Any = self._outh5.create_dataset(\n                f\"/entry_1/result_1/{key}\",\n                (n_events, max_peaks),\n                maxshape=(None, max_peaks),\n                chunks=(1, max_peaks),\n                dtype=float,\n            )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier:peaks\"\n\n    # Timestamp entries\n    lcls_1: Any = self._outh5.create_group(\"LCLS\")\n    keys: List[str] = [\n        \"eventNumber\",\n        \"machineTime\",\n        \"machineTimeNanoSeconds\",\n        \"fiducial\",\n        \"photon_energy_eV\",\n    ]\n    key: str\n    for key in keys:\n        if key == \"photon_energy_eV\":\n            ds_x: Any = lcls_1.create_dataset(\n                f\"{key}\", (n_events,), maxshape=(None,), dtype=float\n            )\n        else:\n            ds_x = lcls_1.create_dataset(\n                f\"{key}\", (n_events,), maxshape=(None,), dtype=int\n            )\n        ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n\n    ds_x = self._outh5.create_dataset(\n        \"/LCLS/detector_1/EncoderValue\", (n_events,), maxshape=(None,), dtype=float\n    )\n    ds_x.attrs[\"axes\"] = \"experiment_identifier\"\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.optimize_and_close_file","title":"<code>optimize_and_close_file(num_hits, max_peaks)</code>","text":"<p>Resize data blocks and write additional information to the file</p> <p>Parameters:</p> <pre><code>num_hits (int): Number of hits for which information has been saved to the\n    file\n\nmax_peaks (int): Maximum number of peaks (per event) for which information\n    can be written into the file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def optimize_and_close_file(\n    self,\n    num_hits: int,\n    max_peaks: int,\n):\n    \"\"\"\n    Resize data blocks and write additional information to the file\n\n    Parameters:\n\n        num_hits (int): Number of hits for which information has been saved to the\n            file\n\n        max_peaks (int): Maximum number of peaks (per event) for which information\n            can be written into the file\n    \"\"\"\n\n    # Resize the entry_1 entry\n    data_shape: Tuple[int, ...] = self._outh5[\"/entry_1/data_1/data\"].shape\n    self._outh5[\"/entry_1/data_1/data\"].resize(\n        (num_hits, data_shape[1], data_shape[2])\n    )\n    self._outh5[f\"/entry_1/result_1/nPeaks\"].resize((num_hits,))\n    key: str\n    for key in [\n        \"peakXPosRaw\",\n        \"peakYPosRaw\",\n        \"rcent\",\n        \"ccent\",\n        \"rmin\",\n        \"rmax\",\n        \"cmin\",\n        \"cmax\",\n        \"peakTotalIntensity\",\n        \"peakMaxIntensity\",\n        \"peakRadius\",\n    ]:\n        self._outh5[f\"/entry_1/result_1/{key}\"].resize((num_hits, max_peaks))\n\n    # Resize LCLS entry\n    for key in [\n        \"eventNumber\",\n        \"machineTime\",\n        \"machineTimeNanoSeconds\",\n        \"fiducial\",\n        \"detector_1/EncoderValue\",\n        \"photon_energy_eV\",\n    ]:\n        self._outh5[f\"/LCLS/{key}\"].resize((num_hits,))\n    self._outh5.close()\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.write_event","title":"<code>write_event(img, peaks, timestamp_seconds, timestamp_nanoseconds, timestamp_fiducials, photon_energy)</code>","text":"<p>Write peak finding results for an event into the HDF5 file.</p> <p>Parameters:</p> <pre><code>img (NDArray[numpy.float_]): Detector data for the event\n\npeaks: (Any): Peak information for the event, as recovered from the PyAlgos\n    algorithm\n\ntimestamp_seconds (int): Second part of the event's timestamp information\n\ntimestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n    information\n\ntimestamp_fiducials (int): Fiducials part of the event's timestamp\n    information\n\nphoton_energy (float): Photon energy for the event\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_event(\n    self,\n    img: NDArray[numpy.float_],\n    peaks: Any,  # Not typed becomes it comes from psana\n    timestamp_seconds: int,\n    timestamp_nanoseconds: int,\n    timestamp_fiducials: int,\n    photon_energy: float,\n):\n    \"\"\"\n    Write peak finding results for an event into the HDF5 file.\n\n    Parameters:\n\n        img (NDArray[numpy.float_]): Detector data for the event\n\n        peaks: (Any): Peak information for the event, as recovered from the PyAlgos\n            algorithm\n\n        timestamp_seconds (int): Second part of the event's timestamp information\n\n        timestamp_nanoseconds (int): Nanosecond part of the event's timestamp\n            information\n\n        timestamp_fiducials (int): Fiducials part of the event's timestamp\n            information\n\n        photon_energy (float): Photon energy for the event\n    \"\"\"\n    ch_rows: NDArray[numpy.float_] = peaks[:, 0] * self._det_shape[1] + peaks[:, 1]\n    ch_cols: NDArray[numpy.float_] = peaks[:, 2]\n\n    if self._outh5[\"/entry_1/data_1/data\"].shape[0] &lt;= self._index:\n        self._outh5[\"entry_1/data_1/data\"].resize(self._index + 1, axis=0)\n        ds_key: str\n        for ds_key in self._outh5[\"/entry_1/result_1\"].keys():\n            self._outh5[f\"/entry_1/result_1/{ds_key}\"].resize(\n                self._index + 1, axis=0\n            )\n        for ds_key in (\n            \"machineTime\",\n            \"machineTimeNanoSeconds\",\n            \"fiducial\",\n            \"photon_energy_eV\",\n        ):\n            self._outh5[f\"/LCLS/{ds_key}\"].resize(self._index + 1, axis=0)\n\n    # Entry_1 entry for processing with CrystFEL\n    self._outh5[\"/entry_1/data_1/data\"][self._index, :, :] = img.reshape(\n        -1, img.shape[-1]\n    )\n    self._outh5[\"/entry_1/result_1/nPeaks\"][self._index] = peaks.shape[0]\n    self._outh5[\"/entry_1/result_1/peakXPosRaw\"][self._index, : peaks.shape[0]] = (\n        ch_cols.astype(\"int\")\n    )\n    self._outh5[\"/entry_1/result_1/peakYPosRaw\"][self._index, : peaks.shape[0]] = (\n        ch_rows.astype(\"int\")\n    )\n    self._outh5[\"/entry_1/result_1/rcent\"][self._index, : peaks.shape[0]] = peaks[\n        :, 6\n    ]\n    self._outh5[\"/entry_1/result_1/ccent\"][self._index, : peaks.shape[0]] = peaks[\n        :, 7\n    ]\n    self._outh5[\"/entry_1/result_1/rmin\"][self._index, : peaks.shape[0]] = peaks[\n        :, 10\n    ]\n    self._outh5[\"/entry_1/result_1/rmax\"][self._index, : peaks.shape[0]] = peaks[\n        :, 11\n    ]\n    self._outh5[\"/entry_1/result_1/cmin\"][self._index, : peaks.shape[0]] = peaks[\n        :, 12\n    ]\n    self._outh5[\"/entry_1/result_1/cmax\"][self._index, : peaks.shape[0]] = peaks[\n        :, 13\n    ]\n    self._outh5[\"/entry_1/result_1/peakTotalIntensity\"][\n        self._index, : peaks.shape[0]\n    ] = peaks[:, 5]\n    self._outh5[\"/entry_1/result_1/peakMaxIntensity\"][\n        self._index, : peaks.shape[0]\n    ] = peaks[:, 4]\n\n    # Calculate and write pixel radius\n    peaks_cenx: NDArray[numpy.float_] = (\n        self._i_x[\n            numpy.array(peaks[:, 0], dtype=numpy.int64),\n            numpy.array(peaks[:, 1], dtype=numpy.int64),\n            numpy.array(peaks[:, 2], dtype=numpy.int64),\n        ]\n        + 0.5\n        - self._ipx\n    )\n    peaks_ceny: NDArray[numpy.float_] = (\n        self._i_y[\n            numpy.array(peaks[:, 0], dtype=numpy.int64),\n            numpy.array(peaks[:, 1], dtype=numpy.int64),\n            numpy.array(peaks[:, 2], dtype=numpy.int64),\n        ]\n        + 0.5\n        - self._ipy\n    )\n    peak_radius: NDArray[numpy.float_] = numpy.sqrt(\n        (peaks_cenx**2) + (peaks_ceny**2)\n    )\n    self._outh5[\"/entry_1/result_1/peakRadius\"][\n        self._index, : peaks.shape[0]\n    ] = peak_radius\n\n    # LCLS entry dataset\n    self._outh5[\"/LCLS/machineTime\"][self._index] = timestamp_seconds\n    self._outh5[\"/LCLS/machineTimeNanoSeconds\"][self._index] = timestamp_nanoseconds\n    self._outh5[\"/LCLS/fiducial\"][self._index] = timestamp_fiducials\n    self._outh5[\"/LCLS/photon_energy_eV\"][self._index] = photon_energy\n\n    self._index += 1\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.CxiWriter.write_non_event_data","title":"<code>write_non_event_data(powder_hits, powder_misses, mask, clen)</code>","text":"<p>Write to the file data that is not related to a specific event (masks, powders)</p> <p>Parameters:</p> <pre><code>powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\npowder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\nmask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_non_event_data(\n    self,\n    powder_hits: NDArray[numpy.float_],\n    powder_misses: NDArray[numpy.float_],\n    mask: NDArray[numpy.uint16],\n    clen: float,\n):\n    \"\"\"\n    Write to the file data that is not related to a specific event (masks, powders)\n\n    Parameters:\n\n        powder_hits (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n        powder_misses (NDArray[numpy.float_]): Virtual powder pattern from hits\n\n        mask: (NDArray[numpy.uint16]): Pixel ask to write into the file\n\n    \"\"\"\n    # Add powders and mask to files, reshaping them to match the crystfel\n    # convention\n    self._outh5[\"/entry_1/data_1/powderHits\"][:] = powder_hits.reshape(\n        -1, powder_hits.shape[-1]\n    )\n    self._outh5[\"/entry_1/data_1/powderMisses\"][:] = powder_misses.reshape(\n        -1, powder_misses.shape[-1]\n    )\n    self._outh5[\"/entry_1/data_1/mask\"][:] = (1 - mask).reshape(\n        -1, mask.shape[-1]\n    )  # Crystfel expects inverted values\n\n    # Add clen distance\n    self._outh5[\"/LCLS/detector_1/EncoderValue\"][:] = clen\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.FindPeaksPyAlgos","title":"<code>FindPeaksPyAlgos</code>","text":"<p>               Bases: <code>Task</code></p> <p>Task that performs peak finding using the PyAlgos peak finding algorithms and writes the peak information to CXI files.</p> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>class FindPeaksPyAlgos(Task):\n    \"\"\"\n    Task that performs peak finding using the PyAlgos peak finding algorithms and\n    writes the peak information to CXI files.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters, use_mpi: bool = True) -&gt; None:\n        super().__init__(params=params, use_mpi=use_mpi)\n        if self._task_parameters.compression is not None:\n            from libpressio import PressioCompressor\n\n    def _run(self) -&gt; None:\n        ds: Any = MPIDataSource(\n            f\"exp={self._task_parameters.lute_config.experiment}:\"\n            f\"run={self._task_parameters.lute_config.run}:smd\"\n        )\n        if self._task_parameters.n_events != 0:\n            ds.break_after(self._task_parameters.n_events)\n\n        det: Any = Detector(self._task_parameters.det_name)\n        det.do_reshape_2d_to_3d(flag=True)\n\n        evr: Any = Detector(self._task_parameters.event_receiver)\n\n        i_x: Any = det.indexes_x(self._task_parameters.lute_config.run).astype(\n            numpy.int64\n        )\n        i_y: Any = det.indexes_y(self._task_parameters.lute_config.run).astype(\n            numpy.int64\n        )\n        ipx: Any\n        ipy: Any\n        ipx, ipy = det.point_indexes(\n            self._task_parameters.lute_config.run, pxy_um=(0, 0)\n        )\n\n        alg: Any = None\n        num_hits: int = 0\n        num_events: int = 0\n        num_empty_images: int = 0\n        tag: str = self._task_parameters.tag\n        if (tag != \"\") and (tag[0] != \"_\"):\n            tag = \"_\" + tag\n\n        evt: Any\n        for evt in ds.events():\n\n            evt_id: Any = evt.get(EventId)\n            timestamp_seconds: int = evt_id.time()[0]\n            timestamp_nanoseconds: int = evt_id.time()[1]\n            timestamp_fiducials: int = evt_id.fiducials()\n            event_codes: Any = evr.eventCodes(evt)\n\n            if isinstance(self._task_parameters.pv_camera_length, float):\n                clen: float = self._task_parameters.pv_camera_length\n            else:\n                clen = (\n                    ds.env().epicsStore().value(self._task_parameters.pv_camera_length)\n                )\n\n            if self._task_parameters.event_logic:\n                if not self._task_parameters.event_code in event_codes:\n                    continue\n\n            img: Any = det.calib(evt)\n\n            if img is None:\n                num_empty_images += 1\n                continue\n\n            if alg is None:\n                det_shape: Tuple[int, ...] = img.shape\n                if len(det_shape) == 3:\n                    det_shape = (det_shape[0] * det_shape[1], det_shape[2])\n                else:\n                    det_shape = img.shape\n\n                mask: NDArray[numpy.uint16] = numpy.ones(det_shape).astype(numpy.uint16)\n\n                if self._task_parameters.psana_mask:\n                    mask = det.mask(\n                        self.task_parameters.run,\n                        calib=False,\n                        status=True,\n                        edges=False,\n                        centra=False,\n                        unbond=False,\n                        unbondnbrs=False,\n                    ).astype(numpy.uint16)\n\n                hdffh: Any\n                if self._task_parameters.mask_file is not None:\n                    with h5py.File(self._task_parameters.mask_file, \"r\") as hdffh:\n                        loaded_mask: NDArray[numpy.int] = hdffh[\"entry_1/data_1/mask\"][\n                            :\n                        ]\n                        mask *= loaded_mask.astype(numpy.uint16)\n\n                file_writer: CxiWriter = CxiWriter(\n                    outdir=self._task_parameters.outdir,\n                    rank=ds.rank,\n                    exp=self._task_parameters.lute_config.experiment,\n                    run=self._task_parameters.lute_config.run,\n                    n_events=self._task_parameters.n_events,\n                    det_shape=det_shape,\n                    i_x=i_x,\n                    i_y=i_y,\n                    ipx=ipx,\n                    ipy=ipy,\n                    min_peaks=self._task_parameters.min_peaks,\n                    max_peaks=self._task_parameters.max_peaks,\n                    tag=tag,\n                )\n                alg: Any = PyAlgos(mask=mask, pbits=0)  # pbits controls verbosity\n                alg.set_peak_selection_pars(\n                    npix_min=self._task_parameters.npix_min,\n                    npix_max=self._task_parameters.npix_max,\n                    amax_thr=self._task_parameters.amax_thr,\n                    atot_thr=self._task_parameters.atot_thr,\n                    son_min=self._task_parameters.son_min,\n                )\n\n                if self._task_parameters.compression is not None:\n\n                    libpressio_config = generate_libpressio_configuration(\n                        compressor=self._task_parameters.compression.compressor,\n                        roi_window_size=self._task_parameters.compression.roi_window_size,\n                        bin_size=self._task_parameters.compression.bin_size,\n                        abs_error=self._task_parameters.compression.abs_error,\n                        libpressio_mask=mask,\n                    )\n\n                powder_hits: NDArray[numpy.float_] = numpy.zeros(det_shape)\n                powder_misses: NDArray[numpy.float_] = numpy.zeros(det_shape)\n\n            peaks: Any = alg.peak_finder_v3r3(\n                img,\n                rank=self._task_parameters.peak_rank,\n                r0=self._task_parameters.r0,\n                dr=self._task_parameters.dr,\n                #      nsigm=self._task_parameters.nsigm,\n            )\n\n            num_events += 1\n\n            if (peaks.shape[0] &gt;= self._task_parameters.min_peaks) and (\n                peaks.shape[0] &lt;= self._task_parameters.max_peaks\n            ):\n\n                if self._task_parameters.compression is not None:\n\n                    libpressio_config_with_peaks = (\n                        add_peaks_to_libpressio_configuration(libpressio_config, peaks)\n                    )\n                    compressor = PressioCompressor.from_config(\n                        libpressio_config_with_peaks\n                    )\n                    compressed_img = compressor.encode(img)\n                    decompressed_img = numpy.zeros_like(img)\n                    decompressed = compressor.decode(compressed_img, decompressed_img)\n                    img = decompressed_img\n\n                try:\n                    photon_energy: float = (\n                        Detector(\"EBeam\").get(evt).ebeamPhotonEnergy()\n                    )\n                except AttributeError:\n                    photon_energy = (\n                        1.23984197386209e-06\n                        / ds.env().epicsStore().value(\"SIOC:SYS0:ML00:AO192\")\n                        / 1.0e9\n                    )\n\n                file_writer.write_event(\n                    img=img,\n                    peaks=peaks,\n                    timestamp_seconds=timestamp_seconds,\n                    timestamp_nanoseconds=timestamp_nanoseconds,\n                    timestamp_fiducials=timestamp_fiducials,\n                    photon_energy=photon_energy,\n                )\n                num_hits += 1\n\n            # TODO: Fix bug here\n            # generate / update powders\n            if peaks.shape[0] &gt;= self._task_parameters.min_peaks:\n                powder_hits = numpy.maximum(\n                    powder_hits,\n                    img.reshape(-1, img.shape[-1]),\n                )\n            else:\n                powder_misses = numpy.maximum(\n                    powder_misses,\n                    img.reshape(-1, img.shape[-1]),\n                )\n\n        if num_empty_images != 0:\n            msg: Message = Message(\n                contents=f\"Rank {ds.rank} encountered {num_empty_images} empty images.\"\n            )\n            self._report_to_executor(msg)\n\n        file_writer.write_non_event_data(\n            powder_hits=powder_hits,\n            powder_misses=powder_misses,\n            mask=mask,\n            clen=clen,\n        )\n\n        file_writer.optimize_and_close_file(\n            num_hits=num_hits, max_peaks=self._task_parameters.max_peaks\n        )\n\n        COMM_WORLD.Barrier()\n\n        num_hits_per_rank: List[int] = COMM_WORLD.gather(num_hits, root=0)\n        num_hits_total: int = COMM_WORLD.reduce(num_hits, SUM)\n        num_events_per_rank: List[int] = COMM_WORLD.gather(num_events, root=0)\n\n        if ds.rank == 0:\n            master_fname: Path = write_master_file(\n                mpi_size=ds.size,\n                outdir=self._task_parameters.outdir,\n                exp=self._task_parameters.lute_config.experiment,\n                run=self._task_parameters.lute_config.run,\n                tag=tag,\n                n_hits_per_rank=num_hits_per_rank,\n                n_hits_total=num_hits_total,\n            )\n\n            # Write final summary file\n            f: TextIO\n            with open(\n                Path(self._task_parameters.outdir) / f\"peakfinding{tag}.summary\", \"w\"\n            ) as f:\n                print(f\"Number of events processed: {num_events_per_rank[-1]}\", file=f)\n                print(f\"Number of hits found: {num_hits_total}\", file=f)\n                print(\n                    \"Fractional hit rate: \"\n                    f\"{(num_hits_total/num_events_per_rank[-1]):.2f}\",\n                    file=f,\n                )\n                print(f\"No. hits per rank: {num_hits_per_rank}\", file=f)\n\n            with open(Path(self._task_parameters.out_file), \"w\") as f:\n                print(f\"{master_fname}\", file=f)\n\n            # Write out_file\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.add_peaks_to_libpressio_configuration","title":"<code>add_peaks_to_libpressio_configuration(lp_json, peaks)</code>","text":"<p>Add peak infromation to libpressio configuration</p> <p>Parameters:</p> <pre><code>lp_json: Dictionary storing the configuration JSON structure for the libpressio\n    library.\n\npeaks (Any): Peak information as returned by psana.\n</code></pre> <p>Returns:</p> <pre><code>lp_json: Updated configuration JSON structure for the libpressio library.\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def add_peaks_to_libpressio_configuration(lp_json, peaks) -&gt; Dict[str, Any]:\n    \"\"\"\n    Add peak infromation to libpressio configuration\n\n    Parameters:\n\n        lp_json: Dictionary storing the configuration JSON structure for the libpressio\n            library.\n\n        peaks (Any): Peak information as returned by psana.\n\n    Returns:\n\n        lp_json: Updated configuration JSON structure for the libpressio library.\n    \"\"\"\n    lp_json[\"compressor_config\"][\"pressio\"][\"roibin\"][\"roibin:centers\"] = (\n        numpy.ascontiguousarray(numpy.uint64(peaks[:, [2, 1, 0]]))\n    )\n    return lp_json\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.generate_libpressio_configuration","title":"<code>generate_libpressio_configuration(compressor, roi_window_size, bin_size, abs_error, libpressio_mask)</code>","text":"<p>Create the configuration JSON for the libpressio library</p> <p>Parameters:</p> <pre><code>compressor (Literal[\"sz3\", \"qoz\"]): Compression algorithm to use\n    (\"qoz\" or \"sz3\").\n\nabs_error (float): Bound value for the absolute error.\n\nbin_size (int): Bining Size.\n\nroi_window_size (int): Default size of the ROI window.\n\nlibpressio_mask (NDArray): mask to be applied to the data.\n</code></pre> <p>Returns:</p> <pre><code>lp_json (Dict[str, Any]): Dictionary storing the JSON configuration structure\nfor the libpressio library\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def generate_libpressio_configuration(\n    compressor: Literal[\"sz3\", \"qoz\"],\n    roi_window_size: int,\n    bin_size: int,\n    abs_error: float,\n    libpressio_mask,\n) -&gt; Dict[str, Any]:\n    \"\"\"\n    Create the configuration JSON for the libpressio library\n\n    Parameters:\n\n        compressor (Literal[\"sz3\", \"qoz\"]): Compression algorithm to use\n            (\"qoz\" or \"sz3\").\n\n        abs_error (float): Bound value for the absolute error.\n\n        bin_size (int): Bining Size.\n\n        roi_window_size (int): Default size of the ROI window.\n\n        libpressio_mask (NDArray): mask to be applied to the data.\n\n    Returns:\n\n        lp_json (Dict[str, Any]): Dictionary storing the JSON configuration structure\n        for the libpressio library\n    \"\"\"\n\n    if compressor == \"qoz\":\n        pressio_opts: Dict[str, Any] = {\n            \"pressio:abs\": abs_error,\n            \"qoz\": {\"qoz:stride\": 8},\n        }\n    elif compressor == \"sz3\":\n        pressio_opts = {\"pressio:abs\": abs_error}\n\n    lp_json = {\n        \"compressor_id\": \"pressio\",\n        \"early_config\": {\n            \"pressio\": {\n                \"pressio:compressor\": \"roibin\",\n                \"roibin\": {\n                    \"roibin:metric\": \"composite\",\n                    \"roibin:background\": \"mask_binning\",\n                    \"roibin:roi\": \"fpzip\",\n                    \"background\": {\n                        \"binning:compressor\": \"pressio\",\n                        \"mask_binning:compressor\": \"pressio\",\n                        \"pressio\": {\"pressio:compressor\": compressor},\n                    },\n                    \"composite\": {\n                        \"composite:plugins\": [\n                            \"size\",\n                            \"time\",\n                            \"input_stats\",\n                            \"error_stat\",\n                        ]\n                    },\n                },\n            }\n        },\n        \"compressor_config\": {\n            \"pressio\": {\n                \"roibin\": {\n                    \"roibin:roi_size\": [roi_window_size, roi_window_size, 0],\n                    \"roibin:centers\": None,  # \"roibin:roi_strategy\": \"coordinates\",\n                    \"roibin:nthreads\": 4,\n                    \"roi\": {\"fpzip:prec\": 0},\n                    \"background\": {\n                        \"mask_binning:mask\": None,\n                        \"mask_binning:shape\": [bin_size, bin_size, 1],\n                        \"mask_binning:nthreads\": 4,\n                        \"pressio\": pressio_opts,\n                    },\n                }\n            }\n        },\n        \"name\": \"pressio\",\n    }\n\n    lp_json[\"compressor_config\"][\"pressio\"][\"roibin\"][\"background\"][\n        \"mask_binning:mask\"\n    ] = (1 - libpressio_mask)\n\n    return lp_json\n</code></pre>"},{"location":"source/tasks/sfx_find_peaks/#tasks.sfx_find_peaks.write_master_file","title":"<code>write_master_file(mpi_size, outdir, exp, run, tag, n_hits_per_rank, n_hits_total)</code>","text":"<p>Generate a virtual dataset to map all individual files for this run.</p> <p>Parameters:</p> <pre><code>mpi_size (int): Number of ranks in the MPI pool.\n\noutdir (str): Output directory for cxi file.\n\nexp (str): Experiment string.\n\nrun (int): Experimental run.\n\ntag (str): Tag to append to cxi file names.\n\nn_hits_per_rank (List[int]): Array containing the number of hits found on each\n    node processing data.\n\nn_hits_total (int): Total number of hits found across all nodes.\n</code></pre> <p>Returns:</p> <pre><code>The path to the the written master file\n</code></pre> Source code in <code>lute/tasks/sfx_find_peaks.py</code> <pre><code>def write_master_file(\n    mpi_size: int,\n    outdir: str,\n    exp: str,\n    run: int,\n    tag: str,\n    n_hits_per_rank: List[int],\n    n_hits_total: int,\n) -&gt; Path:\n    \"\"\"\n    Generate a virtual dataset to map all individual files for this run.\n\n    Parameters:\n\n        mpi_size (int): Number of ranks in the MPI pool.\n\n        outdir (str): Output directory for cxi file.\n\n        exp (str): Experiment string.\n\n        run (int): Experimental run.\n\n        tag (str): Tag to append to cxi file names.\n\n        n_hits_per_rank (List[int]): Array containing the number of hits found on each\n            node processing data.\n\n        n_hits_total (int): Total number of hits found across all nodes.\n\n    Returns:\n\n        The path to the the written master file\n    \"\"\"\n    # Retrieve paths to the files containing data\n    fnames: List[Path] = []\n    fi: int\n    for fi in range(mpi_size):\n        if n_hits_per_rank[fi] &gt; 0:\n            fnames.append(Path(outdir) / f\"{exp}_r{run:0&gt;4}_{fi}{tag}.cxi\")\n    if len(fnames) == 0:\n        sys.exit(\"No hits found\")\n\n    # Retrieve list of entries to populate in the virtual hdf5 file\n    dname_list, key_list, shape_list, dtype_list = [], [], [], []\n    datasets = [\"/entry_1/result_1\", \"/LCLS/detector_1\", \"/LCLS\", \"/entry_1/data_1\"]\n    f = h5py.File(fnames[0], \"r\")\n    for dname in datasets:\n        dset = f[dname]\n        for key in dset.keys():\n            if f\"{dname}/{key}\" not in datasets:\n                dname_list.append(dname)\n                key_list.append(key)\n                shape_list.append(dset[key].shape)\n                dtype_list.append(dset[key].dtype)\n    f.close()\n\n    # Compute cumulative powder hits and misses for all files\n    powder_hits, powder_misses = None, None\n    for fn in fnames:\n        f = h5py.File(fn, \"r\")\n        if powder_hits is None:\n            powder_hits = f[\"entry_1/data_1/powderHits\"][:].copy()\n            powder_misses = f[\"entry_1/data_1/powderMisses\"][:].copy()\n        else:\n            powder_hits = numpy.maximum(\n                powder_hits, f[\"entry_1/data_1/powderHits\"][:].copy()\n            )\n            powder_misses = numpy.maximum(\n                powder_misses, f[\"entry_1/data_1/powderMisses\"][:].copy()\n            )\n        f.close()\n\n    vfname: Path = Path(outdir) / f\"{exp}_r{run:0&gt;4}{tag}.cxi\"\n    with h5py.File(vfname, \"w\") as vdf:\n\n        # Write the virtual hdf5 file\n        for dnum in range(len(dname_list)):\n            dname = f\"{dname_list[dnum]}/{key_list[dnum]}\"\n            if key_list[dnum] not in [\"mask\", \"powderHits\", \"powderMisses\"]:\n                layout = h5py.VirtualLayout(\n                    shape=(n_hits_total,) + shape_list[dnum][1:], dtype=dtype_list[dnum]\n                )\n                cursor = 0\n                for i, fn in enumerate(fnames):\n                    vsrc = h5py.VirtualSource(\n                        fn, dname, shape=(n_hits_per_rank[i],) + shape_list[dnum][1:]\n                    )\n                    if len(shape_list[dnum]) == 1:\n                        layout[cursor : cursor + n_hits_per_rank[i]] = vsrc\n                    else:\n                        layout[cursor : cursor + n_hits_per_rank[i], :] = vsrc\n                    cursor += n_hits_per_rank[i]\n                vdf.create_virtual_dataset(dname, layout, fillvalue=-1)\n\n        vdf[\"entry_1/data_1/powderHits\"] = powder_hits\n        vdf[\"entry_1/data_1/powderMisses\"] = powder_misses\n\n    return vfname\n</code></pre>"},{"location":"source/tasks/sfx_index/","title":"sfx_index","text":"<p>Classes for indexing tasks in SFX.</p> <p>Classes:</p> Name Description <code>ConcatenateStreamFIles</code> <p>task that merges multiple stream files into a single file.</p>"},{"location":"source/tasks/sfx_index/#tasks.sfx_index.ConcatenateStreamFiles","title":"<code>ConcatenateStreamFiles</code>","text":"<p>               Bases: <code>Task</code></p> <p>Task that merges stream files located within a directory tree.</p> Source code in <code>lute/tasks/sfx_index.py</code> <pre><code>class ConcatenateStreamFiles(Task):\n    \"\"\"\n    Task that merges stream files located within a directory tree.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n\n        stream_file_path: Path = Path(self._task_parameters.in_file)\n        stream_file_list: List[Path] = list(\n            stream_file_path.rglob(f\"{self._task_parameters.tag}_*.stream\")\n        )\n\n        processed_file_list = [str(stream_file) for stream_file in stream_file_list]\n\n        msg: Message = Message(\n            contents=f\"Merging following stream files: {processed_file_list} into \"\n            f\"{self._task_parameters.out_file}\",\n        )\n        self._report_to_executor(msg)\n\n        wfd: BinaryIO\n        with open(self._task_parameters.out_file, \"wb\") as wfd:\n            infile: Path\n            for infile in stream_file_list:\n                fd: BinaryIO\n                with open(infile, \"rb\") as fd:\n                    shutil.copyfileobj(fd, wfd)\n</code></pre>"},{"location":"source/tasks/task/","title":"task","text":"<p>Base classes for implementing analysis tasks.</p> <p>Classes:</p> Name Description <code>Task</code> <p>Abstract base class from which all analysis tasks are derived.</p> <code>ThirdPartyTask</code> <p>Class to run a third-party executable binary as a <code>Task</code>.</p>"},{"location":"source/tasks/task/#tasks.task.DescribedAnalysis","title":"<code>DescribedAnalysis</code>  <code>dataclass</code>","text":"<p>Complete analysis description. Held by an Executor.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass DescribedAnalysis:\n    \"\"\"Complete analysis description. Held by an Executor.\"\"\"\n\n    task_result: TaskResult\n    task_parameters: Optional[TaskParameters]\n    task_env: Dict[str, str]\n    poll_interval: float\n    communicator_desc: List[str]\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.ElogSummaryPlots","title":"<code>ElogSummaryPlots</code>  <code>dataclass</code>","text":"<p>Holds a graphical summary intended for display in the eLog.</p> <p>Attributes:</p> Name Type Description <code>display_name</code> <code>str</code> <p>This represents both a path and how the result will be displayed in the eLog. Can include \"/\" characters. E.g. <code>display_name = \"scans/my_motor_scan\"</code> will have plots shown on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors how the file is stored on disk as well.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass ElogSummaryPlots:\n    \"\"\"Holds a graphical summary intended for display in the eLog.\n\n    Attributes:\n        display_name (str): This represents both a path and how the result will be\n            displayed in the eLog. Can include \"/\" characters. E.g.\n            `display_name = \"scans/my_motor_scan\"` will have plots shown\n            on a \"my_motor_scan\" page, under a \"scans\" tab. This format mirrors\n            how the file is stored on disk as well.\n    \"\"\"\n\n    display_name: str\n    figures: Union[pn.Tabs, hv.Image, plt.Figure]\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task","title":"<code>Task</code>","text":"<p>               Bases: <code>ABC</code></p> <p>Abstract base class for analysis tasks.</p> <p>Attributes:</p> Name Type Description <code>name</code> <code>str</code> <p>The name of the Task.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>class Task(ABC):\n    \"\"\"Abstract base class for analysis tasks.\n\n    Attributes:\n        name (str): The name of the Task.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters, use_mpi: bool = False) -&gt; None:\n        \"\"\"Initialize a Task.\n\n        Args:\n            params (TaskParameters): Parameters needed to properly configure\n                the analysis task. These are NOT related to execution parameters\n                (number of cores, etc), except, potentially, in case of binary\n                executable sub-classes.\n\n            use_mpi (bool): Whether this Task requires the use of MPI.\n                This determines the behaviour and timing of certain signals\n                and ensures appropriate barriers are placed to not end\n                processing until all ranks have finished.\n        \"\"\"\n        self.name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n        self._result: TaskResult = TaskResult(\n            task_name=self.name,\n            task_status=TaskStatus.PENDING,\n            summary=\"PENDING\",\n            payload=\"\",\n        )\n        self._task_parameters: TaskParameters = params\n        timeout: int = self._task_parameters.lute_config.task_timeout\n        signal.setitimer(signal.ITIMER_REAL, timeout)\n\n        run_directory: Optional[str] = self._task_parameters.Config.run_directory\n        if run_directory is not None:\n            try:\n                os.chdir(run_directory)\n            except FileNotFoundError:\n                warnings.warn(\n                    (\n                        f\"Attempt to change to {run_directory}, but it is not found!\\n\"\n                        f\"Will attempt to run from {os.getcwd()}. It may fail!\"\n                    ),\n                    category=UserWarning,\n                )\n        self._use_mpi: bool = use_mpi\n\n    def run(self) -&gt; None:\n        \"\"\"Calls the analysis routines and any pre/post task functions.\n\n        This method is part of the public API and should not need to be modified\n        in any subclasses.\n        \"\"\"\n        self._signal_start()\n        self._pre_run()\n        self._run()\n        self._post_run()\n        self._signal_result()\n\n    @abstractmethod\n    def _run(self) -&gt; None:\n        \"\"\"Actual analysis to run. Overridden by subclasses.\n\n        Separating the calling API from the implementation allows `run` to\n        have pre and post task functionality embedded easily into a single\n        function call.\n        \"\"\"\n        ...\n\n    def _pre_run(self) -&gt; None:\n        \"\"\"Code to run BEFORE the main analysis takes place.\n\n        This function may, or may not, be employed by subclasses.\n        \"\"\"\n        ...\n\n    def _post_run(self) -&gt; None:\n        \"\"\"Code to run AFTER the main analysis takes place.\n\n        This function may, or may not, be employed by subclasses.\n        \"\"\"\n        ...\n\n    @property\n    def result(self) -&gt; TaskResult:\n        \"\"\"TaskResult: Read-only Task Result information.\"\"\"\n        return self._result\n\n    def __call__(self) -&gt; None:\n        self.run()\n\n    def _signal_start(self) -&gt; None:\n        \"\"\"Send the signal that the Task will begin shortly.\"\"\"\n        start_msg: Message = Message(\n            contents=self._task_parameters, signal=\"TASK_STARTED\"\n        )\n        self._result.task_status = TaskStatus.RUNNING\n        if self._use_mpi:\n            from mpi4py import MPI\n\n            comm: MPI.Intracomm = MPI.COMM_WORLD\n            rank: int = comm.Get_rank()\n            comm.Barrier()\n            if rank == 0:\n                self._report_to_executor(start_msg)\n        else:\n            self._report_to_executor(start_msg)\n\n    def _signal_result(self) -&gt; None:\n        \"\"\"Send the signal that results are ready along with the results.\"\"\"\n        signal: str = \"TASK_RESULT\"\n        results_msg: Message = Message(contents=self.result, signal=signal)\n        if self._use_mpi:\n            from mpi4py import MPI\n\n            comm: MPI.Intracomm = MPI.COMM_WORLD\n            rank: int = comm.Get_rank()\n            comm.Barrier()\n            if rank == 0:\n                self._report_to_executor(results_msg)\n        else:\n            self._report_to_executor(results_msg)\n        time.sleep(0.1)\n\n    def _report_to_executor(self, msg: Message) -&gt; None:\n        \"\"\"Send a message to the Executor.\n\n        Details of `Communicator` choice are hidden from the caller. This\n        method may be overriden by subclasses with specialized functionality.\n\n        Args:\n            msg (Message): The message object to send.\n        \"\"\"\n        communicator: Communicator\n        if isinstance(msg.contents, str) or msg.contents is None:\n            communicator = PipeCommunicator()\n        else:\n            communicator = SocketCommunicator()\n\n        communicator.delayed_setup()\n        communicator.write(msg)\n        communicator.clear_communicator()\n\n    def clean_up_timeout(self) -&gt; None:\n        \"\"\"Perform any necessary cleanup actions before exit if timing out.\"\"\"\n        ...\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.result","title":"<code>result: TaskResult</code>  <code>property</code>","text":"<p>TaskResult: Read-only Task Result information.</p>"},{"location":"source/tasks/task/#tasks.task.Task.__init__","title":"<code>__init__(*, params, use_mpi=False)</code>","text":"<p>Initialize a Task.</p> <p>Parameters:</p> Name Type Description Default <code>params</code> <code>TaskParameters</code> <p>Parameters needed to properly configure the analysis task. These are NOT related to execution parameters (number of cores, etc), except, potentially, in case of binary executable sub-classes.</p> required <code>use_mpi</code> <code>bool</code> <p>Whether this Task requires the use of MPI. This determines the behaviour and timing of certain signals and ensures appropriate barriers are placed to not end processing until all ranks have finished.</p> <code>False</code> Source code in <code>lute/tasks/task.py</code> <pre><code>def __init__(self, *, params: TaskParameters, use_mpi: bool = False) -&gt; None:\n    \"\"\"Initialize a Task.\n\n    Args:\n        params (TaskParameters): Parameters needed to properly configure\n            the analysis task. These are NOT related to execution parameters\n            (number of cores, etc), except, potentially, in case of binary\n            executable sub-classes.\n\n        use_mpi (bool): Whether this Task requires the use of MPI.\n            This determines the behaviour and timing of certain signals\n            and ensures appropriate barriers are placed to not end\n            processing until all ranks have finished.\n    \"\"\"\n    self.name: str = str(type(self)).split(\"'\")[1].split(\".\")[-1]\n    self._result: TaskResult = TaskResult(\n        task_name=self.name,\n        task_status=TaskStatus.PENDING,\n        summary=\"PENDING\",\n        payload=\"\",\n    )\n    self._task_parameters: TaskParameters = params\n    timeout: int = self._task_parameters.lute_config.task_timeout\n    signal.setitimer(signal.ITIMER_REAL, timeout)\n\n    run_directory: Optional[str] = self._task_parameters.Config.run_directory\n    if run_directory is not None:\n        try:\n            os.chdir(run_directory)\n        except FileNotFoundError:\n            warnings.warn(\n                (\n                    f\"Attempt to change to {run_directory}, but it is not found!\\n\"\n                    f\"Will attempt to run from {os.getcwd()}. It may fail!\"\n                ),\n                category=UserWarning,\n            )\n    self._use_mpi: bool = use_mpi\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.clean_up_timeout","title":"<code>clean_up_timeout()</code>","text":"<p>Perform any necessary cleanup actions before exit if timing out.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>def clean_up_timeout(self) -&gt; None:\n    \"\"\"Perform any necessary cleanup actions before exit if timing out.\"\"\"\n    ...\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.Task.run","title":"<code>run()</code>","text":"<p>Calls the analysis routines and any pre/post task functions.</p> <p>This method is part of the public API and should not need to be modified in any subclasses.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>def run(self) -&gt; None:\n    \"\"\"Calls the analysis routines and any pre/post task functions.\n\n    This method is part of the public API and should not need to be modified\n    in any subclasses.\n    \"\"\"\n    self._signal_start()\n    self._pre_run()\n    self._run()\n    self._post_run()\n    self._signal_result()\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskResult","title":"<code>TaskResult</code>  <code>dataclass</code>","text":"<p>Class for storing the result of a Task's execution with metadata.</p> <p>Attributes:</p> Name Type Description <code>task_name</code> <code>str</code> <p>Name of the associated task which produced it.</p> <code>task_status</code> <code>TaskStatus</code> <p>Status of associated task.</p> <code>summary</code> <code>str</code> <p>Short message/summary associated with the result.</p> <code>payload</code> <code>Any</code> <p>Actual result. May be data in any format.</p> <code>impl_schemas</code> <code>Optional[str]</code> <p>A string listing <code>Task</code> schemas implemented by the associated <code>Task</code>. Schemas define the category and expected output of the <code>Task</code>. An individual task may implement/conform to multiple schemas. Multiple schemas are separated by ';', e.g.     * impl_schemas = \"schema1;schema2\"</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>@dataclass\nclass TaskResult:\n    \"\"\"Class for storing the result of a Task's execution with metadata.\n\n    Attributes:\n        task_name (str): Name of the associated task which produced it.\n\n        task_status (TaskStatus): Status of associated task.\n\n        summary (str): Short message/summary associated with the result.\n\n        payload (Any): Actual result. May be data in any format.\n\n        impl_schemas (Optional[str]): A string listing `Task` schemas implemented\n            by the associated `Task`. Schemas define the category and expected\n            output of the `Task`. An individual task may implement/conform to\n            multiple schemas. Multiple schemas are separated by ';', e.g.\n                * impl_schemas = \"schema1;schema2\"\n    \"\"\"\n\n    task_name: str\n    task_status: TaskStatus\n    summary: str\n    payload: Any\n    impl_schemas: Optional[str] = None\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskStatus","title":"<code>TaskStatus</code>","text":"<p>               Bases: <code>Enum</code></p> <p>Possible Task statuses.</p> Source code in <code>lute/tasks/dataclasses.py</code> <pre><code>class TaskStatus(Enum):\n    \"\"\"Possible Task statuses.\"\"\"\n\n    PENDING = 0\n    \"\"\"\n    Task has yet to run. Is Queued, or waiting for prior tasks.\n    \"\"\"\n    RUNNING = 1\n    \"\"\"\n    Task is in the process of execution.\n    \"\"\"\n    COMPLETED = 2\n    \"\"\"\n    Task has completed without fatal errors.\n    \"\"\"\n    FAILED = 3\n    \"\"\"\n    Task encountered a fatal error.\n    \"\"\"\n    STOPPED = 4\n    \"\"\"\n    Task was, potentially temporarily, stopped/suspended.\n    \"\"\"\n    CANCELLED = 5\n    \"\"\"\n    Task was cancelled prior to completion or failure.\n    \"\"\"\n    TIMEDOUT = 6\n    \"\"\"\n    Task did not reach completion due to timeout.\n    \"\"\"\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.CANCELLED","title":"<code>CANCELLED = 5</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was cancelled prior to completion or failure.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.COMPLETED","title":"<code>COMPLETED = 2</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has completed without fatal errors.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.FAILED","title":"<code>FAILED = 3</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task encountered a fatal error.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.PENDING","title":"<code>PENDING = 0</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task has yet to run. Is Queued, or waiting for prior tasks.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.RUNNING","title":"<code>RUNNING = 1</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task is in the process of execution.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.STOPPED","title":"<code>STOPPED = 4</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task was, potentially temporarily, stopped/suspended.</p>"},{"location":"source/tasks/task/#tasks.task.TaskStatus.TIMEDOUT","title":"<code>TIMEDOUT = 6</code>  <code>class-attribute</code> <code>instance-attribute</code>","text":"<p>Task did not reach completion due to timeout.</p>"},{"location":"source/tasks/task/#tasks.task.ThirdPartyTask","title":"<code>ThirdPartyTask</code>","text":"<p>               Bases: <code>Task</code></p> <p>A <code>Task</code> interface to analysis with binary executables.</p> Source code in <code>lute/tasks/task.py</code> <pre><code>class ThirdPartyTask(Task):\n    \"\"\"A `Task` interface to analysis with binary executables.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        \"\"\"Initialize a Task.\n\n        Args:\n            params (TaskParameters): Parameters needed to properly configure\n                the analysis task. `Task`s of this type MUST include the name\n                of a binary to run and any arguments which should be passed to\n                it (as would be done via command line). The binary is included\n                with the parameter `executable`. All other parameter names are\n                assumed to be the long/extended names of the flag passed on the\n                command line by default:\n                    * `arg_name = 3` is converted to `--arg_name 3`\n                Positional arguments can be included with `p_argN` where `N` is\n                any integer:\n                    * `p_arg1 = 3` is converted to `3`\n\n                Note that it is NOT recommended to rely on this default behaviour\n                as command-line arguments can be passed in many ways. Refer to\n                the dcoumentation at\n                https://slac-lcls.github.io/lute/tutorial/new_task/\n                under \"Speciyfing a TaskParameters Model for your Task\" for more\n                information on how to control parameter parsing from within your\n                TaskParameters model definition.\n        \"\"\"\n        super().__init__(params=params)\n        self._cmd = self._task_parameters.executable\n        self._args_list: List[str] = [self._cmd]\n        self._template_context: Dict[str, Any] = {}\n\n    def _add_to_jinja_context(self, param_name: str, value: Any) -&gt; None:\n        \"\"\"Store a parameter as a Jinja template variable.\n\n        Variables are stored in a dictionary which is used to fill in a\n        premade Jinja template for a third party configuration file.\n\n        Args:\n            param_name (str): Name to store the variable as. This should be\n                the name defined in the corresponding pydantic model. This name\n                MUST match the name used in the Jinja Template!\n            value (Any): The value to store. If possible, large chunks of the\n                template should be represented as a single dictionary for\n                simplicity; however, any type can be stored as needed.\n        \"\"\"\n        context_update: Dict[str, Any] = {param_name: value}\n        if __debug__:\n            msg: Message = Message(contents=f\"TemplateParameters: {context_update}\")\n            self._report_to_executor(msg)\n        self._template_context.update(context_update)\n\n    def _template_to_config_file(self) -&gt; None:\n        \"\"\"Convert a template file into a valid configuration file.\n\n        Uses Jinja to fill in a provided template file with variables supplied\n        through the LUTE config file. This facilitates parameter modification\n        for third party tasks which use a separate configuration, in addition\n        to, or instead of, command-line arguments.\n        \"\"\"\n        from jinja2 import Environment, FileSystemLoader, Template\n\n        out_file: str = self._task_parameters.lute_template_cfg.output_path\n        template_name: str = self._task_parameters.lute_template_cfg.template_name\n\n        lute_path: Optional[str] = os.getenv(\"LUTE_PATH\")\n        template_dir: str\n        if lute_path is None:\n            warnings.warn(\n                \"LUTE_PATH is None in Task process! Using relative path for templates!\",\n                category=UserWarning,\n            )\n            template_dir: str = \"../../config/templates\"\n        else:\n            template_dir = f\"{lute_path}/config/templates\"\n        environment: Environment = Environment(loader=FileSystemLoader(template_dir))\n        template: Template = environment.get_template(template_name)\n\n        with open(out_file, \"w\", encoding=\"utf-8\") as cfg_out:\n            cfg_out.write(template.render(self._template_context))\n\n    def _pre_run(self) -&gt; None:\n        \"\"\"Parse the parameters into an appropriate argument list.\n\n        Arguments are identified by a `flag_type` attribute, defined in the\n        pydantic model, which indicates how to pass the parameter and its\n        argument on the command-line. This method parses flag:value pairs\n        into an appropriate list to be used to call the executable.\n\n        Note:\n        ThirdPartyParameter objects are returned by custom model validators.\n        Objects of this type are assumed to be used for a templated config\n        file used by the third party executable for configuration. The parsing\n        of these parameters is performed separately by a template file used as\n        an input to Jinja. This method solely identifies the necessary objects\n        and passes them all along. Refer to the template files and pydantic\n        models for more information on how these parameters are defined and\n        identified.\n        \"\"\"\n        super()._pre_run()\n        full_schema: Dict[str, Union[str, Dict[str, Any]]] = (\n            self._task_parameters.schema()\n        )\n        short_flags_use_eq: bool\n        long_flags_use_eq: bool\n        if hasattr(self._task_parameters.Config, \"short_flags_use_eq\"):\n            short_flags_use_eq: bool = self._task_parameters.Config.short_flags_use_eq\n            long_flags_use_eq: bool = self._task_parameters.Config.long_flags_use_eq\n        else:\n            short_flags_use_eq = False\n            long_flags_use_eq = False\n        for param, value in self._task_parameters.dict().items():\n            # Clunky test with __dict__[param] because compound model-types are\n            # converted to `dict`. E.g. type(value) = dict not AnalysisHeader\n            if (\n                param == \"executable\"\n                or value is None  # Cannot have empty values in argument list for execvp\n                or value == \"\"  # But do want to include, e.g. 0\n                or isinstance(self._task_parameters.__dict__[param], TemplateConfig)\n                or isinstance(self._task_parameters.__dict__[param], AnalysisHeader)\n            ):\n                continue\n            if isinstance(self._task_parameters.__dict__[param], TemplateParameters):\n                # TemplateParameters objects have a single parameter `params`\n                self._add_to_jinja_context(param_name=param, value=value.params)\n                continue\n\n            param_attributes: Dict[str, Any] = full_schema[\"properties\"][param]\n            # Some model params do not match the commnad-line parameter names\n            param_repr: str\n            if \"rename_param\" in param_attributes:\n                param_repr = param_attributes[\"rename_param\"]\n            else:\n                param_repr = param\n            if \"flag_type\" in param_attributes:\n                flag: str = param_attributes[\"flag_type\"]\n                if flag:\n                    # \"-\" or \"--\" flags\n                    if flag == \"--\" and isinstance(value, bool) and not value:\n                        continue\n                    constructed_flag: str = f\"{flag}{param_repr}\"\n                    if flag == \"--\" and isinstance(value, bool) and value:\n                        # On/off flag, e.g. something like --verbose: No Arg\n                        self._args_list.append(f\"{constructed_flag}\")\n                        continue\n                    if (flag == \"-\" and short_flags_use_eq) or (\n                        flag == \"--\" and long_flags_use_eq\n                    ):  # Must come after above check! Otherwise you get --param=True\n                        # Flags following --param=value or -param=value\n                        constructed_flag = f\"{constructed_flag}={value}\"\n                        self._args_list.append(f\"{constructed_flag}\")\n                        continue\n                    self._args_list.append(f\"{constructed_flag}\")\n            else:\n                warnings.warn(\n                    (\n                        f\"Model parameters should be defined using Field(...,flag_type='')\"\n                        f\" in the future.  Parameter: {param}\"\n                    ),\n                    category=PendingDeprecationWarning,\n                )\n                if len(param) == 1:  # Single-dash flags\n                    if short_flags_use_eq:\n                        self._args_list.append(f\"-{param_repr}={value}\")\n                        continue\n                    self._args_list.append(f\"-{param_repr}\")\n                elif \"p_arg\" in param:  # Positional arguments\n                    pass\n                else:  # Double-dash flags\n                    if isinstance(value, bool) and not value:\n                        continue\n                    if long_flags_use_eq:\n                        self._args_list.append(f\"--{param_repr}={value}\")\n                        continue\n                    self._args_list.append(f\"--{param_repr}\")\n                    if isinstance(value, bool) and value:\n                        continue\n            if isinstance(value, str) and \" \" in value:\n                for val in value.split():\n                    self._args_list.append(f\"{val}\")\n            else:\n                self._args_list.append(f\"{value}\")\n        if (\n            hasattr(self._task_parameters, \"lute_template_cfg\")\n            and self._template_context\n        ):\n            self._template_to_config_file()\n\n    def _run(self) -&gt; None:\n        \"\"\"Execute the new program by replacing the current process.\"\"\"\n        if __debug__:\n            time.sleep(0.1)\n            msg: Message = Message(contents=self._formatted_command())\n            self._report_to_executor(msg)\n        LUTE_DEBUG_EXIT(\"LUTE_DEBUG_BEFORE_TPP_EXEC\")\n        os.execvp(file=self._cmd, args=self._args_list)\n\n    def _formatted_command(self) -&gt; str:\n        \"\"\"Returns the command as it would passed on the command-line.\"\"\"\n        formatted_cmd: str = \"\".join(f\"{arg} \" for arg in self._args_list)\n        return formatted_cmd\n\n    def _signal_start(self) -&gt; None:\n        \"\"\"Override start signal method to switch communication methods.\"\"\"\n        super()._signal_start()\n        time.sleep(0.05)\n        signal: str = \"NO_PICKLE_MODE\"\n        msg: Message = Message(signal=signal)\n        self._report_to_executor(msg)\n</code></pre>"},{"location":"source/tasks/task/#tasks.task.ThirdPartyTask.__init__","title":"<code>__init__(*, params)</code>","text":"<p>Initialize a Task.</p> <p>Parameters:</p> Name Type Description Default <code>params</code> <code>TaskParameters</code> <p>Parameters needed to properly configure the analysis task. <code>Task</code>s of this type MUST include the name of a binary to run and any arguments which should be passed to it (as would be done via command line). The binary is included with the parameter <code>executable</code>. All other parameter names are assumed to be the long/extended names of the flag passed on the command line by default:     * <code>arg_name = 3</code> is converted to <code>--arg_name 3</code> Positional arguments can be included with <code>p_argN</code> where <code>N</code> is any integer:     * <code>p_arg1 = 3</code> is converted to <code>3</code></p> <p>Note that it is NOT recommended to rely on this default behaviour as command-line arguments can be passed in many ways. Refer to the dcoumentation at https://slac-lcls.github.io/lute/tutorial/new_task/ under \"Speciyfing a TaskParameters Model for your Task\" for more information on how to control parameter parsing from within your TaskParameters model definition.</p> required Source code in <code>lute/tasks/task.py</code> <pre><code>def __init__(self, *, params: TaskParameters) -&gt; None:\n    \"\"\"Initialize a Task.\n\n    Args:\n        params (TaskParameters): Parameters needed to properly configure\n            the analysis task. `Task`s of this type MUST include the name\n            of a binary to run and any arguments which should be passed to\n            it (as would be done via command line). The binary is included\n            with the parameter `executable`. All other parameter names are\n            assumed to be the long/extended names of the flag passed on the\n            command line by default:\n                * `arg_name = 3` is converted to `--arg_name 3`\n            Positional arguments can be included with `p_argN` where `N` is\n            any integer:\n                * `p_arg1 = 3` is converted to `3`\n\n            Note that it is NOT recommended to rely on this default behaviour\n            as command-line arguments can be passed in many ways. Refer to\n            the dcoumentation at\n            https://slac-lcls.github.io/lute/tutorial/new_task/\n            under \"Speciyfing a TaskParameters Model for your Task\" for more\n            information on how to control parameter parsing from within your\n            TaskParameters model definition.\n    \"\"\"\n    super().__init__(params=params)\n    self._cmd = self._task_parameters.executable\n    self._args_list: List[str] = [self._cmd]\n    self._template_context: Dict[str, Any] = {}\n</code></pre>"},{"location":"source/tasks/test/","title":"test","text":"<p>Basic test Tasks for testing functionality.</p> <p>Classes:</p> Name Description <code>Test</code> <p>Simplest test Task - runs a 10 iteration loop and returns a result.</p> <code>TestSocket</code> <p>Test Task which sends larger data to test socket IPC.</p> <code>TestWriteOutput</code> <p>Test Task which writes an output file.</p> <code>TestReadOutput</code> <p>Test Task which reads in a file. Can be used to test database access.</p>"},{"location":"source/tasks/test/#tasks.test.Test","title":"<code>Test</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to ensure subprocess and pipe-based IPC work.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class Test(Task):\n    \"\"\"Simple test Task to ensure subprocess and pipe-based IPC work.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(10):\n            time.sleep(1)\n            msg: Message = Message(contents=f\"Test message {i}\")\n            self._report_to_executor(msg)\n        if self._task_parameters.throw_error:\n            raise RuntimeError(\"Testing Error!\")\n\n    def _post_run(self) -&gt; None:\n        self._result.summary = \"Test Finished.\"\n        self._result.task_status = TaskStatus.COMPLETED\n        time.sleep(0.1)\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestReadOutput","title":"<code>TestReadOutput</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to read in output from the test Task above.</p> <p>Its pydantic model relies on a database access to retrieve the output file.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestReadOutput(Task):\n    \"\"\"Simple test Task to read in output from the test Task above.\n\n    Its pydantic model relies on a database access to retrieve the output file.\n    \"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        array: np.ndarray = np.loadtxt(self._task_parameters.in_file, delimiter=\",\")\n        self._report_to_executor(msg=Message(contents=\"Successfully loaded data!\"))\n        for i in range(5):\n            time.sleep(1)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.summary = \"Was able to load data.\"\n        self._result.payload = \"This Task produces no output.\"\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestSocket","title":"<code>TestSocket</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to ensure basic IPC over Unix sockets works.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestSocket(Task):\n    \"\"\"Simple test Task to ensure basic IPC over Unix sockets works.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(self._task_parameters.num_arrays):\n            msg: Message = Message(contents=f\"Sending array {i}\")\n            self._report_to_executor(msg)\n            time.sleep(0.05)\n            msg: Message = Message(\n                contents=np.random.rand(self._task_parameters.array_size)\n            )\n            self._report_to_executor(msg)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        self._result.summary = f\"Sent {self._task_parameters.num_arrays} arrays\"\n        self._result.payload = np.random.rand(self._task_parameters.array_size)\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"source/tasks/test/#tasks.test.TestWriteOutput","title":"<code>TestWriteOutput</code>","text":"<p>               Bases: <code>Task</code></p> <p>Simple test Task to write output other Tasks depend on.</p> Source code in <code>lute/tasks/test.py</code> <pre><code>class TestWriteOutput(Task):\n    \"\"\"Simple test Task to write output other Tasks depend on.\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params)\n\n    def _run(self) -&gt; None:\n        for i in range(self._task_parameters.num_vals):\n            # Doing some calculations...\n            time.sleep(0.05)\n            if i % 10 == 0:\n                msg: Message = Message(contents=f\"Processed {i+1} values!\")\n                self._report_to_executor(msg)\n\n    def _post_run(self) -&gt; None:\n        super()._post_run()\n        work_dir: str = self._task_parameters.lute_config.work_dir\n        out_file: str = f\"{work_dir}/{self._task_parameters.outfile_name}\"\n        array: np.ndarray = np.random.rand(self._task_parameters.num_vals)\n        np.savetxt(out_file, array, delimiter=\",\")\n        self._result.summary = \"Completed task successfully.\"\n        self._result.payload = out_file\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"tutorial/creating_workflows/","title":"Workflows with Airflow","text":"<p>Note: Airflow uses the term DAG, or directed acyclic graph, to describe workflows of tasks with defined (and acyclic) connectivities. This page will use the terms workflow and DAG interchangeably.</p>"},{"location":"tutorial/creating_workflows/#relevant-components","title":"Relevant Components","text":"<p>In addition to the core LUTE package, a number of components are generally involved to run a workflow. The current set of scripts and objects are used to interface with Airflow, and the SLURM job scheduler. The core LUTE library can also be used to run workflows using different backends, and in the future these may be supported.</p> <p>For building and running workflows using SLURM and Airflow, the following components are necessary, and will be described in more detail below: - Airflow launch script: <code>launch_airflow.py</code>   - This has a wrapper batch submission script: <code>submit_launch_airflow.sh</code> . When running using the ARP (from the eLog), you MUST use this wrapper script instead of the Python script directly. - SLURM submission script: <code>submit_slurm.sh</code> - Airflow operators:   - <code>JIDSlurmOperator</code></p>"},{"location":"tutorial/creating_workflows/#launchsubmission-scripts","title":"Launch/Submission Scripts","text":""},{"location":"tutorial/creating_workflows/#launch_airflowpy","title":"<code>launch_airflow.py</code>","text":"<p>Sends a request to an Airflow instance to submit a specific DAG (workflow). This script prepares an HTTP request with the appropriate parameters in a specific format.</p> <p>A request involves the following information, most of which is retrieved automatically:</p> <pre><code>dag_run_data: Dict[str, Union[str, Dict[str, Union[str, int, List[str]]]]] = {\n    \"dag_run_id\": str(uuid.uuid4()),\n    \"conf\": {\n        \"experiment\": os.environ.get(\"EXPERIMENT\"),\n        \"run_id\": f\"{os.environ.get('RUN_NUM')}{datetime.datetime.utcnow().isoformat()}\",\n        \"JID_UPDATE_COUNTERS\": os.environ.get(\"JID_UPDATE_COUNTERS\"),\n        \"ARP_ROOT_JOB_ID\": os.environ.get(\"ARP_JOB_ID\"),\n        \"ARP_LOCATION\": os.environ.get(\"ARP_LOCATION\", \"S3DF\"),\n        \"Authorization\": os.environ.get(\"Authorization\"),\n        \"user\": getpass.getuser(),\n        \"lute_params\": params,\n        \"slurm_params\": extra_args,\n        \"workflow\": wf_defn,  # Used only for custom DAGs. See below under advanced usage.\n    },\n}\n</code></pre> <p>Note that the environment variables are used to fill in the appropriate information because this script is intended to be launched primarily from the ARP (which passes these variables). The ARP allows for the launch job to be defined in the experiment eLog and submitted automatically for each new DAQ run. The environment variables <code>EXPERIMENT</code> and <code>RUN</code> can alternatively be defined prior to submitting the script on the command-line.</p> <p>The script takes a number of parameters:</p> <pre><code>launch_airflow.py -c &lt;path_to_config_yaml&gt; -w &lt;workflow_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] [SLURM_ARGS]\n</code></pre> <ul> <li><code>-c</code> refers to the path of the configuration YAML that contains the parameters for each managed <code>Task</code> in the requested workflow.</li> <li><code>-w</code> is the name of the DAG (workflow) to run. By convention each DAG is named by the Python file it is defined in. (See below).</li> <li>NOTE: For advanced usage, a custom DAG can be provided at run time using <code>-W</code> (capital W) followed by the path to the workflow instead of <code>-w</code>. See below for further discussion on this use case.</li> <li><code>--debug</code> is an optional flag to run all steps of the workflow in debug mode for verbose logging and output.</li> <li><code>--test</code> is an optional flag which will use the test Airflow instance. By default the script will make requests of the standard production Airflow instance.</li> <li><code>-e</code> is used to pass the experiment name. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>-r</code> is used to pass a run number. Needed if not using the ARP, i.e. running from the command-line.</li> <li><code>SLURM_ARGS</code> are SLURM arguments to be passed to the <code>submit_slurm.sh</code> script which are used for each individual managed <code>Task</code>. These arguments to do NOT affect the submission parameters for the job running <code>launch_airflow.py</code> (if using <code>submit_launch_airflow.sh</code> below).</li> </ul> <p>Lifetime This script will run for the entire duration of the workflow (DAG). After making the initial request of Airflow to launch the DAG, it will enter a status update loop which will keep track of each individual job (each job runs one managed <code>Task</code>)  submitted by Airflow. At the end of each job it will collect the log file, in addition to providing a few other status updates/debugging messages, and append it to its own log. This allows all logging for the entire workflow (DAG) to be inspected from an individual file. This is particularly useful when running via the eLog, because only a single log file is displayed.</p>"},{"location":"tutorial/creating_workflows/#submit_launch_airflowsh","title":"<code>submit_launch_airflow.sh</code>","text":"<p>This script is only necessary when running from the eLog using the ARP. The initial job submitted by the ARP can not have a duration of longer than 30 seconds, as it will then time out. As the <code>launch_airflow.py</code> job will live for the entire duration of the workflow, which is often much longer than 30 seconds, the solution was to have a wrapper which submits the <code>launch_airflow.py</code> script to run on the S3DF batch nodes. Usage of this script is mostly identical to <code>launch_airflow.py</code>. All the arguments are passed transparently to the underlying Python script with the exception of the first argument which must be the location of the underlying <code>launch_airflow.py</code> script. The wrapper will simply launch a batch job using minimal resources (1 core). While the primary purpose of the script is to allow running from the eLog, it is also an useful wrapper generally, to be able to submit the previous script as a SLURM job.</p> <p>Usage:</p> <pre><code>submit_launch_airflow.sh /path/to/launch_airflow.py -c &lt;path_to_config_yaml&gt; -w &lt;workflow_name&gt; [--debug] [--test] [-e &lt;exp&gt;] [-r &lt;run&gt;] [SLURM_ARGS]\n</code></pre>"},{"location":"tutorial/creating_workflows/#submit_slurmsh","title":"<code>submit_slurm.sh</code>","text":"<p>Launches a job on the S3DF batch nodes using the SLURM job scheduler. This script launches a single managed <code>Task</code> at a time. The usage is as follows:</p> <pre><code>submit_slurm.sh -c &lt;path_to_config_yaml&gt; -t &lt;MANAGED_task_name&gt; [--debug] [SLURM_ARGS ...]\n</code></pre> <p>As a reminder the managed <code>Task</code> refers to the <code>Executor</code>-<code>Task</code> combination. The script does not parse any SLURM specific parameters, and instead passes them transparently to SLURM. At least the following two SLURM arguments must be provided:</p> <pre><code>--partition=&lt;...&gt; # Usually partition=milano\n--account=&lt;...&gt; # Usually account=lcls:$EXPERIMENT\n</code></pre> <p>Generally, resource requests will also be included, such as the number of cores to use. A complete call may look like the following:</p> <pre><code>submit_slurm.sh -c /sdf/data/lcls/ds/hutch/experiment/scratch/config.yaml -t Tester --partition=milano --account=lcls:experiment --ntasks=100 [...]\n</code></pre> <p>When running a workflow using the <code>launch_airflow.py</code> script, each step of the workflow will be submitted using this script.</p>"},{"location":"tutorial/creating_workflows/#operators","title":"Operators","text":"<p><code>Operator</code>s are the objects submitted as individual steps of a DAG by Airflow. They are conceptually linked to the idea of a task in that each task of a workflow is generally an operator. Care should be taken, not to confuse them with LUTE <code>Task</code>s or managed <code>Task</code>s though. There is, however, usually a one-to-one correspondance between a <code>Task</code> and an <code>Operator</code>.</p> <p>Airflow runs on a K8S cluster which has no access to the experiment data. When we ask Airflow to run a DAG, it will launch an <code>Operator</code> for each step of the DAG. However, the <code>Operator</code> itself cannot perform productive analysis without access to the data. The solution employed by <code>LUTE</code> is to have a limited set of <code>Operator</code>s which do not perform analysis, but instead request that a <code>LUTE</code> managed <code>Task</code>s be submitted on the batch nodes where it can access the data. There may be small differences between how the various provided <code>Operator</code>s do this, but in general they will all make a request to the job interface daemon (JID) that a new SLURM job be scheduled using the <code>submit_slurm.sh</code> script described above.</p> <p>Therefore, running a typical Airflow DAG involves the following steps:</p> <ol> <li><code>launch_airflow.py</code> script is submitted, usually from a definition in the eLog.</li> <li>The <code>launch_airflow</code> script requests that Airflow run a specific DAG.</li> <li>The Airflow instance begins submitting the <code>Operator</code>s that makeup the DAG definition.</li> <li>Each <code>Operator</code> sends a request to the <code>JID</code> to submit a job.</li> <li>The <code>JID</code> submits the <code>elog_submit.sh</code> script with the appropriate managed <code>Task</code>.</li> <li>The managed <code>Task</code> runs on the batch nodes, while the <code>Operator</code>, requesting updates from the JID on job status, waits for it to complete.</li> <li>Once a managed <code>Task</code> completes, the <code>Operator</code> will receieve this information and tell the Airflow server whether the job completed successfully or resulted in failure.</li> <li>The Airflow server will then launch the next step of the DAG, and so on, until every step has been executed.</li> </ol> <p>Currently, the following <code>Operator</code>s are maintained: - <code>JIDSlurmOperator</code>: The standard <code>Operator</code>. Each instance has a one-to-one correspondance with a LUTE managed <code>Task</code>.</p>"},{"location":"tutorial/creating_workflows/#jidslurmoperator-arguments","title":"<code>JIDSlurmOperator</code> arguments","text":"<ul> <li><code>task_id</code>: This is nominally the name of the task on the Airflow side. However, for simplicity this is used 1-1 to match the name of a managed Task defined in LUTE's <code>managed_tasks.py</code> module. I.e., it should the name of an <code>Executor(\"Task\")</code> object which will run the specific Task of interest. This must match the name of a defined managed Task.</li> <li><code>max_cores</code>: Used to cap the maximum number of cores which should be requested of SLURM. By default all jobs will run with the same number of cores, which should be specified when running the <code>launch_airflow.py</code> script (either from the ARP, or by hand). This behaviour was chosen because in general we want to increase or decrease the core-count for all <code>Task</code>s uniformly, and we don't want to have to specify core number arguments for each job individually. Nonetheless, on occassion it may be necessary to cap the number of cores a specific job will use. E.g. if the default value specified when launching the Airflow DAG is multiple cores, and one job is single threaded, the core count can be capped for that single job to 1, while the rest run with multiple cores.</li> <li><code>max_nodes</code>: Similar to the above. This will make sure the <code>Task</code> is distributed across no more than a maximum number of nodes. This feature is useful for, e.g., multi-threaded software which does not make use of tools like <code>MPI</code>. So, the <code>Task</code> can run on multiple cores, but only within a single node.</li> <li><code>require_partition</code>: This option is a string that forces the use of a specific S3DF partition for the managed <code>Task</code> submitted by the Operator. E.g. typically a LCLS user will use <code>--partition=milano</code> for CPU-based workflows; however, if a specific <code>Task</code> requires a GPU you may use <code>JIDSlurmOperator(\"MyTaskRunner\", require_partition=\"ampere\")</code> to override the partition for that single <code>Task</code>.</li> <li><code>custom_slurm_params</code>: You can provide a string of parameters which will be used in its entirety to replace any and all default arguments passed by the launch script. This method is not recommended for general use and is mostly used for dynamic DAGs described at the end of the document.</li> </ul>"},{"location":"tutorial/creating_workflows/#creating-a-new-workflow","title":"Creating a new workflow","text":"<p>Defining a new workflow involves creating a new module (Python file) in the directory <code>workflows/airflow</code>, creating a number of <code>Operator</code> instances within the module, and then drawing the connectivity between them. At the top of the file an Airflow DAG is created and given a name. By convention all <code>LUTE</code> workflows use the name of the file as the name of the DAG. The following code can be copied exactly into the file:</p> <pre><code>from datetime import datetime\nimport os\nfrom airflow import DAG\nfrom lute.operators.jidoperators import JIDSlurmOperator # Import other operators if needed\n\ndag_id: str = f\"lute_{os.path.splitext(os.path.basename(__file__))[0]}\"\ndescription: str = (\n    \"Run SFX processing using PyAlgos peak finding and experimental phasing\"\n)\n\ndag: DAG = DAG(\n    dag_id=dag_id,\n    start_date=datetime(2024, 3, 18),\n    schedule_interval=None,\n    description=description,\n)\n</code></pre> <p>Once the DAG has been created, a number of <code>Operator</code>s must be created to run the various LUTE analysis operations. As an example consider a partial SFX processing workflow which includes steps for peak finding, indexing, merging, and calculating figures of merit. Each of the 4 steps will have an <code>Operator</code> instance which will launch a corresponding <code>LUTE</code> managed <code>Task</code>, for example:</p> <pre><code># Using only the JIDSlurmOperator\n# syntax: JIDSlurmOperator(task_id=\"LuteManagedTaskName\", dag=dag) # optionally, max_cores=123)\npeak_finder: JIDSlurmOperator = JIDSlurmOperator(task_id=\"PeakFinderPyAlgos\", dag=dag)\n\n# We specify a maximum number of cores for the rest of the jobs.\nindexer: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=120, task_id=\"CrystFELIndexer\", dag=dag\n)\n# We can alternatively specify this task be only ever run with the following args.\n# indexer: JIDSlurmOperator = JIDSlurmOperator(\n#     custom_slurm_params=\"--partition=milano --ntasks=120 --account=lcls:myaccount\",\n#     task_id=\"CrystFELIndexer\",\n#     dag=dag,\n# )\n\n# Merge\nmerger: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=120, task_id=\"PartialatorMerger\", dag=dag\n)\n\n# Figures of merit\nhkl_comparer: JIDSlurmOperator = JIDSlurmOperator(\n    max_cores=8, task_id=\"HKLComparer\", dag=dag\n)\n</code></pre> <p>Finally, the dependencies between the <code>Operator</code>s are \"drawn\", defining the execution order of the various steps. The <code>&gt;&gt;</code> operator has been overloaded for the <code>Operator</code> class, allowing it to be used to specify the next step in the DAG. In this case, a completely linear DAG is drawn as:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>Parallel execution can be added by using the <code>&gt;&gt;</code> operator multiple times. Consider a <code>task1</code> which upon successful completion starts a <code>task2</code> and <code>task3</code> in parallel. This dependency can be added to the DAG using:</p> <pre><code>#task1: JIDSlurmOperator = JIDSlurmOperator(...)\n#task2 ...\n\ntask1 &gt;&gt; task2\ntask1 &gt;&gt; task3\n</code></pre> <p>As each DAG is defined in pure Python, standard control structures (loops, if statements, etc.) can be used to create more complex workflow arrangements.</p> <p>Note: Your DAG will not be available to Airflow until your PR including the file you have defined is merged! Once merged the file will be synced with the Airflow instance and can be run using the scripts described earlier in this document. For testing it is generally preferred that you run each step of your DAG individually using the <code>submit_slurm.sh</code> script and the independent managed <code>Task</code> names. If, however, you want to test the behaviour of Airflow itself (in a modified form) you can use the advanced run-time DAGs defined below as well.</p>"},{"location":"tutorial/creating_workflows/#advanced-usage","title":"Advanced Usage","text":""},{"location":"tutorial/creating_workflows/#run-time-dag-creation","title":"Run-time DAG creation","text":"<p>In most cases, standard DAGs should be defined as described above and called by name. However, Airflow also supports the creation of DAGs dynamically, e.g. to vary the input data to various steps, or the number of steps that will occur. Some of this functionality has been used to allow for user-defined DAGs which are passed in the form of a dictionary, allowing Airflow to construct the workflow as it is running.</p> <p>A basic YAML syntax is used to construct a series of nested dictionaries which define a DAG. Considering the first example DAG defined above (for serial femtosecond crystallography), the standard DAG looked like:</p> <pre><code>peak_finder &gt;&gt; indexer &gt;&gt; merger &gt;&gt; hkl_comparer\n</code></pre> <p>We can alternatively define this DAG in YAML:</p> <pre><code>task_name: PeakFinderPyAlgos\nslurm_params: ''\nnext:\n- task_name: CrystFELIndexer\n  slurm_params: ''\n  next: []\n  - task_name: PartialatorMerger\n    slurm_params: ''\n    next: []\n    - task_name: HKLComparer\n      slurm_params: ''\n      next: []\n</code></pre> <p>I.e. we define a tree where each node is constructed using <code>Node(task_name: str, slurm_params: str, next: List[Node])</code>. </p> <ul> <li>The <code>task_name</code> is the name of a managed <code>Task</code> as before, in the same way that would be passed to the <code>JIDSlurmOperator</code>.</li> <li>A custom string of slurm arguments can be passed using <code>slurm_params</code>. This is a complete string of all the arguments to use for the corresponding managed <code>Task</code>. Use of this field is all or nothing! - if it is left as an empty string, the default parameters (passed on the command-line using the launch script) are used, otherwise this string is used in its stead. Because of this remember to include a partition and account if using it.</li> <li>The <code>next</code> field is composed of either an empty list (meaning no managed <code>Task</code>s are run after the current node), or additional nodes. All nodes in the list are run in parallel. </li> </ul> <p>As a second example, to run <code>task1</code> followed by <code>task2</code> and <code>task3</code> in parellel we would use:</p> <pre><code>task_name: Task1\nslurm_params: ''\nnext:\n- task_name: Task2\n  slurm_params: ''\n  next: []\n- task_name: Task3\n  slurm_params: ''\n  next: []\n</code></pre> <p>In order to run a DAG defined this way we pass the path to the YAML file we have defined it in to the launch script using <code>-W &lt;path_to_dag&gt;</code>. This is instead of calling it by name. E.g.</p> <pre><code>/path/to/lute/launch_scripts/submit_launch_airflow.sh /path/to/lute/launch_scripts/launch_airflow.py -e &lt;exp&gt; -r &lt;run&gt; -c /path/to/config -W &lt;path_to_dag&gt; --test [--debug] [SLURM_ARGS]\n</code></pre> <p>Note that fewer options are currently supported for configuring the operators for each step of the DAG. The slurm arguments can be replaced in their entirety using a custom <code>slurm_params</code> string but individual options cannot be modified.</p>"},{"location":"tutorial/new_task/","title":"Integrating a New <code>Task</code>","text":"<p><code>Task</code>s can be broadly categorized into two types: - \"First-party\" - where the analysis or executed code is maintained within this library. - \"Third-party\" - where the analysis, code, or program is maintained elsewhere and is simply called by a wrapping <code>Task</code>.</p> <p>Creating a new <code>Task</code> of either type generally involves the same steps, although for first-party <code>Task</code>s, the analysis code must of course also be written. Due to this difference, as well as additional considerations for parameter handling when dealing with \"third-party\" <code>Task</code>s, the \"first-party\" and \"third-party\" <code>Task</code> integration cases will be considered separately.</p>"},{"location":"tutorial/new_task/#creating-a-third-party-task","title":"Creating a \"Third-party\" <code>Task</code>","text":"<p>There are two required steps for third-party <code>Task</code> integration, and one additional step which is optional, and may not be applicable to all possible third-party <code>Task</code>s. Generally, <code>Task</code> integration requires: 1. Defining a <code>TaskParameters</code> (pydantic) model which fully parameterizes the <code>Task</code>. This involves specifying a path to a binary, and all the required command-line arguments to run the binary. 2. Creating a managed <code>Task</code> by specifying an <code>Executor</code> for the new third-party <code>Task</code>. At this stage, any additional environment variables can be added which are required for the execution environment. 3. (Optional/Maybe applicable) Create a template for a third-party configuration file. If the new <code>Task</code> has its own configuration file, specifying a template will allow that file to be parameterized from the singular LUTE yaml configuration file. A couple of minor additions to the <code>pydantic</code> model specified in 1. are required to support template usage.</p> <p>Each of these stages will be discussed in detail below. The vast majority of the work is completed in step 1.</p>"},{"location":"tutorial/new_task/#specifying-a-taskparameters-model-for-your-task","title":"Specifying a <code>TaskParameters</code> Model for your <code>Task</code>","text":"<p>A brief overview of parameters objects will be provided below. The following information goes into detail only about specifics related to LUTE configuration. An in depth description of pydantic is beyond the scope of this tutorial; please refer to the official documentation for more information. Please note that due to environment constraints pydantic is currently pinned to version 1.10! Make sure to read the appropriate documentation for this version as many things are different compared to the newer releases. At the end this document there will be an example highlighting some supported behaviour as well as a FAQ to address some common integration considerations.</p> <p><code>Task</code>s and <code>TaskParameter</code>s</p> <p>All <code>Task</code>s have a corresponding <code>TaskParameters</code> object. These objects are linked exclusively by a named relationship. For a <code>Task</code> named <code>MyThirdPartyTask</code>, the parameters object must be named <code>MyThirdPartyTaskParameters</code>. For third-party <code>Task</code>s there are a number of additional requirements: - The model must inherit from a base class called <code>ThirdPartyParameters</code>. - The model must have one field specified called <code>executable</code>. The presence of this field indicates that the <code>Task</code> is a third-party <code>Task</code> and the specified executable must be called. This allows all third-party <code>Task</code>s to be defined exclusively by their parameters model. A single <code>ThirdPartyTask</code> class handles execution of all third-party <code>Task</code>s.</p> <p>All models are stored in <code>lute/io/models</code>. For any given <code>Task</code>, a new model can be added to an existing module contained in this directory or to a new module. If creating a new module, make sure to add an import statement to <code>lute.io.models.__init__</code>.</p> <p>Defining <code>TaskParameter</code>s</p> <p>When specifying parameters the default behaviour is to provide a one-to-one correspondance between the Python attribute specified in the parameter model, and the parameter specified on the command-line. Single-letter attributes are assumed to be passed using <code>-</code>, e.g. <code>n</code> will be passed as <code>-n</code> when the executable is launched. Longer attributes are passed using <code>--</code>, e.g. by default a model attribute named <code>my_arg</code> will be passed on the command-line as <code>--my_arg</code>. Positional arguments are specified using <code>p_argX</code> where <code>X</code> is a number. All parameters are passed in the order that they are specified in the model.</p> <p>However, because the number of possible command-line combinations is large, relying on the default behaviour above is NOT recommended. It is provided solely as a fallback. Instead, there are a number of configuration knobs which can be tuned to achieve the desired behaviour. The two main mechanisms for controlling behaviour are specification of model-wide configuration under the <code>Config</code> class within the model's definition, and parameter-by-parameter configuration using field attributes. For the latter, we define all parameters as <code>Field</code> objects. This allows parameters to have their own attributes, which are parsed by LUTE's task-layer. Given this, the preferred starting template for a <code>TaskParameters</code> model is the following - we assume we are integrating a new <code>Task</code> called <code>RunTask</code>:</p> <pre><code>\nfrom pydantic import Field, validator\n# Also include any pydantic type specifications - Pydantic has many custom\n# validation types already, e.g. types for constrained numberic values, URL handling, etc.\n\nfrom .base import ThirdPartyParameters\n\n# Change class name as necessary\nclass RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for RunTask...\"\"\"\n\n    class Config(ThirdPartyParameters.Config): # MUST be exactly as written here.\n        ...\n        # Model-wide configuration will go here\n\n    executable: str = Field(\"/path/to/executable\", description=\"...\")\n    ...\n    # Additional params.\n    # param1: param1Type = Field(\"default\", description=\"\", ...)\n</code></pre> <p>Config settings and options Under the class definition for <code>Config</code> in the model, we can modify global options for all the parameters. In addition, there are a number of configuration options related to specifying what the outputs/results from the associated <code>Task</code> are, and a number of options to modify runtime behaviour. Currently, the available configuration options are:</p> Config Parameter Meaning Default Value ThirdPartyTask-specific? <code>run_directory</code> If provided, can be used to specify the directory from which a <code>Task</code> is run. <code>None</code> (not provided) NO <code>set_result</code> <code>bool</code>. If <code>True</code> search the model definition for a parameter that indicates what the result is. <code>False</code> NO <code>result_from_params</code> If <code>set_result</code> is <code>True</code> can define a result using this option and a validator. See also <code>is_result</code> below. <code>None</code> (not provided) NO <code>short_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <code>long_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <p>These configuration options modify how the parameter models are parsed and passed along on the command-line, as well as what we consider results and where a <code>Task</code> can run. The default behaviour is that parameters are assumed to be passed as <code>-p arg</code> and <code>--param arg</code>, the <code>Task</code> will be run in the current working directory (or scratch if submitted with the ARP), and we have no information about <code>Task</code> results . Setting the above options can modify this behaviour.</p> <ul> <li>By setting <code>short_flags_use_eq</code> and/or <code>long_flags_use_eq</code> to <code>True</code> parameters are instead passed as <code>-p=arg</code> and <code>--param=arg</code>.</li> <li>By setting <code>run_directory</code> to a valid path, we can force a <code>Task</code> to be run in a specific directory. By default the <code>Task</code> will be run from the directory you submit the job in, or from your scratch folder (<code>/sdf/scratch/...</code>) if you submit from the eLog. Some <code>ThirdPartyTask</code>s rely on searching the correct working directory in order run properly.</li> <li>By setting <code>set_result</code> to <code>True</code> we indicate that the <code>TaskParameters</code> model will provide information on what the <code>TaskResult</code> is. This setting must be used with one of two options, either the <code>result_from_params</code> <code>Config</code> option, described below, or the Field attribute <code>is_result</code> described in the next sub-section (Field Attributes).</li> <li><code>result_from_params</code> is a Config option that can be used when <code>set_result==True</code>. In conjunction with a validator (described a sections down) we can use this option to specify a result from all the information contained in the model. E.g. if you have a <code>Task</code> that has parameters for an <code>output_directory</code> and a <code>output_filename</code>, you can set <code>result_from_params==f\"{output_directory}/{output_filename}\"</code>.</li> </ul> <p>Field attributes In addition to the global configuration options there are a couple of ways to specify individual parameters. The following <code>Field</code> attributes are used when parsing the model:</p> Field Attribute Meaning Default Value Example <code>flag_type</code> Specify the type of flag for passing this argument. One of <code>\"-\"</code>, <code>\"--\"</code>, or <code>\"\"</code> N/A <code>p_arg1 = Field(..., flag_type=\"\")</code> <code>rename_param</code> Change the name of the parameter as passed on the command-line. N/A <code>my_arg = Field(..., rename_param=\"my-arg\")</code> <code>description</code> Documentation of the parameter's usage or purpose. N/A <code>arg = Field(..., description=\"Argument for...\")</code> <code>is_result</code> <code>bool</code>. If the <code>set_result</code> <code>Config</code> option is <code>True</code>, we can set this to <code>True</code> to indicate a result. N/A <code>output_result = Field(..., is_result=true)</code> <p>The <code>flag_type</code> attribute allows us to specify whether the parameter corresponds to a positional (<code>\"\"</code>) command line argument, requires a single hyphen (<code>\"-\"</code>), or a double hyphen (<code>\"--\"</code>). By default, the parameter name is passed as-is on the command-line. However, command-line arguments can have characters which would not be valid in Python variable names. In particular, hyphens are frequently used. To handle this case, the <code>rename_param</code> attribute can be used to specify an alternative spelling of the parameter when it is passed on the command-line. This also allows for using more descriptive variable names internally than those used on the command-line. A <code>description</code> can also be provided for each Field to document the usage and purpose of that particular parameter.</p> <p>As an example, we can again consider defining a model for a <code>RunTask</code> <code>Task</code>. Consider an executable which would normally be called from the command-line as follows:</p> <pre><code>/sdf/group/lcls/ds/tools/runtask -n &lt;nthreads&gt; --method=&lt;algorithm&gt; -p &lt;algo_param&gt; [--debug]\n</code></pre> <p>A model specification for this <code>Task</code> may look like:</p> <pre><code>class RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        long_flags_use_eq: bool = True  # For the --method parameter\n\n    # Prefer using full/absolute paths where possible.\n    # No flag_type needed for this field\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/runtask\", description=\"Runtask Binary v1.0\"\n    )\n\n    # We can provide a more descriptive name for -n\n    # Let's assume it's a number of threads, or processes, etc.\n    num_threads: int = Field(\n        1, description=\"Number of concurrent threads.\", flag_type=\"-\", rename_param=\"n\"\n    )\n\n    # In this case we will use the Python variable name directly when passing\n    # the parameter on the command-line\n    method: str = Field(\"algo1\", description=\"Algorithm to use.\", flag_type=\"--\")\n\n    # For an actual parameter we would probably have a better name. Lets assume\n    # This parameter (-p) modifies the behaviour of the method above.\n    method_param1: int = Field(\n        3, description=\"Modify method performance.\", flag_type=\"-\", rename_param=\"p\"\n    )\n\n    # Boolean flags are only passed when True! `--debug` is an optional parameter\n    # which is not followed by any arguments.\n    debug: bool = Field(\n        False, description=\"Whether to run in debug mode.\", flag_type=\"--\"\n    )\n</code></pre> <p>The <code>is_result</code> attribute allows us to specify whether the corresponding Field points to the output/result of the associated <code>Task</code>. Consider a <code>Task</code>, <code>RunTask2</code> which writes its output to a single file which is passed as a parameter.</p> <pre><code>class RunTask2Parameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask2 binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True                     # This must be set here!\n        # result_from_params: Optional[str] = None  # We can use this for more complex result setups (see below). Ignore for now.\n\n    # Prefer using full/absolute paths where possible.\n    # No flag_type needed for this field\n    executable: str = Field(\n        \"/sdf/group/lcls/ds/tools/runtask2\", description=\"Runtask Binary v2.0\"\n    )\n\n    # Lets assume we take one input and write one output file\n    # We will not provide a default value, so this parameter MUST be provided\n    input: str = Field(\n        description=\"Path to input file.\", flag_type=\"--\"\n    )\n\n    # We will also not provide a default for the output\n    # BUT, we will specify that whatever is provided is the result\n    output: str = Field(\n        description=\"Path to write output to.\",\n        flag_type=\"-\",\n        rename_param=\"o\",\n        is_result=True,   # This means this parameter points to the result!\n    )\n</code></pre> <p>Additional Comments 1. Model parameters of type <code>bool</code> are not passed with an argument and are only passed when <code>True</code>. This is a common use-case for boolean flags which enable things like test or debug modes, verbosity or reporting features. E.g. <code>--debug</code>, <code>--test</code>, <code>--verbose</code>, etc.   - If you need to pass the literal words <code>\"True\"</code> or <code>\"False\"</code>, use a parameter of type <code>str</code>. 2. You can use <code>pydantic</code> types to constrain parameters beyond the basic Python types. E.g. <code>conint</code> can be used to define lower and upper bounds for an integer. There are also types for common categories, positive/negative numbers, paths, URLs, IP addresses, etc.   - Even more custom behaviour can be achieved with <code>validator</code>s (see below). 3. All <code>TaskParameters</code> objects and its subclasses have access to a <code>lute_config</code> parameter, which is of type <code>lute.io.models.base.AnalysisHeader</code>. This special parameter is ignored when constructing the call for a binary task, but it provides access to shared/common parameters between tasks. For example, the following parameters are available through the <code>lute_config</code> object, and may be of use when constructing validators. All fields can be accessed with <code>.</code> notation. E.g. <code>lute_config.experiment</code>.   - <code>title</code>: A user provided title/description of the analysis.   - <code>experiment</code>: The current experiment name   - <code>run</code>: The current acquisition run number   - <code>date</code>: The date of the experiment or the analysis.   - <code>lute_version</code>: The version of the software you are running.   - <code>task_timeout</code>: How long a <code>Task</code> can run before it is killed.   - <code>work_dir</code>: The main working directory for LUTE. Files and the database are created relative to this directory. This is separate from the <code>run_directory</code> config option. LUTE will write files to the work directory by default; however, the <code>Task</code> itself is run from <code>run_directory</code> if it is specified.</p> <p>Validators Pydantic uses <code>validators</code> to determine whether a value for a specific field is appropriate. There are default validators for all the standard library types and the types specified within the pydantic package; however, it is straightforward to define custom ones as well. In the template code-snippet above we imported the <code>validator</code> decorator. To create our own validator we define a method (with any name) with the following prototype, and decorate it with the <code>validator</code> decorator:</p> <pre><code>@validator(\"name_of_field_to_decorate\")\ndef my_custom_validator(cls, field: Any, values: Dict[str, Any]) -&gt; Any: ...\n</code></pre> <p>In this snippet, the <code>field</code> variable corresponds to the value for the specific field we want to validate. <code>values</code> is a dictionary of fields and their values which have been parsed prior to the current field. This means you can validate the value of a parameter based on the values provided for other parameters. Since pydantic always validates the fields in the order they are defined in the model, fields dependent on other fields should come later in the definition.</p> <p>For example, consider the <code>method_param1</code> field defined above for <code>RunTask</code>. We can provide a custom validator which changes the default value for this field depending on what type of algorithm is specified for the <code>--method</code> option. We will also constrain the options for <code>method</code> to two specific strings.</p> <pre><code>from pydantic import Field, validator, ValidationError, root_validator\nclass RunTaskParameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask binary.\"\"\"\n\n    # [...]\n\n    # In this case we will use the Python variable name directly when passing\n    # the parameter on the command-line\n    method: str = Field(\"algo1\", description=\"Algorithm to use.\", flag_type=\"--\")\n\n    # For an actual parameter we would probably have a better name. Lets assume\n    # This parameter (-p) modifies the behaviour of the method above.\n    method_param1: Optional[int] = Field(\n        description=\"Modify method performance.\", flag_type=\"-\", rename_param=\"p\"\n    )\n\n    # We will only allow method to take on one of two values\n    @validator(\"method\")\n    def validate_method(cls, method: str, values: Dict[str, Any]) -&gt; str:\n        \"\"\"Method validator: --method can be algo1 or algo2.\"\"\"\n\n        valid_methods: List[str] = [\"algo1\", \"algo2\"]\n        if method not in valid_methods:\n            raise ValueError(\"method must be algo1 or algo2\")\n        return method\n\n    # Lets change the default value of `method_param1` depending on `method`\n    # NOTE: We didn't provide a default value to the Field above and made it\n    # optional. We can use this to test whether someone is purposefully\n    # overriding the value of it, and if not, set the default ourselves.\n    # We set `always=True` since pydantic will normally not use the validator\n    # if the default is not changed\n    @validator(\"method_param1\", always=True)\n    def validate_method_param1(cls, param1: Optional[int], values: Dict[str, Any]) -&gt; int:\n        \"\"\"method param1 validator\"\"\"\n\n        # If someone actively defined it, lets just return that value\n        # We could instead do some additional validation to make sure that the\n        # value they provided is valid...\n        if param1 is not None:\n            return param1\n\n        # method_param1 comes after method, so this will be defined, or an error\n        # would have been raised.\n        method: str = values['method']\n        if method == \"algo1\":\n            return 3\n        elif method == \"algo2\":\n            return 5\n</code></pre> <p>The special <code>root_validator(pre=False)</code> can also be used to provide validation of the model as a whole. This is also the recommended method for specifying a result (using <code>result_from_params</code>) which has a complex dependence on the parameters of the model. This latter use-case is described in FAQ 2 below.</p>"},{"location":"tutorial/new_task/#faq","title":"FAQ","text":"<ol> <li>How can I specify a default value which depends on another parameter?</li> </ol> <p>Use a custom validator. The example above shows how to do this. The parameter that depends on another parameter must come LATER in the model defintion than the independent parameter.</p> <ol> <li>My <code>TaskResult</code> is determinable from the parameters model, but it isn't easily specified by one parameter. How can I use <code>result_from_params</code> to indicate the result?</li> </ol> <p>When a result can be identified from the set of parameters defined in a <code>TaskParameters</code> model, but is not as straightforward as saying it is equivalent to one of the parameters alone, we can set <code>result_from_params</code> using a custom validator. In the example below, we have two parameters which together determine what the result is, <code>output_dir</code> and <code>out_name</code>. Using a validator we will define a result from these two values.</p> <pre><code>from pydantic import Field, root_validator\n\nclass RunTask3Parameters(ThirdPartyParameters):\n    \"\"\"Parameters for the runtask3 binary.\"\"\"\n\n    class Config(ThirdPartyParameters.Config):\n        set_result: bool = True       # This must be set here!\n        result_from_params: str = \"\"  # We will set this momentarily\n\n    # [...] executable, other params, etc.\n\n    output_dir: str = Field(\n        description=\"Directory to write output to.\",\n        flag_type=\"--\",\n        rename_param=\"dir\",\n    )\n\n    out_name: str = Field(\n        description=\"The name of the final output file.\",\n        flag_type=\"--\",\n        rename_param=\"oname\",\n    )\n\n    # We can still provide other validators as needed\n    # But for now, we just set result_from_params\n    # Validator name can be anything, we set pre=False so this runs at the end\n    @root_validator(pre=False)\n    def define_result(cls, values: Dict[str, Any]) -&gt; Dict[str, Any]:\n        # Extract the values of output_dir and out_name\n        output_dir: str = values[\"output_dir\"]\n        out_name: str = values[\"out_name\"]\n\n        result: str = f\"{output_dir}/{out_name}\"\n        # Now we set result_from_params\n        cls.Config.result_from_params = result\n\n        # We haven't modified any other values, but we MUST return this!\n        return values\n</code></pre> <ol> <li>My new <code>Task</code> depends on the output of a previous <code>Task</code>, how can I specify this dependency? Parameters used to run a <code>Task</code> are recorded in a database for every <code>Task</code>. It is also recorded whether or not the execution of that specific parameter set was successful. A utility function is provided to access the most recent values from the database for a specific parameter of a specific <code>Task</code>. It can also be used to specify whether unsuccessful <code>Task</code>s should be included in the query. This utility can be used within a validator to specify dependencies. For example, suppose the input of <code>RunTask2</code> (parameter <code>input</code>) depends on the output location of <code>RunTask1</code> (parameter <code>outfile</code>). A validator of the following type can be used to retrieve the output file and make it the default value of the input parameter.</li> </ol> <pre><code>from pydantic import Field, validator\n\nfrom .base import ThirdPartyParameters\nfrom ..db import read_latest_db_entry\n\nclass RunTask2Parameters(ThirdPartyParameters):\n    input: str = Field(\"\", description=\"Input file.\", flag_type=\"--\")\n\n    @validator(\"input\")\n    def validate_input(cls, input: str, values: Dict[str, Any]) -&gt; str:\n        if input == \"\":\n            task1_out: Optional[str] = read_latest_db_entry(\n                f\"{values['lute_config'].work_dir}\",  # Working directory. We search for the database here.\n                \"RunTask1\",                           # Name of Task we want to look up\n                \"outfile\",                            # Name of parameter of the Task\n                valid_only=True,                      # We only want valid output files.\n            )\n            # read_latest_db_entry returns None if nothing is found\n            if task1_out is not None:\n                return task1_out\n        return input\n</code></pre> <p>There are more examples of this pattern spread throughout the various <code>Task</code> models.</p>"},{"location":"tutorial/new_task/#specifying-an-executor-creating-a-runnable-managed-task","title":"Specifying an <code>Executor</code>: Creating a runnable, \"managed <code>Task</code>\"","text":"<p>Overview</p> <p>After a pydantic model has been created, the next required step is to define a managed <code>Task</code>. In the context of this library, a managed <code>Task</code> refers to the combination of an <code>Executor</code> and a <code>Task</code> to run. The <code>Executor</code> manages the process of <code>Task</code> submission and the execution environment, as well as performing any logging, eLog communication, etc. There are currently two types of <code>Executor</code> to choose from, but only one is applicable to third-party code. The second <code>Executor</code> is listed below for completeness only. If you need MPI see the note below.</p> <ol> <li><code>Executor</code>: This is the standard <code>Executor</code>. It should be used for third-party uses cases.</li> <li><code>MPIExecutor</code>: This performs all the same types of operations as the option above; however, it will submit your <code>Task</code> using MPI.</li> <li>The <code>MPIExecutor</code> will submit the <code>Task</code> using the number of available cores - 1. The number of cores is determined from the physical core/thread count on your local machine, or the number of cores allocated by SLURM when submitting on the batch nodes.</li> </ol> <p>Using MPI with third-party <code>Task</code>s</p> <p>As mentioned, you should setup a third-party <code>Task</code> to use the first type of <code>Executor</code>. If, however, your third-party <code>Task</code> uses MPI this may seem non-intuitive. When using the <code>MPIExecutor</code> LUTE code is submitted with MPI. This includes the code that performs signalling to the <code>Executor</code> and <code>exec</code>s the third-party code you are interested in running. While it is possible to set this code up to run with MPI, it is more challenging in the case of third-party <code>Task</code>s because there is no <code>Task</code> code to modify directly! The <code>MPIExecutor</code> is provided mostly for first-party code. This is not an issue, however, since the standard <code>Executor</code> is easily configured to run with MPI in the case of third-party code.</p> <p>When using the standard <code>Executor</code> for a <code>Task</code> requiring MPI, the <code>executable</code> in the pydantic model must be set to <code>mpirun</code>. For example, a third-party <code>Task</code> model, that uses MPI but is intended to be run with the <code>Executor</code> may look like the following. We assume this <code>Task</code> runs a Python script using MPI.</p> <pre><code>class RunMPITaskParameters(ThirdPartyParameters):\n    class Config(ThirdPartyParameters.Config):\n        ...\n\n    executable: str = Field(\"mpirun\", description=\"MPI executable\")\n    np: PositiveInt = Field(\n        max(int(os.environ.get(\"SLURM_NPROCS\", len(os.sched_getaffinity(0)))) - 1, 1),\n        description=\"Number of processes\",\n        flag_type=\"-\",\n    )\n    pos_arg: str = Field(\"python\", description=\"Python...\", flag_type=\"\")\n    script: str = Field(\"\", description=\"Python script to run with MPI\", flag_type=\"\")\n</code></pre> <p>Selecting the <code>Executor</code></p> <p>After deciding on which <code>Executor</code> to use, a single line must be added to the <code>lute/managed_tasks.py</code> module:</p> <pre><code># Initialization: Executor(\"TaskName\")\nTaskRunner: Executor = Executor(\"SubmitTask\")\n# TaskRunner: MPIExecutor = MPIExecutor(\"SubmitTask\") ## If using the MPIExecutor\n</code></pre> <p>In an attempt to make it easier to discern whether discussing a <code>Task</code> or managed <code>Task</code>, the standard naming convention is that the <code>Task</code> (class name) will have a verb in the name, e.g. <code>RunTask</code>, <code>SubmitTask</code>. The corresponding managed <code>Task</code> will use a related noun, e.g. <code>TaskRunner</code>, <code>TaskSubmitter</code>, etc.</p> <p>As a reminder, the <code>Task</code> name is the first part of the class name of the pydantic model, without the <code>Parameters</code> suffix. This name must match. E.g. if your pydantic model's class name is <code>RunTaskParameters</code>, the <code>Task</code> name is <code>RunTask</code>, and this is the string passed to the <code>Executor</code> initializer.</p> <p>Modifying the environment</p> <p>If your third-party <code>Task</code> can run in the standard <code>psana</code> environment with no further configuration files, the setup process is now complete and your <code>Task</code> can be run within the LUTE framework. If on the other hand your <code>Task</code> requires some changes to the environment, this is managed through the <code>Executor</code>. There are a couple principle methods that the <code>Executor</code> has to change the environment.</p> <ol> <li><code>Executor.update_environment</code>: if you only need to add a few environment variables, or update the <code>PATH</code> this is the method to use. The method takes a <code>Dict[str, str]</code> as input. Any variables can be passed/defined using this method. By default, any variables in the dictionary will overwrite those variable definitions in the current environment if they are already present, except for the variable <code>PATH</code>. By default <code>PATH</code> entries in the dictionary are prepended to the current <code>PATH</code> available in the environment the <code>Executor</code> runs in (the standard <code>psana</code> environment). This behaviour can be changed to either append, or overwrite the <code>PATH</code> entirely by an optional second argument to the method.</li> <li><code>Executor.shell_source</code>: This method will source a shell script which can perform numerous modifications of the environment (PATH changes, new environment variables, conda environments, etc.). The method takes a <code>str</code> which is the path to a shell script to source.</li> </ol> <p>As an example, we will update the <code>PATH</code> of one <code>Task</code> and source a script for a second.</p> <pre><code>TaskRunner: Executor = Executor(\"RunTask\")\n# update_environment(env: Dict[str,str], update_path: str = \"prepend\") # \"append\" or \"overwrite\"\nTaskRunner.update_environment(\n    { \"PATH\": \"/sdf/group/lcls/ds/tools\" }  # This entry will be prepended to the PATH available after sourcing `psconda.sh`\n)\n\nTask2Runner: Executor = Executor(\"RunTask2\")\nTask2Runner.shell_source(\"/sdf/group/lcls/ds/tools/new_task_setup.sh\") # Will source new_task_setup.sh script\n</code></pre>"},{"location":"tutorial/new_task/#using-templates-managing-third-party-configuration-files","title":"Using templates: managing third-party configuration files","text":"<p>Some third-party executables will require their own configuration files. These are often separate JSON or YAML files, although they can also be bash or Python scripts which are intended to be edited. Since LUTE requires its own configuration YAML file, it attempts to handle these cases by using Jinja templates. When wrapping a third-party task a template can also be provided - with small modifications to the <code>Task</code>'s pydantic model, LUTE can process special types of parameters to render them in the template. LUTE offloads all the template rendering to Jinja, making the required additions to the pydantic model small. On the other hand, it does require understanding the Jinja syntax, and the provision of a well-formatted template, to properly parse parameters. Some basic examples of this syntax will be shown below; however, it is recommended that the <code>Task</code> implementer refer to the official Jinja documentation for more information.</p> <p>LUTE provides two additional base models which are used for template parsing in conjunction with the primary <code>Task</code> model. These are: - <code>TemplateParameters</code> objects which hold parameters which will be used to render a portion of a template. - <code>TemplateConfig</code> objects which hold two strings: the name of the template file to use and the full path (including filename) of where to output the rendered result.</p> <p><code>Task</code> models which inherit from the <code>ThirdPartyParameters</code> model, as all third-party <code>Task</code>s should, allow for extra arguments. LUTE will parse any extra arguments provided in the configuration YAML as <code>TemplateParameters</code> objects automatically, which means that they do not need to be explicitly added to the pydantic model (although they can be). As such the only requirement on the Python-side when adding template rendering functionality to the <code>Task</code> is the addition of one parameter - an instance of <code>TemplateConfig</code>. The instance MUST be called <code>lute_template_cfg</code>.</p> <pre><code>from pydantic import Field, validator\n\nfrom .base import TemplateConfig\n\nclass RunTaskParamaters(ThirdPartyParameters):\n    ...\n    # This parameter MUST be called lute_template_cfg!\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"name_of_template.json\",\n            output_path=\"/path/to/write/rendered_output_to.json\",\n        ),\n        description=\"Template rendering configuration\",\n    )\n</code></pre> <p>LUTE looks for the template in <code>config/templates</code>, so only the name of the template file to use within that directory is required for the <code>template_name</code> attribute of <code>lute_template_cfg</code>. LUTE can write the output anywhere (the user has permissions), and with any name, so the full absolute path including filename should be used for the <code>output_path</code> of <code>lute_template_cfg</code>.</p> <p>The rest of the work is done by the combination of Jinja, LUTE's configuration YAML file, and the template itself. Understanding the interplay between these components is perhaps best illustrated by an example. As such, let us consider a simple third-party <code>Task</code> whose only input parameter (on the command-line) is the location of a configuration JSON file. We'll call the third-party executable <code>jsonuser</code> and our <code>Task</code> model, the <code>RunJsonUserParameters</code>. We assume the program is run like:</p> <pre><code>jsonuser -i &lt;input_file.json&gt;\n</code></pre> <p>The first step is to setup the pydantic model as before.</p> <pre><code>from pydantic import Field, validator\n\nfrom .base import TemplateConfig\n\nclass RunJsonUserParameters:\n    executable: str = Field(\n        \"/path/to/jsonuser\", description=\"Executable which requires a JSON configuration file.\"\n    )\n    # Lets assume the JSON file is passed as \"-i &lt;path_to_json&gt;\"\n    input_json: str = Field(\n        \"\", description=\"Path to the input JSON file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n</code></pre> <p>The next step is to create a template for the JSON file. Let's assume the JSON file looks like:</p> <pre><code>{\n    \"param1\": \"arg1\",\n    \"param2\": 4,\n    \"param3\": {\n        \"a\": 1,\n        \"b\": 2\n    },\n    \"param4\": [\n        1,\n        2,\n        3\n    ]\n}\n</code></pre> <p>Any, or all of these values can be substituted for, and we can determine the way in which we will provide them. I.e. a substitution can be provided for each variable individually, or, for example for a nested hierarchy, a dictionary can be provided which will substitute all the items at once. For this simple case, let's provide variables for <code>param1</code>, <code>param2</code>, <code>param3.b</code> and assume that we want the first and second entries for <code>param4</code> to be identical for our use case (i.e., we can use one variable for them both. In total, this means we will perform 5 substitutions using 4 variables. Jinja will substitute a variable anywhere it sees the following syntax, <code>{{ variable_name }}</code>. As such a valid template for our use-case may look like:</p> <pre><code>{\n    \"param1\": {{ str_var }},\n    \"param2\": {{ int_var }},\n    \"param3\": {\n        \"a\": 1,\n        \"b\": {{ p3_b }}\n    },\n    \"param4\": [\n        {{ val }},\n        {{ val }},\n        3\n    ]\n}\n</code></pre> <p>We save this file as <code>jsonuser.json</code> in <code>config/templates</code>. Next, we will update the original pydantic model to include our template configuration. We still have an issue, however, in that we need to decide where to write the output of the template to. In this case, we can use the <code>input_json</code> parameter. We will assume that the user will provide this, although a default value can also be used. A custom validator will be added so that we can take the <code>input_json</code> value and update the value of <code>lute_template_cfg.output_path</code> with it.</p> <pre><code># from typing import Optional\n\nfrom pydantic import Field, validator\n\nfrom .base import TemplateConfig #, TemplateParameters\n\nclass RunJsonUserParameters:\n    executable: str = Field(\n        \"jsonuser\", description=\"Executable which requires a JSON configuration file.\"\n    )\n    # Lets assume the JSON file is passed as \"-i &lt;path_to_json&gt;\"\n    input_json: str = Field(\n        \"\", description=\"Path to the input JSON file.\", flag_type=\"-\", rename_param=\"i\"\n    )\n    # Add template configuration! *MUST* be called `lute_template_cfg`\n    lute_template_cfg: TemplateConfig = Field(\n        TemplateConfig(\n            template_name=\"jsonuser.json\", # Only the name of the file here.\n            output_path=\"\",\n        ),\n        description=\"Template rendering configuration\",\n    )\n    # We do not need to include these TemplateParameters, they will be added\n    # automatically if provided in the YAML\n    #str_var: Optional[TemplateParameters]\n    #int_var: Optional[TemplateParameters]\n    #p3_b: Optional[TemplateParameters]\n    #val: Optional[TemplateParameters]\n\n\n    # Tell LUTE to write the rendered template to the location provided with\n    # `input_json`. I.e. update `lute_template_cfg.output_path`\n    @validator(\"lute_template_cfg\", always=True)\n    def update_output_path(\n        cls, lute_template_cfg: TemplateConfig, values: Dict[str, Any]\n    ) -&gt; TemplateConfig:\n        if lute_template_cfg.output_path == \"\":\n            lute_template_cfg.output_path = values[\"input_json\"]\n        return lute_template_cfg\n</code></pre> <p>All that is left to render the template, is to provide the variables we want to substitute in the LUTE configuration YAML. In our case we must provide the 4 variable names we included within the substitution syntax (<code>{{ var_name }}</code>). The names in the YAML must match those in the template.</p> <pre><code>RunJsonUser:\n    input_json: \"/my/chosen/path.json\" # We'll come back to this...\n    str_var: \"arg1\" # Will substitute for \"param1\": \"arg1\"\n    int_var: 4 # Will substitute for \"param2\": 4\n    p3_b: 2  # Will substitute for \"param3: { \"b\": 2 }\n    val: 2 # Will substitute for \"param4\": [2, 2, 3] in the JSON\n</code></pre> <p>If on the other hand, a user were to have an already valid JSON file, it is possible to turn off the template rendering. (ALL) Template variables (<code>TemplateParameters</code>) are simply excluded from the configuration YAML.</p> <pre><code>RunJsonUser:\n    input_json: \"/path/to/existing.json\"\n    #str_var: ...\n    #...\n</code></pre>"},{"location":"tutorial/new_task/#additional-jinja-syntax","title":"Additional Jinja Syntax","text":"<p>There are many other syntactical constructions we can use with Jinja. Some of the useful ones are:</p> <p>If Statements - E.g. only include portions of the template if a value is defined.</p> <pre><code>{% if VARNAME is defined %}\n// Stuff to include\n{% endif %}\n</code></pre> <p>Loops - E.g. Unpacking multiple elements from a dictionary.</p> <pre><code>{% for name, value in VARNAME.items() %}\n// Do stuff with name and value\n{% endfor %}\n</code></pre>"},{"location":"tutorial/new_task/#creating-a-first-party-task","title":"Creating a \"First-Party\" <code>Task</code>","text":"<p>The process for creating a \"First-Party\" <code>Task</code> is very similar to that for a \"Third-Party\" <code>Task</code>, with the difference being that you must also write the analysis code. The steps for integration are: 1. Write the <code>TaskParameters</code> model. 2. Write the <code>Task</code> class. There are a few rules that need to be adhered to. 3. Make your <code>Task</code> available by modifying the import function. 4. Specify an <code>Executor</code></p>"},{"location":"tutorial/new_task/#specifying-a-taskparameters-model-for-your-task_1","title":"Specifying a <code>TaskParameters</code> Model for your <code>Task</code>","text":"<p>Parameter models have a format that must be followed for \"Third-Party\" <code>Task</code>s, but \"First-Party\" <code>Task</code>s have a little more liberty in how parameters are dealt with, since the <code>Task</code> will do all the parsing itself.</p> <p>To create a model, the basic steps are: 1. If necessary, create a new module (e.g. <code>new_task_category.py</code>) under <code>lute.io.models</code>, or find an appropriate pre-existing module in that directory.   - An <code>import</code> statement must be added to <code>lute.io.models._init_</code> if a new module is created, so it can be found.   - If defining the model in a pre-existing module, make sure to modify the <code>__all__</code> statement to include it. 2. Create a new model that inherits from <code>TaskParameters</code>. You can look at <code>lute.models.io.tests.TestReadOutputParameters</code> for an example. The model must be named <code>&lt;YourTaskName&gt;Parameters</code>   - You should include all relevant parameters here, including input file, output file, and any potentially adjustable parameters. These parameters must be included even if there are some implicit dependencies between <code>Task</code>s and it would make sense for the parameter to be auto-populated based on some other output. Creating this dependency is done with validators (see step 3.). All parameters should be overridable, and all <code>Task</code>s should be fully-independently configurable, based solely on their model and the configuration YAML.   - To follow the preferred format, parameters should be defined as: <code>param_name: type = Field([default value], description=\"This parameter does X.\")</code> 3. Use validators to do more complex things for your parameters, including populating default values dynamically:   - E.g. create default values that depend on other parameters in the model - see for example: SubmitSMDParameters.   - E.g. create default values that depend on other <code>Task</code>s by reading from the database - see for example: TestReadOutputParameters. 4. The model will have access to some general configuration values by inheriting from <code>TaskParameters</code>. These parameters are all stored in <code>lute_config</code> which is an instance of <code>AnalysisHeader</code> (defined here).   - For example, the experiment and run number can be obtained from this object and a validator could use these values to define the default input file for the <code>Task</code>.</p> <p>A number of configuration options and Field attributes are also available for \"First-Party\" <code>Task</code> models. These are identical to those used for the <code>ThirdPartyTask</code>s, although there is a smaller selection. These options are reproduced below for convenience.</p> <p>Config settings and options Under the class definition for <code>Config</code> in the model, we can modify global options for all the parameters. In addition, there are a number of configuration options related to specifying what the outputs/results from the associated <code>Task</code> are, and a number of options to modify runtime behaviour. Currently, the available configuration options are:</p> Config Parameter Meaning Default Value ThirdPartyTask-specific? <code>run_directory</code> If provided, can be used to specify the directory from which a <code>Task</code> is run. <code>None</code> (not provided) NO <code>set_result</code> <code>bool</code>. If <code>True</code> search the model definition for a parameter that indicates what the result is. <code>False</code> NO <code>result_from_params</code> If <code>set_result</code> is <code>True</code> can define a result using this option and a validator. See also <code>is_result</code> below. <code>None</code> (not provided) NO <code>short_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <code>long_flags_use_eq</code> Use equals sign instead of space for arguments of <code>-</code> parameters. <code>False</code> YES - Only affects <code>ThirdPartyTask</code>s <p>These configuration options modify how the parameter models are parsed and passed along on the command-line, as well as what we consider results and where a <code>Task</code> can run. The default behaviour is that parameters are assumed to be passed as <code>-p arg</code> and <code>--param arg</code>, the <code>Task</code> will be run in the current working directory (or scratch if submitted with the ARP), and we have no information about <code>Task</code> results . Setting the above options can modify this behaviour.</p> <ul> <li>By setting <code>short_flags_use_eq</code> and/or <code>long_flags_use_eq</code> to <code>True</code> parameters are instead passed as <code>-p=arg</code> and <code>--param=arg</code>.</li> <li>By setting <code>run_directory</code> to a valid path, we can force a <code>Task</code> to be run in a specific directory. By default the <code>Task</code> will be run from the directory you submit the job in, or from your scratch folder (<code>/sdf/scratch/...</code>) if you submit from the eLog. Some <code>ThirdPartyTask</code>s rely on searching the correct working directory in order run properly.</li> <li>By setting <code>set_result</code> to <code>True</code> we indicate that the <code>TaskParameters</code> model will provide information on what the <code>TaskResult</code> is. This setting must be used with one of two options, either the <code>result_from_params</code> <code>Config</code> option, described below, or the Field attribute <code>is_result</code> described in the next sub-section (Field Attributes).</li> <li><code>result_from_params</code> is a Config option that can be used when <code>set_result==True</code>. In conjunction with a validator (described a sections down) we can use this option to specify a result from all the information contained in the model. E.g. if you have a <code>Task</code> that has parameters for an <code>output_directory</code> and a <code>output_filename</code>, you can set <code>result_from_params==f\"{output_directory}/{output_filename}\"</code>.</li> </ul> <p>Field attributes In addition to the global configuration options there are a couple of ways to specify individual parameters. The following <code>Field</code> attributes are used when parsing the model:</p> Field Attribute Meaning Default Value Example <code>description</code> Documentation of the parameter's usage or purpose. N/A <code>arg = Field(..., description=\"Argument for...\")</code> <code>is_result</code> <code>bool</code>. If the <code>set_result</code> <code>Config</code> option is <code>True</code>, we can set this to <code>True</code> to indicate a result. N/A <code>output_result = Field(..., is_result=true)</code>"},{"location":"tutorial/new_task/#writing-the-task","title":"Writing the <code>Task</code>","text":"<p>You can write your analysis code (or whatever code to be executed) as long as it adheres to the limited rules below. You can create a new module for your <code>Task</code> in <code>lute.tasks</code> or add it to any existing module, if it makes sense for it to belong there. The <code>Task</code> itself is a single class constructed as:</p> <ol> <li>Your analysis <code>Task</code> is a class named in a way that matches its Pydantic model. E.g. <code>RunTask</code> is the <code>Task</code>, and <code>RunTaskParameters</code> is the Pydantic model.</li> <li>The class must inherit from the <code>Task</code> class (see template below). If you intend to use MPI see the following section.</li> <li>You must provide an implementation of a <code>_run</code> method. This is the method that will be executed when the <code>Task</code> is run. You can in addition write as many methods as you need. For fine-grained execution control you can also provide <code>_pre_run()</code> and <code>_post_run()</code> methods, but this is optional.</li> <li>For all communication (including print statements) you should use the <code>_report_to_executor(msg: Message)</code> method. Since the <code>Task</code> is run as a subprocess this method will pass information to the controlling <code>Executor</code>. You can pass any type of object using this method, strings, plots, arrays, etc.</li> <li>If you did not use the <code>set_result</code> configuration option in your parameters model, make sure to provide a result when finished. This is done by setting <code>self._result.payload = ...</code>. You can set the result to be any object. If you have written the result to a file, for example, please provide a path.</li> </ol> <p>A minimal template is provided below.</p> <pre><code>\"\"\"Standard docstring...\"\"\"\n\n__all__ = [\"RunTask\"]\n__author__ = \"\" # Please include so we know who the SME is\n\n# Include any imports you need here\n\nfrom lute.execution.ipc import Message # Message for communication\nfrom lute.io.models.base import *      # For TaskParameters\nfrom lute.tasks.task import *          # For Task\n\nclass RunTask(Task): # Inherit from Task\n    \"\"\"Task description goes here, or in __init__\"\"\"\n\n    def __init__(self, *, params: TaskParameters) -&gt; None:\n        super().__init__(params=params) # Sets up Task, parameters, etc.\n        # Parameters will be available through:\n          # self._task_parameters\n          # You access with . operator: self._task_parameters.param1, etc.\n        # Your result object is availble through:\n          # self._result\n            # self._result.payload &lt;- Main result\n            # self._result.summary &lt;- Short summary\n            # self._result.task_status &lt;- Semi-automatic, but can be set manually\n\n    def _run(self) -&gt; None:\n        # THIS METHOD MUST BE PROVIDED\n        self.do_my_analysis()\n\n    def do_my_analysis(self) -&gt; None:\n        # Send a message, proper way to print:\n        msg: Message(contents=\"My message contents\", signal=\"\")\n        self._report_to_executor(msg)\n\n        # When done, set result - assume we wrote a file, e.g.\n        self._result.payload = \"/path/to/output_file.h5\"\n        # Optionally also set status - good practice but not obligatory\n        self._result.task_status = TaskStatus.COMPLETED\n</code></pre>"},{"location":"tutorial/new_task/#using-mpi-for-your-task","title":"Using MPI for your <code>Task</code>","text":"<p>In the case your <code>Task</code> is written to use <code>MPI</code> a slight modification to the template above is needed. Specifically, an additional keyword argument should be passed to the base class initializer: <code>use_mpi=True</code>. This tells the base class to adjust signalling/communication behaviour appropriately for a multi-rank MPI program. Doing this prevents tricky-to-track-down problems due to ranks starting, completing and sending messages at different times. The rest of your code can, as before, be written as you see fit. The use of this keyword argument will also synchronize the start of all ranks and wait until all ranks have finished to exit.</p> <pre><code>\"\"\"Task which needs to run with MPI\"\"\"\n\n__all__ = [\"RunTask\"]\n__author__ = \"\" # Please include so we know who the SME is\n\n# Include any imports you need here\n\nfrom lute.execution.ipc import Message # Message for communication\nfrom lute.io.models.base import *      # For TaskParameters\nfrom lute.tasks.task import *          # For Task\n\n# Only the init is shown\nclass RunMPITask(Task): # Inherit from Task\n    \"\"\"Task description goes here, or in __init__\"\"\"\n\n    # Signal the use of MPI!\n    def __init__(self, *, params: TaskParameters, use_mpi: bool = True) -&gt; None:\n        super().__init__(params=params, use_mpi=use_mpi) # Sets up Task, parameters, etc.\n        # That's it.\n</code></pre>"},{"location":"tutorial/new_task/#message-signals","title":"Message signals","text":"<p>Signals in <code>Message</code> objects are strings and can be one of the following:</p> <pre><code>LUTE_SIGNALS: Set[str] = {\n    \"NO_PICKLE_MODE\",\n    \"TASK_STARTED\",\n    \"TASK_FAILED\",\n    \"TASK_STOPPED\",\n    \"TASK_DONE\",\n    \"TASK_CANCELLED\",\n    \"TASK_RESULT\",\n}\n</code></pre> <p>Each of these signals is associated with a hook on the <code>Executor</code>-side. They are for the most part used by base classes; however, you can choose to make use of them manually as well.</p>"},{"location":"tutorial/new_task/#making-your-task-available","title":"Making your <code>Task</code> available","text":"<p>Once the <code>Task</code> has been written, it needs to be made available for import. Since different <code>Task</code>s can have conflicting dependencies and environments, this is managed through an import function. When the <code>Task</code> is done, or ready for testing, a condition is added to <code>lute.tasks.__init__.import_task</code>. For example, assume the <code>Task</code> is called <code>RunXASAnalysis</code> and it's defined in a module called <code>xas.py</code>, we would add the following lines to the <code>import_task</code> function:</p> <pre><code># in lute.tasks.__init__\n\n# ...\n\ndef import_task(task_name: str) -&gt; Type[Task]:\n    # ...\n    if task_name == \"RunXASAnalysis\":\n        from .xas import RunXASAnalysis\n\n        return RunXASAnalysis\n</code></pre>"},{"location":"tutorial/new_task/#defining-an-executor","title":"Defining an <code>Executor</code>","text":"<p>The process of <code>Executor</code> definition is identical to the process as described for <code>ThirdPartyTask</code>s above. The one exception is if you defined the <code>Task</code> to use MPI as described in the section above (Using MPI for your <code>Task</code>), you will likely consider using the <code>MPIExecutor</code>.</p>"}]}
\ No newline at end of file
diff --git a/dev/sitemap.xml.gz b/dev/sitemap.xml.gz
index c19f318b..1ce9fcf6 100644
Binary files a/dev/sitemap.xml.gz and b/dev/sitemap.xml.gz differ
diff --git a/dev/source/execution/executor/index.html b/dev/source/execution/executor/index.html
index 56831c1f..51705e51 100644
--- a/dev/source/execution/executor/index.html
+++ b/dev/source/execution/executor/index.html
@@ -2439,7 +2439,9 @@ <h2 id="execution.executor.BaseExecutor" class="doc doc-heading">
 <span class="normal">485</span>
 <span class="normal">486</span>
 <span class="normal">487</span>
-<span class="normal">488</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">BaseExecutor</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
+<span class="normal">488</span>
+<span class="normal">489</span>
+<span class="normal">490</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">BaseExecutor</span><span class="p">(</span><span class="n">ABC</span><span class="p">):</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;ABC to manage Task execution and communication with user services.</span>
 
 <span class="sd">    When running in a workflow, &quot;tasks&quot; (not the class instances) are submitted</span>
@@ -2633,7 +2635,9 @@ <h2 id="execution.executor.BaseExecutor" class="doc doc-heading">
             <span class="c1"># network.</span>
             <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mf">0.1</span><span class="p">)</span>
         <span class="c1"># Propagate any env vars setup by Communicators - only update LUTE_ vars</span>
-        <span class="n">tmp</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="n">key</span><span class="p">:</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span> <span class="k">if</span> <span class="s2">&quot;LUTE_&quot;</span> <span class="ow">in</span> <span class="n">key</span><span class="p">}</span>
+        <span class="n">tmp</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span>
+            <span class="n">key</span><span class="p">:</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span> <span class="k">if</span> <span class="s2">&quot;LUTE_&quot;</span> <span class="ow">in</span> <span class="n">key</span>
+        <span class="p">}</span>
         <span class="bp">self</span><span class="o">.</span><span class="n">_analysis_desc</span><span class="o">.</span><span class="n">task_env</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span>
 
     <span class="k">def</span> <span class="nf">_submit_task</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">cmd</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">Popen</span><span class="p">:</span>
@@ -3267,9 +3271,7 @@ <h3 id="execution.executor.BaseExecutor.execute_task" class="doc doc-heading">
 
             <details class="quote">
               <summary>Source code in <code>lute/execution/executor.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">309</span>
-<span class="normal">310</span>
-<span class="normal">311</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">311</span>
 <span class="normal">312</span>
 <span class="normal">313</span>
 <span class="normal">314</span>
@@ -3310,7 +3312,9 @@ <h3 id="execution.executor.BaseExecutor.execute_task" class="doc doc-heading">
 <span class="normal">349</span>
 <span class="normal">350</span>
 <span class="normal">351</span>
-<span class="normal">352</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">execute_task</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+<span class="normal">352</span>
+<span class="normal">353</span>
+<span class="normal">354</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">execute_task</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;Run the requested Task as a subprocess.&quot;&quot;&quot;</span>
     <span class="bp">self</span><span class="o">.</span><span class="n">_pre_task</span><span class="p">()</span>
     <span class="n">lute_path</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&quot;LUTE_PATH&quot;</span><span class="p">)</span>
@@ -3378,14 +3382,14 @@ <h3 id="execution.executor.BaseExecutor.process_results" class="doc doc-heading"
 
             <details class="quote">
               <summary>Source code in <code>lute/execution/executor.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">478</span>
-<span class="normal">479</span>
-<span class="normal">480</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">480</span>
 <span class="normal">481</span>
 <span class="normal">482</span>
 <span class="normal">483</span>
 <span class="normal">484</span>
-<span class="normal">485</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">process_results</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+<span class="normal">485</span>
+<span class="normal">486</span>
+<span class="normal">487</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">process_results</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;Perform any necessary steps to process TaskResults object.</span>
 
 <span class="sd">    Processing will depend on subclass. Examples of steps include, moving</span>
@@ -4142,9 +4146,7 @@ <h2 id="execution.executor.Executor" class="doc doc-heading">
 
               <details class="quote">
                 <summary>Source code in <code>lute/execution/executor.py</code></summary>
-                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">491</span>
-<span class="normal">492</span>
-<span class="normal">493</span>
+                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">493</span>
 <span class="normal">494</span>
 <span class="normal">495</span>
 <span class="normal">496</span>
@@ -4319,7 +4321,9 @@ <h2 id="execution.executor.Executor" class="doc doc-heading">
 <span class="normal">665</span>
 <span class="normal">666</span>
 <span class="normal">667</span>
-<span class="normal">668</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">Executor</span><span class="p">(</span><span class="n">BaseExecutor</span><span class="p">):</span>
+<span class="normal">668</span>
+<span class="normal">669</span>
+<span class="normal">670</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">Executor</span><span class="p">(</span><span class="n">BaseExecutor</span><span class="p">):</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;Basic implementation of an Executor which manages simple IPC with Task.</span>
 
 <span class="sd">    Attributes:</span>
@@ -4527,9 +4531,7 @@ <h3 id="execution.executor.Executor.add_default_hooks" class="doc doc-heading">
 
             <details class="quote">
               <summary>Source code in <code>lute/execution/executor.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">525</span>
-<span class="normal">526</span>
-<span class="normal">527</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">527</span>
 <span class="normal">528</span>
 <span class="normal">529</span>
 <span class="normal">530</span>
@@ -4602,7 +4604,9 @@ <h3 id="execution.executor.Executor.add_default_hooks" class="doc doc-heading">
 <span class="normal">597</span>
 <span class="normal">598</span>
 <span class="normal">599</span>
-<span class="normal">600</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">add_default_hooks</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+<span class="normal">600</span>
+<span class="normal">601</span>
+<span class="normal">602</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">add_default_hooks</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;Populate the set of default event hooks.&quot;&quot;&quot;</span>
 
     <span class="k">def</span> <span class="nf">no_pickle_mode</span><span class="p">(</span><span class="bp">self</span><span class="p">:</span> <span class="n">Executor</span><span class="p">,</span> <span class="n">msg</span><span class="p">:</span> <span class="n">Message</span><span class="p">):</span>
@@ -4746,9 +4750,7 @@ <h2 id="execution.executor.MPIExecutor" class="doc doc-heading">
 
               <details class="quote">
                 <summary>Source code in <code>lute/execution/executor.py</code></summary>
-                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">671</span>
-<span class="normal">672</span>
-<span class="normal">673</span>
+                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">673</span>
 <span class="normal">674</span>
 <span class="normal">675</span>
 <span class="normal">676</span>
@@ -4790,7 +4792,9 @@ <h2 id="execution.executor.MPIExecutor" class="doc doc-heading">
 <span class="normal">712</span>
 <span class="normal">713</span>
 <span class="normal">714</span>
-<span class="normal">715</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">MPIExecutor</span><span class="p">(</span><span class="n">Executor</span><span class="p">):</span>
+<span class="normal">715</span>
+<span class="normal">716</span>
+<span class="normal">717</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">MPIExecutor</span><span class="p">(</span><span class="n">Executor</span><span class="p">):</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;Runs first-party Tasks that require MPI.</span>
 
 <span class="sd">    This Executor is otherwise identical to the standard Executor, except it</span>
diff --git a/dev/source/tasks/sfx_find_peaks/index.html b/dev/source/tasks/sfx_find_peaks/index.html
index 2523df7d..dfe3621b 100644
--- a/dev/source/tasks/sfx_find_peaks/index.html
+++ b/dev/source/tasks/sfx_find_peaks/index.html
@@ -1376,7 +1376,8 @@ <h2 id="tasks.sfx_find_peaks.CxiWriter" class="doc doc-heading">
 
               <details class="quote">
                 <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal"> 31</span>
+                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal"> 30</span>
+<span class="normal"> 31</span>
 <span class="normal"> 32</span>
 <span class="normal"> 33</span>
 <span class="normal"> 34</span>
@@ -1708,7 +1709,21 @@ <h2 id="tasks.sfx_find_peaks.CxiWriter" class="doc doc-heading">
 <span class="normal">360</span>
 <span class="normal">361</span>
 <span class="normal">362</span>
-<span class="normal">363</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">CxiWriter</span><span class="p">:</span>
+<span class="normal">363</span>
+<span class="normal">364</span>
+<span class="normal">365</span>
+<span class="normal">366</span>
+<span class="normal">367</span>
+<span class="normal">368</span>
+<span class="normal">369</span>
+<span class="normal">370</span>
+<span class="normal">371</span>
+<span class="normal">372</span>
+<span class="normal">373</span>
+<span class="normal">374</span>
+<span class="normal">375</span>
+<span class="normal">376</span>
+<span class="normal">377</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">CxiWriter</span><span class="p">:</span>
 
     <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span>
         <span class="bp">self</span><span class="p">,</span>
@@ -1887,6 +1902,21 @@ <h2 id="tasks.sfx_find_peaks.CxiWriter" class="doc doc-heading">
         <span class="n">ch_rows</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">]</span> <span class="o">=</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">_det_shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">]</span>
         <span class="n">ch_cols</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">]</span> <span class="o">=</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">2</span><span class="p">]</span>
 
+        <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/data_1/data&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_index</span><span class="p">:</span>
+            <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;entry_1/data_1/data&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
+            <span class="n">ds_key</span><span class="p">:</span> <span class="nb">str</span>
+            <span class="k">for</span> <span class="n">ds_key</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/result_1&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span>
+                <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="sa">f</span><span class="s2">&quot;/entry_1/result_1/</span><span class="si">{</span><span class="n">ds_key</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span>
+                    <span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span>
+                <span class="p">)</span>
+            <span class="k">for</span> <span class="n">ds_key</span> <span class="ow">in</span> <span class="p">(</span>
+                <span class="s2">&quot;machineTime&quot;</span><span class="p">,</span>
+                <span class="s2">&quot;machineTimeNanoSeconds&quot;</span><span class="p">,</span>
+                <span class="s2">&quot;fiducial&quot;</span><span class="p">,</span>
+                <span class="s2">&quot;photon_energy_eV&quot;</span><span class="p">,</span>
+            <span class="p">):</span>
+                <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="sa">f</span><span class="s2">&quot;/LCLS/</span><span class="si">{</span><span class="n">ds_key</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
+
         <span class="c1"># Entry_1 entry for processing with CrystFEL</span>
         <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/data_1/data&quot;</span><span class="p">][</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span><span class="p">,</span> <span class="p">:,</span> <span class="p">:]</span> <span class="o">=</span> <span class="n">img</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span>
             <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
@@ -2099,7 +2129,8 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.__init__" class="doc doc-heading">
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal"> 33</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal"> 32</span>
+<span class="normal"> 33</span>
 <span class="normal"> 34</span>
 <span class="normal"> 35</span>
 <span class="normal"> 36</span>
@@ -2241,8 +2272,7 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.__init__" class="doc doc-heading">
 <span class="normal">172</span>
 <span class="normal">173</span>
 <span class="normal">174</span>
-<span class="normal">175</span>
-<span class="normal">176</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span>
+<span class="normal">175</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span>
     <span class="bp">self</span><span class="p">,</span>
     <span class="n">outdir</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
     <span class="n">rank</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
@@ -2414,21 +2444,7 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.optimize_and_close_file" class="doc doc-h
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">314</span>
-<span class="normal">315</span>
-<span class="normal">316</span>
-<span class="normal">317</span>
-<span class="normal">318</span>
-<span class="normal">319</span>
-<span class="normal">320</span>
-<span class="normal">321</span>
-<span class="normal">322</span>
-<span class="normal">323</span>
-<span class="normal">324</span>
-<span class="normal">325</span>
-<span class="normal">326</span>
-<span class="normal">327</span>
-<span class="normal">328</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">328</span>
 <span class="normal">329</span>
 <span class="normal">330</span>
 <span class="normal">331</span>
@@ -2463,7 +2479,21 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.optimize_and_close_file" class="doc doc-h
 <span class="normal">360</span>
 <span class="normal">361</span>
 <span class="normal">362</span>
-<span class="normal">363</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">optimize_and_close_file</span><span class="p">(</span>
+<span class="normal">363</span>
+<span class="normal">364</span>
+<span class="normal">365</span>
+<span class="normal">366</span>
+<span class="normal">367</span>
+<span class="normal">368</span>
+<span class="normal">369</span>
+<span class="normal">370</span>
+<span class="normal">371</span>
+<span class="normal">372</span>
+<span class="normal">373</span>
+<span class="normal">374</span>
+<span class="normal">375</span>
+<span class="normal">376</span>
+<span class="normal">377</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">optimize_and_close_file</span><span class="p">(</span>
     <span class="bp">self</span><span class="p">,</span>
     <span class="n">num_hits</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
     <span class="n">max_peaks</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
@@ -2550,7 +2580,8 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.write_event" class="doc doc-heading">
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">178</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">177</span>
+<span class="normal">178</span>
 <span class="normal">179</span>
 <span class="normal">180</span>
 <span class="normal">181</span>
@@ -2650,7 +2681,21 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.write_event" class="doc doc-heading">
 <span class="normal">275</span>
 <span class="normal">276</span>
 <span class="normal">277</span>
-<span class="normal">278</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_event</span><span class="p">(</span>
+<span class="normal">278</span>
+<span class="normal">279</span>
+<span class="normal">280</span>
+<span class="normal">281</span>
+<span class="normal">282</span>
+<span class="normal">283</span>
+<span class="normal">284</span>
+<span class="normal">285</span>
+<span class="normal">286</span>
+<span class="normal">287</span>
+<span class="normal">288</span>
+<span class="normal">289</span>
+<span class="normal">290</span>
+<span class="normal">291</span>
+<span class="normal">292</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_event</span><span class="p">(</span>
     <span class="bp">self</span><span class="p">,</span>
     <span class="n">img</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">],</span>
     <span class="n">peaks</span><span class="p">:</span> <span class="n">Any</span><span class="p">,</span>  <span class="c1"># Not typed becomes it comes from psana</span>
@@ -2682,6 +2727,21 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.write_event" class="doc doc-heading">
     <span class="n">ch_rows</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">]</span> <span class="o">=</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">_det_shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">1</span><span class="p">]</span>
     <span class="n">ch_cols</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">]</span> <span class="o">=</span> <span class="n">peaks</span><span class="p">[:,</span> <span class="mi">2</span><span class="p">]</span>
 
+    <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/data_1/data&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_index</span><span class="p">:</span>
+        <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;entry_1/data_1/data&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
+        <span class="n">ds_key</span><span class="p">:</span> <span class="nb">str</span>
+        <span class="k">for</span> <span class="n">ds_key</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/result_1&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span>
+            <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="sa">f</span><span class="s2">&quot;/entry_1/result_1/</span><span class="si">{</span><span class="n">ds_key</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span>
+                <span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span>
+            <span class="p">)</span>
+        <span class="k">for</span> <span class="n">ds_key</span> <span class="ow">in</span> <span class="p">(</span>
+            <span class="s2">&quot;machineTime&quot;</span><span class="p">,</span>
+            <span class="s2">&quot;machineTimeNanoSeconds&quot;</span><span class="p">,</span>
+            <span class="s2">&quot;fiducial&quot;</span><span class="p">,</span>
+            <span class="s2">&quot;photon_energy_eV&quot;</span><span class="p">,</span>
+        <span class="p">):</span>
+            <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="sa">f</span><span class="s2">&quot;/LCLS/</span><span class="si">{</span><span class="n">ds_key</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">]</span><span class="o">.</span><span class="n">resize</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
+
     <span class="c1"># Entry_1 entry for processing with CrystFEL</span>
     <span class="bp">self</span><span class="o">.</span><span class="n">_outh5</span><span class="p">[</span><span class="s2">&quot;/entry_1/data_1/data&quot;</span><span class="p">][</span><span class="bp">self</span><span class="o">.</span><span class="n">_index</span><span class="p">,</span> <span class="p">:,</span> <span class="p">:]</span> <span class="o">=</span> <span class="n">img</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span>
         <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
@@ -2779,21 +2839,7 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.write_non_event_data" class="doc doc-head
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">280</span>
-<span class="normal">281</span>
-<span class="normal">282</span>
-<span class="normal">283</span>
-<span class="normal">284</span>
-<span class="normal">285</span>
-<span class="normal">286</span>
-<span class="normal">287</span>
-<span class="normal">288</span>
-<span class="normal">289</span>
-<span class="normal">290</span>
-<span class="normal">291</span>
-<span class="normal">292</span>
-<span class="normal">293</span>
-<span class="normal">294</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">294</span>
 <span class="normal">295</span>
 <span class="normal">296</span>
 <span class="normal">297</span>
@@ -2811,7 +2857,21 @@ <h3 id="tasks.sfx_find_peaks.CxiWriter.write_non_event_data" class="doc doc-head
 <span class="normal">309</span>
 <span class="normal">310</span>
 <span class="normal">311</span>
-<span class="normal">312</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_non_event_data</span><span class="p">(</span>
+<span class="normal">312</span>
+<span class="normal">313</span>
+<span class="normal">314</span>
+<span class="normal">315</span>
+<span class="normal">316</span>
+<span class="normal">317</span>
+<span class="normal">318</span>
+<span class="normal">319</span>
+<span class="normal">320</span>
+<span class="normal">321</span>
+<span class="normal">322</span>
+<span class="normal">323</span>
+<span class="normal">324</span>
+<span class="normal">325</span>
+<span class="normal">326</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_non_event_data</span><span class="p">(</span>
     <span class="bp">self</span><span class="p">,</span>
     <span class="n">powder_hits</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">],</span>
     <span class="n">powder_misses</span><span class="p">:</span> <span class="n">NDArray</span><span class="p">[</span><span class="n">numpy</span><span class="o">.</span><span class="n">float_</span><span class="p">],</span>
@@ -2879,21 +2939,7 @@ <h2 id="tasks.sfx_find_peaks.FindPeaksPyAlgos" class="doc doc-heading">
 
               <details class="quote">
                 <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">575</span>
-<span class="normal">576</span>
-<span class="normal">577</span>
-<span class="normal">578</span>
-<span class="normal">579</span>
-<span class="normal">580</span>
-<span class="normal">581</span>
-<span class="normal">582</span>
-<span class="normal">583</span>
-<span class="normal">584</span>
-<span class="normal">585</span>
-<span class="normal">586</span>
-<span class="normal">587</span>
-<span class="normal">588</span>
-<span class="normal">589</span>
+                <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">589</span>
 <span class="normal">590</span>
 <span class="normal">591</span>
 <span class="normal">592</span>
@@ -3122,14 +3168,38 @@ <h2 id="tasks.sfx_find_peaks.FindPeaksPyAlgos" class="doc doc-heading">
 <span class="normal">815</span>
 <span class="normal">816</span>
 <span class="normal">817</span>
-<span class="normal">818</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">FindPeaksPyAlgos</span><span class="p">(</span><span class="n">Task</span><span class="p">):</span>
+<span class="normal">818</span>
+<span class="normal">819</span>
+<span class="normal">820</span>
+<span class="normal">821</span>
+<span class="normal">822</span>
+<span class="normal">823</span>
+<span class="normal">824</span>
+<span class="normal">825</span>
+<span class="normal">826</span>
+<span class="normal">827</span>
+<span class="normal">828</span>
+<span class="normal">829</span>
+<span class="normal">830</span>
+<span class="normal">831</span>
+<span class="normal">832</span>
+<span class="normal">833</span>
+<span class="normal">834</span>
+<span class="normal">835</span>
+<span class="normal">836</span>
+<span class="normal">837</span>
+<span class="normal">838</span>
+<span class="normal">839</span>
+<span class="normal">840</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">class</span> <span class="nc">FindPeaksPyAlgos</span><span class="p">(</span><span class="n">Task</span><span class="p">):</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;</span>
 <span class="sd">    Task that performs peak finding using the PyAlgos peak finding algorithms and</span>
 <span class="sd">    writes the peak information to CXI files.</span>
 <span class="sd">    &quot;&quot;&quot;</span>
 
-    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="p">,</span> <span class="n">params</span><span class="p">:</span> <span class="n">TaskParameters</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
-        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">params</span><span class="o">=</span><span class="n">params</span><span class="p">)</span>
+    <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="p">,</span> <span class="n">params</span><span class="p">:</span> <span class="n">TaskParameters</span><span class="p">,</span> <span class="n">use_mpi</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
+        <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">params</span><span class="o">=</span><span class="n">params</span><span class="p">,</span> <span class="n">use_mpi</span><span class="o">=</span><span class="n">use_mpi</span><span class="p">)</span>
+        <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_task_parameters</span><span class="o">.</span><span class="n">compression</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
+            <span class="kn">from</span> <span class="nn">libpressio</span> <span class="kn">import</span> <span class="n">PressioCompressor</span>
 
     <span class="k">def</span> <span class="nf">_run</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
         <span class="n">ds</span><span class="p">:</span> <span class="n">Any</span> <span class="o">=</span> <span class="n">MPIDataSource</span><span class="p">(</span>
@@ -3306,9 +3376,15 @@ <h2 id="tasks.sfx_find_peaks.FindPeaksPyAlgos" class="doc doc-heading">
             <span class="c1"># TODO: Fix bug here</span>
             <span class="c1"># generate / update powders</span>
             <span class="k">if</span> <span class="n">peaks</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&gt;=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_task_parameters</span><span class="o">.</span><span class="n">min_peaks</span><span class="p">:</span>
-                <span class="n">powder_hits</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="n">powder_hits</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
+                <span class="n">powder_hits</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span>
+                    <span class="n">powder_hits</span><span class="p">,</span>
+                    <span class="n">img</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]),</span>
+                <span class="p">)</span>
             <span class="k">else</span><span class="p">:</span>
-                <span class="n">powder_misses</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="n">powder_misses</span><span class="p">,</span> <span class="n">img</span><span class="p">)</span>
+                <span class="n">powder_misses</span> <span class="o">=</span> <span class="n">numpy</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span>
+                    <span class="n">powder_misses</span><span class="p">,</span>
+                    <span class="n">img</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">img</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]),</span>
+                <span class="p">)</span>
 
         <span class="k">if</span> <span class="n">num_empty_images</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
             <span class="n">msg</span><span class="p">:</span> <span class="n">Message</span> <span class="o">=</span> <span class="n">Message</span><span class="p">(</span>
@@ -3414,25 +3490,25 @@ <h2 id="tasks.sfx_find_peaks.add_peaks_to_libpressio_configuration" class="doc d
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">554</span>
-<span class="normal">555</span>
-<span class="normal">556</span>
-<span class="normal">557</span>
-<span class="normal">558</span>
-<span class="normal">559</span>
-<span class="normal">560</span>
-<span class="normal">561</span>
-<span class="normal">562</span>
-<span class="normal">563</span>
-<span class="normal">564</span>
-<span class="normal">565</span>
-<span class="normal">566</span>
-<span class="normal">567</span>
-<span class="normal">568</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">568</span>
 <span class="normal">569</span>
 <span class="normal">570</span>
 <span class="normal">571</span>
-<span class="normal">572</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">add_peaks_to_libpressio_configuration</span><span class="p">(</span><span class="n">lp_json</span><span class="p">,</span> <span class="n">peaks</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">]:</span>
+<span class="normal">572</span>
+<span class="normal">573</span>
+<span class="normal">574</span>
+<span class="normal">575</span>
+<span class="normal">576</span>
+<span class="normal">577</span>
+<span class="normal">578</span>
+<span class="normal">579</span>
+<span class="normal">580</span>
+<span class="normal">581</span>
+<span class="normal">582</span>
+<span class="normal">583</span>
+<span class="normal">584</span>
+<span class="normal">585</span>
+<span class="normal">586</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">add_peaks_to_libpressio_configuration</span><span class="p">(</span><span class="n">lp_json</span><span class="p">,</span> <span class="n">peaks</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">]:</span>
 <span class="w">    </span><span class="sd">&quot;&quot;&quot;</span>
 <span class="sd">    Add peak infromation to libpressio configuration</span>
 
@@ -3488,21 +3564,7 @@ <h2 id="tasks.sfx_find_peaks.generate_libpressio_configuration" class="doc doc-h
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">466</span>
-<span class="normal">467</span>
-<span class="normal">468</span>
-<span class="normal">469</span>
-<span class="normal">470</span>
-<span class="normal">471</span>
-<span class="normal">472</span>
-<span class="normal">473</span>
-<span class="normal">474</span>
-<span class="normal">475</span>
-<span class="normal">476</span>
-<span class="normal">477</span>
-<span class="normal">478</span>
-<span class="normal">479</span>
-<span class="normal">480</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">480</span>
 <span class="normal">481</span>
 <span class="normal">482</span>
 <span class="normal">483</span>
@@ -3573,7 +3635,21 @@ <h2 id="tasks.sfx_find_peaks.generate_libpressio_configuration" class="doc doc-h
 <span class="normal">548</span>
 <span class="normal">549</span>
 <span class="normal">550</span>
-<span class="normal">551</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">generate_libpressio_configuration</span><span class="p">(</span>
+<span class="normal">551</span>
+<span class="normal">552</span>
+<span class="normal">553</span>
+<span class="normal">554</span>
+<span class="normal">555</span>
+<span class="normal">556</span>
+<span class="normal">557</span>
+<span class="normal">558</span>
+<span class="normal">559</span>
+<span class="normal">560</span>
+<span class="normal">561</span>
+<span class="normal">562</span>
+<span class="normal">563</span>
+<span class="normal">564</span>
+<span class="normal">565</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">generate_libpressio_configuration</span><span class="p">(</span>
     <span class="n">compressor</span><span class="p">:</span> <span class="n">Literal</span><span class="p">[</span><span class="s2">&quot;sz3&quot;</span><span class="p">,</span> <span class="s2">&quot;qoz&quot;</span><span class="p">],</span>
     <span class="n">roi_window_size</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
     <span class="n">bin_size</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
@@ -3699,21 +3775,7 @@ <h2 id="tasks.sfx_find_peaks.write_master_file" class="doc doc-heading">
 
             <details class="quote">
               <summary>Source code in <code>lute/tasks/sfx_find_peaks.py</code></summary>
-              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">366</span>
-<span class="normal">367</span>
-<span class="normal">368</span>
-<span class="normal">369</span>
-<span class="normal">370</span>
-<span class="normal">371</span>
-<span class="normal">372</span>
-<span class="normal">373</span>
-<span class="normal">374</span>
-<span class="normal">375</span>
-<span class="normal">376</span>
-<span class="normal">377</span>
-<span class="normal">378</span>
-<span class="normal">379</span>
-<span class="normal">380</span>
+              <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">380</span>
 <span class="normal">381</span>
 <span class="normal">382</span>
 <span class="normal">383</span>
@@ -3796,7 +3858,21 @@ <h2 id="tasks.sfx_find_peaks.write_master_file" class="doc doc-heading">
 <span class="normal">460</span>
 <span class="normal">461</span>
 <span class="normal">462</span>
-<span class="normal">463</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_master_file</span><span class="p">(</span>
+<span class="normal">463</span>
+<span class="normal">464</span>
+<span class="normal">465</span>
+<span class="normal">466</span>
+<span class="normal">467</span>
+<span class="normal">468</span>
+<span class="normal">469</span>
+<span class="normal">470</span>
+<span class="normal">471</span>
+<span class="normal">472</span>
+<span class="normal">473</span>
+<span class="normal">474</span>
+<span class="normal">475</span>
+<span class="normal">476</span>
+<span class="normal">477</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">write_master_file</span><span class="p">(</span>
     <span class="n">mpi_size</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
     <span class="n">outdir</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
     <span class="n">exp</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>