From 1299bd208eb3caf081346c3a098012a4abf1fb13 Mon Sep 17 00:00:00 2001 From: gadorlhiac Date: Mon, 6 May 2024 09:30:06 -0700 Subject: [PATCH 1/2] DOC Extra ADRs on launch process for Airflow integration --- docs/adrs/README.md | 2 ++ docs/adrs/adr-8.md | 37 +++++++++++++++++++++++++++++++++++++ docs/adrs/adr-9.md | 39 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 78 insertions(+) create mode 100644 docs/adrs/adr-8.md create mode 100644 docs/adrs/adr-9.md diff --git a/docs/adrs/README.md b/docs/adrs/README.md index 1e9b760b..7da72575 100644 --- a/docs/adrs/README.md +++ b/docs/adrs/README.md @@ -12,4 +12,6 @@ | 5 | 2023-12-06 | Task-Executor IPC is Managed by Communicator Objects | **Proposed** | | 6 | 2024-02-12 | Third-party Config Files Managed by Templates Rendered by `ThirdPartyTask`s | **Proposed** | | 7 | 2024-02-12 | `Task` Configuration is Stored in a Database Managed by `Executor`s | **Proposed** | +| 8 | 2024-03-18 | Airflow credentials/authorization requires special launch program. | **Proposed** | +| 9 | 2024-04-15 | Airflow launch script will run as long lived batch job. | **Proposed** | | | | | | diff --git a/docs/adrs/adr-8.md b/docs/adrs/adr-8.md new file mode 100644 index 00000000..c4f7de27 --- /dev/null +++ b/docs/adrs/adr-8.md @@ -0,0 +1,37 @@ +# [ADR-8] Airflow credentials/authorization requires special launch program + +**Date:** 2024-03-18 + +## Status +**Proposed** + +## Context and Problem Statement +- Airflow is used as the workflow manager. +- Airflow does not currently support multi-tenancy, and LDAP is not currently supported for authentication. +- Multiple users will be expected to run the software and thus need to authenticate against the Airflow API. + - We require a mechanism to control shared credentials for multiple users. + - The credentials are admin credentials, so we do not want unconstrained access to them. + - We want users to run workflows, for instance, but not to have free access to add and remove workflows. + +## Decision +A closed-source `lute_launcher` program will be used to run the Airflow launch scripts. This program accesses credentials with the correct permissions. Users should otherwise not have access to the credentials. This will help ensure the credentials can be used by everyone but only to run workflows and not perform restricted admin activities. + +### Decision Drivers +* Need shared access to credentials for the purpose of launching jobs. +* Restricted access to credentials for administrative activities. +* Ease of use for users + * Authentication should be automatic - users can not be asked for passwords etc, for jobs that need to run automatically upon data acquisition + +### Considered Options +* LDAP - this may be used in the future, but requires backend work outside of our control. We will revisit the implementation arising from this ADR in the future if LDAP is supported. +* + +## Consequences +* Complexity + +## Compliance + + +## Metadata +- This ADR WILL be revisited during the post-mortem of the first prototype. +- Compliance section will be updated as prototype evolves. diff --git a/docs/adrs/adr-9.md b/docs/adrs/adr-9.md new file mode 100644 index 00000000..89b7dd3d --- /dev/null +++ b/docs/adrs/adr-9.md @@ -0,0 +1,39 @@ +# [ADR-9] Airflow launch script will run as long lived batch job. + +**Date:** 2024-03-18 + +## Status +**Proposed** + +## Context and Problem Statement +- Each `Task` will produce its own log file. +- Log files from jobs (i.e. DAGs/workflows) run by different users will be in different locations/directories. +- None of these log files will be accessible from the Web UI of the eLog unless they are available to the initial launch script which starts the workflow. + +## Decision +The Airflow launch script will be a long lived process, running for the duration of the entire DAG. It will provide basic status logging information, e.g. what `Task`s are running, if they succeed or failed. Additionally, at the end of each `Task` job, the launch job will collect the log file from that job and append it to its own log. + +As the Airflow launch script is an entry point used from the eLog, only its log file is available to users using that UI. By converting the launch script into a long-lived monitoring job it allows the log information to be easily accessible. + +In order to accomplish this, the launch script must be submitted as a batch job, in order to comply with the 30 second timeout imposed by jobs run by the ARP. This necessitates providing an additional wrapper script. + +### Decision Drivers +* Log availability from the eLog. +* All logs available from a single location. + +### Considered Options +* All jobs append to the same initial file, by specifying a log file. (`--open-mode=append` for SLURM) + * Having a monitoring job provides the opportunity to include additional information. + +## Consequences +* There needs to be an additional wrapper script: `submit_launch_airflow.sh` which submits the `launch_airflow.py` script (run by `lute_launcher`) as a batch job. + * Jobs run by the ARP can not be long-lived - there is a 30 second timeout. + * The ARP was intended to submit batch jobs - it captures the log file from batch jobs, so running the job directly or submitting as a batch job is equivalent in terms of presenting information to the eLog UI. +* Another core is used to run the job. Overhead is now two cores - 1 for the monitoring job (`launch_airflow.py`) and 1 for the `Executor` process. + +## Compliance + + +## Metadata +- This ADR WILL be revisited during the post-mortem of the first prototype. +- Compliance section will be updated as prototype evolves. From 8a6e45ac2e1f3245174b95144afe0b911adcc6a5 Mon Sep 17 00:00:00 2001 From: gadorlhiac Date: Mon, 6 May 2024 09:33:10 -0700 Subject: [PATCH 2/2] DOC Update date --- docs/adrs/adr-9.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adrs/adr-9.md b/docs/adrs/adr-9.md index 89b7dd3d..a3d8a60c 100644 --- a/docs/adrs/adr-9.md +++ b/docs/adrs/adr-9.md @@ -1,6 +1,6 @@ # [ADR-9] Airflow launch script will run as long lived batch job. -**Date:** 2024-03-18 +**Date:** 2024-04-15 ## Status **Proposed**