Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Env per pod used to exist but has been removed - why? #629

Open
2 of 4 tasks
mb-m opened this issue Aug 1, 2022 · 8 comments · May be fixed by #632
Open
2 of 4 tasks

Env per pod used to exist but has been removed - why? #629

mb-m opened this issue Aug 1, 2022 · 8 comments · May be fixed by #632
Labels
kind/enhancement kind - new features or changes

Comments

@mb-m
Copy link
Contributor

mb-m commented Aug 1, 2022

Checks

Motivation

I'm trying to (finally) move from a patched older version of the community chart to the latest version (and working on other patches - issue and PR to hopefully follow), but I noticed that there's no longer a way to specify an environment variable for a single pod. In my case, I set the AIRFLOW__CORE__LOGGING (I'm still running 1.10.14) to INFO everywhere except the workers where I set it to DEBUG.

I also have my entity syncer (which has a little bit extra from the ones now in the 8.x.x chart), which has a couple of extra useful environment variables that I don't want to specify everywhere (obviously copying in the custom template is easy enough for the latter, but annoying for the workers). There are overrides for many of the other things, and there's presumably a good reason (that I'm missing?) why these used to exist in the 7.x.x versions of the chart and were removed. But is there a better way to solve my logging problem above, for example that I've missed?

Implementation

diff --git a/charts/airflow/templates/_helpers/pods.tpl b/charts/airflow/templates/_helpers/pods.tpl
index 0e5bdd8..47a1bae 100644
--- a/charts/airflow/templates/_helpers/pods.tpl
+++ b/charts/airflow/templates/_helpers/pods.tpl
@@ -583,4 +583,7 @@ EXAMPLE USAGE: {{ include "airflow.env" (dict "Release" .Release "Values" .Value
 {{- if .Values.airflow.extraEnv }}
 {{ toYaml .Values.airflow.extraEnv }}
 {{- end }}
+{{- if .extraEnv }}
+{{ toYaml .extraEnv }}
+{{- end }}
 {{- end }}
diff --git a/charts/airflow/templates/flower/flower-deployment.yaml b/charts/airflow/templates/flower/flower-deployment.yaml
index 436f7de..4a88dcb 100644
--- a/charts/airflow/templates/flower/flower-deployment.yaml
+++ b/charts/airflow/templates/flower/flower-deployment.yaml
@@ -3,6 +3,7 @@
 {{- $podAffinity := include "airflow.podAffinity" (dict "Release" .Release "Values" .Values "affinity" .Values.flower.affinity) }}
 {{- $podTolerations := include "airflow.podTolerations" (dict "Release" .Release "Values" .Values "tolerations" .Values.flower.tolerations) }}
 {{- $podSecurityContext := include "airflow.podSecurityContext" (dict "Release" .Release "Values" .Values "securityContext" .Values.flower.securityContext) }}
+{{- $env := include "airflow.env" (dict "Release" .Release "Values" .Values "extraEnv" .Values.flower.extraEnv) }}
 {{- $extraPipPackages := concat .Values.airflow.extraPipPackages .Values.flower.extraPipPackages }}
 {{- $extraVolumeMounts := .Values.flower.extraVolumeMounts }}
 {{- $volumeMounts := include "airflow.volumeMounts" (dict "Release" .Release "Values" .Values "extraPipPackages" $extraPipPackages "extraVolumeMounts" $extraVolumeMounts) }}
@@ -100,7 +101,7 @@ spec:
           envFrom:
             {{- include "airflow.envFrom" . | indent 12 }}
           env:
-            {{- include "airflow.env" . | indent 12 }}
+            {{- $env | indent 12 }}
           ports:
             - name: flower
               containerPort: 5555
diff --git a/charts/airflow/templates/scheduler/scheduler-deployment.yaml b/charts/airflow/templates/scheduler/scheduler-deployment.yaml
index 529aa2c..7b5b85f 100644
--- a/charts/airflow/templates/scheduler/scheduler-deployment.yaml
+++ b/charts/airflow/templates/scheduler/scheduler-deployment.yaml
@@ -2,6 +2,7 @@
 {{- $podAffinity := include "airflow.podAffinity" (dict "Release" .Release "Values" .Values "affinity" .Values.scheduler.affinity) }}
 {{- $podTolerations := include "airflow.podTolerations" (dict "Release" .Release "Values" .Values "tolerations" .Values.scheduler.tolerations) }}
 {{- $podSecurityContext := include "airflow.podSecurityContext" (dict "Release" .Release "Values" .Values "securityContext" .Values.scheduler.securityContext) }}
+{{- $env := include "airflow.env" (dict "Release" .Release "Values" .Values "extraEnv" .Values.scheduler.extraEnv) }}
 {{- $extraPipPackages := concat .Values.airflow.extraPipPackages .Values.scheduler.extraPipPackages }}
 {{- $extraVolumeMounts := .Values.scheduler.extraVolumeMounts }}
 {{- $volumeMounts := include "airflow.volumeMounts" (dict "Release" .Release "Values" .Values "extraPipPackages" $extraPipPackages "extraVolumeMounts" $extraVolumeMounts) }}
@@ -110,7 +111,7 @@ spec:
           envFrom:
             {{- include "airflow.envFrom" . | indent 12 }}
           env:
-            {{- include "airflow.env" . | indent 12 }}
+            {{- $env | indent 12 }}
           command:
             {{- include "airflow.command" . | indent 12 }}
           args:
@@ -227,4 +228,4 @@ spec:
           configMap:
             name: {{ include "airflow.fullname" . }}-pod-template
         {{- end }}
-      {{- end }}
\ No newline at end of file
+      {{- end }}
diff --git a/charts/airflow/templates/triggerer/triggerer-deployment.yaml b/charts/airflow/templates/triggerer/triggerer-deployment.yaml
index 2aea21d..739580b 100644
--- a/charts/airflow/templates/triggerer/triggerer-deployment.yaml
+++ b/charts/airflow/templates/triggerer/triggerer-deployment.yaml
@@ -3,6 +3,7 @@
 {{- $podAffinity := include "airflow.podAffinity" (dict "Release" .Release "Values" .Values "affinity" .Values.triggerer.affinity) }}
 {{- $podTolerations := include "airflow.podTolerations" (dict "Release" .Release "Values" .Values "tolerations" .Values.triggerer.tolerations) }}
 {{- $podSecurityContext := include "airflow.podSecurityContext" (dict "Release" .Release "Values" .Values "securityContext" .Values.triggerer.securityContext) }}
+{{- $env := include "airflow.env" (dict "Release" .Release "Values" .Values "extraEnv" .Values.triggerer.extraEnv) }}
 {{- $extraPipPackages := concat .Values.airflow.extraPipPackages .Values.triggerer.extraPipPackages }}
 {{- $extraVolumeMounts := .Values.triggerer.extraVolumeMounts }}
 {{- $volumeMounts := include "airflow.volumeMounts" (dict "Release" .Release "Values" .Values "extraPipPackages" $extraPipPackages "extraVolumeMounts" $extraVolumeMounts) }}
@@ -99,7 +100,7 @@ spec:
           envFrom:
             {{- include "airflow.envFrom" . | indent 12 }}
           env:
-            {{- include "airflow.env" . | indent 12 }}
+            {{- $env | indent 12 }}
           command:
             {{- include "airflow.command" . | indent 12 }}
           args:
diff --git a/charts/airflow/templates/webserver/webserver-deployment.yaml b/charts/airflow/templates/webserver/webserver-deployment.yaml
index 4e536da..92139de 100644
--- a/charts/airflow/templates/webserver/webserver-deployment.yaml
+++ b/charts/airflow/templates/webserver/webserver-deployment.yaml
@@ -2,6 +2,7 @@
 {{- $podAffinity := include "airflow.podAffinity" (dict "Release" .Release "Values" .Values "affinity" .Values.web.affinity) }}
 {{- $podTolerations := include "airflow.podTolerations" (dict "Release" .Release "Values" .Values "tolerations" .Values.web.tolerations) }}
 {{- $podSecurityContext := include "airflow.podSecurityContext" (dict "Release" .Release "Values" .Values "securityContext" .Values.web.securityContext) }}
+{{- $env := include "airflow.env" (dict "Release" .Release "Values" .Values "extraEnv" .Values.web.extraEnv) }}
 {{- $extraPipPackages := concat .Values.airflow.extraPipPackages .Values.web.extraPipPackages }}
 {{- $extraVolumeMounts := .Values.web.extraVolumeMounts }}
 {{- $volumeMounts := include "airflow.volumeMounts" (dict "Release" .Release "Values" .Values "extraPipPackages" $extraPipPackages "extraVolumeMounts" $extraVolumeMounts) }}
@@ -103,7 +104,7 @@ spec:
           envFrom:
             {{- include "airflow.envFrom" . | indent 12 }}
           env:
-            {{- include "airflow.env" . | indent 12 }}
+            {{- $env | indent 12 }}
           command:
             {{- include "airflow.command" . | indent 12 }}
           args:
@@ -155,4 +156,4 @@ spec:
             {{- else }}
             secretName: {{ include "airflow.fullname" . }}-webserver-config
             {{- end }}
-            defaultMode: 0644
\ No newline at end of file
+            defaultMode: 0644
diff --git a/charts/airflow/templates/worker/worker-statefulset.yaml b/charts/airflow/templates/worker/worker-statefulset.yaml
index d2e50bc..2266314 100644
--- a/charts/airflow/templates/worker/worker-statefulset.yaml
+++ b/charts/airflow/templates/worker/worker-statefulset.yaml
@@ -3,6 +3,7 @@
 {{- $podAffinity := include "airflow.podAffinity" (dict "Release" .Release "Values" .Values "affinity" .Values.workers.affinity) }}
 {{- $podTolerations := include "airflow.podTolerations" (dict "Release" .Release "Values" .Values "tolerations" .Values.workers.tolerations) }}
 {{- $podSecurityContext := include "airflow.podSecurityContext" (dict "Release" .Release "Values" .Values "securityContext" .Values.workers.securityContext) }}
+{{- $env := include "airflow.env" (dict "Release" .Release "Values" .Values "extraEnv" .Values.workers.extraEnv) }}
 {{- $extraPipPackages := concat .Values.airflow.extraPipPackages .Values.workers.extraPipPackages }}
 {{- $extraVolumeMounts := .Values.workers.extraVolumeMounts }}
 {{- $volumeMounts := include "airflow.volumeMounts" (dict "Release" .Release "Values" .Values "extraPipPackages" $extraPipPackages "extraVolumeMounts" $extraVolumeMounts) }}
@@ -104,7 +105,7 @@ spec:
           envFrom:
             {{- include "airflow.envFrom" . | indent 12 }}
           env:
-            {{- include "airflow.env" . | indent 12 }}
+            {{- $env | indent 12 }}
             # have dumb-init only send signals to direct child process (needed for celery workers to warm shutdown)
             - name: DUMB_INIT_SETSID
               value: "0"
diff --git a/charts/airflow/values.yaml b/charts/airflow/values.yaml
index 9b7b987..96d60ab 100644
--- a/charts/airflow/values.yaml
+++ b/charts/airflow/values.yaml
@@ -629,6 +629,9 @@ scheduler:
       ##
       schedulerAgeBeforeCheck: 180
 
+  ## extra environment for just the scheduler Pods
+  extraEnv: {}
+
   ## extra pip packages to install in the scheduler Pods
   ##
   ## ____ EXAMPLE _______________
@@ -786,6 +789,9 @@ web:
     timeoutSeconds: 5
     failureThreshold: 6
 
+  ## extra environment for just the web Pods
+  extraEnv: {}
+
   ## extra pip packages to install in the web Pods
   ##
   ## ____ EXAMPLE _______________
@@ -955,6 +961,9 @@ workers:
     ##
     intervalSeconds: 900
 
+  ## extra environment for just the worker Pods
+  extraEnv: {}
+
   ## extra pip packages to install in the worker Pod
   ##
   ## ____ EXAMPLE _______________
@@ -1068,6 +1077,9 @@ triggerer:
     timeoutSeconds: 60
     failureThreshold: 5
 
+  ## extra environment for just the triggerer Pods
+  extraEnv: {}
+
   ## extra pip packages to install in the triggerer Pod
   ##
   ## ____ EXAMPLE _______________
@@ -1204,6 +1216,9 @@ flower:
     timeoutSeconds: 5
     failureThreshold: 6
 
+  ## extra environment for just the flower Pods
+  extraEnv: {}
+
   ## extra pip packages to install in the flower Pod
   ##
   ## ____ EXAMPLE _______________

Are you willing & able to help?

  • I am able to submit a PR!
  • I can help test the feature!
@mb-m mb-m added the kind/enhancement kind - new features or changes label Aug 1, 2022
@thesuperzapper
Copy link
Member

@mb-m it's been a while since that change, but I think the main reason was to reduce the risk that people set inconsistent airflow configs across their cluster, either because they don't define them consistently, or because they don't restart all the Pods after a secret which set an env is updated (as Secret/ConfigMap updates are not propagated to env until the container restarts).

I am not against introducing an extraEnv value for each Deployment, but there are a number of things to consider:

  • in a Container's env list, the FIRST instance of a key takes precedence
    • so we want order in the "airflow.env" template to be: "chart managed envs" --> xxxx.extraEnv --> airflow.extraEnv
  • we should improve the values-validation.tpl that checks people are not setting chart managed envs to also check xxxx.extraEnv and airflow.extraEnv (which we don't actually check right now)
    • NOTE: in the interest of not looping over every airflow.extraEnv and xxxx.extraEnv for every variable, we might create something like a hash-map in a variable name to quickly check which envs were set
  • there are a small number of places where we loop over airflow.extraEnv to detect things for the post-install NOTES.txt, we need to extend these checks to also check xxxx.extraEnv

Also, I would love to know what extra features we are missing from the sync deployments, could you share what you have added?

@mb-m
Copy link
Contributor Author

mb-m commented Aug 5, 2022

@thesuperzapper thanks for getting back to me so quickly (and apologies for my delayed response back - I was totally useless a few days ago after a positive test result and symptoms - and then I thought I'd sent this response and hadn't). That's an interesting set of concerns, for sure, I've always tended to err on the side that the user should know what they are doing, but I can totally see why protecting them from themselves is worthwhile. To be honest, I'm expecting the xxxx.extraEnv to be small though, and its usage to be limited.

As to the syncer - the big one that's missing for me is "roles" (so actual permissions) - for instance, I turn off can_trigger on the instance in question, but the truth is that our system isn't just airflow - it's airflow with a layer to scan lots of git repositories organisation-wide and generate DAGs from individual "steps" (which might themselves expand into multiple parts of a DAG and task), and join that all together. (It turns out just to do the git-sync at scale is about 1000 lines of python, and to do all the rest is about 8000-odd lines of code, plus tests). But that means that our "syncer" pod - which started off as just being an entity syncer - like the different bits that you have (although we have the entities in the synced dags/ directory, so you just need to commit them there without any apply, and they'll get updated), ended up being a place for all the extra bits that went with supporting this system (the program that cleans up dead DAGs (and reports on it); a program to sync S3 back into git (and manage the commits in a sane way - because it turns out that other people find that kind of thing hard) and other such things, including queries across the database that give us monitoring on the status of tasks, pool usage etc. Quite a few of these aren't easily supported just by the extraManifests - because you want them to have access to the same extraContainers, and environment, which means that when the chart changes style, you have to follow that (we already have to do that a bit for one of the extraContainers too, so we can generate the DAGs - which ends up being a bit ugly - we need the airflowdb so we don't auto-instantiate a role that doesn't exist (this is also why the role syncer helps of course) - because we give our users access to dags based on where we scanned the steps from).

There's an extra part here, which is that we've been doing this weird hack (we currently use the google oauth) where we ask users to register, we capture their google oauth id and only then update our list of users and roles to sync. As I'd like us to move away from Google, being able to do the sync for "any account with this email address" coming from oauth is something we're likely to put in our version of the syncer soon. This also means we can (to some extent) pre-prepare users - they can register, and then within 5-10 minutes of registering, they'll have the right permissions - because we can potentially export the email address permissions from elsewhere.

I like your suggestions though - I'd probably disagree that you need to go as far as a hashmap because the number of these extraEnvs is still relatively small, and there are so many other things that are slow in helm that looping all of these bits is the least of my worries - but I think that a template for each possible place where the envs might be set is enough. I'll have a go on getting the other bits together as a PR.

@thesuperzapper
Copy link
Member

@mb-m I am interested to see your approach or any suggestions you have to make the chart better for you (whether as a PR, or in an issue).

Also, there is actually a PR to add syncing for FAB Roles #537, but I haven't looked into the problem yet, so I have no idea if that PR attacks the problem in the best way.

@mb-m
Copy link
Contributor Author

mb-m commented Aug 16, 2022

Before I wrap this fully up into a PR and have a proper testcase, how does the attached look - am I capturing the kinds of things you were thinking?

charts-airflow.patch

@thesuperzapper
Copy link
Member

@mb-m it's quite difficult to look at in a patch form, can you make a PR so we can discuss it there?

(even if its not fully ready yet, we can discuss it more in your PR)

mb-m pushed a commit to mb-m/airflow-helm-charts that referenced this issue Aug 19, 2022
As per airflow-helm#629, add support for .extraEnv for the various pod types, including some of the validation and warnings.
This is generally better where we either use an environment variable that's something new, or that we are overriding
something in airflow.config (eg. setting the log level at DEBUG on workers but INFO everywhere else)

Signed-off-by: Matthew Byng-Maddick <[email protected]>
mb-m pushed a commit to mb-m/airflow-helm-charts that referenced this issue Aug 19, 2022
…d types

As per airflow-helm#629, add support for .extraEnv for the various pod types, including some of the validation and warnings.
This is generally better where we either use an environment variable that's something new, or that we are overriding
something in airflow.config (eg. setting the log level at DEBUG on workers but INFO everywhere else)

Signed-off-by: Matthew Byng-Maddick <[email protected]>
mb-m pushed a commit to mb-m/airflow-helm-charts that referenced this issue Aug 19, 2022
…d types

As per airflow-helm#629, add support for .extraEnv for the various pod types, including some of the validation and warnings.
This is generally better where we either use an environment variable that's something new, or that we are overriding
something in airflow.config (eg. setting the log level at DEBUG on workers but INFO everywhere else)

Signed-off-by: Matthew Byng-Maddick <[email protected]>
@mb-m
Copy link
Contributor Author

mb-m commented Aug 23, 2022

@thesuperzapper hopefully you've seen that a PR now exists for this (it's minorly changed from my original patch, but I have it demonstrably working (using a patched version of the chart) in our environment). I'd love to hear your thoughts!

@stale
Copy link

stale bot commented Oct 22, 2022

This issue has been automatically marked as stale because it has not had activity in 60 days.
It will be closed in 7 days if no further activity occurs.

Thank you for your contributions.


Issues never become stale if any of the following is true:

  1. they are added to a Project
  2. they are added to a Milestone
  3. they have the lifecycle/frozen label

@stale stale bot added the lifecycle/stale lifecycle - this is stale label Oct 22, 2022
@mb-m
Copy link
Contributor Author

mb-m commented Nov 9, 2022

This is still waiting on the PR (comment just to stop staleness)

@stale stale bot removed the lifecycle/stale lifecycle - this is stale label Nov 9, 2022
@thesuperzapper thesuperzapper added this to the airflow-8.6.2 milestone Nov 11, 2022
mb-m pushed a commit to mb-m/airflow-helm-charts that referenced this issue Jun 6, 2023
…d types

As per airflow-helm#629, add support for .extraEnv for the various pod types, including some of the validation and warnings.
This is generally better where we either use an environment variable that's something new, or that we are overriding
something in airflow.config (eg. setting the log level at DEBUG on workers but INFO everywhere else)

Signed-off-by: Matthew Byng-Maddick <[email protected]>
mb-m pushed a commit to mb-m/airflow-helm-charts that referenced this issue Aug 30, 2023
…d types

As per airflow-helm#629, add support for .extraEnv for the various pod types, including some of the validation and warnings.
This is generally better where we either use an environment variable that's something new, or that we are overriding
something in airflow.config (eg. setting the log level at DEBUG on workers but INFO everywhere else)

Signed-off-by: Matthew Byng-Maddick <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement kind - new features or changes
Projects
None yet
2 participants