Adjust node pools in data-processing clusters #330

gfr10598 · 2021-04-06T20:36:16Z

The most recent change to etl k8s configs failed to launch pods in staging, because I had set the 8 core node-pool to 0 instances.

Also, in all projects, the utilization is low, because there are more nodes than needed for the requested pods.

All the pools should be updated to use appropriate auto-scaling configs.

gfr10598 · 2021-04-06T20:44:49Z

Today I am:
changing sandbox default pool to allow 1 node per zone, and remove zone us-east1-d
changing staging 8 core parser-pool1 to allow 0-2 nodes per zone.
changing staging 4 core parser-pool to allow 0-1 nodes per zone.

Tomorrow, I intend to change prod to set up auto-scaling for parser, default, and gardener pools.

gfr10598 · 2021-04-06T20:46:19Z

removing zone us-east1-d from the default pool in mlab-sandbox made gardener unschedulable, because of persistent volume location. Restored us-east1-d, and adjusted auto-scaling to allow 1-2 per zone.

Later changed to 0-2 per zone

gfr10598 · 2021-04-06T21:02:20Z

Subsequently updated mlab-sandbox parser-pool1 to allow 0-2 nodes per zone as well.

laiyi-ohlsen · 2021-04-12T17:09:05Z

@gfr10598 can you enumerate the steps that are needed before the next gardener release to production?

gfr10598 · 2021-04-21T14:55:25Z

I just deleted the mlab-staging parser-pool1 node pool from data-processing cluster. K8S had restarted parsers, and they were instantiated in the wrong node pool. Deleting the pool should prevent this happening again.

gfr10598 · 2021-04-21T15:03:30Z

See m-lab/etl#985 related to propagating errors from etl to gardener.

autolabel bot added the review/triage Team should review and assign priority label Apr 6, 2021

laiyi-ohlsen added bug Something isn't working pipeline and removed review/triage Team should review and assign priority labels Apr 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust node pools in data-processing clusters #330

Adjust node pools in data-processing clusters #330

gfr10598 commented Apr 6, 2021

gfr10598 commented Apr 6, 2021

gfr10598 commented Apr 6, 2021 •

edited

Loading

gfr10598 commented Apr 6, 2021

laiyi-ohlsen commented Apr 12, 2021

gfr10598 commented Apr 21, 2021

gfr10598 commented Apr 21, 2021

Adjust node pools in data-processing clusters #330

Adjust node pools in data-processing clusters #330

Comments

gfr10598 commented Apr 6, 2021

gfr10598 commented Apr 6, 2021

gfr10598 commented Apr 6, 2021 • edited Loading

gfr10598 commented Apr 6, 2021

laiyi-ohlsen commented Apr 12, 2021

gfr10598 commented Apr 21, 2021

gfr10598 commented Apr 21, 2021

gfr10598 commented Apr 6, 2021 •

edited

Loading