Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add option to allow cinder CSI availability zone override #366

Open
satishdotpatel opened this issue Apr 29, 2024 · 5 comments
Open

Comments

@satishdotpatel
Copy link

I have my worker nodes and cinder storage running on different availability zone and because of that getting affinity issue during pv and deployment. By default during cluster create k8s and CSI use same AZ. I would like to have option available where we can tell mcap to override CSI AZ to be different from worker/compute AZ. (Example: csi_availability_zone label apply to template or cluster create)

As you can see in following output topology.cinder.csi.openstack.org/zone=general and topology.kubernetes.io/zone=general is same AZ but I would like to change csi AZ to something else.

(venv-openstack) root@os-eng-ctrl-01:/tmp# kubectl describe node kube-acrsf-default-worker-rxmg6-t7wzd
Name:               kube-acrsf-default-worker-rxmg6-t7wzd
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=gen.c4-m8-d40
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=Boston-eng
                    failure-domain.beta.kubernetes.io/zone=general
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=kube-acrsf-default-worker-rxmg6-t7wzd
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.cluster.x-k8s.io/nodegroup=default-worker
                    node.kubernetes.io/instance-type=gen.c4-m8-d40
                    topology.cinder.csi.openstack.org/zone=general
                    topology.kubernetes.io/region=Boston-eng
                    topology.kubernetes.io/zone=general
@mnaser
Copy link
Member

mnaser commented Apr 29, 2024

I think this is a two part issue:

StorageClass not including the correct mapping
By default, it doesn't include anything about the availability zones which the volume is available at -- making it not possible to create the volume in certain situations. We should include this data by including the following to the StorageClass:

...
allowedTopologies:
- matchLabelExpressions:
  - key: topology.cinder.csi.openstack.org/zone
    values:
    - nova
...

Allow cross-AZ attachments
So while the above can be sorted, the other issue is inside an environment where cross_az_attach is set to true in the cloud, this will mean that we need to enable ignore-volume-az inside the BlockStorage if the user requires this. This would be handled by adding [BlockStorage]/ignore-volume-az to true in the Cinder CSI plugin configuration file.

@satishdotpatel
Copy link
Author

I don't think the allowedTopologies approach will work for this case. But I might consider it a bug in csi-cinder that it doesn't give you a way to set topology.cinder.csi.openstack.org/zone correctly on the nodes/worker.

@satishdotpatel
Copy link
Author

How to tell end users to set [BlockStorage]/ignore-volume-az in your k8s cluster. Are there any way to set this during k8s cluster create using magnum?

@mnaser
Copy link
Member

mnaser commented Apr 30, 2024

I don't think the allowedTopologies approach will work for this case. But I might consider it a bug in csi-cinder that it doesn't give you a way to set topology.cinder.csi.openstack.org/zone correctly on the nodes/worker.

I am not following here.. Cinder CSI doesn't know anything about your topology, so in order for it to understand the topology, we need to provide that information to it.

How to tell end users to set [BlockStorage]/ignore-volume-az in your k8s cluster. Are there any way to set this during k8s cluster create using magnum?

I think for this one, you'll have to probably edit the CCM configuration (which I can't remember if it was a configmap or inside the /etc/kubernetes file).

@noonedeadpunk
Copy link
Contributor

I can only +1 to the issue here.

And add on top, that current version of driver seems to not actually respect passed to magnum Availability Zone hint.
So workloads result in being spread randomly across AZs.

Setting cross_az_attach: True is kind of niche solution, as in many cases having a separate storage per AZ might be a part of AZ design. Where AZs are spread geographically to a point, where latency for storage between AZs is not acceptable.
While one can argue that these should be regions (and I would tend to agree), it's still might be the case for some deployments.

So respecting AZ where workload cluster is spawned should be ideally respected if a hint passed by the user.

noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a random
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is not posisble an attempt
to add a volume to worker has high failure percentage due to random
selection of AZ.

This patch suggests using provided to cluster availability_zone by
supplying `availability` parameter to cinder csi [1]

This ensures that volume will be created in a same zone as workers are
preventing failures where cross_az_attach is False.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
noonedeadpunk added a commit to noonedeadpunk/magnum-cluster-api that referenced this issue Nov 29, 2024
At the moment volumes for workers are gonna be spawn in a default
AZ completely disregarding user request for Availability Zone.

In a design where cross_az_attach is disabled, an attempt
to add a volume to worker has high failure percentage due to fallback
to the default scheduling zone, unless allow_availability_zone_fallback
is disabled.

This patch adds a configuration option `cross_az_attach` which is set
to True by default to align with Nova typical behavior. It will define
AZ to be set to `nova` according to a CSI default [1].

When `cross_az_attach` is set to False, AZ for the volume will be set to
the cluster AZ value.
This ensures that volume will be created in a same zone as workers are
preventing failures.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/d228854cf58e7b4ed93d5e7ba68ab639450e3449/docs/cinder-csi-plugin/using-cinder-csi-plugin.md#supported-parameters

Relates-to: vexxhost#366
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants