-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI: Node.ExpandVolume gets wrong staging path when volume is in use #24465
Comments
Hi @dani! I was a little surprised to discover that online resize was supported at all! But it looks like Nomad's CSI library version is older than the addition of the Capabilities for What's you're seeing is definitely weird, but I can't quite tell at a glance what the issue is. The code in the Nomad client that sends the RPC to the plugin is here in "poc" is the namespace, right? The only two ways I could see that missing here are:
So this will definitely need more investigation. I'll mark it for a closer look. |
Indeed, poc is the namespace where the volume (and the job using it) is created. The namespace is correctly populated when creating the volume
|
Nomad version
Operating system and Environment details
AlmaLinux 9.4
Using Nomad from pre-built Linux AMD64 binaries
Ceph CSI 3.12.2
Issue
Most operations with Ceph RBD volumes are working (so I guess my setup is correct), except for one thing : trying to resize a volume when it's in use (by altering min_capacity + max_capacity, then registering the volume again with nomad volume register volume.hcl). For example, if I try to resize the postgres-data[1] volume, in the "poc" namespace :
Logs from the corresponding ceph-csi node shows the same error
The problem is that the CSI node gets the staging path as
/local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/
but the real staging path is/local/csi/staging/poc/postgres-data[1]/rw-file-system-single-node-writer/
(the name of the namespace the volume is registered in is missing)Inside the Ceph RBD node
I can resize correctly when the volume is not in use. The issue might be related to the fix for this bug
Maybe other CSI plugins are also affected, but I can reproduce it only with Ceph (tried with democratic-csi iSCSI against a truenas server with no issue)
Reproduction steps
Expected Result
The volume should be resized
Actual Result
Ceph CSI node fails as it gets a incorrect staging path (from Nomad ? Not a CSI expert)
The only workarround is to stop the job, do the resize, start the job again
The text was updated successfully, but these errors were encountered: