Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI: Node.ExpandVolume gets wrong staging path when volume is in use #24465

Open
dani opened this issue Nov 15, 2024 · 2 comments
Open

CSI: Node.ExpandVolume gets wrong staging path when volume is in use #24465

dani opened this issue Nov 15, 2024 · 2 comments

Comments

@dani
Copy link

dani commented Nov 15, 2024

Nomad version

Nomad v1.9.3
BuildDate 2024-11-11T16:35:41Z
Revision d92bf1014886c0ff9f882f4a2691d5ae8ad8131c

Operating system and Environment details

AlmaLinux 9.4
Using Nomad from pre-built Linux AMD64 binaries
Ceph CSI 3.12.2

Issue

Most operations with Ceph RBD volumes are working (so I guess my setup is correct), except for one thing : trying to resize a volume when it's in use (by altering min_capacity + max_capacity, then registering the volume again with nomad volume register volume.hcl). For example, if I try to resize the postgres-data[1] volume, in the "poc" namespace :

Error registering volume: Unexpected response code: 500 (rpc error: unable to update volume: 1 error occurred:
	* CSI.NodeExpandVolume error: node plugin returned an internal error, check the plugin allocation logs for more information: rpc error: code = Internal desc = Failed as missing stash (internal open /local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json: no such file or directory))

Logs from the corresponding ceph-csi node shows the same error

2024-11-15 13:47:29.000	E1115 13:47:29.863468       1 utils.go:245] ID: 7874 Req-ID: 0001-0024-cbfda0a8-461a-4577-9be1-e229acb2bac5-0000000000000006-1db44304-55c6-4200-854f-d315a86375db GRPC error: rpc error: code = Internal desc = Failed as missing stash (internal open /local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json: no such file or directory)
2024-11-15 13:47:29.000	E1115 13:47:29.863457       1 nodeserver.go:1136] ID: 7874 Req-ID: 0001-0024-cbfda0a8-461a-4577-9be1-e229acb2bac5-0000000000000006-1db44304-55c6-4200-854f-d315a86375db failed to find image metadata: Failed as missing stash (internal open /local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json: no such file or directory)

The problem is that the CSI node gets the staging path as /local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/ but the real staging path is /local/csi/staging/poc/postgres-data[1]/rw-file-system-single-node-writer/ (the name of the namespace the volume is registered in is missing)

Inside the Ceph RBD node

sh-5.1# ls -l /local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json
ls: cannot access '/local/csi/staging/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json': No such file or directory
sh-5.1# ls -l /local/csi/staging/poc/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json
-rw-------. 1 root root 210 Nov 15 13:32 '/local/csi/staging/poc/postgres-data[1]/rw-file-system-single-node-writer/image-meta.json'

I can resize correctly when the volume is not in use. The issue might be related to the fix for this bug

Maybe other CSI plugins are also affected, but I can reproduce it only with Ceph (tried with democratic-csi iSCSI against a truenas server with no issue)

Reproduction steps

  • Create a ceph RBD volume in a specific namespace
  • Run a job using this volume
  • Try to resize the volume while it's in use

Expected Result

The volume should be resized

Actual Result

Ceph CSI node fails as it gets a incorrect staging path (from Nomad ? Not a CSI expert)

The only workarround is to stop the job, do the resize, start the job again

@tgross
Copy link
Member

tgross commented Nov 15, 2024

Hi @dani! I was a little surprised to discover that online resize was supported at all! But it looks like Nomad's CSI library version is older than the addition of the Capabilities for VolumeExpansion where plugins can define their capability for online vs offline resize.

What's you're seeing is definitely weird, but I can't quite tell at a glance what the issue is. The code in the Nomad client that sends the RPC to the plugin is here in (volumeManager).ExpandVolume. The RPC call that's sent from the server to the client is created here in (CSIVolume).NodeExpand. Not much in the way of logic here.

"poc" is the namespace, right? The only two ways I could see that missing here are:

  • The volume object in the state store is missing its namespace somehow, so its empty when we assign it here. You could diagnose that via nomad volume status -namespace poc 'postgres-data[1]' to verify the namespace field is set.
  • The staging point isn't visible to Nomad here. But I don't see a way for that to happen without Nomad never having been able to mount the volume in the first place!

So this will definitely need more investigation. I'll mark it for a closer look.

@tgross tgross moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage Nov 15, 2024
@tgross tgross changed the title Can't resize a Ceph RBD volume when it's in use CSI: Node.ExpandVolume gets wrong staging path when volume is in use Nov 15, 2024
@dani
Copy link
Author

dani commented Nov 15, 2024

Indeed, poc is the namespace where the volume (and the job using it) is created. The namespace is correctly populated when creating the volume

[dbd@laptop-103 ~]$ nomad volume status -namespace poc 'postgres-data[1]'
ID                   = postgres-data[1]
Name                 = postgres-data-1
Namespace            = poc
External ID          = 0001-0024-cbfda0a8-461a-4577-9be1-e229acb2bac5-0000000000000006-1db44304-55c6-4200-854f-d315a86375db
Plugin ID            = rbd.ceph-csi
Provider             = rbd.csi.ceph.com
Version              = v3.12.2
Capacity             = 47 GiB
Schedulable          = true
Controllers Healthy  = 1
Controllers Expected = 1
Nodes Healthy        = 6
Nodes Expected       = 6
Access Mode          = single-node-writer
Attachment Mode      = file-system
Mount Options        = fs_type: xfs flags: [REDACTED]
Namespace            = poc

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
73ba6c96  4ae656ea  server      31       run      running  1h58m ago  1h57m ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Roadmapping
Development

No branches or pull requests

2 participants