-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] virtio-net: Guest can't NACK LRO #3905
Comments
Related: #2433 |
Hi, |
I've been experimenting with a FreeBSD guest for a GitHub Actions CI use case and ran into this as well. In case it helps anyone, here are two workarounds:
@bchalios I saw your post about this in slack:
Would you accept a PR for this? I'm willing to work on a proper fix but wanted to ask before digging in. |
Hey @acj, we had already started working on this. Should be able to post a PR in a week or so. I can ping you once it's out so you'll let me know what you think. Does that work for you? |
I'm not really sure that it's not impacted. eBPF support (which is currently all the rage on Linux) seems to fail miserably because of this bug (and some others). |
Hi @pkit. Could you please provide a reproducer for the issue. We do have some work in progress to fix this, just got de-prioritized due to other work. One thing missing was an integration test for making sure our fix works in Linux systems. |
Currently, we assume that the guest virtio-net driver acknowledges all the offload features we offer to it and then subsequently set the corresponding TAP features. This might be the case for the Linux guest kernel drivers we are currently using, but it is not necessarily the case. FreeBSD for example, might try to NACK some of them in some cases: firecracker-microvm#3905. This commit, changes the Firecracker implementation of VirtIO device to only setup the TAP offload features that correspond to the ones that the guest driver acknowledged. Signed-off-by: Nikita Zakirov <[email protected]> Signed-off-by: Babis Chalios <[email protected]>
Currently, we assume that the guest virtio-net driver acknowledges all the offload features we offer to it and then subsequently set the corresponding TAP features. This might be the case for the Linux guest kernel drivers we are currently using, but it is not necessarily the case. FreeBSD for example, might try to NACK some of them in some cases: firecracker-microvm#3905. This commit, changes the Firecracker implementation of VirtIO device to only setup the TAP offload features that correspond to the ones that the guest driver acknowledged. Signed-off-by: Nikita Zakirov <[email protected]> Signed-off-by: Babis Chalios <[email protected]>
Currently, we assume that the guest virtio-net driver acknowledges all the offload features we offer to it and then subsequently set the corresponding TAP features. This might be the case for the Linux guest kernel drivers we are currently using, but it is not necessarily the case. FreeBSD for example, might try to NACK some of them in some cases: firecracker-microvm#3905. This commit, changes the Firecracker implementation of VirtIO device to only setup the TAP offload features that correspond to the ones that the guest driver acknowledged. Signed-off-by: Nikita Zakirov <[email protected]> Signed-off-by: Babis Chalios <[email protected]>
Currently, we assume that the guest virtio-net driver acknowledges all the offload features we offer to it and then subsequently set the corresponding TAP features. This might be the case for the Linux guest kernel drivers we are currently using, but it is not necessarily the case. FreeBSD for example, might try to NACK some of them in some cases: #3905. This commit, changes the Firecracker implementation of VirtIO device to only setup the TAP offload features that correspond to the ones that the guest driver acknowledged. Signed-off-by: Nikita Zakirov <[email protected]> Signed-off-by: Babis Chalios <[email protected]>
#4680 changes Firecracker to properly negotiate features with the guest. I will close this now, but please let us know if this problem persists. |
Currently, we assume that the guest virtio-net driver acknowledges all the offload features we offer to it and then subsequently set the corresponding TAP features. This might be the case for the Linux guest kernel drivers we are currently using, but it is not necessarily the case. FreeBSD for example, might try to NACK some of them in some cases: firecracker-microvm#3905. This commit, changes the Firecracker implementation of VirtIO device to only setup the TAP offload features that correspond to the ones that the guest driver acknowledged. Signed-off-by: Nikita Zakirov <[email protected]> Signed-off-by: Babis Chalios <[email protected]>
Describe the bug
When FreeBSD boots in Firecracker, it disables LRO features because it doesn't want to allocate 64 kB buffers. (Maybe this should change in FreeBSD, but that's how our virtio-net driver currently does things.)
Firecracker computes the
acked_features
value but seems to completely ignore it(?); in any case, Firecracker assumes that the guest can handle >1536 byte segments despite FreeBSD disabling the relevant options. This results in the device hanging since Firecracker attempts (and fails) repeatedly to write a packet which doesn't fit onto the guest I/O ring.To Reproduce
I'm not sure why, but github always seems to trigger this while I've never seen it with other networking. Possibly because github is very close to my test instance in us-east-1?
Expected behaviour
Firecracker should pay attention to the features accepted by the guest, and not send large segments if the guest has said it doesn't want them.
Firecracker should probably also drop packets rather than entering an infinite loop trying to write them into a buffer which they don't fit into. Obviously this isn't a trivial change since buffer space might open up later; I don't know enough about these bits to suggest the right solution here.
Environment
Firecracker @ 1ca6f8c plus PVH support patches. Linux 5.15 host; FreeBSD 14.0 guest; amd64 architecture.
Additional context
Checks
The text was updated successfully, but these errors were encountered: