Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attached devices are not reported to kubelet upon device plugin restarts #392

Open
zshi-redhat opened this issue Oct 25, 2021 · 6 comments

Comments

@zshi-redhat
Copy link
Collaborator

zshi-redhat commented Oct 25, 2021

What happened?

Device plugin (with linkType selector) reports X devices to kubelet, Y number (Y < X) of devices be allocated to SR-IOV Pods.
Device plugin restarts and reports X-Y devices to kubelet.

What happened was linkType attribute (retrieved via netlink in host namespace) is empty when devices are attached to SR-IOV pods, this results in attached devices are filtered out when running linkType selector.

What did you expect to happen?

Device plugin reports X devices to kubelet after restarting

Anything else we need to know?

Netlink interface is used to retrieve linkType in the network namespace where device plugin runs:

var (
        // getLinkByName is a function that retrieves nl.Link object according to
        // a provided netdev name.
        getLinkByName = nl.LinkByName
)

// GetLinkAttrs returns a net device's link attributes.
func GetLinkAttrs(ifName string) (*nl.LinkAttrs, error) {
        link, err := getLinkByName(ifName)
        if err != nil {
                return nil, fmt.Errorf("error getting link attributes for net device %s %v", ifName, err)
        }
        return link.Attrs(), nil
}

        linkType := ""
        if len(ifName) > 0 { 
                la, err := utils.GetLinkAttrs(ifName)
                if err != nil {
                        return nil, err 
                }
                linkType = la.EncapType
        }

Component Versions

Please fill in the below table with the version numbers of components used.

Component Version
SR-IOV Network Device Plugin master

Config Files

Config file locations may be config dependent.

Device pool config file location (Try '/etc/pcidp/config.json')
{
  "resourceList": [
    {
      "resourceName": "intelnics",
      "selectors": {
        "vendors": [
          "8086"
        ],
        "pfNames": [
          "ens1f1#0-9"
        ],
        "rootDevices": [
          "0000:3b:00.1"
        ],
        "linkTypes": [
          "ether"
        ],
        "IsRdma": false,
        "NeedVhostNet": false
      },
      "SelectorObj": null
    },
    {
      "resourceName": "mlxnics",
      "selectors": {
        "vendors": [
          "15b3"
        ],
        "pfNames": [
          "ens8f1"
        ],
        "rootDevices": [
          "0000:d8:00.1"
        ],
        "linkTypes": [
          "ether"
        ],
        "IsRdma": false,
        "NeedVhostNet": false
      },
      "SelectorObj": null
    }
  ]
}
@zshi-redhat
Copy link
Collaborator Author

It seems all the rest of device selectors (driver, vendor, device, pciAddress, pfNames, rootDevices) are still retrievable even if the devices are attached (not in the host namespace), linkType is an exception.

Q: Is there a way to get linkType when VFs are moved to container namespace?

@zshi-redhat
Copy link
Collaborator Author

/cc @adrianchiris any thoughts?

@adrianchiris
Copy link
Contributor

one idea i had is to leverage PodResourceLister service exposed by kubelet[1] to add the "missing" devices.

[1] https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources

@zshi-redhat
Copy link
Collaborator Author

one idea i had is to leverage PodResourceLister service exposed by kubelet[1] to add the "missing" devices.

[1] https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources

I think it could be an option, just that we need to add k8s api access in device plugin.
I was trying to find a way to get link type reliably via kernel interface, in which case, it can also be easily used in Probe function if needed.

@adrianchiris
Copy link
Contributor

I think it could be an option, just that we need to add k8s api access in device plugin.

its grpc to kubelet socket so no access to k8s api (at least by device plugin)

I was trying to find a way to get link type reliably via kernel interface, in which case, it can also be easily used in Probe function if needed.

While writing this, i remembered we might be able to leverage devlink for that. so if we change to devlink for determining link type we should be OK.

it requires device driver to support this though, so need to check on Intel Nics and others as well.
we could (worst case) fallback to determining link type from netdev itself

[root@xxx]# devlink port show
pci/0000:06:00.0/65535: type eth netdev enp2s0f0 flavour physical port 0
pci/0000:06:00.1/131071: type eth netdev enp2s0f1 flavour physical port 1
pci/0000:06:00.3/327680: type eth netdev enp2s0f0vf0 flavour <unknown flavour> port 0
pci/0000:06:00.4/393216: type eth netdev enp6s0f0v1 flavour <unknown flavour> port 0
pci/0000:06:00.5/458752: type eth netdev enp6s0f0v2 flavour <unknown flavour> port 0
pci/0000:06:00.6/524288: type eth netdev enp6s0f0v3 flavour <unknown flavour> port 0
pci/0000:06:00.7/589824: type eth netdev enp6s0f0v4 flavour <unknown flavour> port 0
pci/0000:06:01.0/655360: type eth netdev enp6s0f0v5 flavour <unknown flavour> port 0
pci/0000:06:01.1/720896: type eth netdev enp6s0f0v6 flavour <unknown flavour> port 0
pci/0000:06:01.2/786432: type eth netdev enp6s0f0v7 flavour <unknown flavour> port 0

@zshi-redhat
Copy link
Collaborator Author

Some old NIC may not support devlink query, as mentioned in PR #395.
Proposed a temp fix to remove default setting for linkType in sriov operator webhook: k8snetworkplumbingwg/sriov-network-operator#203

SchSeba added a commit to SchSeba/sriov-network-device-plugin that referenced this issue Nov 23, 2021
Try to use devlink api to configure the linkType to be `ether`.
If the card doesn't support this we fall back to netlink.

This commit is a partial fix for k8snetworkplumbingwg#392
There are still cards the doesn't support the netlink api like intel xv710

Signed-off-by: Sebastian Sch <[email protected]>
SchSeba added a commit to SchSeba/sriov-network-device-plugin that referenced this issue Nov 23, 2021
Try to use devlink api to configure the linkType to be `ether`.
If the card doesn't support this we fall back to netlink.

This commit is a partial fix for k8snetworkplumbingwg#392
There are still cards the doesn't support the netlink api like intel xv710

Signed-off-by: Sebastian Sch <[email protected]>
SchSeba added a commit to SchSeba/sriov-network-device-plugin that referenced this issue Nov 23, 2021
Try to use devlink api to configure the linkType to be `ether`.
If the card doesn't support this we fall back to netlink.

This commit is a partial fix for k8snetworkplumbingwg#392
There are still cards the doesn't support the netlink api like intel xv710

Signed-off-by: Sebastian Sch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants