Role Name: roce_backend
The Role is added to K8s cluster availability to use in POD deployment RoCE enabled additional NIC's which based on SR-IOV Virtual function.
- SR-IOV supported server platform
- Enable SR-IOV in the NIC firmware(For Mellanox adapters please refer to https://community.mellanox.com/s/article/howto-configure-sr-iov-for-connectx-4-connectx-5-with-kvm--ethernet-x#jive_content_id_I_Enable_SRIOV_on_the_Firmware)
- Kubernetes cluster is deployed by DeepOps deployment tools
You should consult your hardware documentation for the BIOS specific settings in order to enable support for SR-IOV networking.
The Role is required an additional Ethernet fabric for high-performance POD network interfaces. Recommended scale-out L2 fabric with VXLAN-BGP-EVPN over Mellanox Onyx. Network deployment example with switch configuration files can be found - https://github.com/Mellanox/roce_backend_at_scale.
The settable variables for the role must be provided in vars/main.yml.
- SR-IOV resources for high-performance POD network interfaces. Each section of sriov_resources must have: pf_name – physical adapter interface name vlan_id – VLAN ID for virtual function interfaces res_name – resource pool name network_name – network name for annotation in POD YAML configuration
Below provided sriov_resources example for four interfaces.
sriov_resources:
- pf_name: ens9f0
vlan_id: 111
res_name: "sriov_111"
network_name: sriov111
- pf_name: ens10f0
vlan_id: 112
res_name: "sriov_112"
network_name: sriov112
- pf_name: ens11f0
vlan_id: 113
res_name: "sriov_113"
network_name: sriov113
- pf_name: ens12f0
vlan_id: 114
res_name: "sriov_114"
network_name: sriov114
- Hardware adapter vendor - vendor. Default - 15b3.
vendor: 15b3
- Virtual function device ID - dev_id. Default - "MT28908 Family [ConnectX-6 Virtual Function]". Detailed information about all Mellanox Device ID can be found - https://devicehunt.com/view/type/pci/vendor/15B3.
Supported values
- 101c - MT28908 Family [ConnectX-6 Virtual Function]
- 101a - MT28800 Family [ConnectX-5 Ex Virtual Function]
- 1018 - MT27800 Family [ConnectX-5 Virtual Function]
- 1016 - MT27710 Family [ConnectX-4 Lx Virtual Function]
- 1014 - MT27700 Family [ConnectX-4 Virtual Function]
dev_id: 101c
- Amount of Virtual function for activation - num_vf.
num_vf: 8
- Mellanox Ofed place and image name - mofed_site_place, mofed_file_name.
mofed_site_place: "MLNX_OFED-4.6-1.0.1.1"
mofed_file_name: "MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.iso"
During the installation process the Role is used Deepops config/inventory file for deployment and provisioning the Kubernetes components.
The Role installing following components:
- Mellanox Ofed with Virtual function activation
- Python modules
- Multus CNI for attaching multiple network interfaces to pod
- Universal SR-IOV device plugin with specific configuration
- Universal SR-IOV CNI
- Specific Network provisioning with NetworkAttachmentDefinition
- DHCP CNI for providing IP addresses for SR-IOV based NIC's in pod deployment from existing infrastructure
- The latest version Kubeflow/MPI-Operator
ansible-playbook -l k8s-cluster playbooks/k8s-cluster/roce.yaml
BSD
author: Vitaliy Razinkov
email: [email protected]
company: Mellanox Technologies