Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Interface Netkit is not found in the VM #5020

Open
3 tasks done
Anghille opened this issue Jan 29, 2025 · 0 comments
Open
3 tasks done

[Bug] Interface Netkit is not found in the VM #5020

Anghille opened this issue Jan 29, 2025 · 0 comments

Comments

@Anghille
Copy link

Describe the bug

Following the kata documentation, I have successfully installed kata and its corresponding runtimes on my kubernetes cluster. But uppon created a pod from a deployment, I had the error

Failed to create pod sandbox: rpc error: code=unkown desc = failed to create containerd task: failed to create shim task" Unsupported network interface: netkit unknown

The error is self-explanatory : Using cilium with netkit is not supported yet apprently. Will this be added to firefcracker ?

In the same idea, we use cilium with XDP express Datapath, BIGTCP64 and eBPF. I was wondering if firecracker would work in this setup?

To Reproduce

  1. Install any kubernetes version 1.28+
# -- (string) Kubernetes service host
k8sServiceHost: "<API Version>"
# -- (string) Kubernetes service port
k8sServicePort: "6443"

logOptions:
  format: json

# -- Configure the client side rate limit for the agent and operator
#
# If the amount of requests to the Kubernetes API server exceeds the configured
# rate limit, the agent and operator will start to throttle requests by delaying
# them until there is budget or the request times out.
k8sClientRateLimit:
  # -- (int) The sustained request rate in requests per second.
  # @default -- 5 for k8s up to 1.26. 10 for k8s version 1.27+
  qps: 80
  # -- (int) The burst request rate in requests per second.
  # The rate limiter will allow short bursts with a higher rate.
  # @default -- 10 for k8s up to 1.26. 20 for k8s version 1.27+
  burst: 150

# -- Roll out cilium agent pods automatically when configmap is updated.
rollOutCiliumPods: true

tolerations:
  - key: "CriticalAddonsOnly"
    operator: "Equal"
    value: "true"
    effect: "NoExecute"
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoSchedule"
  - key: "node.cilium.io/agent-not-ready"
    operator: "Exists"
    effect: "NoSchedule"

priorityClassName: "system-cluster-critical"

# -- Enable installation of PodCIDR routes between worker
# nodes if worker nodes share a common L2 network segment.
autoDirectNodeRoutes: true

# -- Enable bandwidth manager to optimize TCP and UDP workloads and allow
# for rate-limiting traffic from individual Pods with EDT (Earliest Departure
# Time) through the "kubernetes.io/egress-bandwidth" Pod annotation.
bandwidthManager:
  # -- Enable bandwidth manager infrastructure (also prerequirement for BBR)
  enabled: true
 
  # -- Activate BBR TCP congestion control for Pods
  bbr: true

# -- Configure L2 announcements
l2announcements:
  # -- Enable L2 announcements
  enabled: true
  # -- If a lease is not renewed for X duration, the current leader is considered dead, a new leader is picked
  leaseDuration: 20s
  # -- The interval at which the leader will renew the lease
  leaseRenewDeadline: 10s
  # -- The timeout between retries if renewal fails
  leaseRetryPeriod: 2s

pmtuDiscovery:
  # -- Enable path MTU discovery to send ICMP fragmentation-needed replies to
  # the client
  enabled: true

bpf:
  # -- Configure the level of aggregation for monitor notifications.
  # Valid options are none, low, medium, maximum.
  monitorAggregation: none

  # -- (bool) Enable native IP masquerade support in eBPF
  # @default -- `false`
  masquerade: true

  # Needed to avoid some externalIP error with native routing
  #  Disable ExternalIP mitigation (CVE-2020-8554)
  disableExternalIPMitigation: true

  # -- (bool) Configure the eBPF-based TPROXY to reduce reliance on iptables rules
  # for implementing Layer 7 policy.
  # @default -- `false`
  tproxy: true

  # -- (list) Configure explicitly allowed VLAN id's for bpf logic bypass.
  # [0] will allow all VLAN id's without any filtering.
  # @default -- `[]`
  vlanBypass: [0]

  # -- (string) Mode for Pod devices for the core datapath (veth, netkit, netkit-l2, lb-only)
  # @default -- `veth`
  datapathMode: "netkit"

# -- Enable BPF clock source probing for more efficient tick retrieval.
# default: false
bpfClockProbe: true

cni:
  exclusive: false

# -- Specify which network interfaces can run the eBPF datapath. This means
# that a packet sent from a pod to a destination outside the cluster will be
# masqueraded (to an output device IPv4 address), if the output device runs the
# program. When not specified, probing will automatically detect devices that have
# a non-local route. This should be used only when autodetection is not suitable.
# devices: ""

# -- Enable Kubernetes EndpointSlice feature in Cilium if the cluster supports it.
enableK8sEndpointSlice: true

ciliumEndpointSlice:
  # -- Enable Cilium EndpointSlice feature.
  enabled: true

envoyConfig:
  # -- Enable CiliumEnvoyConfig CRD
  # CiliumEnvoyConfig CRD can also be implicitly enabled by other options.
  enabled: true

gatewayAPI:
  # -- Enable support for Gateway API in cilium
  # This will automatically set enable-envoy-config as well.
  # default: false
  enabled: true

externalIPs:
  # -- Enable ExternalIPs service support.
  enabled: true

# -- Configure socket LB
socketLB:
  # -- Enable socket LB
  enabled: true
  # -- Disable socket lb for non-root ns. This is used to enable Istio routing rules.
  # hostNamespaceOnly: false
  # -- Enable terminating pod connections to deleted service backends.
  terminatePodConnections: true

hubble:
  # -- Enable Hubble (true by default).
  enabled: true

  metrics:
    enabled:
      - dns:query;ignoreAAAA
      - drop
      - tcp
      - flow
      - icmp
      - http
  
  redact:
    enabled: false
    http:
      urlQuery: true
      userInfo: true
      headers:
        deny:
          - Authorization
          - Proxy-Authorization
  relay:
    # -- Enable Hubble Relay (requires hubble.enabled=true)
    enabled: true

    # -- Roll out Hubble Relay pods automatically when configmap is updated.
    rollOutPods: true

    # -- Node tolerations for envoy scheduling to nodes with taints
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
    tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Equal"
        value: "true"
        effect: "NoExecute"
      - key: "node.kubernetes.io/not-ready"
        operator: "Exists"
        effect: "NoSchedule"
  ui:
    enabled: true
    # -- Roll out Hubble-ui pods automatically when configmap is updated.
    rollOutPods: true

    # -- Node tolerations for envoy scheduling to nodes with taints
    # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
    tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Equal"
        value: "true"
        effect: "NoExecute"
      - key: "node.kubernetes.io/not-ready"
        operator: "Exists"
        effect: "NoSchedule"

ipam:
  # -- Configure IP Address Management mode.
  # ref: https://docs.cilium.io/en/stable/network/concepts/ipam/
  mode: "cluster-pool"
  operator:
    clusterPoolIPv4PodCIDRList: ["10.32.0.0/15"]
    clusterPoolIPv4MaskSize: 24

# -- Configure the eBPF-based ip-masq-agent
ipMasqAgent:
  enabled: true
  config:
    #nonMasqueradeCIDRs:
    #  - 10.32.0.0/15 # podCIDR
    #  - 10.42.0.0/15 # serviceCIDR
    masqLinkLocal: false
    masqLinkLocalIPv6: false

# -- Configure the kube-proxy replacement in Cilium BPF datapath
# Valid options are "true", "false", "disabled" (deprecated), "partial" (deprecated), "strict" (deprecated).
# ref: https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/
kubeProxyReplacement: true

# -- healthz server bind address for the kube-proxy replacement.
# To enable set the value to '0.0.0.0:10256' for all ipv4
# addresses and this '[::]:10256' for all ipv6 addresses.
# By default it is disabled.
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"

localRedirectPolicy: false

# -- Configure maglev consistent hashing
# see. https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/#maglev
maglev:
  # -- tableSize is the size (parameter M) for the backend table of one
  # service entry
  tableSize: 16381

  # -- hashSeed is the cluster-wide base64 encoded seed for the hashing
  hashSeed: "QeSiyNQZ0sCFgeQ0"

# -- (string) Allows to explicitly specify the IPv4 CIDR for native routing.
# When specified, Cilium assumes networking for this CIDR is preconfigured and
# hands traffic destined for that range to the Linux network stack without
# applying any SNAT.
# Generally speaking, specifying a native routing CIDR implies that Cilium can
# depend on the underlying networking stack to route packets to their
# destination. To offer a concrete example, if Cilium is configured to use
# direct routing and the Kubernetes CIDR is included in the native routing CIDR,
# the user must configure the routes to reach pods, either manually or by
# setting the auto-direct-node-routes flag.
ipv4NativeRoutingCIDR: "10.32.0.0/15" # default ""

# -- cilium-monitor sidecar.
monitor:
  # -- Enable the cilium-monitor sidecar.
  # default: false
  enabled: true

# -- Configure service load balancing
loadBalancer:
  # -- algorithm is the name of the load balancing algorithm for backend
  # selection e.g. random or maglev
  algorithm: maglev

  # -- mode is the operation mode of load balancing for remote backends
  # e.g. snat, dsr, hybrid
  # see. https://github.com/cilium/cilium/issues/8979
  # see. https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/
  # default: snat
  mode: dsr

  # -- acceleration is the option to accelerate service handling via XDP
  # Applicable values can be: disabled (do not use XDP), native (XDP BPF
  # program is run directly out of the networking driver's early receive
  # path), or best-effort (use native mode XDP acceleration on devices
  # that support it).
  # default: disabled
  acceleration: disabled # nodes are not compatible :(

  # See. https://github.com/cilium/cilium/issues/29900
  # -- dsrDispatch configures whether IP option or IPIP encapsulation is
  # used to pass a service IP and port to remote backend
  dsrDispatch: opt #opt or geneve

  # -- L7 LoadBalancer
  l7:
    # -- Enable L7 service load balancing via envoy proxy.
    # The request to a k8s service, which has specific annotation e.g. service.cilium.io/lb-l7,
    # will be forwarded to the local backend proxy to be load balanced to the service endpoints.
    # Please refer to docs for supported annotations for more configuration.
    #
    # Applicable values:
    #   - envoy: Enable L7 load balancing via envoy proxy. This will automatically set enable-envoy-config as well.
    #   - disabled: Disable L7 load balancing by way of service annotation.
    backend: envoy


## -- Configure N-S k8s service loadbalancing
nodePort:
  # -- Enable the Cilium NodePort service implementation.
  enabled: true

# -- Configure prometheus metrics on the configured port at /metrics
prometheus:
  enabled: true

# Configure Cilium Envoy options.
envoy:
  # -- Enable Envoy Proxy in standalone DaemonSet.
  enabled: true

  # -- Configure Cilium Envoy Prometheus options.
  # Note that some of these apply to either cilium-agent or cilium-envoy.
  prometheus:
    # -- Enable prometheus metrics for cilium-envoy
    enabled: true

  # -- Roll out cilium envoy pods automatically when configmap is updated.
  # default: false
  rollOutPods: true

  debug:
    admin:
      # -- Enable admin interface for cilium-envoy.
      # This is useful for debugging and should not be enabled in production.
      enabled: true

  # -- Node tolerations for envoy scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Equal"
      value: "true"
      effect: "NoExecute"
    - key: "node.kubernetes.io/not-ready"
      operator: "Exists"
      effect: "NoSchedule"

# -- Enable native-routing mode or tunneling mode.
# Possible values:
#   - ""
#   - native
#   - tunnel
# @default -- `"tunnel"`
routingMode: "native"

operator:
  # -- Roll out cilium-operator pods automatically when configmap is updated.
  rollOutPods: true

  # -- Number of replicas to run for the cilium-operator deployment
  replicas: 1

  # -- HostNetwork setting
  # default: true
  hostNetwork: true
  
  # -- Node tolerations for envoy scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Equal"
      value: "true"
      effect: "NoExecute"
    - key: "node.kubernetes.io/not-ready"
      operator: "Exists"
      effect: "NoSchedule"
    - key: "node.cilium.io/agent-not-ready"
      operator: "Exists"
      effect: "NoSchedule"

nodeinit:
  # -- Enable the node initialization DaemonSet
  enabled: true
  
  # -- Node tolerations for envoy scheduling to nodes with taints
  # ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: "CriticalAddonsOnly"
      operator: "Equal"
      value: "true"
      effect: "NoExecute"
    - key: "node.kubernetes.io/not-ready"
      operator: "Exists"
      effect: "NoSchedule"
  1. Install Kata container with firecracker :
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-rbac/base/kata-rbac.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml
  1. Run any pod with katacontainer firecracker runtime:
apiVersion: v1
kind: Pod
metadata:
  name: webserver
spec:
  runtimeClassName: kata-fc
  containers:
  - name: webserver
    image: nginx:latest
    ports:
    - containerPort: 80
  - name: webwatcher
    image: afakharany/watcher:latest

Expected behaviour

The pod nginx should be running as expectedf

Environment

[Author TODO: Please supply the following information):]

  • Firecracker version: I dont really know given its installation using kata-deploy but must be one of the latest version

  • Host and guest kernel versions: 6.8.0.51 kernel version (host)

  • Rootfs used: ext-4

  • Architecture: x86-64

  • Any other relevant software versions: ubuntu 24.04.1 Host OS

  • Have you searched the Firecracker Issues database for similar problems?

  • Have you read the existing relevant Firecracker documentation?

  • Are you certain the bug being reported is a Firecracker issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant