taskrun pods not scheduled due to affinity issues #8422

alanreynosov · 2024-12-06T19:54:34Z

Expected Behavior

to run a taskrun pod

Actual Behavior

does not schedule taskrun pod due to pod affinity conditions not met

Steps to Reproduce the Problem

create a kn function (func create myfunc -l go)
deploy kn function : func deploy --remote --registry ttl.sh

Additional Info

Kubernetes version:

Output of kubectl version:

kubectl version
Client Version: v1.29.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.0+k3s1

Tekton Pipeline version:

Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

tkn version
Client version: 0.38.1
Pipeline version: v0.66.0

this is running on a vcluster.

with this tekton latest version get affinity issuer error

0/8 nodes are available: 1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector, 4 node(s) didn't match pod affinity rules. preemption: 0/8 nodes are available: 1 No preemption victims found for incoming pod, 7 Preemption is not helpful for scheduling.

kubectl get po
NAME                                                READY   STATUS    RESTARTS   AGE
testn-pack-upload-pipeline-run-77rlh-scaffold-pod   0/1     Pending   0          46m
affinity-assistant-82b5fdbd2b-0                     1/1     Running   0          46m

kubectl get po demo-pack-upload-pipeline-run-test-scaffold-pod -ojson | jq -r '.status.conditions[]'        
{
  "lastProbeTime": null,
  "lastTransitionTime": "2024-12-09T16:48:04Z",
  "message": "0/4 nodes are available: 1 node(s) had untolerated taint {node-pool-dev: dev}, 3 node(s) didn't match pod affinity rules. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.",
  "reason": "Unschedulable",
  "status": "False",
  "type": "PodScheduled"
}

checking on affinity conditions I find this

  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/component: affinity-assistant
              app.kubernetes.io/instance: affinity-assistant-65ba113202
          topologyKey: kubernetes.io/hostname

so evidently no running pod matches with required condition

Why I am not reporting this at knative project repo? because this works more less fine on tektoncd version 0.49.0
Why don't to continue using such version? because intermitently finish or get stucked waiting for something, not sure what since not any affinity report but when it fails shows an error message from tekton-pipeline controller about a missing pod

{"severity":"error","timestamp":"2024-12-04T10:11:40.379Z","logger":"tekton-pipelines-controller","caller":"controller/controller.go:566","message":"Reconcile error","commit":"c802069","knative.dev/controller":"github.com.tektoncd.pipeline.pkg.reconciler.taskrun.Reconciler","knative.dev/kind":"tekton.dev.TaskRun","knative.dev/traceid":"8e4d29ab-74fa-49d8-a405-0f318cb59f99","knative.dev/key":"default/devfunc-pack-upload-pipeline-run-228jf-scaffold","duration":0.004444159,"error":"pods \"devfunc-pack-upload-pipeline-run-228jf-scaffold-pod\" not found","stacktrace":"knative.dev/pkg/controller.(*Impl).handleErr\n\tknative.dev/[email protected]/controller/controller.go:566\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/[email protected]/controller/controller.go:543\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/[email protected]/controller/controller.go:491"}

edit:

it fails also running on GKE

The text was updated successfully, but these errors were encountered:

metacoma · 2024-12-16T17:25:04Z

I have encountered the same issue

$ tkn version
Client version: 0.39.0
Pipeline version: v0.66.0
Dashboard version: v0.53.0

$ kubectl version
Client Version: v1.31.0
Kustomize Version: v5.4.2
Server Version: v1.31.3+k3s1
$ kubectl get pods
NAME                                                        READY   STATUS    RESTARTS   AGE
affinity-assistant-6177472940-0                             1/1     Running   0          8m42s
my-function-pack-git-pipeline-run-dns8x-fetch-sources-pod   0/1     Pending   0          8m42s

$ kubectl get po my-function-pack-git-pipeline-run-dns8x-fetch-sources-pod  -o yaml
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-12-16T17:18:35Z"
    message: '0/1 nodes are available: 1 node(s) didn''t match pod affinity rules.
      preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: BestEffort

more details here: tektoncd/pipeline#8422

alanreynosov added the kind/bug Categorizes issue or PR as related to a bug. label Dec 6, 2024

metacoma added a commit to mindwm/mindwm-gitops that referenced this issue Dec 16, 2024

chore: downgrade tekton-pipelines to 0.49.0

d2555d2

more details here: tektoncd/pipeline#8422

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

taskrun pods not scheduled due to affinity issues #8422

taskrun pods not scheduled due to affinity issues #8422

alanreynosov commented Dec 6, 2024 •

edited

Loading

metacoma commented Dec 16, 2024 •

edited

Loading

taskrun pods not scheduled due to affinity issues #8422

taskrun pods not scheduled due to affinity issues #8422

Comments

alanreynosov commented Dec 6, 2024 • edited Loading

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

metacoma commented Dec 16, 2024 • edited Loading

alanreynosov commented Dec 6, 2024 •

edited

Loading

metacoma commented Dec 16, 2024 •

edited

Loading