[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

q131172019 · 2022-02-04T21:05:18Z

What happened:
Run kube-up.sh with mizar network and setting DISABLE_NETWORK_SERVICE_SUPPORT=true to start 1 x 1 scale-out cluster, 6 pods (coredns, coredns-default, event-exporter, kube-proxy, l7-default-backend, metrics-server) failed to start and are not in Running states, which are a little different from issue #1337.

$ kubectl get pods -AT |grep -v Running
TENANT   NAMESPACE     NAME                                                      HASHKEY               READY   STATUS              RESTARTS   AGE
system   kube-system   coredns-75c65c444f-ncdth                                  8112594621559564708   0/1     ContainerCreating   0          19m
system   kube-system   coredns-default-carltenant-012722-tp-1-fc5d7b556-fkbtt    5639473102057060281   0/1     CrashLoopBackOff    6          19m
system   kube-system   event-exporter-v0.2.5-868dff6494-pkld4                    7344285700895751079   0/1     CrashLoopBackOff    6          19m
system   kube-system   kubernetes-dashboard-848965699-w2mx5                      7124903576439153021   0/1     ContainerCreating   0          19m
system   kube-system   l7-default-backend-6497bc5bf6-z4ls5                       820844122596721447    0/1     CrashLoopBackOff    6          19m
system   kube-system   metrics-server-v0.3.3-5f994fcb77-qwgtj                    9113278416233267155   1/2     CrashLoopBackOff    6          19m

$ kubectl get pods -AT |more
TENANT   NAMESPACE     NAME                                                      HASHKEY               READY   STATUS              RESTARTS   AGE
system   default       mizar-daemon-carltenant-012722-rp-1-minion-group-592l     8536270369083156335   1/1     Running             0          12m
system   default       mizar-daemon-carltenant-012722-tp-1-master                8536270369083156335   1/1     Running             0          18m
system   default       mizar-operator-carltenant-012722-tp-1-master              4897509646032053362   1/1     Running             0          12m
system   kube-system   arktos-network-controller-carltenant-012722-tp-1-master   6039142322731055076   1/1     Running             0          18m
system   kube-system   coredns-75c65c444f-ncdth                                  8112594621559564708   0/1     ContainerCreating   0          20m
system   kube-system   coredns-default-carltenant-012722-tp-1-fc5d7b556-fkbtt    5639473102057060281   0/1     CrashLoopBackOff    6          20m
system   kube-system   etcd-empty-dir-cleanup-carltenant-012722-rp-1-master      6416262497714047119   1/1     Running             0          11m
system   kube-system   etcd-empty-dir-cleanup-carltenant-012722-tp-1-master      6416262497714047119   1/1     Running             0          19m
system   kube-system   etcd-server-carltenant-012722-rp-1-master                 6268924934619826925   1/1     Running             0          12m
system   kube-system   etcd-server-carltenant-012722-tp-1-master                 1228365644270276432   1/1     Running             0          19m
system   kube-system   etcd-server-events-carltenant-012722-rp-1-master          7474375747461145051   1/1     Running             0          12m
system   kube-system   etcd-server-events-carltenant-012722-tp-1-master          6619750444035813212   1/1     Running             0          19m
system   kube-system   event-exporter-v0.2.5-868dff6494-pkld4                    7344285700895751079   0/1     CrashLoopBackOff    6          20m
system   kube-system   fluentd-gcp-scaler-74b46b8776-b56zl                       9018864907502037401   1/1     Running             0          20m
system   kube-system   fluentd-gcp-v3.2.0-5pcgf                                  8343423451879686010   1/1     Running             0          12m
system   kube-system   fluentd-gcp-v3.2.0-cc7st                                  5427642711625129074   1/1     Running             0          19m
system   kube-system   fluentd-gcp-v3.2.0-nxjnc                                  8568787939644734734   1/1     Running             0          12m
system   kube-system   heapster-v1.6.0-beta.1-57874ccf9d-khdd5                   4598786992029713871   2/2     Running             2          20m
system   kube-system   kube-addon-manager-carltenant-012722-rp-1-master          5014754618061431440   1/1     Running             0          12m
system   kube-system   kube-addon-manager-carltenant-012722-tp-1-master          5014754618061431440   1/1     Running             0          19m
system   kube-system   kube-apiserver-carltenant-012722-rp-1-master              245430988817997797    1/1     Running             0          8m29s
system   kube-system   kube-apiserver-carltenant-012722-tp-1-master              7349134437032311414   1/1     Running             0          19m
system   kube-system   kube-controller-manager-carltenant-012722-rp-1-master     4971968134422270638   1/1     Running             0          9m2s
system   kube-system   kube-controller-manager-carltenant-012722-tp-1-master     2644362116934241034   1/1     Running             1          19m
system   kube-system   kube-dns-autoscaler-748b78969c-8dg4z                      4173996508600644137   1/1     Running             0          20m
system   kube-system   kube-proxy-carltenant-012722-rp-1-master                  2955930285773073542   1/1     Running             0          12m
system   kube-system   kube-proxy-carltenant-012722-rp-1-minion-group-592l       2955930285773073542   1/1     Running             0          12m
system   kube-system   kube-proxy-carltenant-012722-tp-1-master                  8282377931504847538   1/1     Running             0          19m
system   kube-system   kube-scheduler-carltenant-012722-tp-1-master              4099904641534921225   1/1     Running             6          19m
system   kube-system   kubernetes-dashboard-848965699-w2mx5                      7124903576439153021   0/1     ContainerCreating   0          20m
system   kube-system   l7-default-backend-6497bc5bf6-z4ls5                       820844122596721447    0/1     CrashLoopBackOff    6          20m
system   kube-system   l7-lb-controller-v1.2.3-carltenant-012722-rp-1-master     5013348416365870850   1/1     Running             0          12m
system   kube-system   l7-lb-controller-v1.2.3-carltenant-012722-tp-1-master     5013348416365870850   1/1     Running             0          18m
system   kube-system   metrics-server-v0.3.3-5f994fcb77-qwgtj                    9113278416233267155   1/2     CrashLoopBackOff    6          20m

What you expected to happen:

These 6 pods should be in Running state.

How to reproduce it (as minimally and precisely as possible):
The codes are POC codes + mizar PR1320.

$ unset KUBE_GCE_MASTER_PROJECT KUBE_GCE_NODE_PROJECT KUBE_GCI_VERSION  KUBE_GCE_MASTER_IMAGE  KUBE_GCE_NODE_IMAGE KUBE_CONTAINER_RUNTIME NETWORK_PROVIDER DISABLE_NETWORK_SERVICE_SUPPORT

$ export KUBEMARK_NUM_NODES=100 NUM_NODES=1 SCALEOUT_CLUSTER=true SCALEOUT_TP_COUNT=1 SCALEOUT_RP_COUNT=1 RUN_PREFIX=carltenant-012722 NETWORK_PROVIDER=mizar

$ export MASTER_DISK_SIZE=500GB MASTER_ROOT_DISK_SIZE=500GB KUBE_GCE_ZONE=us-west2-b MASTER_SIZE=n1-highmem-32 NODE_SIZE=n1-highmem-16 NODE_DISK_SIZE=500GB GOPATH=$HOME/go KUBE_GCE_ENABLE_IP_ALIASES=true KUBE_GCE_PRIVATE_CLUSTER=true CREATE_CUSTOM_NETWORK=true KUBE_GCE_INSTANCE_PREFIX=${RUN_PREFIX} KUBE_GCE_NETWORK=${RUN_PREFIX} ENABLE_KCM_LEADER_ELECT=false ENABLE_SCHEDULER_LEADER_ELECT=false ETCD_QUOTA_BACKEND_BYTES=8589934592 SHARE_PARTITIONSERVER=false LOGROTATE_FILES_MAX_COUNT=200 LOGROTATE_MAX_SIZE=200M KUBE_ENABLE_APISERVER_INSECURE_PORT=true KUBE_ENABLE_PROMETHEUS_DEBUG=true KUBE_ENABLE_PPROF_DEBUG=true TEST_CLUSTER_LOG_LEVEL=--v=2 HOLLOW_KUBELET_TEST_LOG_LEVEL=--v=2 GCE_REGION=us-west2-b

$ ./cluster/kube-up.sh

Anything else we need to know?:
N/A

Environment:

Arktos version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools: kubeup 1 x 1 scale-out
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

Sindica · 2022-02-07T18:42:29Z

Resolving as we have separate issues to track.

Sindica · 2022-02-07T19:30:08Z

Duplicated issues:
#1339
#1333
#1302
#1301
#1300

q131172019 mentioned this issue Feb 4, 2022

[KubeUP][Scaleout] 8 system pods failed to start when mizar with setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1337

Open

q131172019 changed the title ~~[KubeUP][Scaleout] 6 pods (coredns, coredns-default, event-exporter, kube-proxy, l7-default-backend, metrics-server) failed to start~~ [KubeUP][Scaleout] 6 pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true Feb 4, 2022

q131172019 changed the title ~~[KubeUP][Scaleout] 6 pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true~~ [KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true Feb 4, 2022

Sindica added P2 and removed P2 labels Feb 7, 2022

Sindica closed this as completed Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

q131172019 commented Feb 4, 2022 •

edited

Loading

Sindica commented Feb 7, 2022

Sindica commented Feb 7, 2022

[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

Comments

q131172019 commented Feb 4, 2022 • edited Loading

Sindica commented Feb 7, 2022

Sindica commented Feb 7, 2022

q131172019 commented Feb 4, 2022 •

edited

Loading