Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true #1334

Closed
q131172019 opened this issue Feb 4, 2022 · 2 comments

Comments

@q131172019
Copy link
Collaborator

q131172019 commented Feb 4, 2022

What happened:
Run kube-up.sh with mizar network and setting DISABLE_NETWORK_SERVICE_SUPPORT=true to start 1 x 1 scale-out cluster, 6 pods (coredns, coredns-default, event-exporter, kube-proxy, l7-default-backend, metrics-server) failed to start and are not in Running states, which are a little different from issue #1337.

$ kubectl get pods -AT |grep -v Running
TENANT   NAMESPACE     NAME                                                      HASHKEY               READY   STATUS              RESTARTS   AGE
system   kube-system   coredns-75c65c444f-ncdth                                  8112594621559564708   0/1     ContainerCreating   0          19m
system   kube-system   coredns-default-carltenant-012722-tp-1-fc5d7b556-fkbtt    5639473102057060281   0/1     CrashLoopBackOff    6          19m
system   kube-system   event-exporter-v0.2.5-868dff6494-pkld4                    7344285700895751079   0/1     CrashLoopBackOff    6          19m
system   kube-system   kubernetes-dashboard-848965699-w2mx5                      7124903576439153021   0/1     ContainerCreating   0          19m
system   kube-system   l7-default-backend-6497bc5bf6-z4ls5                       820844122596721447    0/1     CrashLoopBackOff    6          19m
system   kube-system   metrics-server-v0.3.3-5f994fcb77-qwgtj                    9113278416233267155   1/2     CrashLoopBackOff    6          19m
$ kubectl get pods -AT |more
TENANT   NAMESPACE     NAME                                                      HASHKEY               READY   STATUS              RESTARTS   AGE
system   default       mizar-daemon-carltenant-012722-rp-1-minion-group-592l     8536270369083156335   1/1     Running             0          12m
system   default       mizar-daemon-carltenant-012722-tp-1-master                8536270369083156335   1/1     Running             0          18m
system   default       mizar-operator-carltenant-012722-tp-1-master              4897509646032053362   1/1     Running             0          12m
system   kube-system   arktos-network-controller-carltenant-012722-tp-1-master   6039142322731055076   1/1     Running             0          18m
system   kube-system   coredns-75c65c444f-ncdth                                  8112594621559564708   0/1     ContainerCreating   0          20m
system   kube-system   coredns-default-carltenant-012722-tp-1-fc5d7b556-fkbtt    5639473102057060281   0/1     CrashLoopBackOff    6          20m
system   kube-system   etcd-empty-dir-cleanup-carltenant-012722-rp-1-master      6416262497714047119   1/1     Running             0          11m
system   kube-system   etcd-empty-dir-cleanup-carltenant-012722-tp-1-master      6416262497714047119   1/1     Running             0          19m
system   kube-system   etcd-server-carltenant-012722-rp-1-master                 6268924934619826925   1/1     Running             0          12m
system   kube-system   etcd-server-carltenant-012722-tp-1-master                 1228365644270276432   1/1     Running             0          19m
system   kube-system   etcd-server-events-carltenant-012722-rp-1-master          7474375747461145051   1/1     Running             0          12m
system   kube-system   etcd-server-events-carltenant-012722-tp-1-master          6619750444035813212   1/1     Running             0          19m
system   kube-system   event-exporter-v0.2.5-868dff6494-pkld4                    7344285700895751079   0/1     CrashLoopBackOff    6          20m
system   kube-system   fluentd-gcp-scaler-74b46b8776-b56zl                       9018864907502037401   1/1     Running             0          20m
system   kube-system   fluentd-gcp-v3.2.0-5pcgf                                  8343423451879686010   1/1     Running             0          12m
system   kube-system   fluentd-gcp-v3.2.0-cc7st                                  5427642711625129074   1/1     Running             0          19m
system   kube-system   fluentd-gcp-v3.2.0-nxjnc                                  8568787939644734734   1/1     Running             0          12m
system   kube-system   heapster-v1.6.0-beta.1-57874ccf9d-khdd5                   4598786992029713871   2/2     Running             2          20m
system   kube-system   kube-addon-manager-carltenant-012722-rp-1-master          5014754618061431440   1/1     Running             0          12m
system   kube-system   kube-addon-manager-carltenant-012722-tp-1-master          5014754618061431440   1/1     Running             0          19m
system   kube-system   kube-apiserver-carltenant-012722-rp-1-master              245430988817997797    1/1     Running             0          8m29s
system   kube-system   kube-apiserver-carltenant-012722-tp-1-master              7349134437032311414   1/1     Running             0          19m
system   kube-system   kube-controller-manager-carltenant-012722-rp-1-master     4971968134422270638   1/1     Running             0          9m2s
system   kube-system   kube-controller-manager-carltenant-012722-tp-1-master     2644362116934241034   1/1     Running             1          19m
system   kube-system   kube-dns-autoscaler-748b78969c-8dg4z                      4173996508600644137   1/1     Running             0          20m
system   kube-system   kube-proxy-carltenant-012722-rp-1-master                  2955930285773073542   1/1     Running             0          12m
system   kube-system   kube-proxy-carltenant-012722-rp-1-minion-group-592l       2955930285773073542   1/1     Running             0          12m
system   kube-system   kube-proxy-carltenant-012722-tp-1-master                  8282377931504847538   1/1     Running             0          19m
system   kube-system   kube-scheduler-carltenant-012722-tp-1-master              4099904641534921225   1/1     Running             6          19m
system   kube-system   kubernetes-dashboard-848965699-w2mx5                      7124903576439153021   0/1     ContainerCreating   0          20m
system   kube-system   l7-default-backend-6497bc5bf6-z4ls5                       820844122596721447    0/1     CrashLoopBackOff    6          20m
system   kube-system   l7-lb-controller-v1.2.3-carltenant-012722-rp-1-master     5013348416365870850   1/1     Running             0          12m
system   kube-system   l7-lb-controller-v1.2.3-carltenant-012722-tp-1-master     5013348416365870850   1/1     Running             0          18m
system   kube-system   metrics-server-v0.3.3-5f994fcb77-qwgtj                    9113278416233267155   1/2     CrashLoopBackOff    6          20m

What you expected to happen:

These 6 pods should be in Running state.

How to reproduce it (as minimally and precisely as possible):
The codes are POC codes + mizar PR1320.

$ unset KUBE_GCE_MASTER_PROJECT KUBE_GCE_NODE_PROJECT KUBE_GCI_VERSION  KUBE_GCE_MASTER_IMAGE  KUBE_GCE_NODE_IMAGE KUBE_CONTAINER_RUNTIME NETWORK_PROVIDER DISABLE_NETWORK_SERVICE_SUPPORT

$ export KUBEMARK_NUM_NODES=100 NUM_NODES=1 SCALEOUT_CLUSTER=true SCALEOUT_TP_COUNT=1 SCALEOUT_RP_COUNT=1 RUN_PREFIX=carltenant-012722 NETWORK_PROVIDER=mizar

$ export MASTER_DISK_SIZE=500GB MASTER_ROOT_DISK_SIZE=500GB KUBE_GCE_ZONE=us-west2-b MASTER_SIZE=n1-highmem-32 NODE_SIZE=n1-highmem-16 NODE_DISK_SIZE=500GB GOPATH=$HOME/go KUBE_GCE_ENABLE_IP_ALIASES=true KUBE_GCE_PRIVATE_CLUSTER=true CREATE_CUSTOM_NETWORK=true KUBE_GCE_INSTANCE_PREFIX=${RUN_PREFIX} KUBE_GCE_NETWORK=${RUN_PREFIX} ENABLE_KCM_LEADER_ELECT=false ENABLE_SCHEDULER_LEADER_ELECT=false ETCD_QUOTA_BACKEND_BYTES=8589934592 SHARE_PARTITIONSERVER=false LOGROTATE_FILES_MAX_COUNT=200 LOGROTATE_MAX_SIZE=200M KUBE_ENABLE_APISERVER_INSECURE_PORT=true KUBE_ENABLE_PROMETHEUS_DEBUG=true KUBE_ENABLE_PPROF_DEBUG=true TEST_CLUSTER_LOG_LEVEL=--v=2 HOLLOW_KUBELET_TEST_LOG_LEVEL=--v=2 GCE_REGION=us-west2-b

$ ./cluster/kube-up.sh

Anything else we need to know?:
N/A

Environment:

  • Arktos version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools: kubeup 1 x 1 scale-out
  • Network plugin and version (if this is a network-related bug):
  • Others:
@q131172019 q131172019 changed the title [KubeUP][Scaleout] 6 pods (coredns, coredns-default, event-exporter, kube-proxy, l7-default-backend, metrics-server) failed to start [KubeUP][Scaleout] 6 pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true Feb 4, 2022
@q131172019 q131172019 changed the title [KubeUP][Scaleout] 6 pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true [KubeUP][Scaleout] 6 system pods failed to start when mizar without setting DISABLE_NETWORK_SERVICE_SUPPORT=true Feb 4, 2022
@Sindica Sindica added P2 and removed P2 labels Feb 7, 2022
@Sindica
Copy link
Collaborator

Sindica commented Feb 7, 2022

Resolving as we have separate issues to track.

@Sindica Sindica closed this as completed Feb 7, 2022
@Sindica
Copy link
Collaborator

Sindica commented Feb 7, 2022

Duplicated issues:
#1339
#1333
#1302
#1301
#1300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants