-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kong-controller stops fetching EndpointSlices and update kong-gateways #6567
Comments
We face a similar issue on KIC 3.3. Kong stops somehow updating the Upstreams. Neither restarting control or data plane helps. |
Hi @MarkusFlorian79, what version did you downgrade to? |
@lindeskar |
@MarkusFlorian79 any reason for KIC 3.2.2? Did the later patch versions have problems as well? |
I also spotted the issue with KIC 3.2.0 and Kong 3.7.0, it persists after upgrade to 3.2.2. The problem comes up when new gateway pod is spawned. Controller keeps using old ip. After recreating the controller pod, all gateways receive valid config immediately.
Then |
@jjchambl |
same issue for us: described in further details here Kong/gateway-operator#140 (comment) |
@azdobylak are you running 2 KIC replicas or just one? We had seen this behavior in Kong/KIC v2 with Gateway Discovery turned on and running KIC with 2 replicas. Haven't experienced this in v3 yet. |
Is there an existing issue for this?
Current Behavior
A few times per day we see the kong-controller enter a state where it stops fetching EndpointSlices, and by that not updating kong-gateways with new configuration. The bad state lasts for about 30 minutes before an unknown trigger makes it all go back to normal.
This affects traffic going through kong-gateways if there were upstream changes during the bad kong-controller state, which the kong-gateways are then not aware of.
The cluster where the issue is occurring is heavily using spot Nodes, which leads to frequent updates of available Pods in Services.
The issue also affects Kong itself if a kong-gateway Pod is replaced during the bad state. Logs show the kong-controller not being aware of the new kong-gateway and still tries to reach the old kong-gateway Pod.
--
During the issue, two errors are constantly logged:
not ready for proxying: no configuration available (empty configuration present)
Failed to fill in defaults for plugin
with a reference to a previously running kong-gateway Pod, not the newly added oneI think these errors are a symptom of a greater issue where something in the kong-controller gets stuck.
Debug logs show
Fetching EndpointSlices
andSending configuration to gateway clients
stop entirely during the bad state:Expected Behavior
The kong-controller keeps fetching EndpointSlices and updates the kong-gateways.
Steps To Reproduce
Values for the ingress chart:
Kong Ingress Controller version
kong/kubernetes-ingress-controller:3.3
from the Helm chart (the digest matches 3.3.1)Kubernetes version
Anything else?
Debug log filtered for kong-gateway Pod IPs:
kong-controller-debug-2.txt
172.19.7.208
kong-gateway running172.19.0.152
kong-gateway running172.19.2.48
kong-gateway stopped 14:56172.19.1.164
kong-gateway started 14:56 and stopped 17:10172.19.0.154
kong-gateway started 17:10The text was updated successfully, but these errors were encountered: