-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous synchronization of Beyla cache #1358
Conversation
select { | ||
case <-time.After(mp.cfg.SyncTimeout): | ||
klog().Warn("kubernetes cache has not been synced after timeout. The kubernetes attributes might be incomplete."+ | ||
" Consider increasing the BEYLA_KUBE_INFORMERS_SYNC_TIMEOUT value", "timeout", mp.cfg.SyncTimeout) | ||
case err, ok := <-done: | ||
if ok { | ||
return nil, fmt.Errorf("failed to initialize Kubernetes informers: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we didn't realize a bug here.
If the timeout happened, the later returned "informers" variable would be nil, as it requires that the InitInformers
function successfully returns from another goroutine.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1358 +/- ##
==========================================
- Coverage 79.06% 72.06% -7.01%
==========================================
Files 145 144 -1
Lines 14648 14661 +13
==========================================
- Hits 11581 10565 -1016
- Misses 2489 3395 +906
- Partials 578 701 +123
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Alleviates the following issues: #1349 and #1354
The Beyla cache service was not accepting connections before it had a complete copy of the Kubernetes cache.
In big clusters (~1000 nodes) that meant that each cache instance would take some minutes to completely initialize, and beyla instances would keep reporting errors.
This PR enables the service as soon as the process start, and keeps accepting Beyla connections while the cache is still synchronizing, making sure that the Beyla clients won't receive the synchronization signal until the cache service is fully synchronized.