-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Handle Race Condition for Istio Gateway Secret #2078
base: main
Are you sure you want to change the base?
fix: Handle Race Condition for Istio Gateway Secret #2078
Conversation
FYI: we also had some checkmarx related issues: #2075 |
WDYT about my comment on the panic? Should the goroutine exit with a bootstrap error? @c-pius |
kcpClient client.Client | ||
type Handler struct { | ||
kcpClient client.Client | ||
kcpClientset *kubernetes.Clientset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it make more sense only use Clientset
in this case, then this package is decoupled with controller-runtime
func (gsh *GatewaySecretHandler) StartRootCertificateWatch(clientset *kubernetes.Clientset, | ||
log logr.Logger, | ||
) { | ||
func (h *Handler) StartRootCertificateWatch() { | ||
ctx, cancel := context.WithCancel(context.TODO()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use TODO context, in this case, use background
} | ||
|
||
func (h *Handler) handleAlreadyCreatedRootCertificate(ctx context.Context) { | ||
rootCASecrets, err := h.kcpClientset.CoreV1().Secrets(istioNamespace).List(ctx, apimetav1.ListOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use .List
but not .Get
?
FieldSelector: fields.OneTermEqualSelector(apimetav1.ObjectNameField, kcpRootSecretName).String(), | ||
}) | ||
if err != nil { | ||
log.Error(err, "unable to start watching root certificate") | ||
h.log.Error(err, "unable to start watching root certificate") | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you panic in goruntine, it just crashed this goruntine itself, but in your case, if can't watching root certificate, the whole lifecycle manager should not started, right? That's basically the reason why no secret found, the lifecycle manager still running, instead it should crashed and new instance get created.
In this case, you need to send error back to go main goruntine and panic there. So you should use channel here.
Another option is not design whole StartRootCertificateWatch
as goruntine, instead, before WatchEvents
, you can still running in the main goruntine, you only need to make WatchEvents
as a dedicated thread, it can simplify your panic handling.
if len(rootCASecrets.Items) != 1 { | ||
h.log.Error(errExpectedOneRootCASecret, errExpectedOneRootCASecret.Error(), | ||
"found", len(rootCASecrets.Items)) | ||
panic(fmt.Errorf("%w: found %d", errExpectedOneRootCASecret, len(rootCASecrets.Items))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, if you panic in a goruntine, the main thread is still running. https://github.com/kyma-project/lifecycle-manager/pull/2078/files#r1865020057
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check my comments
Fix for #2004