-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(k8sprocessor): owner meta deletion grace period #1256
Conversation
Adds a deferred deletion queue to the owner metadata cache, allowing time for telemetry data to be processed with correct metadata after the owning resource(s) have been deleted. Signed-off-by: Christian Kruse <[email protected]>
Signed-off-by: Christian Kruse <[email protected]>
Signed-off-by: Christian Kruse <[email protected]>
Signed-off-by: Christian Kruse <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a large amount of commented out code in one of the test files that this PR touches. I would suggest deleting it if possible.
op.deleteMu.Lock() | ||
for i, d := range op.deleteQueue { | ||
if d.ts.Add(gracePeriod).After(now) { | ||
break | ||
} | ||
cutoff = i + 1 | ||
} | ||
toDelete := op.deleteQueue[:cutoff] | ||
op.deleteQueue = op.deleteQueue[cutoff:] | ||
op.deleteMu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this is worth its own method, and then the Unlock can be deferred
Signed-off-by: Christian Kruse <[email protected]>
Signed-off-by: Christian Kruse <[email protected]>
@swiatekm-sumo you cool with yanking the commented out benchmarks? They had drifted a bit even before this CL. Saw that you had been "using" them in earlier PR/issue discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall approach looks good to me. I'd like:
- Comments for the new functions
- A test that actually measures the deletion delay
- Maybe a manual E2E test in Kubernetes to make sure this is ok? I'm not sure this is necessary, but it's not too difficult to do using the Helm Chart integration tests. Let me know if you want to try it, but if not, I'm ok merging this change as-is.
Signed-off-by: Christian Kruse <[email protected]>
Signed-off-by: Christian Kruse <[email protected]>
@swiatekm-sumo Updated godocs and tests with assertions on honoring the grace period. Have dug into the E2E tests and may yet manually validate the change, but had to fight with the test suite a bit to get it up and running. |
Signed-off-by: Christian Kruse <[email protected]>
Adds a deferred deletion queue to the owner metadata cache, allowing time for telemetry data to be processed with correct metadata after the owning resource(s) have been deleted.
Closes #1242