-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prober should not consider Failed
or Terminating
Machines, Unhealthy
Nodes.
#124
Conversation
/kind enhancement |
/test pull-dependency-watchdog-check-vulnerabilities |
The Gardener project currently lacks enough active contributors to adequately respond to all PRs.
You can:
/lifecycle stale |
The Gardener project currently lacks enough active contributors to adequately respond to all PRs.
You can:
/lifecycle rotten |
@@ -58,30 +60,30 @@ func testDeletePredicateFunc(g *WithT, numWorkers int) bool { | |||
func TestUpdatePredicateFunc(t *testing.T) { | |||
tests := []struct { | |||
title string | |||
oldNumWorkers int | |||
newNumWorkers int | |||
oldWorkerNames []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just write a small function that generates the worker names and retain the old way as that is more flexible and easier to change. Same elsewhere as well.
@@ -150,9 +150,10 @@ func testWeederDedicatedEnvTest(t *testing.T) { | |||
for _, test := range tests { | |||
ctx, cancelFn := context.WithCancel(context.Background()) | |||
testEnv, reconciler := setupWeederEnv(ctx, t, test.apiServerFlags) | |||
ns := testutil.CreateTestNamespace(ctx, g, reconciler.Client, strings.ToLower(test.name)) | |||
testNs := testutil.GenerateRandomAlphanumericString(g, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not need to use this utility function any longer. We should perhaps remove this now.
Use https://pkg.go.dev/k8s.io/apimachinery/pkg/util/rand#String instead.
Change this at all places where this utility function is used.
* Check the role/rolebinding in the shoot for the dependency-watchdog-probe related service account in the `kube-system` | ||
namespace. Add rules for listing leases in the `kube-node-lease` namespace. | ||
* Scale down the DWD prober deployment in the garden namespace (in the shooted-seed) and start a local DWD process by | ||
providing the prober config and the kubeconfig of the shooted seed as command line flags - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to add a new line for commands that are enclosed within bash
else the formatting is not going to be correct.
> process. To do that you will have to the following: | ||
> * Create a new `DNSRecord` containing the new shoot Kube API server endpoint. This will create a provider specific | ||
route (e.g. In case of AWS it will create a AWS-Route53 endpoint) | ||
> * To ensure that a call to this endpoint is routed to the Kube API server of the shoot do the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need all this? Please revisit and change the docs.
internal/prober/errors/errors.go
Outdated
// ProbeError is the error type for probe errors. It contains the error code, the cause of the error, and the error message. | ||
// It is used by prober to record its last error and is currently only used for unit tests. | ||
type ProbeError struct { | ||
Code ErrorCode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add doc strings for all exposed fields
internal/prober/fakes/k8s/client.go
Outdated
|
||
func (c *fakeClient) getRecordedObjectCollectionError(method ClientMethod, namespace string, labelSelector labels.Selector, objGVK schema.GroupVersionKind) error { | ||
for _, errRecord := range c.errorRecords { | ||
if errRecord.method == method && errRecord.resourceNamespace == namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can combine these into a single IF condition and perhaps add some comments as well.
if errRecord.method == method &&
errRecord.resourceNamespace == namespace &&
(errRecord.resourceGVK == objGVK || // if it is a direct GVK match
(labelSelector == nil && errRecord.labels == nil) || // if there is no label selector and no labels in the error record
labelSelector != nil && labelSelector.Matches(errRecord.labels)) { // if there is a label selector, and it matches the labels in the error record
return errRecord.err
}
} | ||
|
||
func TestAPIServerProbeFailure(t *testing.T) { | ||
testCases := []probeTestCase{ | ||
{name: "Unknown error is returned by api server", discoveryError: errFoo}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you seems to have removed this test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed. The other cases are sufficient.
internal/test/miscellaneous.go
Outdated
} | ||
|
||
// GenerateRandomAlphanumericString generates a random alphanumeric string of the given length. | ||
func GenerateRandomAlphanumericString(g *WithT, length int) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is no longer required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: e83c81d405913128845165aff0a9e9367cc3f0f8
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: unmarshall The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR does the following:-
Prober will no longer consider
Failed
orTerminating
Machines in candidate lease calculation.Prober will no longer consider
Unhealthy
Nodes in candidate lease calculation. The criteria for a node to beUnhealthy
is as follows:-a.) The Shoot resource has a section to define node Conditions per worker to be checked by MCM for marking a machine
Unknown
See:- https://github.com/gardener/gardener/blob/98fb871eb6045f69715d9f8b39dd84f9637f1fa0/example/90-shoot.yaml#L138. If not specified DWD will use the default node conditions that MCM uses (See :- https://github.com/gardener/machine-controller-manager/blob/11c2983db70c3c13a7a95d51f9de9896e69d4599/pkg/util/provider/app/options/options.go#L60).b.) If any of these node conditions is set to
True
, then DWD will consider the nodeUnhealthy
and not consider it in candidate lease calculation.Replace mocks with fake clients for unit tests.
If there is only one candidate lease present in the cluster, DWD probe will not perform any scaling operations.
Enabled Pprof endpoint to allow profiling of DWD.
Which issue(s) this PR fixes:
Fixes #123 #126
Fixes part of #110
Special notes for your reviewer:
Release note: