DNSPolicy scale test #615

mikenairn · 2025-01-13T14:29:19Z

Adds a DNSPolicy specific scale test using kube burner.

Part of #928

Based on the existing scale test, but with a focus on DNSPolicy and shared hostnames being updated by multiple dns operator instances.

The workload will create multiple instances of the dns operator in separate namespaces(kuadrant-dns-operator-x), and multiple test namespaces (scale-test-x) that the corresponding dns operator is configured to watch. The number of dns operator instances and test namespaces created is determined by the JOB_ITERATIONS environment variable.
In each test namespace a test app and service is deployed and one or more gateways are created determined by the NUM_GWS environment variable. The number of listeners added to the gateway is determined by the NUM_LISTENERS environment variable.
Each listener hostname is generated using the listener number and the KUADRANT_ZONE_ROOT_DOMAIN environment variable. In each test namespace a dns provider credential is created, the type created is determined by the DNS_PROVIDER environment variable, additional environment variables may need to be set depending on the provider type.

Requires:

Comments/Thoughts:

Kubeburner does not have the concept of running workloads across multiple instances. This was one of the asks in this issue. It is probably possible to run multiple kubeburner tasks simultaneously using the same configuration in order to have multiple updates to the same record set from multiple clusters but there would be no orchestration from kubeburners POV. It should also use of a single thanos instance instead of one deployed on each cluster.
For these workloads to be of any use we need good metrics and alerts that are expected to fire when things are not working. It's not a test suite with assertions on the state, but rather it expects alerts to fire in order to fail the test run.
Separating the DNS Operator specific templates/metrics/alerts into the dns operator repo makes sense as long as we have a similar scale test in that repo. TBD if we do want that.

Alerts
A small list of alerts that i realised would be useful, but really there are probably hundreds required.

Alert when a gateway has not been assigned an address in an appropriate amount of time (Can be hit quite easily when using kind if you only have a few IPs available). This isn't strictly a kuadrant, issue.
Alert when DNSRecords are in a failing state for a given amount of time.
Alert if the managers are restarting an unexpected amount of times during the test run. Hit this as part of the DNSRecord scale test, wrote an alert for this here.

Adds a DNSPolicy specific scale test using kube burner. The workload will create multiple instances of the dns operator in separate namespaces(kuadrant-dns-operator-x), and multiple test namespaces (scale-test-x) that the corresponding dns operator is configured to watch. The number of dns operator instances and test namespaces created is determined by the `JOB_ITERATIONS` environment variable. In each test namespace a test app and service is deployed and one or more gateways are created determined by the `NUM_GWS` environment variable. The number of listeners added to the gateway is determined by the `NUM_LISTENERS` environment variable. Each listener hostname is generated using the listener number and the `KUADRANT_ZONE_ROOT_DOMAIN` environment variable. In each test namespace a dns provider credential is created, the type created is determined by the `DNS_PROVIDER` environment variable, additional environment variables may need to be set depending on the provider type. Signed-off-by: Michael Nairn <[email protected]>

mikenairn · 2025-01-14T07:50:46Z

Makefile

-	@awk 'BEGIN {FS = ":.*?## "} /^[a-zA-Z_-]+:.*?## / {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' $(MAKEFILE_LIST)
+.PHONY: help
+help: ## Display this help.
+	@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n  make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf "  \033[36m%-15s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)


Optional change, just brings it in-line with the help in other repos, you can use ##@ foo to add sections:

Before:

$ make help commit-acceptance Runs pre-commit linting checks reformat Reformats testsuite with black test Run all non mgc tests authorino Run only authorino related tests authorino-standalone Run only test capable of running with standalone Authorino limitador Run only Limitador related tests kuadrant Run all tests available on Kuadrant kuadrant-only Run Kuadrant-only tests multicluster Run Multicluster only tests dnstls Run DNS and TLS tests disruptive Run disruptive tests kuadrantctl Run Kuadrantctl tests poetry Installs poetry with all dependencies poetry-no-dev Installs poetry without development dependencies polish-junit Remove skipped tests and logs from passing tests reportportal Upload results to reportportal. Appropriate variables for juni2reportportal must be set help Print this help clean Clean all objects on cluster created by running this testsuite. Set the env variable USER to delete after someone else test-scale-dnspolicy Run DNSPolicy scale tests. kube-burner Download kube-burner locally if necessary.

After:

$ make help Usage: make <target> commit-acceptance Runs pre-commit linting checks reformat Reformats testsuite with black test Run all non mgc tests authorino Run only authorino related tests authorino-standalone Run only test capable of running with standalone Authorino limitador Run only Limitador related tests kuadrant Run all tests available on Kuadrant kuadrant-only Run Kuadrant-only tests multicluster Run Multicluster only tests dnstls Run DNS and TLS tests disruptive Run disruptive tests kuadrantctl Run Kuadrantctl tests poetry Installs poetry with all dependencies poetry-no-dev Installs poetry without development dependencies polish-junit Remove skipped tests and logs from passing tests reportportal Upload results to reportportal. Appropriate variables for juni2reportportal must be set help Display this help. clean Clean all objects on cluster created by running this testsuite. Set the env variable USER to delete after someone else Scale Testing test-scale-dnspolicy Run DNSPolicy scale tests. Build Dependencies kube-burner Download kube-burner locally if necessary.

mikenairn · 2025-01-14T07:56:54Z

scale_test/dnspolicy/README.md

+Deploy the observability stack:
+```shell
+#kubectl apply --server-side -k github.com/mikenairn/dns-operator/config/observability?ref=add_scale_test
+kubectl apply --server-side -k github.com/kuadrant/dns-operator/config/observability?ref=main # Run twice if it fails the first time


This needs to be updated.

I think a single kustomization in the kuadrant operator that can be executed, without the need to pull down the repo, would be useful and easier than having to run this set of steps https://github.com/Kuadrant/kuadrant-operator/tree/main/config/observability#deploying-the-observabilty-stack. Depends how varied the observability setup gets.

Thanos setup should likely be a secondary optional task when working with a single cluster.

@david-martin Any opinions/thoughts on this?

The cmd you have here should be enough to deploy the stack (without thanos or example alerts & dashboards).
You would typically need to run it twice if CRDs don't exist.
Once to just include CRDs so they are registered, and a 2nd time (without CRDs) to avoid errors creating CRs.

mikenairn · 2025-01-14T08:02:31Z

scale_test/dnspolicy/namespaced-dns-operator-deployments.yaml

+      - kind: DNSPolicy
+        apiVersion: kuadrant.io/v1alpha1
+        labelSelector: {kube-burner-job: dnspolicy-scale-test-loadbalanced}
+  {{ end }}


I copied the pattern from the original scale test of adding this cleanup to remove the DNSPolicies. I understand why its here, but it is fairly frustrating that we need it.

I wonder if we should revisit the cleanup of policies/records/secrets and see if there is some reasonable way we can prevent secrets being removed before all resources referencing them are deleted.

@maleck13 I know we have discussed this before, but i think its probably the single most annoying thing about testing anything DNS related.

mikenairn · 2025-01-14T08:07:59Z

scale_test/dnspolicy/namespaced-dns-operator-deployments.yaml

+      - https://raw.githubusercontent.com/{{.DNS_OPERATOR_GITHUB_ORG}}/dns-operator/refs/heads/{{.DNS_OPERATOR_GITREF}}/test/scale/alerts.yaml
+    indexer:
+      type: local
+      metricsDirectory: ./metrics


I have the alerts and metrics being pulled from the dns operator repo here, but i imagine we could have these being pulled from multiple sources i.e. kuadrant-operator, testsuite repo, other components, where they define their own metrics/alerts specific to the resources they are providing.

The metrics/alerts configured, from what i can gather, are really needed to make the most out of kubeburner runs since alerts firing during the run are what will tell us if things are working or not, and what we would need to improve on if we feel these types of scale tests are useful.

I wonder if the alerts and metrics files should be maintained in this repo for easier maintenance in the context of running and maintaining tests.
The alternative could result in extra toil, particularly when working out the details of assertions for a test.

mikenairn commented Jan 14, 2025

View reviewed changes

mikenairn mentioned this pull request Jan 14, 2025

[E2E] kuadrant based DNS policy scale test Kuadrant/kuadrant-operator#928

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNSPolicy scale test #615

DNSPolicy scale test #615

mikenairn commented Jan 13, 2025 •

edited

Loading

mikenairn Jan 14, 2025

mikenairn Jan 14, 2025

david-martin Jan 14, 2025

mikenairn Jan 14, 2025

mikenairn Jan 14, 2025

david-martin Jan 14, 2025

DNSPolicy scale test #615

Are you sure you want to change the base?

DNSPolicy scale test #615

Conversation

mikenairn commented Jan 13, 2025 • edited Loading

mikenairn Jan 14, 2025

Choose a reason for hiding this comment

mikenairn Jan 14, 2025

Choose a reason for hiding this comment

david-martin Jan 14, 2025

Choose a reason for hiding this comment

mikenairn Jan 14, 2025

Choose a reason for hiding this comment

mikenairn Jan 14, 2025

Choose a reason for hiding this comment

david-martin Jan 14, 2025

Choose a reason for hiding this comment

mikenairn commented Jan 13, 2025 •

edited

Loading