The CNF Test suite can be run in production mode (using an executable) or in developer mode (using crystal lang directly). See the pseudo code documentation for examples of how the internals of WIP tests might work.
# Production mode
./cnf-testsuite <testname>
# Developer mode
crystal src/cnf-testsuite.cr <testname>
⭐ *Note: All usage commands in this document will use the production (binary executable) syntax unless otherwise stated.
- ✔️ indicates implemented into stable release
- 💡 indicates Proof of Concept
- 📝 indicates To Do
- ❌ indicates WARNINGS*
- ✔️ PASSED indicates it meets best practice, positive points given.
- ⏭ SKIPPED indicates the test was skipped (output should provide a reason), no points given.
- ❌ FAILED indicates the test failed, negative points given.
This is the command to build the binary executable if in developer mode or using the source install method (requires crystal):
crystal build src/cnf-testsuite.cr
./cnf-testsuite validate_config cnf-config=[PATH_TO]/cnf-testsuite.yml
./cnf-testsuite all cnf-config=<path_to_your_config_file>/cnf-testsuite.yml
./cnf-testsuite all poc cnf-config=<path_to_your_config_file>/cnf-testsuite.yml
crystal src/cnf-testsuite.cr workload
cnf-config=<path_to_your_config_file>/cnf-testsuite.yml
./cnf-testsuite workload
./cnf-testsuite platform
./cnf-testsuite help
./cnf-testsuite cleanup
# cmd line
./cnf-testsuite -l debug test
crystal src/cnf-testsuite.cr -- -l debug test
LOGLEVEL=DEBUG ./cnf-testsuite test
⭐ Note: When setting log level, the following is the order of precedence:
- CLI or Command line flag
- Environment variable
- CNF-Testsuite Config file
Also setting the verbose option for many tasks will add extra output to help with debugging
./cnf-testsuite test_name verbose
See https://github.com/crystal-ameba/ameba for more details. Follow the INSTALL guide starting at the Source Install for more details running cnf-testsuite in developer mode.
shards install # only for first install
crystal bin/ameba.cr
./cnf-testsuite compatibility
./cnf-testsuite increase_decrease_capacity
./cnf-testsuite increase_capacity
./cnf-testsuite decrease_capacity
Remediation for failing this test:
Check out the kubectl docs for how to manually scale your cnf.
Also here is some info about things that could cause failures.
./cnf-testsuite helm_chart_published
Remediation for failing this test:
Make sure your CNF helm charts are published in a Helm Repository.
./cnf-testsuite helm_chart_valid
Remediation for failing this test:
Make sure your helm charts pass lint tests.
./cnf-testsuite helm_deploy
Remediation for failing this test:
Make sure your helm charts are valid and can be deployed to clusters.
./cnf-testsuite rollback
Remediation for failing this test:
Ensure that you can upgrade your CNF using the Kubectl Set Image command, then rollback the upgrade using the Kubectl Rollout Undo command.
./cnf-testsuite rolling_update
Remediation for failing this test:
Ensure that you can successfuly perform a rolling upgrade of your CNF using the Kubectl Set Image command.
./cnf-testsuite rolling_version_change
Remediation for failing this test:
Ensure that you can successfuly rollback the software version of your CNF by using the Kubectl Set Image command.
./cnf-testsuite rolling_downgrade
Remediation for failing this test:
Ensure that you can successfuly change the software version of your CNF back to an older version by using the Kubectl Set Image command.
./cnf-testsuite cni_compatible
Remediation for failing this test:
Ensure that your CNF is compatible with Calico, Cilium and other available CNIs.
./cnf-testsuite alpha_k8s_apis
Remediation for failing this test:
Make sure your CNFs are not utilizing any Kubernetes alpha APIs. You can learn more about Kubernetes API versioning here.
Details for Compatibility, Installability and Upgradability Tests To Do's
📝 (To Do) To check of the CNF's CNI plugin accepts valid calls from the CNI specification
crystal src/cnf-testsuite.cr cni_spec
crystal src/cnf-testsuite.cr api_snoop_beta
crystal src/cnf-testsuite.cr api_snoop_general_apis
crystal src/cnf-testsuite.cr small_autoscaling
📝 (To Do) To test large scale autoscaling
crystal src/cnf-testsuite.cr large_autoscaling
crystal src/cnf-testsuite.cr network_chaos
📝 (To Do) To test if the CNF control layer uses external retry logic
crystal src/cnf-testsuite.cr external_retry
crystal src/cnf-testsuite.cr small_autoscaling
📝 (To Do) To test large scale autoscaling
crystal src/cnf-testsuite.cr large_autoscaling
crystal src/cnf-testsuite.cr network_chaos
📝 (To Do) To test if the CNF control layer uses external retry logic
crystal src/cnf-testsuite.cr external_retry
./cnf-testsuite microservice
./cnf-testsuite reasonable_image_size
Remediation for failing this test:
Enure your CNFs image size is under 5GB.
./cnf-testsuite reasonable_startup_time
Remediation for failing this test:
Ensure that your CNF gets into a running state within 30 seconds.
./cnf-testsuite single_process_type
Remediation for failing this test:
Ensure that there is only one process type within a container. This does not count against child processes, e.g. nginx or httpd could be a parent process with 10 child processes and pass this test, but if both nginx and httpd were running, this test would fail.
./cnf-testsuite service_discovery
Remediation for failing this test:
Make sure the CNF exposes any of its containers as a Kubernetes Service. You can learn more about Kubernetes Service here.
./cnf-testsuite shared_database
Remediation for failing this test:
Make sure that your CNFs containers are not sharing the same database.
./cnf-testsuite specialized_init_system
Remediation for failing this test:
Use init systems that are purpose-built for containers like tini, dumb-init, s6-overlay.
./cnf-testsuite state
./cnf-testsuite node_drain
Please note, that this test requires a cluster with atleast two schedulable nodes.
Remediation for failing this test Ensure that your CNF can be successfully rescheduled when a node fails or is drained
./cnf-testsuite volume_hostpath_not_found
Remediation for failing this test: Ensure that none of the containers in your CNFs are using ["hostPath"] to mount volumes.
./cnf-testsuite no_local_volume_configuration
Remediation for failing this test: Ensure that your CNF isn't using any persistent volumes that use a ["local"] mount point.
./cnf-testsuite elastic_volume
Remediation for failing this test: Setup and use elastic persistent volumes instead of local storage.
./cnf-testsuite database_persistence
Remediation for failing this test: Select a database configuration that uses statefulsets and elastic storage volumes.
./cnf-testsuite resilience
./cnf-testsuite pod_network_latency
Remediation for failing this test: Ensure that your CNF doesn't stall or get into a corrupted state when network degradation occurs. A mitigation stagagy(in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs parameters.
./cnf-testsuite disk_fill
Remediation for failing this test: Ensure that your CNF is resilient and doesn't stall when heavy IO causes a degradation in storage resource availability.
./cnf-testsuite pod_delete
Remediation for failing this test: Ensure that your CNF is resilient and doesn't fail on a forced/graceful pod failure on specific or random replicas of an application.
./cnf-testsuite pod_memory_hog
Remediation for failing this test: Ensure that your CNF is resilient to heavy memory usage and can maintain some level of avaliabliy.
./cnf-testsuite pod_io_stress
Remediation for failing this test: Ensure that your CNF is resilient to continuous and heavy disk IO load and can maintain some level of avaliabliy
./cnf-testsuite pod_network_corruption
Remediation for failing this test: Ensure that your CNF is resilient to a lossy/flaky network and can maintain a level of avaliabliy.
./cnf-testsuite pod_network_duplication
Remediation for failing this test: Ensure that your CNF is resilient to erroneously duplicated packets and can maintain a level of availability.
./cnf-testsuite pod_dns_error
Remediation for failing this test: Ensure that your CNF is resilient to DNS resolution failures can maintain a level of availability.
./cnf-testsuite liveness
Remediation for failing this test: Ensure that your CNF has a Liveness Probe configured.
./cnf-testsuite readiness
Remediation for failing this test: Ensure that your CNF has a Readiness Probe configured.
./cnf-testsuite observability
./cnf-testsuite log_output
Remediation for failing this test: Make sure applications and CNF's are sending log output to STDOUT and or STDERR.
./cnf-testsuite prometheus_traffic
Remediation for failing this test: Install and configure Prometheus for your CNF.
./cnf-testsuite routed_logs
Remediation for failing this test: Install and configure fluentd or fluentbit to collect data and logs. See more at fluentd.org for fluentd or fluentbit.io for fluentbit.
./cnf-testsuite open_metrics
Remediation for failing this test: Ensure that your CNF is publishing OpenMetrics compatible metrics.
./cnf-testsuite tracing
Remediation for failing this test: Ensure that your CNF is both using & publishing traces to Jaeger.
./cnf-testsuite security
./cnf-testsuite container_sock_mounts
Remediation for failing this test:
Make sure your CNF doesn't mount /var/run/docker.sock
, /var/run/containerd.sock
or /var/run/crio.sock
on any containers.
./cnf-testsuite external_ips
Remediation for failing this test: Make sure to not define external IPs in your kubernetes service configuration
./cnf-testsuite privileged_containers
Remediation for failing this test:
Remove privileged capabilities by setting the securityContext.privileged to false. If you must deploy a Pod as privileged, add other restriction to it, such as network policy, Seccomp etc and still remove all unnecessary capabilities.
./cnf-testsuite privilege_escalation
Remediation for failing this test: If your application does not need it, make sure the allowPrivilegeEscalation field of the securityContext is set to false. See more at ARMO-C0016
./cnf-testsuite symlink_file_system
Remediation for failing this test: To mitigate this vulnerability without upgrading kubelet, you can disable the VolumeSubpath feature gate on kubelet and kube-apiserver, or remove any existing Pods using subPath or subPathExpr feature.
./cnf-testsuite sysctls
Remediation for failing this test: The spec.securityContext.sysctls field must be unset or not use.
./cnf-testsuite application_credentials
Remediation for failing this test: Use Kubernetes secrets or Key Management Systems to store credentials.
./cnf-testsuite host_network
Remediation for failing this test: Only connect PODs to the hostNetwork when it is necessary. If not, set the hostNetwork field of the pod spec to false, or completely remove it (false is the default). Allow only those PODs that must have access to host network by design.
./cnf-testsuite service_account_mapping
Remediation for failing this test: Disable automatic mounting of service account tokens to PODs either at the service account level or at the individual POD level, by specifying the automountServiceAccountToken: false. Note that POD level takes precedence.
./cnf-testsuite ingress_egress_blocked
Remediation for failing this test:
By default, you should disable or restrict Ingress and Egress traffic on all pods.
./cnf-testsuite insecure_capabilities
Remediation for failing this test:
Remove all insecure capabilities which aren’t necessary for the container.
./cnf-testsuite non_root_containers
Remediation for failing this test:
If your application does not need root privileges, make sure to define the runAsUser and runAsGroup under the PodSecurityContext to use user ID 1000 or higher, do not turn on allowPrivlegeEscalation bit and runAsNonRoot is true.
To configure the Falco driver to be used for this test, please refer to docs/falco-config.md.
./cnf-testsuite host_pid_ipc_privileges
Remediation for failing this test:
Apply least privilege principle and remove hostPID and hostIPC from the yaml configuration privileges unless they are absolutely necessary.
./cnf-testsuite linux_hardening
Remediation for failing this test:
Use AppArmor, Seccomp, SELinux and Linux Capabilities mechanisms to restrict containers abilities to utilize unwanted privileges.
./cnf-testsuite resource_policies
Remediation for failing this test:
Define LimitRange and ResourceQuota policies to limit resource usage for namespaces or in the deployment/POD yamls.
./cnf-testsuite immutable_file_systems
Remediation for failing this test:
Set the filesystem of the container to read-only when possible. If the containers application needs to write into the filesystem, it is possible to mount secondary filesystems for specific directories where application require write access.
./cnf-testsuite hostpath_mounts
Remediation for failing this test:
Refrain from using a hostPath mount.
./cnf-testsuite selinux_options
Remediation for failing this test: Ensure the following guidelines are followed for any cluster resource that allow SELinux options.
- If the SELinux option `type` is set, it should only be one of the allowed values: `container_t`, `container_init_t`, or `container_kvm_t`.
- SELinux options `user` or `role` should not be set.
Details for Security Tests To Do's
📝 (To Do) To check if there are any shells running in the container
crystal src/cnf-testsuite.cr shells
📝 (To Do) To check if there are any protected directories or files that are accessed from within the container
crystal src/cnf-testsuite.cr protected_access
./cnf-testsuite configuration_lifecycle
./cnf-testsuite default_namespace
Remediation for failing this test:
Ensure that your CNF is configured to use a Namespace and is not using the default namespace.
./cnf-testsuite latest_tag
Remediation for failing this test:
When specifying container images, always specify a tag and ensure to use an immutable tag that maps to a specific version of an application Pod. Remove any usage of the latest
tag, as it is not guaranteed to be always point to the same version of the image.
./cnf-testsuite require_labels
Remediation for failing this test:
Make sure to define app.kubernetes.io/name
label under metadata for your CNF.
./cnf-testsuite versioned_tag
Remediation for failing this test:
When specifying container images, always specify a tag and ensure to use an immutable tag that maps to a specific version of an application Pod. Remove any usage of the latest
tag, as it is not guaranteed to be always point to the same version of the image.
./cnf-testsuite nodeport_not_used
Remediation for failing this test:
Review all Helm Charts & Kubernetes Manifest files for the CNF and remove all occurrences of the nostPort field in you configuration. Alternatively, configure a service or use another mechanism for exposing your contianer.
./cnf-testsuite hostport_not_used
Remediation for failing this test:
Review all Helm Charts & Kubernetes Manifest files for the CNF and remove all occurrences of the hostPort field in you configuration. Alternatively, configure a service or use another mechanism for exposing your contianer.
./cnf-testsuite hardcoded_ip_addresses_in_k8s_runtime_configuration
Remediation for failing this test:
Review all Helm Charts & Kubernetes Manifest files of the CNF and look for any hardcoded usage of ip addresses. If any are found, you will need to use an operator or some other method to abstract the IP management out of your configuration in order to pass this test.
./cnf-testsuite secrets_used
Rules for the test: The whole test passes if any workload resource in the cnf uses a (non-exempt) secret. If no workload resources use a (non-exempt) secret, the test is skipped.
Remediation for failing this test:
Remove any sensitive data stored in configmaps, environment variables and instead utilize K8s Secrets for storing such data. Alternatively, you can use an operator or some other method to abstract hardcoded sensitive data out of your configuration.
./cnf-testsuite immutable_configmap
Remediation for failing this test: Use immutable configmaps for any non-mutable configuration data.
./cnf-testsuite 5g
./cnf-testsuite smf_upf_core_validator
./cnf-testsuite suci_enabled
./cnf-testsuite ran
./cnf-testsuite oran_e2_connection
./cnf-testsuite platform
./cnf-testsuite k8s_conformance
Remediation for failing this test: Check that Sonobuoy can be successfully run and passes without failure on your platform. Any failures found by Sonobuoy will provide debug and remediation steps required to get your K8s cluster into a conformant state.
./cnf-testsuite clusterapi_enabled
Remediation for failing this test: Enable ClusterAPI and start using it to manage the provisioning and lifecycle of your Kubernetes clusters.
./cnf-testsuite platform:hardware_and_scheduling
./cnf-testsuite platform:oci_compliant
Remediation for failing this test:
Check if your Kuberentes Platform is using an OCI Compliant Runtime. If you platform is not using an OCI Compliant Runtime, you'll need to switch to a new runtitme that is OCI Compliant in order to pass this test.
./cnf-testsuite platform:resilience poc
./cnf-testsuite platform:worker_reboot_recovery poc destructive
Remediation for failing this test:
Reboot a worker node in your Kubernetes cluster verify that the node can recover and re-join the cluster in a schedulable state. Workloads should also be rescheduled to the node once it's back online.
./cnf-testsuite platform:security
./cnf-testsuite platform:cluster_admin
Remediation for failing this test: You should apply least privilege principle. Make sure cluster admin permissions are granted only when it is absolutely necessary. Don't use subjects with high privileged permissions for daily operations.
See more at ARMO-C0035
./cnf-testsuite platform:control_plane_hardening
Remediation for failing this test:
Set the insecure-port flag of the API server to zero.
See more at ARMO-C0005
./cnf-testsuite platform:control_plane_hardening
./cnf-testsuite platform:exposed_dashboard
Remediation for failing this test:
Update dashboard version to v2.0.1 or above.
./cnf-testsuite platform:helm_tiller
Remediation for failing this test: Switch to using Helm v3+ and make sure not to pull any images with name tiller in them