Skip to content

An easy way to determine the health of Kubernetes resources.

License

Notifications You must be signed in to change notification settings

xterm-x11/kube-health

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kube-health

kube-health is a library and a kubectl plugin to evaluate the health of Kubernetes resources. It aims at unifying and making it easier to understand the health of individual objects without requiring to know all the nuances of different kinds.

Features:

  • Unified health reporting of Kubernetes resources.
  • Decomposing the health of a high-level object (e.g. deployment) to lower-level components (e.g. pods and containers) for faster root cause analysis.
  • Wait for reconciliation.
  • Differentiating between progressing and stalled status.
  • Combine the command with others, e.g. kubectl apply.
  • Use via CLI, Prometheus/Grafana or as a library.
  • Extensibility for implementing non-standard health evaluation logic.

Installation

Use one of these methods:

  • Get binaries for Linux and Mac are available as tarballs from the releases page.
  • Using go install:
    go install github.com/inecas/kube-health@latest
  • Building from source with:
    make build

CLI Usage

The most basic use is simply asking about the status of a particular object.

kube-health <object-type>/<object-name>

Screenshot

Besides the health of the object itself, it shows the details from sub-resources (including tail of logs of the failed container in this case).

By default, the sub-resources are only displayed for objects in abnormal state. Use -H to show details for objects with healthy (OK) status as well.

It's possible to combine kube-health with kubectl apply via a pipe:

kubectl apply -f <manifest-file> -o=yaml | kube-health -

kube-health allows waiting for reconciliation via additional flags.

Screenshot

There are multiple waiting strategies implemented:

  • --wait-progress|-W - wait while there is are some objects still progressing (regardless of the final result).
  • --wait-ready|-R - wait until all the objects are in OK state
  • --wait-forever|-F - continuously poll for the status regardless of the results.

Exit codes

  • 0 - all resources are OK
  • 1 - some resources in Warning state
  • 2 - some resources in Error state
  • 3 - some resources in Unknown state
  • 128 - error during evaluation

If some resources are progressing, 8 is added to the exit code: use bitwise AND to extract this information.

Use with Prometheus/Grafana

Besides using kube-health from command line, it is possible to leverage the functionality on the server side as well, e.g. exporting resources health via monitoring stack.

Grafana dashboard

  1. Get the binaries for kube-health-monitor from the releases page or build it from source with:
    make build-monitor
  2. Create a monitor.yaml file. See the example monitor yaml files for more details.
  3. Run the monitor process that continuously monitors the objects from definition and exports it via Prometheus metrics:
    kube-health-monitor --config <path/to/my/monitor.yaml> -v1
  4. Configure Prometheus to scan the target (exposed at localhost:8080 by default).
  5. Import one of the example Grafana dashboard files and update based on your needs.

Motivation

Kubernetes ecosystem encourages use of certain conventions when reporting status of the objects. One of the core units of the status is the condition. Unfortunately, in some cases the condition is desired to be True (e.g. Ready), while it's False for others (e.g. OutOfSpace or Degraded).

With the concept of eventual consistency, it's also important to be able to quickly tell whether the abnormal state is still expected to change or it's stuck and needs manual intervention.

Another common case is an object composed by some lower-level components. There are some conventions here as well (e.g. using ownerReference) for capturing this relations.

This project tries to leverage available conventions and cover the common cases to build better user experience around objects status reporting. The main idea could be summarized with:

  1. if a resource follows common practices, it should work out of box.
  2. if it doesn't, it's still possible to extend kube-health to support it (and ideally enhance the resource's API to follow the conventions.)

Project Status

The project should be far enough to be usable out-of-the-box. It's however still in early stage of development and the APIs should not be considered stable yet.

Prior Art

These projects played an important role during the development of kube-health:

Developer Docs

For more details on structure of the code and developer guides, see the developer docs.

About

An easy way to determine the health of Kubernetes resources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 99.3%
  • Makefile 0.7%