Convergence metrics

WORK IN PROGRESS

Rationale

Right now we focus on highly visible and visual metrics. This is great, because we can quickly if, when and how things go wrong. However, we want to:

quantify how well the entire system is behaving,
quantify how well individual components are behaving,
use these metrics to judge current development,
use these metrics to guide future development.

Generic properties of metrics

User error vs system error

Per group vs aggregate

The easiest way to resolve this problem is to do A/B testing.

Regret

It is important that metrics that measure bad things happening measure "regret", i.e. positive values mean something bad happened, zero is ideal. This prevents aggregate errors from canceling out. For example, if we were to measure the difference between

Delta asymmetry

Under-provisioning and over-provisioning by the same amount is not equally bad. Being over capacity may cost marginally more money, but under-provisioning usually comes with service degradation.

Under- or over-provisioning by one near the desired capacity is not as

Specific metrics

Desired/actual delta, integrated over time

TODO: this thing needs a good name.

This metric has upsides and downsides. It is a high-level aggregate metric, which is good because it measures how much we failed to do what was asked, but is bad because it doesn't

Provide feedback

Saved searches

Use saved searches to filter your results more quickly