Skip to content
etishor edited this page Oct 18, 2014 · 3 revisions

A Histogram measures the distribution of values in a stream of data. From the Java library documentation

Histogram metrics allow you to measure not just easy things like the min, mean, max, and standard deviation of values, but also quantiles like the median or 95th percentile.

Traditionally, the way the median (or any other quantile) is calculated is to take the entire data set, sort it, and take the value in the middle (or 1% from the end, for the 99th percentile). This works for small data sets, or batch processing systems, but not for high-throughput, low-latency services.

The solution for this is to sample the data as it goes through. By maintaining a small, manageable reservoir which is statistically representative of the data stream as a whole, we can quickly and easily calculate quantiles which are valid approximations of the actual quantiles. This technique is called reservoir sampling.

    private readonly Histogram histogram = Metric.Histogram("SearchResultSize", Unit.Items);
    public void Search(string keyword)
    {
        var results = ActualSearch(keyword);
        histogram.Update(results.Length);
    }

Out of the box three sampling types are provided:

  • Exponentially Decaying Reservoir - produces quantiles which are representative of (roughly) the last five minutes of data
  • Uniform Reservoir - produces quantiles which are valid for the entirely of the histogram’s lifetime
  • Sliding Window Reservoir - produces quantiles which are representative of the past N measurements

More information about the reservoir types can be found in the Java library documentation