-
Notifications
You must be signed in to change notification settings - Fork 110
Histograms
A Histogram measures the distribution of values in a stream of data. From the Java library documentation
Histogram metrics allow you to measure not just easy things like the min, mean, max, and standard deviation of values, but also quantiles like the median or 95th percentile.
Traditionally, the way the median (or any other quantile) is calculated is to take the entire data set, sort it, and take the value in the middle (or 1% from the end, for the 99th percentile). This works for small data sets, or batch processing systems, but not for high-throughput, low-latency services.
The solution for this is to sample the data as it goes through. By maintaining a small, manageable reservoir which is statistically representative of the data stream as a whole, we can quickly and easily calculate quantiles which are valid approximations of the actual quantiles. This technique is called reservoir sampling.
private readonly Histogram histogram = Metric.Histogram("SearchResultSize", Unit.Items);
public void Search(string keyword)
{
var results = ActualSearch(keyword);
histogram.Update(results.Length);
}
Out of the box three sampling types are provided:
- Exponentially Decaying Reservoir - produces quantiles which are representative of (roughly) the last five minutes of data
- Uniform Reservoir - produces quantiles which are valid for the entirely of the histogram’s lifetime
- Sliding Window Reservoir - produces quantiles which are representative of the past N measurements
More information about the reservoir types can be found in the Java library documentation
For any issues please use the GitHub issues. For any other questions and ideas feel free to ping us: @PaulParau, @HinteaDan, @BogdanGaliceanu.