Allow specification of latency distribution in HTTP/gRPC fault injection #196
Labels
enhancement
New feature or request
needs evaluation
issue needs evaluation to assess viability or impact
Presently, only an average and variation can be specified for the latency introduced in the fault injection.
However, in most systems, latency follows a log normal distribution with a long tail that affects the percentiles 95 and 99 (common in most SLOs).
Therefore, it would be interesting to allow the user to control the shape of the latency in a more precise way.
This does not necessarily means explicitly defining the distribution of the latency to be introduced. Mostly because it is important to balance the expressiveness with the ergonomics in the API (very precise control of the distribution at the expense of a complex or error-prone API would result more harmful that useful)
Alternatively, specifying that only a fraction of requests will be affected by certain random latency could approximate the effect of having an increased latency for the long tail.
The text was updated successfully, but these errors were encountered: