You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The (default) optimized slo:sli_error:ratio_rate30d uses an expression of sum_over_time() / count_over_time(). This is following 9cd3177 which changed it from avg_over_time().
I'm very confused on what the difference is. The definition of an arithmetic average (mean) is sum() / count(), so unless there's something unusual in prom's implementation of these functions, I would expect the two expressions to be equivalent.
When aggregating up ratios, aggregate up the numerator and denominator separately and then divide. Do not take the average of a ratio or average of an average as that is not statistically valid.
But sloth does not preserve either the numerator or denominator, therefore doing that is not possible.
The text was updated successfully, but these errors were encountered:
Agreed. This seems to be just a different way of averaging ratios, as far as I can tell.
The missing information is the number of requests in each 5m period. Without that, a 5 minute period with 1 error in 10 requests (10% error rate) will be treated equally to a 5 minute period with 1,000 errors in 10,000 requests (also a 10% error rate). But the 1,000 errors should contribute significantly more to the overall 30 day error rate than the 1 error.
The (default) optimized
slo:sli_error:ratio_rate30d
uses an expression ofsum_over_time() / count_over_time()
. This is following 9cd3177 which changed it fromavg_over_time()
.I'm very confused on what the difference is. The definition of an arithmetic average (mean) is
sum() / count()
, so unless there's something unusual in prom's implementation of these functions, I would expect the two expressions to be equivalent.Prom's best practices on recording rules does mention:
But sloth does not preserve either the numerator or denominator, therefore doing that is not possible.
The text was updated successfully, but these errors were encountered: