You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When calculating univariate drift, you "fit" the drift on the reference. How are the drift metrics of the chunks in the reference data then calculated? - Are they compared to the overall distribution of the reference data?
The text was updated successfully, but these errors were encountered:
Yes, that's how it is done currently and we are aware it is not the optimum way. Good job on spotting that though 👏
So the correct way is: when calculating drift metric for a chunk which is a subset of the reference data, the observations that belong to that chunk should be "removed" from the reference data for the comparison. Just like in Cross Validation. Otherwise the some of the drift metrics are lower than they really should, because one dataset (reference chunk) is a subset of the other (whole reference). As an effect, in an extreme situation, one may have perfectly iid data, but the drift metrics on reference chunks will be lower than on monitored (analysis) data - yet with iid data they shouldn't.
We plan to fix this. Either by enforcing the new correct way or making it the default one, but keeping both and making the old way optional as it sometimes may be beneficial because of its lower computational cost. I can't say exactly when because our current focus is on research related to performance estimation methods.
Before we fix it, if you really want, you can hack it on your own - by fitting calculator multiple times on subsets of reference data that do not contain the reference chunk of interest.
When calculating univariate drift, you "fit" the drift on the reference. How are the drift metrics of the chunks in the reference data then calculated? - Are they compared to the overall distribution of the reference data?
The text was updated successfully, but these errors were encountered: