hierarchical iterations #52

FelixWick · 2023-10-09T15:19:06Z

In the first three iterations of the training, only selected feature groups are used, i.e., all other feature groups are excluded. From the fourth iteration onwards, all feature groups are used. The idea of such hierarchical iterations is to support the modeling of hierarchical or causal effects (e.g., mitigate confounding).

lbventura · 2023-10-13T12:58:56Z

cyclic_boosting/base.py

+        idea of such hierarchical iterations is to support the modeling of
+        hierarchical or causal effects (e.g., mitigate confounding).
+
+        If this argument is omitted, such no hierarchical iterations are run.


A rather pedantic comment. This argument cannot be omitted (in the strict sense that it is not defined) because in the __init__ of CyclicBoostingBase, hierarchical_feature_groups is set to None and not =Optional[Union[str, int, Tuple, FeatureID]]. This means that hierarchical_feature_groups = Nonehas to be passed to the child classes ofCyclicBoostingBase`.

TL:DR, omitted makes it sound like this argument is Optional but is not. It is mandatory but has a default None. Perhaps "If this argument is not explicitly set, such no hierarchical iterations are run." is better?

Will change it, thanks.

lbventura · 2023-10-13T13:02:44Z

cyclic_boosting/base.py

@@ -887,8 +920,11 @@ def _check_stop_criteria(self, iterations: int, convergence_parameters: Converge
                "analysis plots."
            )

+        if iterations <= 3 and self.hierarchical_feature_groups is not None:


Rather ignorant question, why is 3 the magical number of iterations here? Is it a good balance between two criteria?
If this is in fact a parameter one can tune, than 3 should be replaced by a constant, say "TRAINING_ITERATIONS_HIER_FEATURES" and the docstring above has to be adjusted.

I think 3 is a good value, but a parameter with default 3 might be better, yes. I will change it.

lbventura · 2023-10-13T13:04:53Z

tests/test_integration.py

+    yhat = CB_est.predict(X.copy())
+
+    mad = np.nanmean(np.abs(y - yhat))
+    np.testing.assert_almost_equal(mad, 1.699, 3)


Can I read this slight improvement in MAD (compared to the previous test) as a result of feature hierarchization, or is the difference too small to attribute it to that?

The improvement is small, yes, but I think it is thanks to the hierarchical training. Here it helps to describe the strong confounding of the price due to different products. I couldn't find a better example in our integration test.

enable hierarchical iterations

ecb603c

FelixWick requested a review from lbventura October 9, 2023 15:19

FelixWick self-assigned this Oct 9, 2023

FelixWick linked an issue Oct 9, 2023 that may be closed by this pull request

enable hierarchical groups in optimization #4

Closed

lbventura approved these changes Oct 13, 2023

View reviewed changes

hierarchical iterations as parameter

5a1d7c8

FelixWick merged commit 550760c into main Oct 16, 2023
8 checks passed

FelixWick deleted the hierarchical_iterations branch October 16, 2023 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hierarchical iterations #52

hierarchical iterations #52

FelixWick commented Oct 9, 2023

lbventura Oct 13, 2023

FelixWick Oct 14, 2023

lbventura Oct 13, 2023

FelixWick Oct 14, 2023

lbventura Oct 13, 2023

FelixWick Oct 14, 2023

hierarchical iterations #52

hierarchical iterations #52

Conversation

FelixWick commented Oct 9, 2023

lbventura Oct 13, 2023

Choose a reason for hiding this comment

FelixWick Oct 14, 2023

Choose a reason for hiding this comment

lbventura Oct 13, 2023

Choose a reason for hiding this comment

FelixWick Oct 14, 2023

Choose a reason for hiding this comment

lbventura Oct 13, 2023

Choose a reason for hiding this comment

FelixWick Oct 14, 2023

Choose a reason for hiding this comment