feat: enable feature grouping for attention mechanism #443

Optimox · 2022-12-01T15:04:01Z

What kind of change does this PR introduce?
This PR solves #122 in a new way.

Embeddings from a same column are automatically grouped together.
Now attention works at the group level and not feature level. So without specifying anything groups for different categorical features are created.

This is a new as it allows users to specify features they would like to be grouped together by the attention mechanism. This can be very useful when sparse features are created (for example after a TD-IDF), attention has a hard time using this kind of features because of sparsity coming from both the data and the attention. Now you can group all those features together and have a single attention.

Does this PR introduce a breaking change?
I'm not sure if this should be considered a breaking change or not. I think so as old trained models used with this new code will have a different behaviour.

What needs to be documented once your changes are merged?

I think all changes are already documented in this PR.

Closing issues
closes #122

pytorch_tabnet/tab_network.py

pytorch_tabnet/utils.py

eduardocarvp · 2022-12-04T22:21:08Z

pytorch_tabnet/utils.py

+    - list_groups : list of list of int
+        Each element is a list representing features in the same group.
+        One feature should appear in maximum one group.
+        Feature that don't get assign a group will be in their own group of one feature.


Features that don't get assigned

eduardocarvp · 2022-12-04T22:21:25Z

pytorch_tabnet/utils.py

+        for group_pos, group in enumerate(list_groups):
+            msg = f"Groups must be given as a list of list, but found {group} in position {group_pos}."  # noqa
+            assert isinstance(group, list), msg
+            assert len(group) > 0, "Empty groups are forbidding please remove empty groups []"


forbidding -> forbidden

eduardocarvp · 2022-12-05T15:08:43Z

pytorch_tabnet/tab_network.py

            out = self.feat_transformers[step](masked_x)
            d = ReLU()(out[:, : self.n_d])
            # explain
            step_importance = torch.sum(d, dim=1)
-            M_explain += torch.mul(M, step_importance.unsqueeze(dim=1))
+            M_explain += torch.mul(M_feature_level, step_importance.unsqueeze(dim=1))


I wonder if we can multiply this by the transpose at the end to divide (equally given how the matrix is created) the importance for each feature of the group. Don't know if it's a good idea.

There is already a mapping to get importance from post embedding dimension to initial features. They might be redundant indeed, but I don't think I have time to think about it.

gauravbrills · 2023-10-30T13:55:34Z

can we add an example of how to use this , bit confused about this @Optimox

Optimox · 2023-10-30T14:01:11Z

@gauravbrills There is an example on this notebook : https://github.com/dreamquark-ai/tabnet/blob/develop/census_example.ipynb

You simply need to give a list of groups, each group is a list with the index of the features forming the group.
Attention mechanism will consider each group as one single feature, so all features of the group will get the same attention (and importance). Note that all embedding dimensions generated by a categorical feature will be grouped together.

Optimox mentioned this pull request Dec 1, 2022

Research : Embedding Aware Attention #122

Closed

Optimox requested a review from eduardocarvp December 1, 2022 15:07

Optimox force-pushed the feat/grouped-attention branch 8 times, most recently from 177a0f9 to a12c039 Compare December 1, 2022 15:40

eduardocarvp reviewed Dec 4, 2022

View reviewed changes

eduardocarvp reviewed Dec 5, 2022

View reviewed changes

feat: enable feature grouping for attention mechanism

0d03b06

Optimox force-pushed the feat/grouped-attention branch from a12c039 to 0d03b06 Compare December 8, 2022 15:31

eduardocarvp approved these changes Dec 12, 2022

View reviewed changes

eduardocarvp merged commit bcae5f4 into develop Dec 12, 2022

eduardocarvp deleted the feat/grouped-attention branch December 12, 2022 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable feature grouping for attention mechanism #443

feat: enable feature grouping for attention mechanism #443

Optimox commented Dec 1, 2022

eduardocarvp Dec 4, 2022

eduardocarvp Dec 4, 2022

eduardocarvp Dec 5, 2022

Optimox Dec 8, 2022

gauravbrills commented Oct 30, 2023

Optimox commented Oct 30, 2023

feat: enable feature grouping for attention mechanism #443

feat: enable feature grouping for attention mechanism #443

Conversation

Optimox commented Dec 1, 2022

eduardocarvp Dec 4, 2022

Choose a reason for hiding this comment

eduardocarvp Dec 4, 2022

Choose a reason for hiding this comment

eduardocarvp Dec 5, 2022

Choose a reason for hiding this comment

Optimox Dec 8, 2022

Choose a reason for hiding this comment

gauravbrills commented Oct 30, 2023

Optimox commented Oct 30, 2023