TokenClassificationExplainer(#91)

This incredible release is all thanks to a fantastic community contribution from @pabvald, he implemented the entire TokenClassificationExplainer class, as well as all its tests and associated docs. A huge thank you again to Pablo for this amazing work, it has been on my to-do list for over a year and I greatly appreciate this contribution and I know the community will too.

This new explainer is designed to work with any and all models in the HuggingFaceTransformers package that are of the kind {Model}ForTokenClassification, which are models commonly used for tasks such Named Entity Recognition (NER) and Part-of-speech (POS) tagging.

The TokenClassificationExplainer returns a dictionary mapping each word in a given sequence to a label in the model's trained labels configuration. Token classification models work on a word by word basis so the structure of this explainers output is that each word maps to another dictionary which contains two keys label and attribution_scores, where label is a string indicating the predicted label and attribution_scores is another dict mapping words to scores for the given root word key.

How to use

from transformers import AutoModelForTokenClassification, AutoTokenizer
from transformers_interpret import TokenClassificationExplainer

MODEL_PATH = 'dslim/bert-base-NER'
model = AutoModelForTokenClassification.from_pretrained(MODEL_PATH)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)

ner_explainer = TokenClassificationExplainer(model=model, tokenizer=tokenizer)

sample_text = "Tim Cook is CEO of Apple."
attributions = ner_explainer(sample_text)

print(attributions)

Expand to see word attribution dictionary

{'[CLS]': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.346423320984119),
   ('Cook', 0.5334609978768102),
   ('is', -0.40334870049983335),
   ('CEO', -0.3101234375976895),
   ('of', 0.512072192130804),
   ('Apple', -0.17249370683345489),
   ('.', 0.21111967418861474),
   ('[SEP]', 0.0)]},
 'Tim': {'label': 'B-PER',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.6097200124017794),
   ('Cook', 0.7418433507979225),
   ('is', 0.2277328676307869),
   ('CEO', 0.12913824237676577),
   ('of', 0.0658425121482477),
   ('Apple', 0.06830320263790929),
   ('.', -0.01924683905463743),
   ('[SEP]', 0.0)]},
 'Cook': {'label': 'I-PER',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.5523936725613293),
   ('Cook', 0.8009957951991128),
   ('is', 0.1804967026709793),
   ('CEO', 0.12327788007775593),
   ('of', 0.042470529981614845),
   ('Apple', 0.057217721910403266),
   ('.', -0.020318897077615642),
   ('[SEP]', 0.0)]},
 'is': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.24614651317657982),
   ('Cook', -0.009088703281476993),
   ('is', 0.9216954069405697),
   ('CEO', 0.026992140219729874),
   ('of', 0.2520559406534854),
   ('Apple', -0.09920548911190433),
   ('.', 0.12531705560714215),
   ('[SEP]', 0.0)]},
 'CEO': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.3124910273039106),
   ('Cook', 0.3625517589427658),
   ('is', 0.3507524148134499),
   ('CEO', 0.37196988201878567),
   ('of', 0.645668212957734),
   ('Apple', -0.27458958091134866),
   ('.', 0.13126252757894524),
   ('[SEP]', 0.0)]},
 'of': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.021065140560775575),
   ('Cook', 0.05638048932919909),
   ('is', 0.16774739397504396),
   ('CEO', 0.043009122581603866),
   ('of', 0.9340829137500298),
   ('Apple', -0.11144488868920191),
   ('.', 0.2854079089492836),
   ('[SEP]', 0.0)]},
 'Apple': {'label': 'B-ORG',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', -0.017330599088927878),
   ('Cook', -0.04074196463435918),
   ('is', -0.08738080703156076),
   ('CEO', 0.23234519803002726),
   ('of', 0.12270125701886334),
   ('Apple', 0.9561624229708163),
   ('.', -0.08436746169241069),
   ('[SEP]', 0.0)]},
 '.': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.052863660537099254),
   ('Cook', -0.0694824371223385),
   ('is', -0.18074653059003534),
   ('CEO', 0.021118463602210605),
   ('of', 0.06322422431822372),
   ('Apple', -0.6286955666244136),
   ('.', 0.748336093254276),
   ('[SEP]', 0.0)]},
 '[SEP]': {'label': 'O',
  'attribution_scores': [('[CLS]', 0.0),
   ('Tim', 0.29980967625881066),
   ('Cook', -0.22297477338851293),
   ('is', -0.050889312336460345),
   ('CEO', 0.11157068443843984),
   ('of', 0.25200059104116196),
   ('Apple', -0.8839047143031845),
   ('.', -0.023808126035021283),
   ('[SEP]', 0.0)]}}

Visualizing explanations

With a single call to the visualize() method we get a nice inline display of what inputs are causing the activations to fire that led to classifying each of the tokens into a particular class.

Ignore indexes

To save computation time, we can indicate a list of token indexes that we want to ignore. The explainer will not compute explanations for these tokens, although attributions of these tokens will be calculated to explain the predictions over other tokens.

attributions_2 = ner_explainer(sample_text, ignored_indexes=[0, 3, 4, 5])

When we visualize these attributions it will be much more concise:

Ignore labels

In a similar way, we can also tell the explainer to ignore certain labels, e.g. we might not be interested in seeing the explanations of those tokens that are classified as 'O'.

attributions_3 = ner_explainer(sample_text, ignored_labels=['O'])

Which result in:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.2 - TokenClassificationExplainer (NER) 🧍🏻‍♀️🌎 🏢

TokenClassificationExplainer(#91)

How to use

Visualizing explanations

Ignore indexes

Ignore labels

Contributors