v0.7.2 - TokenClassificationExplainer (NER) π§π»ββοΈπ π’
TokenClassificationExplainer(#91)
This incredible release is all thanks to a fantastic community contribution from @pabvald, he implemented the entire TokenClassificationExplainer
class, as well as all its tests and associated docs. A huge thank you again to Pablo for this amazing work, it has been on my to-do list for over a year and I greatly appreciate this contribution and I know the community will too.
This new explainer is designed to work with any and all models in the HuggingFaceTransformers package that are of the kind {Model}ForTokenClassification
, which are models commonly used for tasks such Named Entity Recognition (NER) and Part-of-speech (POS) tagging.
The TokenClassificationExplainer
returns a dictionary mapping each word in a given sequence to a label in the model's trained labels configuration. Token classification models work on a word by word basis so the structure of this explainers output is that each word maps to another dictionary which contains two keys label
and attribution_scores
, where label
is a string indicating the predicted label and attribution_scores
is another dict mapping words to scores for the given root word key.
How to use
from transformers import AutoModelForTokenClassification, AutoTokenizer
from transformers_interpret import TokenClassificationExplainer
MODEL_PATH = 'dslim/bert-base-NER'
model = AutoModelForTokenClassification.from_pretrained(MODEL_PATH)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
ner_explainer = TokenClassificationExplainer(model=model, tokenizer=tokenizer)
sample_text = "Tim Cook is CEO of Apple."
attributions = ner_explainer(sample_text)
print(attributions)
Expand to see word attribution dictionary
{'[CLS]': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.346423320984119),
('Cook', 0.5334609978768102),
('is', -0.40334870049983335),
('CEO', -0.3101234375976895),
('of', 0.512072192130804),
('Apple', -0.17249370683345489),
('.', 0.21111967418861474),
('[SEP]', 0.0)]},
'Tim': {'label': 'B-PER',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.6097200124017794),
('Cook', 0.7418433507979225),
('is', 0.2277328676307869),
('CEO', 0.12913824237676577),
('of', 0.0658425121482477),
('Apple', 0.06830320263790929),
('.', -0.01924683905463743),
('[SEP]', 0.0)]},
'Cook': {'label': 'I-PER',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.5523936725613293),
('Cook', 0.8009957951991128),
('is', 0.1804967026709793),
('CEO', 0.12327788007775593),
('of', 0.042470529981614845),
('Apple', 0.057217721910403266),
('.', -0.020318897077615642),
('[SEP]', 0.0)]},
'is': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.24614651317657982),
('Cook', -0.009088703281476993),
('is', 0.9216954069405697),
('CEO', 0.026992140219729874),
('of', 0.2520559406534854),
('Apple', -0.09920548911190433),
('.', 0.12531705560714215),
('[SEP]', 0.0)]},
'CEO': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.3124910273039106),
('Cook', 0.3625517589427658),
('is', 0.3507524148134499),
('CEO', 0.37196988201878567),
('of', 0.645668212957734),
('Apple', -0.27458958091134866),
('.', 0.13126252757894524),
('[SEP]', 0.0)]},
'of': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.021065140560775575),
('Cook', 0.05638048932919909),
('is', 0.16774739397504396),
('CEO', 0.043009122581603866),
('of', 0.9340829137500298),
('Apple', -0.11144488868920191),
('.', 0.2854079089492836),
('[SEP]', 0.0)]},
'Apple': {'label': 'B-ORG',
'attribution_scores': [('[CLS]', 0.0),
('Tim', -0.017330599088927878),
('Cook', -0.04074196463435918),
('is', -0.08738080703156076),
('CEO', 0.23234519803002726),
('of', 0.12270125701886334),
('Apple', 0.9561624229708163),
('.', -0.08436746169241069),
('[SEP]', 0.0)]},
'.': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.052863660537099254),
('Cook', -0.0694824371223385),
('is', -0.18074653059003534),
('CEO', 0.021118463602210605),
('of', 0.06322422431822372),
('Apple', -0.6286955666244136),
('.', 0.748336093254276),
('[SEP]', 0.0)]},
'[SEP]': {'label': 'O',
'attribution_scores': [('[CLS]', 0.0),
('Tim', 0.29980967625881066),
('Cook', -0.22297477338851293),
('is', -0.050889312336460345),
('CEO', 0.11157068443843984),
('of', 0.25200059104116196),
('Apple', -0.8839047143031845),
('.', -0.023808126035021283),
('[SEP]', 0.0)]}}
Visualizing explanations
With a single call to the visualize() method we get a nice inline display of what inputs are causing the activations to fire that led to classifying each of the tokens into a particular class.
Ignore indexes
To save computation time, we can indicate a list of token indexes that we want to ignore. The explainer will not compute explanations for these tokens, although attributions of these tokens will be calculated to explain the predictions over other tokens.
attributions_2 = ner_explainer(sample_text, ignored_indexes=[0, 3, 4, 5])
When we visualize these attributions it will be much more concise:
Ignore labels
In a similar way, we can also tell the explainer to ignore certain labels, e.g. we might not be interested in seeing the explanations of those tokens that are classified as 'O'.
attributions_3 = ner_explainer(sample_text, ignored_labels=['O'])
Which result in: