Replies: 2 comments
-
Hello. It seems that the second import should be |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Release version 0.8.1
Lots of changes big and small with this release:
PairwiseSequenceClassificationExplainer (#87, #82, #58)
This has been a fairly requested feature and one that I am very happy to release, especially as I have had the desire to explain the outputs of CrossEncoder models as of late.
The
PairwiseSequenceClassificationExplainer
is a variant of theSequenceClassificationExplainer
that is designed to work with classification models that expect the input sequence to be two inputs separated by a models' separator token. Common examples of this are NLI models and Cross-Encoders which are commonly used to score two inputs similarity to one another.This explainer calculates pairwise attributions for two passed inputs text1 and text2 using the model and tokenizer given in the constructor.
Also, since a common use case for pairwise sequence classification is to compare two inputs similarity - models of this nature typically only have a single output node rather than multiple for each class. The pairwise sequence classification has some useful utility functions to make interpreting single node outputs clearer.
By default for models that output a single node the attributions are with respect to the inputs pushing the scores closer to 1.0, however if you want to see the attributions with respect to scores closer to 0.0 you can pass
flip_sign=True
when calling the explainer. For similarity-based models, this is useful, as the model might predict a score closer to 0.0 for the two inputs and in that case, we would flip the attributions sign to explain why the two inputs are dissimilar.Example Usage
For this example we are using
"cross-encoder/ms-marco-MiniLM-L-6-v2"
, a high quality cross-encoder trained on the MSMarco dataset a passage ranking dataset for question answering and machine reading comprehension.Which returns the following attributions:
Visualize Pairwise Classification attributions
Visualizing the pairwise attributions is no different to the sequence classification explaine. We can see that in both the
query
andcontext
there is a lot of positive attribution for the wordberlin
as well the wordspopulation
andinhabitants
in thecontext
, good signs that our model understands the textual context of the question asked.If we were more interested in highlighting the input attributions that pushed the model away from the positive class of this single node output we could pass:
This simply inverts the sign of the attributions ensuring that they are with respect to the model outputting 0 rather than 1.
RoBERTa Consitency Improvements (#65)
Thanks to some great detective work by @dvsrepo, @jogonba2, @databill86, and @VDuchauffour on this issue over the last year we've been able to identify what looks to be the main culprit responsible for the misalignment of scores given for RoBERTa based model inside the package when compared with their actual outputs in the transformers package.
Because this package has to create reference id's for each input type (input_ids, position_ids, token_type_ids) to create a baseline we try and emulate the outputs of the model's tokenizers in an automated fashion, for most BERT-based models this works great but as I have learned from reading this thread (#65) there were significant issues with RoBERTa.
It seems that the main reason for this is that RoBERTa implements
position_ids
in a very different manner to BERT (read this and this for extra context). Since we were passing completely incorrect values for position_ids it appears to have thrown the model's predictions off. This release does not fully fix the issue but it does bypass the passing of incorrectposition_ids
by simply not passing them to the forward function. We've done this by creating a flag that recognises certain model architectures as being incompatible with how we createposition_ids
according the Transformers docs whenposition_ids
are not passed:So this solution should be good for most situations, however, ideally in the future, we will look into creating RoBERTa compatible
position_ids
within the package itself.Move to GH actions
This release also moves our testing suite from CircleCI to GH Actions, GH Actions has proven to be easier to integrate with and much more convenient.
Other
This discussion was created from the release PairwiseSequenceClassificationExplainer, RoBERTa bug fixes, GH Actions migration.
Beta Was this translation helpful? Give feedback.
All reactions