-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add documentation notebook violin plot #97
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there, thanks for the contribution! Really appreciate you taking this up. And sorry it took a while to get around to a review. The Github UI isn't that friendly for reviewing notebooks, so I'll resort to images.
First off, nice exposition in the intro. Couple of points:
- As described in Refactor
_violin
plot code and add examples #55, I'm intending to soft-deprecate passing in of numpy arrays (of SHAP values) as the first argument of the plotting functions -- it seems to be in the spirit that slundberg was going for in their earlier refactorings. And instead pass inExplanation
objects, which is just a wrapper around the various numpy arrays required for an "explanation".
Could you revise the example a bit to pass anExplanation
object toshap.plots.violin
. Note that you can call the Explainer directly to get an Explanation object back, in this caseshap.TreeExplainer(bst)(X)
instead of calling the.shap_values()
method which returns a numpy array -- it looks a little weird as is though, so do assign the explainer to a different variable before calling on X. - This is potentially confusing (at least, it was to me) where we say the max display is 20 but we only showed 10 features above. Shall we add in an additional sentence/clause to explain that there are only 10 columns in
X
- I find the jump to
layered_violin
is a little abrupt, we could ease the reader into the topic abit more.
Perhaps, "There are currently two supported violin plot types: 'violin' and 'layered_violin'. You can control this via theplot_type
parameter. Layered violin is identical to violin, except that outliers are not drawn as scatter points." - Also, I think let's make the layered violin example a little simpler, I think there's no need to specify
features
andcolor
-- when you providefeatures
, by default you'll get the color map version of the violin plot (i.e., the next section), which is why you had to specifycolor=blue
. This is potentially confusing for the reader. - Formatting
Thank you very much for the review. I will make the requested changes and resubmit. |
So it turns out my low hanging fruit had a worm in it, and now i am afraid to bite... If i may ask, so that I understand fully. As of now, the violin plot does not accept Explanation objects, right ?
see here : but the base_values can do the job.
Line 335 in 0821d2e
removing them seem to be okay, but i don't want to presume.
My proposal would be to correct the other points you reviewed in this PR and to leave the passing of numpy arrays for now in the doc. Then, when you definitely deprecate them, or if i can get a grasp on how to make |
Hi @adnene-guessoum, apologies for making you go down this rabbit hole. Sure, let's leave it as numpy arrays for now, perhaps the support for Explanation in |
so i made the changes, but i see there are some movements to an organisation repo if i got it right. should i close this and retry when the merge is done ? |
Nope, we're good. We can merge this here, and we'll take care of porting this change over to the main repo. Thanks for the contribution @adnene-guessoum ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the helpful review and the welcoming of new contributors. |
Related to Issue #55, I was looking for some low hanging fruits to start contributing and found this good first issue for an API example notebook related to violin plots.
I followed the example from the original package ( https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Scatter%20Density%20vs.%20Violin%20Plot%20Comparison.html?highlight=summary%20plot#Layered-violin-plot ) and the beeswarm general template.
Hope it's fine.