Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation notebook violin plot #97

Merged
merged 11 commits into from
Jun 2, 2023
Merged

add documentation notebook violin plot #97

merged 11 commits into from
Jun 2, 2023

Conversation

adnene-guessoum
Copy link

Related to Issue #55, I was looking for some low hanging fruits to start contributing and found this good first issue for an API example notebook related to violin plots.

I followed the example from the original package ( https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Scatter%20Density%20vs.%20Violin%20Plot%20Comparison.html?highlight=summary%20plot#Layered-violin-plot ) and the beeswarm general template.

Hope it's fine.

@adnene-guessoum adnene-guessoum changed the title Add documentation notebook violin plot add documentation notebook violin plot May 31, 2023
@thatlittleboy thatlittleboy added the documentation Improvements or additions to documentation label Jun 1, 2023
Copy link
Collaborator

@thatlittleboy thatlittleboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, thanks for the contribution! Really appreciate you taking this up. And sorry it took a while to get around to a review. The Github UI isn't that friendly for reviewing notebooks, so I'll resort to images.

First off, nice exposition in the intro. Couple of points:

  1. As described in Refactor _violin plot code and add examples #55, I'm intending to soft-deprecate passing in of numpy arrays (of SHAP values) as the first argument of the plotting functions -- it seems to be in the spirit that slundberg was going for in their earlier refactorings. And instead pass in Explanation objects, which is just a wrapper around the various numpy arrays required for an "explanation".
    image
    Could you revise the example a bit to pass an Explanation object to shap.plots.violin. Note that you can call the Explainer directly to get an Explanation object back, in this case shap.TreeExplainer(bst)(X) instead of calling the .shap_values() method which returns a numpy array -- it looks a little weird as is though, so do assign the explainer to a different variable before calling on X.
  2. This is potentially confusing (at least, it was to me) where we say the max display is 20 but we only showed 10 features above. Shall we add in an additional sentence/clause to explain that there are only 10 columns in X
    image
  3. I find the jump to layered_violin is a little abrupt, we could ease the reader into the topic abit more.
    image
    Perhaps, "There are currently two supported violin plot types: 'violin' and 'layered_violin'. You can control this via the plot_type parameter. Layered violin is identical to violin, except that outliers are not drawn as scatter points."
  4. Also, I think let's make the layered violin example a little simpler, I think there's no need to specify features and color -- when you provide features, by default you'll get the color map version of the violin plot (i.e., the next section), which is why you had to specify color=blue. This is potentially confusing for the reader.
  5. Formatting
    image

@adnene-guessoum
Copy link
Author

Thank you very much for the review. I will make the requested changes and resubmit.

@adnene-guessoum
Copy link
Author

adnene-guessoum commented Jun 1, 2023

So it turns out my low hanging fruit had a worm in it, and now i am afraid to bite...

If i may ask, so that I understand fully. As of now, the violin plot does not accept Explanation objects, right ?

  1. It appears to me that an Explanation object doesn't have expected_values :

https://github.com/dsgibbons/shap/blob/0821d2e1ea073562da7763236fe34c514e0bc0e8/shap/plots/_violin.py#LL61C7-L61C7

see here :

https://github.com/dsgibbons/shap/blob/0821d2e1ea073562da7763236fe34c514e0bc0e8/shap/plots/_violin.py#LL61C7-L61C7

but the base_values can do the job.

  1. However, after this change, i get a matplotlib warning regarding unused vmin / vmax ( i don't know what their purpose is, as there is no "c" parameter needed here, i feel ):

pl.scatter(shaps[nan_mask], np.ones(shap_values[nan_mask].shape[0]) * pos,

removing them seem to be okay, but i don't want to presume.

  1. After these changes however, it is also no longer possible to set colors from the top parameter. It may not have been possible from the start with Explanation objects. At this point, i don't want to touch too much as you seem to have a much clearer view of what you want to do with these violin plots.

My proposal would be to correct the other points you reviewed in this PR and to leave the passing of numpy arrays for now in the doc. Then, when you definitely deprecate them, or if i can get a grasp on how to make Explanation objects work, we can change the doc. Would that be ok ?

@thatlittleboy
Copy link
Collaborator

Hi @adnene-guessoum, apologies for making you go down this rabbit hole. Sure, let's leave it as numpy arrays for now, perhaps the support for Explanation in shap.plots.violin is not complete yet. I'll follow up on this later on.

@adnene-guessoum
Copy link
Author

so i made the changes, but i see there are some movements to an organisation repo if i got it right. should i close this and retry when the merge is done ?

@thatlittleboy
Copy link
Collaborator

Nope, we're good. We can merge this here, and we'll take care of porting this change over to the main repo.

Thanks for the contribution @adnene-guessoum !

Copy link
Collaborator

@thatlittleboy thatlittleboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domino-effect-self-high-five

@thatlittleboy thatlittleboy merged commit ad8b8bb into dsgibbons:master Jun 2, 2023
@adnene-guessoum
Copy link
Author

Thank you for the helpful review and the welcoming of new contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants