Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble calculating SHAP scores #398

Open
GKNM995 opened this issue Oct 17, 2024 · 17 comments
Open

Trouble calculating SHAP scores #398

GKNM995 opened this issue Oct 17, 2024 · 17 comments

Comments

@GKNM995
Copy link

GKNM995 commented Oct 17, 2024

Hi! I am trying to train classifiers for my project but I keep getting an error when trying to calculate SHAP scores. I tried both single and multicore runs but I get the same error either way. I can train the classifier if I completely skip over the SHAP calculations but I would love to have them.

The Simba screen freezes here:
SIMBA_screenshot

The error message in the command line shown here:
Error_message

Here are the model parameters used:
pose_estimation_body_parts = 8
classifier = Body_groom
train_test_size = 0.2
under_sample_setting = None
under_sample_ratio = NaN
over_sample_setting = None
over_sample_ratio = NaN
rf_n_estimators = 2000
rf_min_sample_leaf = 1
rf_max_features = sqrt
rf_n_jobs = -1
rf_criterion = entropy
generate_rf_model_meta_data_file = None
generate_example_decision_tree = False
generate_example_decision_tree_fancy = False
generate_features_importance_log = False
generate_features_importance_bar_graph = False
compute_feature_permutation_importance = False
generate_sklearn_learning_curves = False
generate_precision_recall_curves = True
n_feature_importance_bars = 0
learning_curve_k_splits = 0
learningcurve_shuffle_data_splits = None
model_to_run = RF
train_test_split_type = BOUTS
rf_meta_data = True
generate_classification_report = True
learning_curve_data_splits = 0
generate_shap_scores = True
shap_target_present_no = 100
shap_target_absent_no = 100
shap_save_iteration = 100
shap_multiprocess = False
partial_dependency = False
class_weights = custom
class_custom_weights = {0: '1', 1: '2'}

  • OS: Windows 10
  • Python Version 3.10
  • Using a Python virtual environment
  • Simba version 2.1.7

Thank you for your help!

@sronilsson
Copy link
Collaborator

sronilsson commented Oct 17, 2024

Thanks for reporting @GKNM995 - I can see what is happening. Seems like shap package version in Python 3.10 wants an older numpy version or vice versa.

What version of shap (pip show shap) and numpy (pip show numpy) are you running?

@sronilsson
Copy link
Collaborator

... PS. It should be shap version 0.42.0 if you are running python version above 3.6 (I think).

If you do pip install shap==0.42.0 and then try again, how does it look on your end?

@GKNM995
Copy link
Author

GKNM995 commented Oct 17, 2024

Shap version is 0.35.0, and numpy version is 1.26.4. Gonna update the shap version and let you know how it goes.

@GKNM995
Copy link
Author

GKNM995 commented Oct 17, 2024

I get this error when I install shap 0.42.0.
image

@sronilsson
Copy link
Collaborator

Yes, it should just be a warning that I will remove, let me know if it runs anyways?

@GKNM995
Copy link
Author

GKNM995 commented Oct 17, 2024

It seems to be running now. No new error at the moment

@sronilsson
Copy link
Collaborator

thanks for letting me know!

@GKNM995
Copy link
Author

GKNM995 commented Oct 18, 2024

Hi again! SHAP value calculation still won't run to the end. I let it be for the last few hours but it won't advance past the spot shown in the first message above. This time, though, there's no error message printed in the command line.

@sronilsson
Copy link
Collaborator

Thanks @GKNM995 - just to confirm, it won't advance past Calculating SHAP scores (SINGLE CORE).. ?

@GKNM995
Copy link
Author

GKNM995 commented Oct 18, 2024

Yeah. It gets to "Calculating SHAP scores (SINGLE CORE).." and stops. Initially thought it was just a slow process but now I think it's stuck.

@sronilsson
Copy link
Collaborator

Yes you should be able to see the progress, let me take a look tomorrow and get back to you. In meantime, f you click to multiprocess the shap values in the model settings menu, does it also get stuck?

@GKNM995
Copy link
Author

GKNM995 commented Oct 18, 2024

I'll try that now and let you know. Thank you!

@GKNM995
Copy link
Author

GKNM995 commented Oct 18, 2024

Unfortunately, it still gets stuck when I click the multiprocess the shap values in the model settings menu.

@sronilsson
Copy link
Collaborator

Thanks @GKNM995 - I have been trying it this morning, with your settings, and I have not been able to hit the error :/

The only difference I can see is I am running simba 2.2.3, if you update with pip install simba-uw-tf-dev --upgrade, is it still hanging? I don't think it is that but I can't think of anything else at the minute.

@sronilsson
Copy link
Collaborator

PS. Alternative, I can give you an example jupyter notebook to run it? I don't know what's going on but it may be something with the GUI that is causing it to freeze on your system.

@GKNM995
Copy link
Author

GKNM995 commented Oct 25, 2024

Hi! I updated Simba and it ran fine. Not sure what was going on before, but I'll let you know if it happens again. Thank you so much for your help!

@sronilsson
Copy link
Collaborator

Cheers @GKNM995 - just fyi

there are some notebooks below that gives SHAP values in SimBA that maybe execute more reliably outside GUI, and runs faster.

Shapley calculations: Example I (single core)

Shapley calculations: Example II (multiple cores)

Shapley calculations: Example III (GPU)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants