Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor classification results despite copying annotations from successfully trained models #401

Open
xktz89 opened this issue Nov 18, 2024 · 2 comments

Comments

@xktz89
Copy link

xktz89 commented Nov 18, 2024

Describe the bug
I am training several models to detect grooming behavior. I trained three different DeepLabCut models on the same set of videos of at low, medium, or high resolution, then analyzed the same set of experimental videos (at low, medium, or high resolutions) to train three separate SimBA models. I annotated the behaviors for low resolution videos and successfully trained a model with good precision, recall, and f-1 score. I then created a new project for the medium resolution videos and went through the same steps as before for training (setting new video parameters, skipping outlier correction, extracting features) and then manually copied my previous annotations from the targets_inserted .csv files from the low resolution videos (see Steps below). This classifier also had good precision, recall, and f-1. However, I have tried the same procedure for the same high resolution videos, and my classification report looks very strange.

Not_grooming
Precision: 0.999233
Recall: 1
f1-score: 0.999616
Support: 242179

grooming
Precision: 0
Recall: 0
f1-score: 0
Support: 186

I already checked labeled videos from DeepLabCut and confirmed that the high-resolution DLC model is correctly tracking body parts. I also made sure I am importing the correct high-resolution videos and tracking files. The previous classifiers had the same balance of of grooming vs not grooming examples. I have used identical settings and training parameters. I have attempted to train a network again, and still get 0 precision, recall, and f-1. I am unsure what parameters to modify to improve performance, or frankly why this classifier is performing so poorly despite using the exact same annotations and only improving the video resolution. What might account for this?

To Reproduce
Steps to reproduce the behavior:

  1. Trained classifier to detect grooming.
  2. Created new project using same videos but at higher resolution (imported higher resolution video files AND tracking files from DLC model trained on higher resolution videos).
  3. Set video parameters to new distance (since there are more pixels and distance is different than in the same lower resolution videos).
  4. Skipped outlier correction and extracted features (same as with low-resolution classifier).
  5. Created new video annotations for low-resolution videos using "Label Behavior" tab, then manually pasted "groom" column with annotations from low-resolution "targets_inserted" .csv files into new project's "targets_inserted" .csv files.
  6. Trained classifier at default settings, and created classification report, feature importance bar graph, and calculated SHAP scores for 200 targets present and 200 targets absent.

Expected behavior
I would expect good or at least comparable classification performance as the lower resolution models.

Desktop (please complete the following information):

  • OS: Windows 10
  • SimBA v.2.2.4
  • Are you using anaconda? Yes

Additional context
Let me know if this all makes sense. Thank you in advance for any help or insights as to why this may be happening!

@sronilsson
Copy link
Collaborator

Hi @xktz89!

Sounds like you have thoroughly investigated this and I don't have an immediate answer. One thing that looks suspicious:

Not_grooming
Precision: 0.999233
Recall: 1
f1-score: 0.999616
Support: 242179

grooming
Precision: 0
Recall: 0
f1-score: 0
Support: 186

This table suggests that the test set contains 186 grooming frames, and more than 242k non-grooming frames, meaning that something like 0.0007% of frames contain grooming in the project_folder/csv/targets_inserted files which is very little data. Could anything have gone wrong when you copied and pasted the data?

If not huge amounts of data, you could you share the project with me and I can take a look? (it can help if you omit most of the video files which can take up a lot of space).

@sronilsson
Copy link
Collaborator

@xktz89 PS:

Under the [Label behavior] tab - right at the bottom as in the screengrab below - there is a Count annotations in project button that allows you to get the count the number of annotated behaviors in the SimBA project. You could use that in the three different projects to get an indications if the labelling aligns and nothing is off.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants