Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HawkEars predict() has scores from audio clips that did not exist #1072

Open
sammlapp opened this issue Oct 29, 2024 · 1 comment
Open

HawkEars predict() has scores from audio clips that did not exist #1072

sammlapp opened this issue Oct 29, 2024 · 1 comment
Labels
bug Something isn't working module:ml Machine Learning with PyTorch

Comments

@sammlapp
Copy link
Collaborator

sammlapp commented Oct 29, 2024

predict() and embed() outputs for rows where the audio file did not exist contain copies of scores from some other clip. When batching samples we copy other clips into the place of missing clips to avoid errors with N/A, but they need to be replaced with NA in the final score df!

As a temporary workaround, please replace scores with NaN for any rows in the output where the start_time is nan:

hawkears=opensoundscape.ml.bioacoustics_model_zoo.Hawkears()
preds = hawkears.predict(...)
nan_mask = preds.index.get_level_values('start_time').isna()
preds[nan_mask]=np.nan

# same for embeddings:
emb = hawkears.predict(...)
nan_mask = emb.index.get_level_values('start_time').isna()
emb[nan_mask]=np.nan

This is not occurring with the CNN() class, so seems specific to something in the model zoo

@sammlapp sammlapp added bug Something isn't working module:ml Machine Learning with PyTorch labels Oct 29, 2024
@sammlapp
Copy link
Collaborator Author

hm actually I can't reproduce this right now. Need to investigate why it happened

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:ml Machine Learning with PyTorch
Projects
None yet
Development

No branches or pull requests

1 participant