HawkEars predict() has scores from audio clips that did not exist #1072

sammlapp · 2024-10-29T15:19:23Z

predict() and embed() outputs for rows where the audio file did not exist contain copies of scores from some other clip. When batching samples we copy other clips into the place of missing clips to avoid errors with N/A, but they need to be replaced with NA in the final score df!

As a temporary workaround, please replace scores with NaN for any rows in the output where the start_time is nan:

hawkears=opensoundscape.ml.bioacoustics_model_zoo.Hawkears()
preds = hawkears.predict(...)
nan_mask = preds.index.get_level_values('start_time').isna()
preds[nan_mask]=np.nan

# same for embeddings:
emb = hawkears.predict(...)
nan_mask = emb.index.get_level_values('start_time').isna()
emb[nan_mask]=np.nan

This is not occurring with the CNN() class, so seems specific to something in the model zoo

The text was updated successfully, but these errors were encountered:

sammlapp · 2024-10-31T18:30:00Z

hm actually I can't reproduce this right now. Need to investigate why it happened

sammlapp added bug Something isn't working module:ml Machine Learning with PyTorch labels Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HawkEars predict() has scores from audio clips that did not exist #1072

HawkEars predict() has scores from audio clips that did not exist #1072

sammlapp commented Oct 29, 2024 •

edited

Loading

sammlapp commented Oct 31, 2024

HawkEars predict() has scores from audio clips that did not exist #1072

HawkEars predict() has scores from audio clips that did not exist #1072

Comments

sammlapp commented Oct 29, 2024 • edited Loading

sammlapp commented Oct 31, 2024

sammlapp commented Oct 29, 2024 •

edited

Loading