ENH: `vak.predict` should record which input was used for each prediction #777

NickleDave · 2024-09-22T15:41:03Z

Currently vak.predict.frame_classification saves an output csv where we infer the annotated audio file from the name of the "frames path", the array file that contains the input frames to the model.

This has a couple of drawbacks:

If there's no annotations for one of the frame paths, because the model does not predict any (non-background) segments (as in add logic to handle predicted annotations with no segments #393), then the frames path doesn't appear in the predicted annotations
the inferred audio file name can be wrong, if we change the frames path name (e.g. for CMACBench experiments where we have multiple frames file derived from the same audio file, and we give those frames paths different names based on some condition like spectrogram parameters or what group appears in the dataset). This means we will end up inferring some notated_path that is an audio filename that doesn't exist

So instead we should somehow track the exact frames path used for a prediction, and
I think what would be extra nice here would be to save a new csv file that just adds columns to the splits_csv_path dataset, so that way we carry along any metadata we might have added as columsn in the splits_csv_path (e.g., species, animal ID, some other arbitrary group like "unit" or "song") that we can use for downstream analysis. We're most of the way to this already since we iterate over the "frames_path" column when we generate predictions.

Eventually we should do this for other predict functions as well, although not sure what it will look like there if we're not still using "frames_path"

The text was updated successfully, but these errors were encountered:

NickleDave self-assigned this Oct 25, 2024

NickleDave added the ENH: enhancement enhancement; new feature or request label Oct 25, 2024

NickleDave added this to vak-1.1 Oct 25, 2024

NickleDave moved this to Todo in vak-1.1 Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: `vak.predict` should record which input was used for each prediction #777

ENH: `vak.predict` should record which input was used for each prediction #777

NickleDave commented Sep 22, 2024

ENH: vak.predict should record which input was used for each prediction #777

ENH: vak.predict should record which input was used for each prediction #777

Comments

NickleDave commented Sep 22, 2024

ENH: `vak.predict` should record which input was used for each prediction #777

ENH: `vak.predict` should record which input was used for each prediction #777