Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add training data extraction to segments tab #463

Open
kahst opened this issue Oct 7, 2024 · 2 comments
Open

Add training data extraction to segments tab #463

kahst opened this issue Oct 7, 2024 · 2 comments
Assignees
Labels

Comments

@kahst
Copy link
Owner

kahst commented Oct 7, 2024

We should add the extraction of training samples based on annotation files (e.g., selection tables) to the segments tab, so that people can use that to build their training data sets without having to manually extract snippets through Raven.

This is how it could work:

  • create two options in segments tab: Create segments for review and Create segments for training
  • for training segments:
    • parse selection table and convert timestamps to fixed format (e.g., 3s) based on overlap with bounding box
      • 4.2-7.2 becomes 3.0-6.0 and 6.0-9.0
      • 2.9-6.2 becomes 3.0-6.0 because the overlap (in this case 0.5 is set as min) is too short to be in 0.0-3.0 or 6.0-9.0
    • merge labels for each segment
    • save in folder with multi-label format that is compatible with our training data workflow
@max-mauermann
Copy link
Collaborator

How do we want to handle annotations that lie exactly on the edge between two segments but dont have enough overlap with either one?
For example: 2.9-3.3

@kahst
Copy link
Owner Author

kahst commented Oct 26, 2024

I'd say we assign these to the segment with the most overlap (in your case 3-6). Since the annotation is so short, we can still assume that a significant percentage of the marked vocalization is covered by the segment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants