Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media Pipe Pose Estimation + Visualization #203

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

brukew
Copy link

@brukew brukew commented Nov 19, 2024

Description

Implemented Media Pipe pose estimation. Given an image path, will return a PoseSkeleton object with landmarks of each individual in the image.

Related Issue(s)

https://github.com/orgs/sensein/projects/45/views/3?pane=issue&itemId=82951656&issue=sensein%7Csenselab%7C173

Motivation and Context

This is the initial structure for pose estimation which is a valuable signal for behavior analysis. I will expand to more models and functionality, and with that, this will be more generalized.

How Has This Been Tested?

I tested with different kinds of images and attempts to access invalid properties of the PoseSkeleton object. Unit tests for the new functions + also manually tested for proper visualization.

Screenshots (if appropriate):

Types of changes

Created PoseSkeleton object that contains pose information for individuals in an image. Currently supports MediaPipe pose estimation + visualization functionality.

Checklist:

  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • My code follows the code style of this project.

@brukew brukew changed the title 173 task pose estimation Media Pipe Pose Estimation + Visualization Nov 19, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 First Pull Request 🎉

Welcome to Senselab, and thank you for submitting your first pull request! We’re thrilled to have your contribution. Our team will review it as soon as possible. Stay engaged, and let’s make behavioral data analysis even more powerful together!

@brukew brukew requested a review from fabiocat93 November 19, 2024 21:37
@fabiocat93
Copy link
Collaborator

Thank you, @brukew, for the updates. I have a few suggestions and points to address based on our previous discussions:

  1. Reorganize Code Structure
    Please re-organize your code by separating data structures from functionalities (what we refer to as tasks in Senselab).

    • The skeleton should be treated as a data structure. It should define the skeleton and optionally include utilities for visualization. Alternatively, visualization can be handled as a standalone task.
    • Pose estimation should be a task (generating the skeleton as an output). All human pose estimation models (e.g., MediaPipe, AlphaPose) should conform to a consistent data structure.
    • To encourage generalizability, I suggest integrating a second pose estimation tool, such as YOLO due to its simplicity.
  2. Model Inclusion
    The current approach of including a model within the source code (e.g., src/senselab/video/tasks/pose_estimation/models/pose_landmarker.task) makes the package unnecessarily heavy. Instead, please ensure models are downloaded as needed. You can take inspiration from this example.

  3. Documentation
    Please add a dedicated documentation page:

    • Explain human pose estimation as a task, its purpose, and supported models.
    • For instance, you can reference this documentation for MediaPipe.
    • Feel free to draw inspiration from the existing audio task documentation (though some sections are incomplete).
  4. Tutorial
    Create a Jupyter Notebook tutorial to demonstrate:

    • How to use the interface.
    • What functionalities are available.
      Add this under a video folder in the tutorial/ directory.
  5. Failing Tests
    I noticed two tests are failing:

    • test_valid_image_single_person
      • AssertionError: "Input and output image shapes should match."
    • test_visualization_single_person
      • ValueError: "Input image must contain three-channel BGR data."
        Please double-check these tests to ensure they pass.

Copy link
Collaborator

@fabiocat93 fabiocat93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brukew, I have commented some required changes.

@fabiocat93 fabiocat93 added enhancement New feature or request release minor Minor release to-test labels Nov 20, 2024
@fabiocat93 fabiocat93 marked this pull request as draft November 20, 2024 02:38
@brukew
Copy link
Author

brukew commented Nov 20, 2024

Nice, thank you for the feedback @fabiocat93. I will address your comments and ask questions as I go.

@brukew brukew force-pushed the 173-task-pose-estimation branch from e1a7696 to 269f0a6 Compare November 24, 2024 21:21
@fabiocat93
Copy link
Collaborator

hi @brukew , did you have any time to work further on this?

@brukew
Copy link
Author

brukew commented Dec 29, 2024

hey @fabiocat93, slowed down on development towards the end of the semester. I will continue this month.

@brukew
Copy link
Author

brukew commented Jan 8, 2025

@fabiocat93 lmk what you think about having separate estimator classes for different models. also, I kept the landmark names as they are originally listed but wondering if I should homogenize them at all.

@fabiocat93
Copy link
Collaborator

@fabiocat93 lmk what you think about having separate estimator classes for different models. also, I kept the landmark names as they are originally listed but wondering if I should homogenize them at all.

Good question. You can have multiple estimators, each being a child of the same abstract class and producing a skeleton following the same universal structure

@brukew
Copy link
Author

brukew commented Jan 12, 2025

addressed model download by downloading models to video/tasks/pose_estimation/models upon use. Let me know if there is a better alternative to this that is more in line with senselab use.

@brukew
Copy link
Author

brukew commented Jan 12, 2025

also wondering how I can ensure tutorial works (and cells are run in tutorial) if changes haven't been pushed to main branch @fabiocat93. should I wait until after?

@fabiocat93
Copy link
Collaborator

addressed model download by downloading models to video/tasks/pose_estimation/models upon use. Let me know if there is a better alternative to this that is more in line with senselab use.

Good for now. But you raised a good point. We will need to implement a customizable cache folder for the package (of course with a default value)

@fabiocat93
Copy link
Collaborator

also wondering how I can ensure tutorial works (and cells are run in tutorial) if changes haven't been pushed to main branch @fabiocat93. should I wait until after?

When testing the tutorial locally, you can install the package directly from a specific branch instead of relying on pip install senselab. This allows you to ensure the tutorial runs correctly with the latest changes before they are merged into the main branch

@brukew brukew marked this pull request as ready for review January 13, 2025 21:41
@brukew brukew requested a review from fabiocat93 January 13, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request minor Minor release release to-test
Projects
None yet
2 participants