Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model usage in the wild and other observations #2

Open
thoppe opened this issue Oct 18, 2019 · 3 comments
Open

Model usage in the wild and other observations #2

thoppe opened this issue Oct 18, 2019 · 3 comments

Comments

@thoppe
Copy link

thoppe commented Oct 18, 2019

Thanks again for putting this together. After combing through about a dozen movies, I can say that the models works brilliantly on almost every scene. It struggles with action scenes, but that's an understandable failure, as the shot itself is sometimes dynamic. Amazing work!

I've been experimenting with watching only specific types of shots and seeing how this would alter the expression or dynamic of the film. I've put together a fun video of only the non-speaking medium close shots here

https://www.youtube.com/watch?v=K0_O34eoC68&feature=youtu.be

BEFRAME is an AI-powered project to ONLY keep the scenes where the character is framed and not speaking. They can just "be" in the frame. Explore the characters and director's choices from Legally Blonde, The Exorcist, Fight Club, Pitch Perfect, Die Hard, Pretty Woman, The Princess Bride, and Requiem For a Dream.
Each movie is first clipped by visual content, and then analyzed for shot type. Only the Medium Close-up (MCU) shots are preserved. Google's speech detection is used to filter out any shots with detected words. Finally, the shots are strung back together in sequence.

I think there's a lot more one could do with this! If you've got any feedback let me know, otherwise feel free to close this issue as it's just a comment.

@rsomani95
Copy link
Owner

@thoppe this is really interesting! The video is quite fun to watch, and a nice direction to explore. Perhaps as you add different filtering criteria -- CUs with dialogues, WS only, etc, you'll start to see some more interesting patterns.

I saw you used PySceneDetect to split the movie up. Interestingly, I've also looked into cut detection techniques, and PySceneDetect definitely seemed like the best option to start with. It's a great tool, and gets the job done fairly well, but ICYMI, you should be aware that it isn't perfect. I opened #96 discussing this.

There are deep learning approaches that work better, but aren't in a user friendly format yet. I intend to fill this gap soon, adapting the solution for cinema. You might find these interesting:

  1. Fast Video Shot Transition Localization with Deep Structured Models. Paper. Code.
  2. Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks. Paper.

Also, this.


I think there's a lot more one could do with this!

I agree! In addition to exploring a similar approach with some different filtering criteria mentioned above, the next step could be combining other tools along with this. For example, looking at all the shots of a particular character, and how does this vary based on the character's role in the film. Also, looking at usage of colour schemes and how these vary, etc. Does that make sense?

I did some work with color (my first coding project ever), but haven't released it yet. I'll share it here once I do.

I'll leave this issue open for discussion of potential directions we could take. I'm curious, what do you plan to do next with this? :)

For some inspiration: http://cinemetrics.fredericbrodbeck.de

@1933874502
Copy link

Thanks again for putting this together. After combing through about a dozen movies, I can say that the models works brilliantly on almost every scene. It struggles with action scenes, but that's an understandable failure, as the shot itself is sometimes dynamic. Amazing work!

I've been experimenting with watching only specific types of shots and seeing how this would alter the expression or dynamic of the film. I've put together a fun video of only the non-speaking medium close shots here

https://www.youtube.com/watch?v=K0_O34eoC68&feature=youtu.be

BEFRAME is an AI-powered project to ONLY keep the scenes where the character is framed and not speaking. They can just "be" in the frame. Explore the characters and director's choices from Legally Blonde, The Exorcist, Fight Club, Pitch Perfect, Die Hard, Pretty Woman, The Princess Bride, and Requiem For a Dream.
Each movie is first clipped by visual content, and then analyzed for shot type. Only the Medium Close-up (MCU) shots are preserved. Google's speech detection is used to filter out any shots with detected words. Finally, the shots are strung back together in sequence.

I think there's a lot more one could do with this! If you've got any feedback let me know, otherwise feel free to close this issue as it's just a comment.

Hey,can you share this model ? the author doesn't reply.

@pbaneto
Copy link

pbaneto commented Nov 27, 2024

Hello, do you know anything about the models? could you share them if you have access?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants