Shot boundary detection (SBD) is the process of automatically detecting the boundaries between shots in video.
- TransNet: A deep network for fast detection of common shot transitions(arXiv 2019) [paper]
- TransNet V2: An effective deep network architecture for fast shot transition detection(arXiv 2020) [paper code]
Video moment localization, also known as video moment retrieval(VMR) or natural language video localization (NLVL) or temporal sentence grounding in videos (TSGV), aiming to retrieve a temporal moment that semantically corresponds to a language query from an untrimmed video.
- A Survey on Video Moment Localization(*ACM Computing Surveys)[paper]
- Temporal Sentence Grounding in Videos: A Survey and Future Directions(arXiv 2022)[paper]
Cognitive Science has known since last century that humans consistently segment videos into meaningful temporal chunks.This segmentation happens naturally, without pre-defined event categories and without being explicitly asked to do so.This task aims at localizing the moments where humans naturally perceive event boundaries.
- Generic Event Boundary Detection: A Benchmark for Event Segmentation (arXiv 2021) [paper]