This project focuses on developing a model to map football players' positions on a 2D map using footage from Duke Kunshan University's football games. The initial phase involves using stable video footage, with plans to later incorporate dynamic livestream videos with optical flow for more advanced tracking and detection.
Using the Ultralytics YoloV8 model, we detect bounding boxes around moving players. This model, pretrained with weights from extensive datasets, is highly accurate and can be found here.
Although player detection is highly effective, challenges persist with ball detection due to suboptimal lighting conditions at DKU compared to better-lit stadiums. This lighting issue complicates detection both for the model and human observers. Efforts are underway to enhance data labeling for improved model training.
The model currently recognizes three categories: players, referees, and the football. Additional modifications are planned to separately identify goalkeepers to facilitate team differentiation.
Player tracking employs ByteTrack, a robust multi-object tracking algorithm designed for maintaining identification across video frames under complex conditions. The key components of ByteTrack include:
-
IoU Tracking: Initially, IoU (Intersection over Union) tracking matches player detections frame-to-frame based on bounding box overlaps, effective for minimally moved objects.
-
Byte Association: For fast-moving or temporarily obscured players, ByteTrack calculates a cost matrix using IoU scores between unmatched detections and existing tracks, applying algorithms like the Hungarian method to establish the best matches.
The approach to team prediction involves applying a green mask to the field, then extracting and analyzing the average color of player uniforms minus the green background, focusing on the jersey's upper half.
Colors are analyzed in the Lab color space to increase the precision of Euclidean distance measurements, crucial for distinguishing team colors effectively.
We apply coordinate transformation using a homography matrix to accurately map out player positions onto a 2D field representation.
We use ImageJ to find the keypoint coordinates manually since the video is stable and not moving. We use the following labels for every keypoint:
Code | Description |
---|---|
TLC | Top Left Corner |
TRC | Top Right Corner |
TR6MC | Top Right 6-yard box Middle Center |
TL6MC | Top Left 6-yard box Middle Center |
TR6ML | Top Right 6-yard box Middle Left |
TL6ML | Top Left 6-yard box Middle Left |
TR18MC | Top Right 18-yard box Middle Center |
TL18MC | Top Left 18-yard box Middle Center |
TR18ML | Top Right 18-yard box Middle Left |
TL18ML | Top Left 18-yard box Middle Left |
TRArc | Top Right Arc |
TLArc | Top Left Arc |
RML | Right Midline |
RMC | Right Middle Center |
LMC | Left Middle Center |
LML | Left Midline |
BLC | Bottom Left Corner |
BRC | Bottom Right Corner |
BR6MC | Bottom Right 6-yard box Middle Center |
BL6MC | Bottom Left 6-yard box Middle Center |
BR6ML | Bottom Right 6-yard box Middle Left |
BL6ML | Bottom Left 6-yard box Middle Left |
BR18MC | Bottom Right 18-yard box Middle Center |
BL18MC | Bottom Left 18-yard box Middle Center |
BR18ML | Bottom Right 18-yard box Middle Left |
BL18ML | Bottom Left 18-yard box Middle Left |
BRArc | Bottom Right Arc |
BLArc | Bottom Left Arc |
For one of the frames, we get the following transformation:
The homography is generally accurate enough for us to get reliable results for analysis.
Here is a summary that englobles all the processes done in this project.
The next step would be to use live-stream videos from the Suzhou College Football League to get useful data from the games for analysis.