RetinaNet implementation

Hello all!

In this repository I played around with RetinaNet and implemented it for object detection of traffic signs. The data can be found here: https://www.kaggle.com/datasets/valentynsichkar/traffic-signs-dataset-in-yolo-format

Architecture

I based the repo on this paper: https://arxiv.org/pdf/1708.02002.pdf The authors use a FPN combined with a focal loss to tackle the class imbalance. They use a ResNet50 backbone and put a regression and classifciation head on top of the backbone.

For my implementation I used a ResNet34 backbone due to GPU memory issues. Additionally, I used the anchor offset computation proposed in the Yolov3 paper by:

Center Coordinates: $$[ b_x = \sigma(t_x) + c_x ]$$ $$[ b_y = \sigma(t_y) + c_y ]$$
- Width and Height: $$[ b_w = p_w \cdot e^{t_w} ]$$ $$[ b_h = p_h \cdot e^{t_h} ]$$

This stabled my gradients. I trained for a couple of epochs and was able to detect the correct objects. Unfortunately, the bounding boxes are not that accurate yet. here is an example:

There is still room for improvement. NMS has to be applied and the bounding box predictions can be more accurate. My class accuracy is around 75% and no object accuracy is about 99%, which is already decent considering I trained only a couple hundred epochs with this small dataset. MAP is around 50% with a threshold of 0.3.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
README.md		README.md
img.png		img.png
img_1.png		img_1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RetinaNet implementation

Hello all!

Architecture

About

Releases

Packages

Languages

Coluding/RetinaNet

Folders and files

Latest commit

History

Repository files navigation

RetinaNet implementation

Hello all!

Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages