Skip to content

Latest commit

 

History

History
228 lines (180 loc) · 9.74 KB

README.md

File metadata and controls

228 lines (180 loc) · 9.74 KB

face-mask-detector

𝐑𝐞𝐚𝐥-𝐓𝐢𝐦𝐞 𝐅𝐚𝐜𝐞 𝐦𝐚𝐬𝐤 𝐝𝐞𝐭𝐞𝐜𝐭𝐢𝐨𝐧 𝐮𝐬𝐢𝐧𝐠 𝐝𝐞𝐞𝐩𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐀𝐥𝐞𝐫𝐭 𝐬𝐲𝐬𝐭𝐞𝐦 💻🔔

System Overview

It detects human faces with 𝐦𝐚𝐬𝐤 𝐨𝐫 𝐧𝐨-𝐦𝐚𝐬𝐤 even in crowd in real time with live count status and notifies user (officer) if danger.

System Modules:

  1. Deep Learning Model : I trained a YOLOv2,v3 and v4 on my own dataset and for YOLOv4 achieved 93.95% mAP on Test Set whereas YOLOv3 achieved 90% mAP on Test Set even though my test set contained realistic blur images, small + medium + large faces which represent the real world images of average quality.

  2. Alert System: It monitors the mask, no-mask counts and has 3 status :

    1. Safe : When all people are with mask.
    2. Warning : When atleast 1 person is without mask.
    3. Danger : ( + SMS Alert ) When some ratio of people are without mask.

Table of Contents

  1. Face-Mask Dataset
    1. Image Sources
    2. Image Annotation
    3. Dataset Description
  2. Deep Learning Models
    1. Training
    2. Model Performance
    3. Inference
      1. Detection on Image
      2. Detection on Video
      3. Detection on WebCam
  3. Alert System
  4. Suggestions to improve Performance
  5. References

Face-Mask Dataset

1. Image Sources

2. Image Annotation

3. Dataset Description

  • Dataset is split into 3 sets:
Set Number of images Objects with mask Objects without mask
Training Set 700 3047 868
Validation Set 100 278 49
Test Set 120 503 156
Total 920 3828 1073

Deep Learning Models

1. Training

  • Install Darknet for Mac or Windows first.
  • I have trained Yolov2,Yolov3 and YOLOv4.
  • Use following (linux) cmd to train:
./darknet detector train obj.data yolo3.cfg darknet53.conv.74
  • for windows use darknet.exe instead of ./darknet

YOLOv2 Training details

  • Data File = obj.data
  • Cfg file = yolov2.cfg
  • Pretrained Weights for initialization= yolov2.conv.23
  • Main Configs from yolov2.cfg:
    • learning_rate=0.001
    • batch=64
    • subdivisions=16
    • steps=1000,4700,5400
    • max_batches = 6000
    • i.e approx epochs = (6000*64)/700 = 548
  • YOLOv2 Training results: 0.674141 avg loss
  • Weights of YOLOv2 trained on Face-mask Dataset: yolov2_face_mask.weights

YOLOv3 Training details

  • Data File = obj.data
  • Cfg file = yolov3.cfg
  • Pretrained Weights for initialization= darknet53.conv.74
  • Main Configs from yolov3.cfg:
    • learning_rate=0.001
    • batch=64
    • subdivisions=32
    • steps=4800,5400
    • max_batches = 6000
    • i.e approx epochs = (6000*64)/700 = 548
  • YOLOv3 Training results: 0.355751 avg loss
  • Weights of YOLOv3 trained on Face-mask Dataset: yolov3_face_mask.weights

YOLOv4 Training details

  • Data File = obj.data
  • Cfg file = yolov4-obj.cfg
  • Pretrained Weights for initialization= yolov4.conv.137
  • Main Configs from yolov4-obj.cfg:
    • learning_rate=0.001
    • batch=64
    • subdivisions=64
    • steps=4800,5400
    • max_batches = 6000
    • i.e approx epochs = (6000*64)/700 = 548
  • YOLOv4 Training results: 1.19 avg loss
  • Weights of YOLOv4 trained on Face-mask Dataset: yolov4_face_mask.weights

2. Model Performance

  • Below is the comparison of YOLOv2, YOLOv3 and YOLOv4 on 3 sets.
  • Metric is [email protected] i.e Mean Average Precision.
  • Frames per Second (FPS) was measured on Google Colab GPU - Tesla P100-PCIE using Darknet command: link
Model Training Set Validation Set Test Set FPS
YOLOv2 83.83% 74.50% 78.95% 45 FPS
YOLOv3 99.75% 87.16% 90.18% 23 FPS
YOLOv4 99.65% 88.38% 93.95% 22 FPS
  • Note: For more detailed evaluation of model, click on model name above.
  • Conclusion:
    • Yolov2 has High bias and High Variance, thus Poor Performance.
    • Yolov3 has Low bias and Medium Variance, thus Good Performance.
    • Yolov4 has Low bias and Medium Variance, thus Good Performance.
    • Model can still generalize well as discussed in section : 4. Suggestions to improve Performance

3. Inference

  • You can run model inference or detection on image/video/webcam.
  • Two ways:
    1. Using Darknet itself
    2. Using Inference script (detection + alert)
  • Note: If you are using yolov4 weights and cfg for inference, then make sure you use opencv>=4.4.0 else you will get Unsupported activation: mish error.

3.1 Detection on Image

  • Use command:

     ./darknet detector test obj.data yolov3.cfg yolov3_face_mask.weights input/1.jpg -thresh 0.45
    

    OR

  • Use inference script

     python mask-detector-image.py -y yolov3-mask-detector -i input/1.jpg
    
  • Output Image:

    1_output.jpg

3.2 Detection on Video

  • Use command:

     ./darknet detector demo obj.data yolov3.cfg yolov3_face_mask.weights <video-file> -thresh 0.45
    

    OR

  • Use inference script

     python mask-detector-video.py -y yolov3-mask-detector -i input/airport.mp4 -u 1
    
  • Output Video:

3.3 Detection on WebCam

  • Use command: (just remove input video file)

     ./darknet detector demo obj.data yolov3.cfg yolov3_face_mask.weights -thresh 0.45
    

    OR

  • Use inference script: (just remove input video file)

     python mask-detector-video.py -y yolov3-mask-detector -u 1
    
  • Output Video:

Note

  • All the results(images & videos) shown are output of yolov3, you can use yolov4 for better results.

Alert System

  • Alert system is present within the inference script code.
  • You can modify the SMS alert code in script to customize ratio for sms if you want.
  • It monitors the mask, no-mask counts and has 3 status :
    1. Safe : When all people are with mask.
    2. Warning : When atleast 1 person is without mask.
    3. Danger : ( + SMS Alert ) When some ratio of people are without mask.

Suggestions to improve Performance

  • As described earlier that yolov4 is giving 93.95% mAP on Test Set, this can be improved by following tips if you want:

    1. Use more Training Data.
    2. Use more Data Augmentation for Training Data.
    3. Train with larger network-resolution by setting your .cfg-file (height=640 and width=640) (any value multiple of 32).
    4. For Detection use even larger network-resolution like 864x864.
    5. Try YOLOv5 or any other Object Detection Algorithms like SSD, Faster-RCNN, RetinaNet, etc. as they are very good as of now (year 2020).

References