Skip to content
This repository has been archived by the owner on Jun 28, 2020. It is now read-only.

pimonteiro/Covid-19-Detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Covid-19-Detector

Covid-19 XRay Detector This project was made on behalf of a subject on our 4th year of the Master's Degree on Data Science. It's goal is to achieve a model viable enough to produce good results on distinguishing cases of Pneumonia, Covid-19 and the absence of these. Due to the lack of a single source of data to train our models, we used a data-generator from "COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images", made by Linda Wang, Zhong Qiu Lin and Alexander Wong. Link here: https://github.com/lindawangg/COVID-Net. The only part we used was the data-generator provided but with some modifications to attend our needs. Thank you to all the participants of the COVID-Net for their hard-work on making it this far. The reason we didn't simply fork the main repository was to make it simpler for evaluation.

About Team

The developer team is three master students from University of Minho, Braga, Portugal.

Dataset

The dataset we used, as explained above, was generated using a script from the COVID-Net repository. The file was then uploaded to my personal drive, where we must unzip it in order to use it. Link for the dataset: https://drive.google.com/open?id=1EtA92aU1GmcRCR00hTxmyC2kfQIEYx-L

Being the data so unbalaced, a new dataset was made to fix this problem: https://drive.google.com/file/d/1_X_cO5INBY3EfSGEYlMe6hmc4ni2bujH/view?usp=sharing

Train_Data Original Balanced
COVID-19 223 1561
Pneumonia 5451 1700
Normal 7966 1700
Test_Data
COVID-19 31
Pneumonia 594
Normal 885

This was created by removing a lot of records with Pneumonia and Normal, randomly and generating new data of COVID-19 using some transformations: each real image was transformed 6 times. This was also made before the computation to speed up the process of training the various models (making image augmentation during the training caused epochs to take between 5 to 10 minutes to complete).

The exploration of these datasets can be found in the Covid_19_Data_Exploration.ipynb notebook.

Models

Here we discuss the models created by our group. All notebooks with their respective model follow the pipeline exposed in the standard notebook, Covid_19_Standard_Notebook.ipynb, and are displayed on the following table:

Link to model Short description Articles About
Covid_19_First Example This notebook was the first developed to test and understand the augmentation techniques, transfer learning. Article
Covid_19_VGG16 This notebook uses the VGG16 network architecture. Article
Covid_19_VGG19 This notebook uses the VGG19 network architecture. Article
Covid_19_XCeption This notebook uses the XCeption network architecture Article
Covid_19_ResNet50 This notebook uses the ResNet50 network architecture Article
Covid_19_LeNet This notebook uses the LeNet network architecture. Article
Covid_19_MobileNetV2 This notebook uses the MobileNetV2 network architecture Article
Covid_19_CXRP This notebook uses a newly created architecture - CXRP (Covid Xray Profiler)
Covid_19_BraDetect This notebook uses a newly created architecture - BraDetect
Covid_19_SCNN This notebook uses a newly created architecture - SCNN (Simple Covid-19 Neural Network)

Conclusion

Accuracy Precision Recall Notes
Covid-19 Normal Pneumonia Covid-19 Normal Pneumonia
VGG16 0.90 0.68 0.94 0.86 0.84 0.90 0.90 Retrained
VGG19 0.92 0.68 0.96 0.87 0.84 0.91 0.93 Retrained
XCeption 0.79 0.78 0.91 0.68 0.23 0.74 0.89 Freezed; Version 2
ResNet50 0.82 0.81 0.98 0.70 0.81 0.72 0.97 Freezed
MobileNetV2 0.86 1.00 0.97 0.75 0.71 0.79 0.97 Retrained; Overfitted
CXRP-3B 0.88 0.42 0.91 0.89 0.90 0.90 0.84 3 Blocks
LeNet 0.87 0.52 0.89 0.87 0.81 0.91 0.81
SCNN 0.84 0.29 0.93 0.81 0.77 0.83 0.86 2 Blocks
BraDetect 0.85 0.56 0.91 0.79 0.74 0.85 0.86

Of all the explored models the best one performance wise was the CXRP-3B, with a good balance in recall for all the different classes. One might think the VGG19 had a better development, but in fact the model isn't consistent enough to be have trustable predictions - the training appeared very random, even tho the predictions were right. All other models still had a very good performance, all with close numbers of accuracy, precision and recall. Despite others having better accuracy than the CXRP-3B, we tried to focus on the Recall metric because it represents the percentage of positive cases of each class that were correctly assigned - we want to avoid giving false negatives, as well as always try to have the most confident of not having Covid-19 - reason being why we want to maximize the recall of Covid-19.

To try and query the final model, please run this notebook.

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •