Transfer Learning with EfficientNet For The Classification of Brain Tumor MR Images
An End-to-End Deep Learning Project
In this project, we developed and deployed a classifier for four different types of brain tumor MR images: Glioma, Meningioma, Pituitary and No Tumor.
Utilizing the method of transfer learning and EfficientNet B3, the final model achieved an accuracy rate of 99.23%
Table of Contents
🧠 Table of content:
-
About the project 📃
-
Data Pre-Processing ✂️✂️✂️✂️✂️✂️
-
The Dataset
-
Libraries
-
Cropping the original images
-
-
Model Developing & Deploying 🛠️ 🚀
-
Make the development of applications faster by creating a series of useful functions
-
Generate training, testing, and validation dataframes from testing and training directories (image paths and labels).
-
Manage the balance of the dataset by adding augmented images to minority classes
-
Generates batches of augmented/normalized data for training, testing and validation dataframes
-
Display examples of training images
-
Create models using transfer learning with EfficientNet (B0,B3,B5,B7)
-
Custom model callback
-
Train model
-
Display metrics (loss and accuracy) for training and validation
-
Making prediction on the test set
-
Confusion matrix
-
ROC & AUC curves
-
-
Export and deploy trained model
-
-
Conclusions!
The automatic classification of medical images plays a vital role in diagnosis, growth prediction, and treatment of brain tumors. The earlier a tumor brain is diagnosed, the more likely it will respond to treatment, which ultimately improves the survival rate for patients. Manually classifying brain tumors in large medical image databases is one of the most time-consuming and labor-intensive clinical tasks. As a result, automatic detection and classification procedures are desirable and worthwhile.
This project is focused on multi-class brain tumors classification using pre-trained Convolutional Neural Networks (CNNs) and the EfficientNet B3 learning method.
To achieve enhanced classification results, I applied some image cropping techniques that are specifically used in the medical field. In addition to adding augmented images to minority classes to balance the dataset, I also customized the callback method to automatically adjust the learning rate while training the model for the best accuracy.
Brain Tumor Classification (MRI) Dataset from Kaggle
Distribution of images in training dataset directory by each brain tumor classes:
- Glioma: 1321
- Meningioma: 1339
- No Tumor: 1595
- Pituitary: 1457
Distribution of images in testing dataset directory by each brain tumor classes:
- Glioma: 300
- Meningioma: 306
- No Tumor: 405
- Pituitary: 300
Source: John Hopkins Medical Center
Glioma | Meningioma | No Tumor | Pituitary |
---|---|---|---|
Glioma
- Gliomas account for 33% of all brain tumors, which arise from gluey supporting cells surrounding and supporting neurons.
- Depending on its location and growth rate, glioma can affect brain function and be life-threatening
Meningioma
- Generally, meningioma tumors begin in the brain or spinal cord (central nervous system).
- 90% of meningioma tumors are benign (not cancerous)
Pituitary
- Pituitary tumor is a tumor that forms in the pituitary gland near the brain that can cause changes in hormones level in the body
- Most pituitary tumors are noncancerous growth
- Pituitary cancers are very rare
- Opencv-python 4.6.0.66
- imutils
- A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, displaying Matplotlib images, sorting contours, detecting edges, and much more easier with OpenCV and both Python 2.7 and Python 3.
- matplotlib
- PIL
- Python Imaging Library adds image processing capabilities to your Python interpreter.
- seaborn
- shutil
- sklearn
- tensorflow
- keras
- time
- tqdm
- To make the MR images uniformly sized, resize all of them to 256x256 pixels
- Noise was another major issue with the MR images. For this reason, we need to crop out the unnecessary portion of the input images to improve their quality.
- Exported the cropped MR images to other directory and ready for feeding the model
filepaths | labels | |
---|---|---|
2209 | /content/drive/MyDrive/0_data_science/mri_brai... | meningioma |
907 | /content/drive/MyDrive/0_data_science/mri_brai... | glioma |
- Creating a new folder named 'new_aug' for storing augmented images
- Apply image augmentation by rotating, shifting, and flipping horizontally
- Merge newly augmented images paths and labels with existing dataframes for training/testing/validation
-
Using the flow_from_dataframe function, generate batches of tensor image data with real-time data augmentation
- Tensorflow can use a dataframe to determine which images to use and what classification each image belongs to.
-
In the test_gen, we want to calculate the batch size and test steps such that
batch_size
*test_steps
= number of samples in the test set- By doing this, we ensure that we run through all sample sets exactly once.
- Display the first 25 images along with their labels of the first batch in training data generator
- Training our model using EfficientNet with transferred imagenet weights as a base model
- Removing the top layer from our base model and setting it to non-trainable
- By doing this, we can pass the image data through the pretrained model and get an output
- Using this output as inputs for our additional dense layers, we will be able to train them.
When training is complete, the model weights are set to the epoch with the lowest validation loss.
callbacks= LR_ASK(model, epochs, ask_epoch, dwell=True, factor=.4)
- model: A string representing the name of the compiled model
- epochs: An integer representing the number of epochs to run specified in model.fit
- ask_epochs: if ask_epochs = 10, the model will train for 10 epochs before asking for user input
- User input:
h
or a number (integer value) H
orh': ex:
ask_epochs = h` the model training process would be halted- A number (integer value): for example
ask_epochs = 5
would continue training for 5 epochs, then the user would be asked for the new value. - dwell: a boolean value (True/False)
- True: the model will keep track of the current epoch validation loss and compare it with the lowest validation loss so far.
- If the current epoch validation loss is less than the lowest validation loss, the current epoch weights would be saved at the best weight
- When the current epoch validation loss exceeds the lowest validation loss, the new learning rate would be lowered by multiplying the current learning rate by the
factor
(a float value between 0 & 1)
- True: the model will keep track of the current epoch validation loss and compare it with the lowest validation loss so far.
- Graph the change in validation accuracy values and validation loss values between training and validating datasets over time.
- Define a function which takes in a test generator and an integer 'test_steps'
- A confusion matrix is generated based on the test set predictions
- Generate classification report
precision | recall | f1-score | support | |
glioma | 1.0000 | 0.9767 | 0.9882 | 300 |
meningioma | 0.9838 | 0.9902 | 0.9870 | 306 |
notumor | 0.9951 | 1.0000 | 0.9975 | 405 |
pituitary | 0.9901 | 1.0000 | 0.9950 | 300 |
accuracy | 0.9924 | 1311 | ||
macro avg | 0.9922 | 0.9917 | 0.9919 | 1311 |
weighted avg | 0.9924 | 0.9924 | 0.9924 | 1311 |