As access to assignments is lost after finishing the course, this repository will help to go through the assignments on IDE. The aim is to help one brush up on the concepts taught by Professor Andrew Ng.
Summary of each assignment for quick reference:
AIM: Functional and sequential implementation of the TENSORFLOW models.
Number of assignments: 2 (Convolution_model_Step_by_Step, Convolution_model_Application). Only the second assignment (Convolution_model_Application) is TensorFlow based and is included in the repository.
Link to datasets: Happy Face, hand signs
Assignment 1: Binary Classification on Happy Face dataset.
- Build and train a ConvNet in TF for a binary classification problem using sequential API.
- Dataset: The data is stored in h5 format. More details can be found here.
- Train and Test data is provided.
- Training dataset has 600 images and images are of shape (64, 64, 3). The batch can be represented as (600, 64, 64, 3).
- Test dataset has 150 images with same dimensions as train.
Assignment 2: Multiclass Classification on hand signs dataset.
- Build and train a ConvNet in TF for a multiclass classification using the Functional API.
- Dataset: The data is stored in h5 format like assignment 1.
- Main difference is the use of more flexible functional API and use of one hot encoding.
AIM: ResNet model and MobileNet model implementation.
Number of assignments: 2 (Residual Network and MobileNet)
Link to datasets: hand signs, Alpaca/Not Alpaca
Assignment 1: ResNet50
- Implement the basic building blocks of ResNets in a deep neural network using Keras.
- Put building blocks (identity block and convolution block) to implement and train a state-of-the-art neural network for image classification.
- Implement a skip connection in your network.
- Dataset: hand signs dataset stored in h5 format. Same as C4W1A1.
Assignment 2: MobileNet
- Create a dataset from a directory.
- Preprocess and augment data using the Sequential API.
- Adapt a pretrained model to new data and train a classifier using the Functional API and MobileNet.
- Fine-tune a classifier's final layers to improve accuracy.
- Fine-tune the final layers of your model to capture high-level details near the end of the network and potentially improve accuracy
- Dataset: The model was trained on Alpaca/Not Alpaca dataset.
- The images were stored in two folders names as ‘Alpaca’ and ‘Not Alpaca’. The name of these two filters becomes two classes which are to be classified.
AIM: YOLO model and Unet model implementation.
Number of assignments: 2 (YOLO model and Unet model)
Link to datasets: drive.ai dataset, CARLA dataset
Assignment 1: YOLO
- Detect objects in a car detection dataset
- Implement non-max suppression to increase accuracy
- Implement intersection over union
- Handle bounding boxes, a type of image annotation popular in deep learning
- As the YOLO model is computationally expensive to train, the pre-trained weights are used.
- Refer this notebook for a detailed understanding of the implementation.
- NOTE:- There was a problem in loading yolo.h5 in the python script. The error was ‘Bad Marshal Data’ error. Refer the following link to see the possible reasons. Project was at TF 2.9.1 and python 3.9.
Assignment 2: Unet
- Build your own U-Net
- Explain the difference between a regular CNN and a U-net
- Implement semantic image segmentation on the CARLA self-driving car dataset
- Apply sparse categorical crossentropy for pixelwise prediction
- Dataset: CARLA dataset
- The CARLA dataset has two subfolders (CameraRGB and CameraSeg).
- The CameraRGB is the actual image dataset. It is of shape (600, 800, 3).
- The CameraSeg folder consists of the corresponding masks. It is of the shape (600, 800, 3). However, only the first channel has the data (label) in it. There are 13 different classes in the masked dataset.
- Here is a similar notebook for Image Semantic Segmentation with U-Net on complete CARLA dataset along with splitting of the dataset into training, validation and test dataset.
Aim: Face recognition and Neural style transfer.
Number of assignments: 2 (Face Recognition and Neural Style Transfer)
Important links: keras_facenet_model
Assignment 1: Face Recognition
- Differentiate between face recognition and face verification.
- Implement one-shot learning to solve a face recognition problem.
- Apply the triplet loss function to learn a network's parameters in the context of face recognition.
- Explain how to pose face recognition as a binary classification problem.
- Map face images into 128-dimensional encodings using a pretrained model.
- Perform face verification and face recognition with these encodings.
- Dataset: Images to test the verification on
- Pre-trained model is used
- Assignment focuses more on how to use hte pretained model face recognition and verification problem.
- Note: Loading the "model.h5" on the local interpreter with python v3.9 and TensorFlow version 2.9.1 throws a value error "ValueError:bad marshal data". It is recommended to run this assignment using google colab or use python v3.7.13 and TensorFlow 2.8.2
Assignment 2: Neural Style Transfer
- Implement the neural style transfer algorithm
- Generate novel artistic images using the algorithm
- Define the style cost function for Neural Style Transfer
- Define the content cost function for Neural Style Transfer
- Pre-trained VGG19 model was used.
- Learned about implementing style cost function, content cost function and optimization of the defined cost function.
- No particular dataset to train on as a pre trained model was used. One can choose the content and style image to play around.