GitHub - mhpetiwala/ship-detection: Walkthrough using machine learning to detect ships in photos powered by SageMaker!

Here is the final output of the project:

Process

Step 1: Create infrastructure

Creates SageMaker notebook instance, IAM role, S3 bucket for output

aws cloudformation create-stack --stack-name ShipDetection --template-body file://cloudformation/infrastructure.yaml --capabilities CAPABILITY_NAMED_IAM

Check the status until you receive "CREATE_COMPLETE"

aws cloudformation describe-stacks --stack-name ShipDetection --query 'Stacks[0].StackStatus'

Keep track of the S3 bucket name

aws cloudformation describe-stacks --stack-name ShipDetection --query 'Stacks[0].Outputs[0].OutputValue

TODO:

Figure out a clever way to give consumers of this project read access to the S3 bucket in my account so I don't need to toggle public access manually.
Add this to-be-created repo to the start up of the notebook so all of the files (i.e. train_ship_segmentations_v2.csv) is already there for future users.

Step 2: Decide if you want to take a short cut using my prepared data OR rebuild the data yourself.

Short cut: Continue to Step 3

OR

Long way: Continue to DIY steps

Step 3: Train, deploy, and test the model

Once the CloudFormation code sets up your SageMaker notebook and S3 bucket, you are free to continue. This step leverages my cleaned, transformed, and hosted data.

You'll need the name of the S3 bucket you created here for SageMaker output.
Open notebooks/ship-object-detection.ipynb and run through steps.

QUESTION:

It's weird that this is test data provided by Kaggle, but does not seem to be used. I wonder if this should be combined with the training data and then split out later. Maybe I'm missing something.

DIY Path

Step A: Download data from Kaggle and upload to S3

The long way? Good for you! Remember, you'll only need to follow these steps if you want to start completely from scratch. I am hosting training and validation data in my S3 bucket to simplify the process. That's not your style, though, is it?

Added time: 6 hours (mostly waiting)

Spin up an EC2 instance or perform the instructions locally
Install python3 & pip3
Install Kaggle API
Provide API credentials locally
Download the data

kaggle competitions download -c airbus-ship-detection

Unzip training data

mkdir training && unzip -qo .zip -d training

Unzip test data

mkdir test && unzip -qo .zip -d test

Upload training data

aws s3 sync training s3://your-bucket-name/training/

Upload test data

aws s3 sync test s3://your-bucket-name/test/

Expect to download and transfer about 29 GB.

	Total size	Total objects
Training:	27.0 GB	192556
Test:	2.2 GB	15606

Step B: Draw boxes on images

This step shows you how to use a cited notebook to convert masks to bounding boxes.

Added time: 1-2 hours

Open notebooks/from-masks-to-bounding-boxes.ipynb and run through steps.

TODO:

Possibly move data/train_ship_segmentations_v2.csv from the repo to S3

Step C: Create annotation files for training data

This step walks you though modifying data to create annotation files in JSON format.

Added time: 1-2 hours

Open notebooks/data-prep.ipynb and run through steps.

Clean up

Tear down the CloudFormation stack

aws cloudformation delete-stack --stack-name ShipDetection

Remove the SageMaker endpoint (this will rack up your bill!)
Make sure you only have the one endpoint bash aws sagemaker list-endpoints
Assuming you do, list the endpoints again and pass the first one into a delete command bash aws sagemaker list-endpoints --query 'Endpoints[0].EndpointName' --output text | xargs -I {} aws sagemaker delete-endpoint --endpoint-name {}
Remove the SageMaker model
Make sure you only have the one model

aws sagemaker list-models --query 'Models[0].ModelName' --output text

Assuming you do, list the models again and pass the first one into a delete command

aws sagemaker list-models --query 'Models[0].ModelName' --output text | xargs -I {} aws sagemaker delete-model --model-name {}

Remove the SageMaker endpoint configs
Make sure you only have the one config

aws sagemaker list-endpoint-configs --query 'EndpointConfigs[0].EndpointConfigName' --output text

Assuming you do, list the configs again and pass the first one into a delete command

aws sagemaker list-endpoint-configs --query 'EndpointConfigs[0].EndpointConfigName' --output text | xargs -I {} aws sagemaker delete-endpoint-config --endpoint-config-name {}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cloudformation		cloudformation
data		data
frontend		frontend
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
final_output.png		final_output.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Process

Step 1: Create infrastructure

Step 2: Decide if you want to take a short cut using my prepared data OR rebuild the data yourself.

Step 3: Train, deploy, and test the model

DIY Path

Step A: Download data from Kaggle and upload to S3

Step B: Draw boxes on images

Step C: Create annotation files for training data

Clean up

Instructor notes

About

Releases

Packages

Languages

mhpetiwala/ship-detection

Folders and files

Latest commit

History

Repository files navigation

Process

Step 1: Create infrastructure

Step 2: Decide if you want to take a short cut using my prepared data OR rebuild the data yourself.

Step 3: Train, deploy, and test the model

DIY Path

Step A: Download data from Kaggle and upload to S3

Step B: Draw boxes on images

Step C: Create annotation files for training data

Clean up

Instructor notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages