Disaster Response Pipeline Project

This project contains codes written to analyze disaster data from Appen to build a model for an API that classifies disaster messages.

The project also includes a web app where an emergency worker can input a new message and get classification results in several categories. This is a multi-output classification task.

Installation

There is no major libraries required to run the code beyond what is provided in the Anaconda python distribution. The code can be run with any version of Python 3.

Project Motivation

Following a disaster, disaster response organizations get millions of communications either direct or via social media at the time when they have the least capacity to filter and pull out the messages which are most important.

The way disaster is responded to is that different organizations will take care of different part of the problems. One might be in charge of water, blocked roads, fire etc. However, it is usually the case that there is only one in a thousand messages that might be relevant to the disaster response professionals.

Therefore supervised machine learning based approaches are used and are more accurate than key word searches to analyze the data and know which of the organizations should respond to which need.

Overview of project

There are three main components of this project:

ETL Pipeline:

This first component is implemented in process_data.py and it:

Loads the messages and categories datasets.
Merges the two datasets.
Cleans the data.
Stores it in a SQLite database.

ML Pipeline:

This component involves writing a machine learning pipeline that:

Loads data from the SQLite database.
Splits the dataset into training and testing sets.
Builds a text processing and machine learning pipeline
Trains and tunes a model using GridSearchCV.
Outputs the results on the test set.
Exports the final model as a pickle file.

This is given in the train_classifier.py.

Flask Web App:

Here I use my knowledge of flask, html, css and javascript to build the web app.
I also add to the web appdata visulaizations using Plotly.

Instruction for setting up

Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database
  
  python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves
  
  python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
Go to app directory: cd app
Run your web app: python run.py
Click on the web link to visualize the app.

Results

Here I provide a snapshot of the built web app.

Acknowledgements

This project was inspired by the Data Science nanodegree program at Udacity.
The data was provided for by Appen.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.idea		.idea
app		app
data		data
models		models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Installation

Project Motivation