Travel Insurance Dataset

Can I predict if customers will buy travel insurance?

These data were acquired from Kaggle

Project Description

This notebook uses the Travel Insurance Dataset to generate hypotheses tests and supervised machine learning models (i.e. classification models) in order to predict whether a customer will purchase travel insurance and to better understand customers who do and do not purchase travel insurance.

Goals:

Perform data wrangling and exploratory data analysis (EDA)
Plot the data informatively
Create various statistical and machine learning models to better understand the dataset
optimize the classification model
Test the optimized classification model

Altogether, these aims should allow us to predict whether someone is likely to purchase travel insurance given our model and to better understand the relationships between the features in these data and purchasing travel insurance.

Conclusions

Large families buy more travel insurance than smaller families, but at a rate that below chance, indicating a potentially untapped market.
People who buy travel insurance do appear to be slightly older than those who do not.
Wealthier customers purchase travel insurance.
Whether someone has ever traveled abroad or not appears to be the most important factor determining whether they will buy travel insurance or not. This increases the odds of purchasing travel insurance by over 400% according to logistic regression modeling.
Other important positive predictors include Frequent Flyers and Chronic Disease.
according to logistic regression and linear discriminant analysis, important features that predict TravelInsurance include: EverTravelledAbroad, FrequentFlyer, Family Members, and Chronic Diseases

A variety of non-linear models outperformed linear models. All models generalized well, as did an ensemble Voting Model and included both linear and non-linear models

(see TravelInsurancePrediction.ipynb for all details about model performance)

Installation Instructions

Clone or download repository
essential items:
- dataset from Kaggle (TravelInsurancePrediction.csv)
- utils module
- ensure all requirements are installed, best practice to do so is to create a virtual environment as described below
```
$ python3 -m venv travel_ins/
$ source travel_ins/bin/activate
$ pip install -r requirements.txt
```
- Then run cells in notebook (TravelInsurancePrediction.ipynb)

Requirements

ipython
matplotlib
numpy
pandas
scipy
seaborn
scikit-learn
statsmodels (see requirements.txt for version details and more dependencies used in the development environment)

License The MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
utils		utils
LICENSE		LICENSE
README.md		README.md
TravelInsurancePrediction.ipynb		TravelInsurancePrediction.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Travel Insurance Dataset

Can I predict if customers will buy travel insurance?

Project Description

Conclusions

Installation Instructions

Requirements

About

Releases

Packages

Languages

License

migueldiazacevedo/travel_insurance_prediction

Folders and files

Latest commit

History

Repository files navigation

Travel Insurance Dataset

Can I predict if customers will buy travel insurance?

Project Description

Conclusions

Installation Instructions

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages