Skip to content

migueldiazacevedo/stroke_risk_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stroke Risk: A Machine Learning Approach

The John Hopkins Hospital -- Baltimore, MD

Project Aim
Develop and deploy a predictive model that is capable of providing information about whether a patient is likely to incur a stroke (i.e., to determine which patients have high stroke risk).

Technical Objectives:

  • Import data
  • Perform exploratory data analysis
  • Perform statistical inference
  • Data visualization
  • Develop a variety of machine learning models
  • Assess the quality of these models
  • Gain insights about meaningful features that relate to stroke likelihood
  • Deploy the machine learning model via a Flask application

File Descriptions

data/healthcare-dataset-stroke-data.csv:
The dataset used for training and evaluating the models.

utils
Contains utility scripts used in the notebook.
plots.py: Functions for data visualization.
stats_ML.py: Functions for statistical analysis and machine learning tasks.

stroke_risk.ipynb: Jupyter notebook for data analysis, model development, and evaluation.
requirements.txt: List of required Python packages.

deployment
contains all necessary files to deploy model as endpoint API for predictions on novel data
stroke_risk_deployment.py: Script to deploy the trained model using Flask.
deployment_requirements.txt: List of required Python packages for Dockerfile creation
model.pkl The final model chosen for this analysis (Polynomial Logistic Regression)
Dockerfile: Dockerfile to create Docker Image of this model
test_request.py: Python file for testing deployed model

note: stroke_risk.ipynb is the notebook that contains my efforts at model building model building

Getting Started

Prerequisites
Make sure you have Python 3.10 installed. You can download it from python.org.

Installation: Clone the repository

Create a virtual environment:

python3 -m venv stroke_risk/
source stroke_risk/bin/activate   # On Windows, use `venv\Scripts\activate`

Install the required packages:

pip install -r requirements.txt

Running the Notebook
To explore the data and run the models, start Jupyter Notebook:

jupyter notebook

Open stroke_risk.ipynb in the browser and run the cells to perform data analysis, model training, and evaluation.

Requirements
This project uses the following packages:
flask~=3.0.3
IPython~=8.22.2
ipykernel~=6.29.3
jupyter_client~=8.6.0
jupyter_core~=5.7.1
jupyter_server~=2.13.0
matplotlib~=3.8.3
notebook~=7.1.1
numpy~=1.26.4
pandas~=2.2.1
python~=3.10.13
qtconsole~=5.5.1
requests~=2.31.0
scipy~=1.12.0
seaborn~=0.13.2
scikit-learn~=1.4.1.post1
xgboost~=2.0.3

License

MIT

For any questions or issues, please contact [email protected]

About

Predictive modeling of stroke events

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages