Stroke Risk: A Machine Learning Approach

The John Hopkins Hospital -- Baltimore, MD

Project Aim
Develop and deploy a predictive model that is capable of providing information about whether a patient is likely to incur a stroke (i.e., to determine which patients have high stroke risk).

Technical Objectives:

Import data
Perform exploratory data analysis
Perform statistical inference
Data visualization
Develop a variety of machine learning models
Assess the quality of these models
Gain insights about meaningful features that relate to stroke likelihood
Deploy the machine learning model via a Flask application

File Descriptions

data/healthcare-dataset-stroke-data.csv:
The dataset used for training and evaluating the models.

utils
Contains utility scripts used in the notebook.
plots.py: Functions for data visualization.
stats_ML.py: Functions for statistical analysis and machine learning tasks.

stroke_risk.ipynb: Jupyter notebook for data analysis, model development, and evaluation.
requirements.txt: List of required Python packages.

deployment
contains all necessary files to deploy model as endpoint API for predictions on novel data
stroke_risk_deployment.py: Script to deploy the trained model using Flask.
deployment_requirements.txt: List of required Python packages for Dockerfile creation
model.pkl The final model chosen for this analysis (Polynomial Logistic Regression)
Dockerfile: Dockerfile to create Docker Image of this model
test_request.py: Python file for testing deployed model

note: stroke_risk.ipynb is the notebook that contains my efforts at model building model building

Getting Started

Prerequisites
Make sure you have Python 3.10 installed. You can download it from python.org.

Installation: Clone the repository

Create a virtual environment:

python3 -m venv stroke_risk/
source stroke_risk/bin/activate   # On Windows, use `venv\Scripts\activate`

Install the required packages:

pip install -r requirements.txt

Running the Notebook
To explore the data and run the models, start Jupyter Notebook:

jupyter notebook

Open stroke_risk.ipynb in the browser and run the cells to perform data analysis, model training, and evaluation.

Requirements
This project uses the following packages:
flask~=3.0.3
IPython~=8.22.2
ipykernel~=6.29.3
jupyter_client~=8.6.0
jupyter_core~=5.7.1
jupyter_server~=2.13.0
matplotlib~=3.8.3
notebook~=7.1.1
numpy~=1.26.4
pandas~=2.2.1
python~=3.10.13
qtconsole~=5.5.1
requests~=2.31.0
scipy~=1.12.0
seaborn~=0.13.2
scikit-learn~=1.4.1.post1
xgboost~=2.0.3

License

MIT

For any questions or issues, please contact [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
deployment		deployment
utils		utils
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
stroke_risk.ipynb		stroke_risk.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stroke Risk: A Machine Learning Approach

The John Hopkins Hospital -- Baltimore, MD

File Descriptions

note: stroke_risk.ipynb is the notebook that contains my efforts at model building model building

Getting Started

License

About

Releases

Packages

Languages

License

migueldiazacevedo/stroke_risk_prediction

Folders and files

Latest commit

History

Repository files navigation

Stroke Risk: A Machine Learning Approach

The John Hopkins Hospital -- Baltimore, MD

File Descriptions

note: stroke_risk.ipynb is the notebook that contains my efforts at model building model building

Getting Started

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages