This repository consist of ML experiments in form of notebook, which looks into using various Python-libraries of Machine Learning and Data Science in an attempt to build a machine learning model capable of predicting whether or not someone has heart disease based on their medical attributes.
Install MiniConda, for detailed setup check
- Pandas
- NumPy
- Matplotlib
- Seaborn (for heatmaps)
- Scikit-Learn
- Jupyter Notebook
- Problem Definition
- Data Exploration
- Evaluation
- Features
- Modelling
- Experimentation
In a statement,
Given clinical parameters about a patient, can we predict whether or not they have heart disease ?
The original data came from the Cleavland data from the UCI Machine Learning Repository.
Download it from UCI Heart Disease Data Set or Kaggle
- age: age in years
- sex: sex (1 = male; 0 = female)
- cp: chest pain type
- Value 0: typical angina
- Value 1: atypical angina
- Value 2: non-anginal pain
- Value 3: asymptomatic
- trestbps: resting blood pressure (in mm Hg on admission to the hospital)
- chol: serum cholestoral in mg/dl
- fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
- restecg: resting electrocardiographic results
- Value 0: normal
- Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV)
- Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria
- thalach: maximum heart rate achieved
- exang: exercise induced angina (1 = yes; 0 = no)
- oldpeak = ST depression induced by exercise relative to rest
- slope: the slope of the peak exercise ST segment
- Value 0: upsloping
- Value 1: flat
- Value 2: downsloping
- ca: number of major vessels (0-3) colored by flourosopy
- thal: 0 = normal; 1 = fixed defect; 2 = reversable defect
- target: 0 = no disease, 1 = disease
- Logistic Regression
- RandomForest Classifier
- K-Nearest Neighbours