https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction
- Numerical features statistics.
- Missing values detection.
- Duplicates detection.
- Target balance detection.
- Outliers detection and handling.
- Distribution of numerical and categorical features.
- Histogram of features between classes.
A) Numerical features:
B) Categorical features:
- Pairplot of features.
- Correlation between features.
- Quantization of categorical features.
- Dataset splitting into training, validation and testing.
- Feature scaling step.
- Performing backward SFS.
- GridSearchCV for finding best hyperparameters.
- Validating models.
- Choosing best number of principal components.
- Data Transformation.
- GridSearchCV for finding best hyperparameters.
- Validating models.
- Training and testing chosen models.
- Plotting testing results.
- Visualization of predictions.
A) RF predictions using PCA:
B) RF predictions using SFS:
- Confusion matrix.