- This project deals with a 3 class problem for predicting the fetal health into categories like i. Normal ii. Suspect iii. Pathological
- The Data undergoes pre-processing to remove outliers, then cross validated for better results. Models used include Decision Tree, Random Forest, XGBoost and Naive Bayes.
Reproduction Child-healthcare Classification
- This dataset has over 2100 records
- Fetal_health is the target variable
- It has 22 input variables including: - Baseline value - Baseline Fetal Heart Rate (FHR) - Accelerations - Number of accelerations per second - Fetal_movement - Number of fetal movements per second - Uterine_contractions - Number of uterine contractions per second - Light_decelerations - Number of LDs per second - Severe_decelerations - Number of SDs per second - Prolongued_decelerations - Number of PDs per second - Abnormal_short_term_variability - Percentage of time with abnormal short term variability - Mean_value_of_short_term_variability - Mean value of short term variability - Percentage_of_time_with_abnormal_long_term_variability - Percentage of time with abnormal long term variability
So, the following are algorithm implemented along with their accuracy,
- Naïve Bayes (Accuracy:0.81)
- Decision Tree (Accuracy:0.82)
- XGBoost (Accuracy:0.84)
- Random Forest (Accuracy: 0.87)
R studio and R Programming to be installed using their official documentation.
-
Install both simultaneously & ensure version compatability. If either is already downloaded, upgrade it to latest version.
install.packages("ggplot2")
install.packages("ROSE")
install.packages("rpart")
install.packages("rpart.plot")
install.packages("randomForest")
install.packages("e1071")
install.packages("xgboost")
install.packages("caret")
- Run the given code by downloading the file and ensure respective csv file is also in same directory
- Set the directory using setwd command.
- Install the required packages