Compared performance of different ML algorithms in both classification and regression tasks using scikit-learn framewok. The classification performance was evaluated by area under ROC and PR curves, the regression by MSE and R2 scores.
- Diabetic Retinopathy
- Default of credit card clients
- Breast Cancer Wisconsin
- Statlog (Australian credit approval)
- Statlog (German credit data)
- Steel Plates Faults
- Adult
- Yeast
- Thoracic Surgery Data
- Seismic-Bumps
- k-nearest neighbours classification
- Support vector classification
- Decision tree classification
- Random forest classification
- AdaBoost classification
- Logistic regression (for classification)
- Gaussian naive Bayes classification
- Neural network classification
- Wine Quality
- Communities and Crime
- QSAR aquatic toxicity
- Parkinson Speech
- Facebook metrics
- Bike Sharing
- Student Performance
- Concrete Compressive Strength
- SGEMM GPU kernel performance
- Merck Molecular Activity Challenge (from Kaggle)
- Support vector regression
- Decision tree regression
- Random forest regression
- AdaBoost regression
- Gaussian process regression
- Linear regression
- Neural network regression
- Install Anaconda
- Create a conda env that contain python 3.7.5:
conda create -n your_env_name python=3.7.5
- Activate the environment (do this every time you open a new terminal):
conda activate your_env_name
- Install the requirements into this conda env:
pip install --user --requirement requirements.txt
- Run the jupyter notebook:
jupyter notebook