Stacking, or stacked generalization, is a technique in ensemble learning where multiple base models, or "weak learners," are trained and combined to form a metamodel with improved predictive power. The ISOVIS research group at LNU has created StackGenVis, a visual analytics system that helps users optimize performance metrics, manage input data, including selecting features, and choose top-performing algorithms. The current version of StackGenVis uses a single Linear Regression metamodel. This work aims to investigate the impact of alternative metamodels on the predictive performance of StackGenVis using provided data and charts for comparison.
Script used and testing in Numer.ai ML competition, check additional details at https://docs.numer.ai/tournament/learn
Notebook includes following steps:
- feature selection using univariate Selection and Fit model using each importance as a threshold
- Model testing and parameter tuning for Lasso Regression, Ridge Regression, XGBoost and ElasticNet models
- Model ensembling using stacked generalization
- Analysis perfomed to discover the correlation (if any) between USD/RUB exchange rate and Oil Brent price.
- Intended as home project for DS course. Script includes Exploratory analysis steps, data wrangling stepsand simple machine learning models check.
- MNIST K-NN Classification using numpy
- k-NN Classification using scikit-learn
- k-NN Regression using numpy
- Multivariate regression using gradient descent
- Polynomial regression
- Multivariate Logistic Regression
- Nonlinear logistic regression
- Regularization techniques
- Support Vector Machines with Gaussian and ANOVA(own implementation) kernels
- Support Vector Machines - One versus all MNIST
- An effect of ensembling
- Simple Neural Network - Fashion MNIST