The Housing prices competition is with over 45.000 participating teams and individuals one of the most popular Machine Learning competition on Kaggle. The goal of this competition is to predict the sale price of residential homes in Ames (Iowa) using 79 explanatory variables.
- Imputed missing data using Simple imputer, KNN imputer and Mice imputer
- Created a variety of new binary features (e.g. “HasGarage” and “HasBasement”), features representing the number of years since the house was last remoddeled and features indicating the proximity to the train station
- Explored the effect of features on the sale price using scatter and bar plots
- Tested the difference of using log-transformed and untrafsormed sale price on RMSE
- Optimized Ridge regression using GridSearch CV to obtain a model with a top 9% score on the public Kaggle leaderboard
Click here to go to the Kaggle competition page.