The Pune House price Prediction project is made in R language using the dataset of House pricing of Pune city obtained through the kaggle. The project explores various data science pipelines from data preprocessing and feature extraction to model training and deployment.
The Multiple Linear Regression obtained the highest accuracy among the multiple machine learning algorithms applied with an accuracy rate of 81 % over 10 fold cross validation
The System considers various parameters like House price location (choose location among 30+ locations in pune) , Total Sq.feet size of the House , Total BHK size (1,2 or more) , number of bathrooms and number of Bedrooms
The code is written in R language which supports data analysis and exploratory data analysis at a higher rate. The System also uses various visualizatiosn Techniques like Plotly , ggplot and matplotlib for visualization of variety of plots like scatter plot, barplot etc
Multiple linear regression (81 % accuracy)
Suppport vector Machine (SVM) (78 % accuracy)
Random Forest (RF) (75 % accuracy)
Decision Tree (DT) (72 % accuracy)
after that k-flod cross validation is performed on linear regression and Random Forest Model and linear regression proved to be most accurate with an average accuracy of 78 %.
The frontend interface is created using the shiny package in R language.