The analysis and prediction of purchases done on the day of black friday is done using xgboost, tfidftransformer and extratreesregressor. The idea and Dataset is taken from AnalyticsVidhya where the project is a part of a hackathon
Since the test output dataset is not provided, so the project is divided into four cases with the same logic applied on various sets of dataset
In first case(case 1), input dataset is split in 90% and 10% where 90% is kept for training dataset and the latter for testing dataset
In second case(case2), input dataset is split in 70% and 30%. Here 70% is kept for training dataset and 30% for testing dataset
In third case(case 3), input dataset is divided equally i.e 50% and 50%
In fourth case(case 4), the training, testing and sample datasets are taken from Analytics Vidhya and the operations are done on it without any change. Here the sample dataset has ‘purchase’ value equal to 9000(default) for all users.
Language used: Python
Framework used: Anaconda(Jupyter Notebook)
I hope, you guys get something of it and all suggestions/queries are always welcomed :)