Cascade Cup '22, organized by Consulting and Analytics club, IIT Guwahati, consisted of 3 rounds - ML Quiz, ML Hackathon, Data Analysis Report. In this repository, we have uploaded our submissions - the notebook which we submitted for Data Analysis round and the Report used in the last round.
https://www.kaggle.com/c/cascade-cup-22/overview
In this repo: https://github.com/yash-shimpi/Cascade-Cup-Neural-Demons/blob/main/notebook.ipynb
It consisted of an unbalanced dataset of 450,000 rows with 97:3 being the ratio of non-cancelled data to cancelled data of delivery orders. Most significant improvement was brought by the Class Weights method as it penalised more if got the minority data prediction wrong. Along with that, effective EDA alongwith some data manipulation led us to AUC-ROC score of 0.81 and 0.83 on public and private leaderboards respectively.
In this repo: https://github.com/yash-shimpi/Cascade-Cup-Neural-Demons/blob/main/Team%20Neural%20Demons.pdf
Problem statement stated to prepare a report analyzing the data we had been given. Using various MatPlotLib methods, explored relations between cancellation and various other factors given in the data. Essentially divided into 3 sub reports, rider analysis, order analysis, rider-order analysis.