In this project I am addressing weather forecasting with Machine Learning and Big Data tools, in order to show whether is possible to make valuable predictions of meteorological conditions only based on previously seen meteorological data. The classification goal is therefore, given a set of weather measurements, to predict which meteorological condition should occur.
For further details you can refer to the presentation slides or to the Python Notebook (also published on DataBricks).
This project has been developed during the A.Y. 2020-2021 for the Big Data Computing course @ Sapienza University of Rome.
The dataset comes from Kaggle and contains hourly weather measurements data of 36 cities, collected from 2012 to 2017. This 5 years of data result in approximately 45.000 measurements (for each city) of temperature, humidity, air pressure and the like.