CSU Machine Learning 2015 project - entering the Kaggle competition "How much did it rain? II"
To use this code, download the train and test data from https://www.kaggle.com/c/how-much-did-it-rain-ii/data and unzip them into this folder. Then, process them using master_processing.py - in a python prompt, import it, then use the processor function. Many of our scripts depend on a file named average.csv in this directory, with the default values of the program, run on the training data. Others rely on a nomissing.csv - this can be created using the writestats function in the stats.py script in the BasicAnalysis folder. Most rely on a file named avtest.csv, which can be obtained by running the processor function on the test data with the filt option set to False. The data files could not be included in the repository due to their size, unfortunately.
The different folders represent different approaches we took to the problem. BasicAnalysis finds some useful statistics about the data and runs the traditional regression equations. DecisionTrees uses the XGBoost library to run boosted random forests. NN runs basic neural networks using the Keras library. The convolutional neural network approach can be found in convnet.py.