The file hw5 load the original dataset and nothing else. To run the .py file is important that all the libraries are installed in the machine. Look at the beginning of the three files to see the imports. In particular need to install xgboost and tensorflow are the less common packets used.
The order of the files to run is: preprocessing.py -> it will create the files .csv with the preprocessed data training.py -> it will create the file .pickle with the trained models testing.py -> it will print the accuracies
I have uploaded with the 3 .py files also the trained models and the preprocessed data. So it is possible to run testing.py directly. If you want to run also preprocessing and training you need to download the original dataset from: https://www.kaggle.com/miroslavsabo/young-people-survey/