A Comparative Study of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science
thesis.pdf
is a copy of the thesis itself.- The
thesis
directory contains the LaTeX code and figures used to produce my thesis. PDF generation was in fact performed within Overleaf. - The
scraper
directory contains a web scraper, written in Python, which was used to assist in shortlisting papers for categorisation in the survey. - The
usability-study-data.tsv
is a tab separated values file containing all the anonymised survey data collected from the performed usability study.