Skip to content
This repository has been archived by the owner on Mar 22, 2019. It is now read-only.

Latest commit

 

History

History
12 lines (10 loc) · 620 Bytes

README.md

File metadata and controls

12 lines (10 loc) · 620 Bytes

Master's Thesis

A Comparative Study of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science

  • thesis.pdf is a copy of the thesis itself.
  • The thesis directory contains the LaTeX code and figures used to produce my thesis. PDF generation was in fact performed within Overleaf.
  • The scraper directory contains a web scraper, written in Python, which was used to assist in shortlisting papers for categorisation in the survey.
  • The usability-study-data.tsv is a tab separated values file containing all the anonymised survey data collected from the performed usability study.