In this repository, we're building an educational hands-on Data Science project for students who are participating in TechAcademy's curriculum. We analyze the recent coronavirus outbreak and teach Data Science in Python or R along the way.
- simple line plot for keyword "coronavirus" without any data wrangling
- Stock data: facet_wrap by share (Airlines, healthcare, slack), Linie rein, die zeigt, wann Krise begonnen hat
- Gilead Sciences, Moderna, Lufthansa, overall index (S&P500)
- Stock und Trends in einen Plot
-
Lineplot number infected (group_by time)(Datenreinigung: NA durch 0 ersetzen)
-
Stacked Area Plot (Top 5 countries, see Economist). Use group_by country, date and arrange, filter (viele Hinweise, zeitaufwändig)
-
generate total data set by merging confimed with deaths and recovered by ID
-
calculate net infected, direkt als area plot
-
visualize four timeseries in several plots (top 9 countries)
-
Barplot most recent date: by country
- second plot: mortality rate by country (top n) (either deaths/confirmed or deaths/recovered or deaths/netinfected), highlight top 5 confirmed/ (net infected)
- Why is there a kink in the total number of infected around Feb 10? -> Testing criteria changed
- Mortality rate: Why is it problematic to calculate the mortality rate during a pandemic? How would you calculate the true rate?
- How does the political attitude towards testing the population affect the number of infected and number of deaths? (Italy vs. US) -Warum haben top 5 confirmed unterschiedliche mortality rates (tests)
4. Visualization with Maps (Datensatz so aufbereiten, dass es einfacher wird vs. Tipps geben: Lara schaut sich den Weg nochmal an)
Static Map
- select most recent snapshot
- World Maps: individual data points
- color by number of infected
- Include country borders and color by number of infected in each country (rwoldmap) \
Dynamic Map
- Create dynamic world map, get detailed information by clicking on each data point (leaflet)
idee: r: labels, python: dynamic mit pop ups