This code is derivative work from the Imperial College Study Estimating the number of infections and the impact of nonpharmaceutical interventions (NPI) on COVID-19 in 11 European countries. This repository extends applies the same NPI effect fitting and modelling to the regions (and possibly departements) in France. The modifications to the code are made:
- To make it more modular and facilitate reuse.
- To handle additional data streams beyond the ECDC.
- To validate the stability of the predictions with time.
This code is part of the data against covid-19 citizens' initiative for open data and open source code around the COVID-19 pandemic.
Looking to contribute? Check the contributing section below on ways you can help and then go to our projects page!
Simulation results for analysis are in our sister repository.
Mid-March conversations between the data against covid-19 initiative and hospital managers revealed a need for local predictions of the evolution of the COVID-19 pandemic. France has been unevenly hit by the spread of the novel coronavirus, and in order to most effectively allocate resources on a national level, an understanding of local progression is critical.
Code for modelling estimated deaths and cases for COVID19 from Report 13 published by MRC Centre for Global Infectious Disease Analysis, Imperial College London: Estimating the number of infections and the impact of nonpharmaceutical interventions on COVID-19 in 11 European countries.
If you are looking for the individual based model used in Report 9, please look here
This is the release related to report 21, where we use mobility data to estimate situation in Brazil. All other code is still the same.
To run this code you can directly run the base-Brazil.r file or from command line after seting the current directory as the repository directory run the following command
Rscript base-Brazil.r
The code shold be run in full mode to obtain any results. Not running full model to estimate anything is not recommended and discouraged. Only full run should be used to get results.
The instructions for European and Italy code are same as earlier (Look at version 3 and version 4). This release is specific to Brazil report
This is the release related to report 20, where we use mobility data to estimate situation in Italy. All other code is still the same.
To run this code you can directly source the base-italy.r file in rstudio inside the project or from command line after setting the current directory as the repository directory run the following command
Rscript base-italy.r base-italy google interventions '~ -1 + residential + transit + averageMobility' '~ -1 + residential + transit + averageMobility'
The code for scenarios runs only in full mode not in short run or debug mode. Not running full model to estimate anything is not recommended and discouraged. Only full run should be used to get results.
The instructions for European code are below. This release is specific to Italy report
In this update, we first extended our model from version 2 to have 'partial-pooling' for lockdown across all countries. This means now we have a global effect of lockdown along with each country having its own different lockdown effect. We also made our code modular, stan code faster (with help from the community) and now we create CSV outputs too for usage.
You can directly get csv files here and new model description here
- Python code is right now not updated and won't work. Python code is good for only version 1 model and data.
- base_general.r and base_general.stan, base_general_speed.stan and base_general_speed2.stan are now valid models for only version2
In this update we extend our original model to include (a) population saturation effects, (b) prior uncertainty on the infection fatality ratio and (c) a more balanced prior on intervention effects. We also (d) included another 3 countries (Greece, the Netherlands and Portugal). The updated technical detail is available here.
You can directly look at our results here
This repository has code for replication purposes. The bleeding edge code and advancements are done in a private repository. Ask report authors for any collaborations.
To see the full readme from the original repository please consult either the readme on upstram-master or the ICL readme.
The original readme includes more details on configuring and running the model.
In an attempt to analyse and predict progression of the epidemic in France, the model from the study of non-pharmaceutical interventions on the basis of death data produced and published by Imperial is used in conjunction with the latest available French regional data from opencovid19-fr/data.
'Live' data sources updated regularly:
- opencovid19-fr/data French regions and departments covid-19 death data;
- ECDC European countries death data;
- INSEE data on the breakdown of French population by age preprocessed in scrouzet/covid19-incrementality.
'Static' sources not updated:
- EHPAD population age breakdown from the DREES;
- Infection Fatality Ration (IFR) provided by original repository, calculation by the ICL MRC Centre for Global Infectious Disease Analysis in report 12.
The original code is developed on .csv
files downloaded from the
ECDC. These are then converted to .rds
.
The processing of French regional and departmental data is performed in 3 steps:
- Download and pre-process CSV:
data/update-french-regional-data.sh
- Process to ECDC format CSV:
data/extract_opencovidfr_2_ICL.py
- Process to ECDC format CSV:
- Format pre-processed data to RDS:
data/fetch-region-france.r
- The data from opencovid19-fr are deaths since the epidemic start.
- Deaths in nursing homes (EHPAD) are reported separately to those in hospitals.
- For all French regions only the hospital deaths are available.
These observations led to the following choices in the processing of the opencovid19-fr data:
- Geographical "Regions" and "departements" only consider deaths at hospital.
- Three additional regions are defined:
France-OC19
: France's death data in hospitals and EHPAD as provided byopencovid19-fr/data
;France-Hopitaux
: France's death data from hospitals;France-EHPAD
: France's death data from nursing homes (EHPAD).
The separation between the hospital and EHPAD data is done, permit an acceptable fit on the French data despite the change in data reporting half-way through the period.
There are 3 ways to contribute:
- Running the code (it's expensive! 10-20h of runtime on 4 core desktop machine);
- Analysing forecasting accuracy;
- Develop the software itself, improve useability, modelling accuracy and data handling.
- Check the todo items in the run projects page;
- Setup your environement: docker and conda are supported.
- Test your setup
Rscript base-region-france.r --debug
: this should run without errors. - Run the model
Rscript base-region-france.r --full
: this takes HOURS. - Upload the
.csv
files generated inresults/base-full-yyyymmddTHHMMSS-JOBID/
to the result repository.
If you have previously run the code and wish to reprocess some data to the latest formats, run command:
Rscript reprocess-stanfit.r --all
For more options on reprocessing your data:
Rscript reprocess-stanfit.r --help
- The results are stored in two folders results and figures.
- Results has the stored stan fits and data used for plotting
- Figures have the images with daily cases, daily death and Rt for all countries.
While model and feature development on the model is welcome you can also contribute by analysing the results in detail. To do that we are in the process of making results available. To analyse model performance:
- Fork the repository;
- Check the todo items in the analysis projects page;
- Download the result
.csv
files from the result repository. - Suggest a new analysis, preferably as a jupyter notebook.
- Fork the repository;
- Check the todo items in the development projects page;
- Submit a pull request against the more appropriate branch, depending on what you have added.
Much of the discussion is done in the data against covid-19 slack that you can join here. If you are not part of it feel free to submit an issue on this repository.
master
is the production branch, modelling for prediction is run on this code.france-regions
is a development branch for features looking to improve modelling and processing of French regions.upstream-master
an exact mirror of the original model.community-contribs
Branch to pull in contributions from the rest of the community that is actively developing thiscovid19model
.modularisation
development branch of features which can be useful to other community projects.
- Original ICL report:
Seth Flaxman, Swapnil Mishra, Axel Gandy et al. Estimating the number of infections and the impact of nonpharmaceutical interventions on COVID-19 in 11 European countries. Imperial College London (30-03-2020) doi: https://doi.org/10.25561/77731