Repository with various data science projects in the branches (More coming soon)
Links to several repositories of publicly available data are stored in the data_sources.json
with the following structure:
data_sources.json:
|--Global:
|--source:link
|--Nations:
|--Regions:
|--Provinces:
|--source:link
Ingesting pipelines need to check:
- file format:
csv
,json
, etc. - based on the file format, select the
longitude
,latitude
andtimestamp
values (where present) (in progress) - logging errors (in progress)
- build a harmonized file containing only those three columns
- harmonize values (need to avoid doubling of data qhen present from different sources)