-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This wiki contains documentation for viewser for versions 6.0.0 and above. Much of what follows can also be found in the README.md file in the root directory of this repository.
Older versions of viewser (<6.0.0) retrieve data from a source which is no longer maintained, and these versions can no longer be used - previous users who wish to continue to use the service should upgrade to the latest version of viewser (see below).
The viewser package is a client which interacts via an https connection with a remote service, usually known as views3, currently hosted at the Peace Research Institute of Oslo (PRIO).
views3 has an actively maintained Postgres database containing of order 10 000 features of various kinds (e.g. conflict, economic, developmental, environmental) mostly defined at one of two main temporal levels of analysis (month, year) and one of two main spatial levels of analysis (country, priogrid). Most features are regularly updated, some (e.g UCDP GED) on monthly timescales.
The viewser client allows users to fetch raw data from the database and apply a wide variety of mathematical transforms to each feature.
It communicates with a service running on a remote server, which bears all the computational load, thereby relieving users' machines. Communication is by a simple polling model: the client keeps pinging the service until it receives either an error message, or the requested data. In the meantime, the service returns status messages giving users' some idea of how much longer they need to wait for their data.
Users define what data and which transforms (which can be arbitrarily chained) they require using a Python class called a Queryset, which makes the specification of the requested dataset declarative - users do not need to worry about the various lower-level processes needed to produce a certain dataset. In particular, any aggregation or disaggregation between different levels of analysis is performed automatically. The package allows users to simply state what they want in a straightforward fashion that the data service understands, and requests are fulfilled on the fly, while being cached in case they are needed again.
The service eventually returns a single pandas dataframe (compressed for transfer) containing all requested data.
The following is a brief Quickstart guide - detailed instructions can be found at Detailed installation instructions.
viewser is a Python package publicly available via pip. It was developed for Apple and Linux platforms and has not been tested to any degree on Windows.
It is very strongly recommended that viewser be installed in a dedicated conda environment.
Conda can be obtained here (the miniforge installer is recommended):
https://conda.io/projects/conda/en/latest/user-guide/install/index.html
A minimal viewser environment can be created by executing
conda create -n viewser python=3.11
Once the environment is activated by
conda activate viewser
viewser itself can be installed via
pip install viewser
The viewser package is regularly updated, so users should frequently update it by executing
pip install --upgrade viewser
Once viewser is installed, it needs to be configured to set the URL from which it fetches data. This can be achieved by
viewser config set REMOTE_URL https://viewser.viewsforecasting.org
To open this wiki in a browser window from the terminal, run:
viewser help wiki
The viewser client can be used in two ways:
viewser
functionality is exposed via a CLI on your system after installation.
An overview of available commands can be seen by running viewser --help
.
The CLI is envisaged mainly as a tool to help users with issues such as selecting appropriate transforms, exploring the database, determining the level of analysis of a given feature, etc.
Show all features in the database:
viewser features list <loa>
with <loa>
being one of ['pgm', 'cm', 'pgy', 'cy', 'pg', 'c', 'am', 'a', 'ay', 'm', 'y']
Show all transforms sorted by level of analysis:
viewser transforms list
Show all transforms available at a particular level of analysis:
viewser transforms at_loa <loa>
with <loa>
being one of ['any', 'country_month', 'priogrid_month', 'priogrid_year']
Show docstring for a particular transform:
viewser transform show <transform-name>
List querysets stored in the queryset database:
viewser querysets list
Produce code required to generate a queryset
viewser querysets show <queryset-name>
The full functionality of viewser is exposed via its API for use in scripts and notebooks
The two fundamental objects used to define what data is fetched by the client are the Queryset
and the Column
, where a
Queryset consists of one or more Columns, and a Column is one raw feature to which zero or more transforms have been applied.
Follow the links below for guides to creating querysets and fetching data.
- Queryset Basics: How to define new querysets
- Getting Data: How to retrieve data from the views3 service
- Drift Detection: Detecting possible anomalies in data retrieved through viewser
ViEWS 3 relies on several conventions for data and naming, to make exchange and interoperation between packages easier:
Documentation is provided for each of the constituent ViEWS 3 packages.
Notebooks are also available, which show complete workflows using the various
packages in concert. Note that package names are written with a dash (-
) on
PyPi, and with an underscore (_
) on github, due to differing naming
conventions.
- viewser is the entrypoint for interacting with the ViEWS 3 cloud, which provides data for the ViEWS team.
- views-transformation-library contains data transformation functions available in viewser
- views-runs provides helper classes for model run management (soon to be deprecated)
- stepshift implements the step-shifting algorithm used for predictive modelling.
- views-partitioning is used to partition data for training.