The aim of this project is to showcase how graph-query languages can be used to analyse a list of assets and their owners. E.g. we can estimate how perturbations to the systems, i.e. problems with one node, will affect other nodes in the network in a natural way.
This project is divided into four main parts:
- first is the synthetic data generation;
- second is the export the resulting list (data generated) into a graphml file;
- is the preparation to connect the Jupyter notebook to the Gremlin server
- is the analysis in itself.
I will post a step-by-step setup in separate file called "CONFIG_README".
The data generation is done in python via a jupyter Lab project. The analysis of the data is done via queries to a Gremlin server; The result presentation is generated via python.
To run this project you will need: Jupyter Lab and Gremlin server on your machine. A variation of this project will be later ported to Azure, for demonstrating another workflow.
The aim here is to showcase how simple the design of queries become, regardign the graph, when using a Graphic language framework, used in Gremlin. After generating the graph, we will use Gremlin in two ways:
- use the Gremlin console to prototype our queries
- use python libraries to query the data directly from the Jupyter notebook, in order to have the data directly available to generate a visual report. (queries via Gremlin server)
the questions we will implement are:
- given a node, are there feedback loops? how many? What is the proportion of indirect self-ownership?
- given a node, how strongly this node may affect other nodes? Which nodes are affected? in which proportion?