Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the display for larger networks #34

Open
albertocottica opened this issue Jul 8, 2014 · 15 comments
Open

Improve the display for larger networks #34

albertocottica opened this issue Jul 8, 2014 · 15 comments

Comments

@albertocottica
Copy link
Member

For example edge bundling:

http://www.win.tue.nl/~dholten/papers/forcebundles_eurovis.pdf

Already at the Edgeryders level (hundreds of nodes, thousands of edges) the visualization becomes chaotic.

Another possibility is adding a metagraph representation (see tulip: https://www.youtube.com/watch?v=hbhOfQgQbZY) , where the metanodes could be the classes of the modularity-maximizing partition, already computed by Edgesense. Problem: the Louvain algorithm is non-deterministic, so this representation is blurry at the edges.

@luca
Copy link
Contributor

luca commented Aug 24, 2015

Other ways to improve graph visualization without edge bundling:

  1. Remove loops as per Issue 44.
  2. Change the layout to one based on arrows. This would eliminate approximately 20-25% of edges from the visualization.
  3. Reduce to Simmelian backbone. Requires extra computatio, reduces edges by 60-80%.

@luca
Copy link
Contributor

luca commented Aug 24, 2015

Also relevant is #44

@luca
Copy link
Contributor

luca commented Aug 24, 2015

Other options:

  • when the density of the network overcomes a certain threshold then the graph is drawed as unidirected, that is only one edge is drawed between alice and bob (vs 1 from alice to bob and another from bob to alice) e.g. the threshold could be when there are more than 4 edges per each node
  • Leave to the user the choice of the minimal edge weight to show, that is if the user choses to show only nodes of weight 3 or more then all the edges of weight 1 or 2 are not shown. Edges with weight > 1 represent repeated interactions

@luca
Copy link
Contributor

luca commented Aug 24, 2015

Another dimension to consider when addressing the complexity of the network is that of the time: there are communities like InnovatoriPA that are very old (years) and for which the timeline filter becomes useless.
For these cases it might make sense to limit the timeframe handled in the representation or to "compress" the history earlier than 12 months ago to a single timestep.

also see #71

@luca luca changed the title Add a layout for larger networks Improve the display for larger networks Aug 24, 2015
@albertocottica
Copy link
Member Author

Here are some comments from my network scientists friends:

@renoust:

"Of course, the Simmelian backbone will be a good solution, and should test what it gives you on your network. This method has been developed for affiliation networks (like facebook schools, and zachary karate club), so I'm unsure of what you might get. I've tested on document networks constructed from keyword affiliation, and there were nothing really to be interpreted.
Your other solution for meta graph could also be a nice idea, maybe based on something else than Louvain.
In general, I'd say all these methods make the graph easy to read as a whole, but difficult to interpret and inspect individual nodes and links.
Another solution can be setting your layout from a filtered the graph, like you do with time, but maybe from participation. If you can identify what is a (semantically) weak/meaningless link, you can filter those links out; or get inspired here: Neal Z. (2013). Identifying statistically significant edges in one-mode projections.Social Network Analysis and Mining 3, 915 – 924

You could also separate your whole graph into 2 subgraphs, for example, low participation and new comers + neighborhood at 1 and their induced subgraph, and another more dense of the induced subgraph all none new comers.

Well these are just ideas, I hope it helps!"

@melancon:

"I’ve played quite a lot with simmelian backbone, at the same time I was playing with Zachary Neal’s trick to project affiliation networks on their « main » component.

I have both coded in python already if that can be useful. I did run the simmelian backbone approach on the FP7 EU data in the context of the study performed in collaboration with PwC. I can tell you about it. I remember I used it to try and put my hands on the « most important » SMEs.

I also know Brandes had a trick to improve the computation time of the backbone when dealing with larger graphs.

Neal’s trick is also quite good, but will work best when dealing with a bipartite graph (affiliation) A+B with a small number of B nodes (A people, B events they attend in his case). I know since his 2013 paper (he presented during a INSITE workshop meeting organized by Rozenblat) he generalized his approach to weighted bipartite graphs.

Would be worth trying it all on your data."

So the good news is that there already is usable Python code (Guy writes quite beautifully, everything is classes and superclasses, so his code is probably actually reusable). The bad news is that both the Simmelian backbone and, even more so, Neal's method were dreamed up for bipartite graphs, and ours is not really a bipartite graph.

Let's talk soon and make a decision, @luca and @jenkin

@luca
Copy link
Contributor

luca commented Oct 16, 2015

a few comments:

if i understand correctly, the Simmelian backbone and Neal's method may or may not help with simplifying the network. It depends on the network. Moreover it might be difficult to implement but having the python code could help.

I'd say that this tells us that at best those methods should be optionals, and be explicitly triggered by the user, rather than a default behaviour of the processing script.

I also see some difficulty in communicating the meaning of some of these techniques to the visualization users as they are quite unaware of the network science behind all this.

I like the suggestion from @renoust that is can be good enough setting your layout from a filtered the graph, like you do with time, but maybe from participation.

I suggest we explore what filtering can be done that makes the display clearer while using only the simpler data we collect. An example could be make the links more transparent the less weight they have (weight here meaning the degree == number of singular links between A and B that are merged in an edge)

@albertocottica
Copy link
Member Author

The tradeoff is inevitable: a simpler graphic representation means either a loss of information or a more complex way to aggregate the data. @jenkin knows that in a different, less formal context (hackathon, so anything goes) I reduced a network of collaboration by dreaming up something I called the "stable collaboration graph", which was defined as the network of organisations that had collaborated at least twice. This graph said something different from the source one, but what it said was relevant to the research question.

Of the two methods discussed above, I believe that the Simmelian backbone is not dependent on the bipartite structure of the original social interaction. It is a way to reduce a network by picking only the edges that, based on social theory, should represent the stronger ties. Neal's method, as I remember it, takes a different approach based on co-occurrence: it generalises to non-social networks, but it does need a bipartite graph to start with.

I like @luca's proposal of playing with transparency. I also like very much the idea that the reduced graph should be triggered by the user and not a default.

I am going to ask @melancon to process the GEXF file from Edgesense via Simmelian backbone. If he agrees, we can spend some time looking at it and figuring out what it does.

@renoust
Copy link

renoust commented Oct 16, 2015

Quickly reading,
+1 on your idea of transparency, please also consider z-ordering (I mean
semi-transparent edges can also accumulate and occlude, so tweak the order
of drawing edges to have a better effect)
Anyway, whatever the method you choose to filter your edges (simmelian
backbone, statistical relevance, or any other property), it's generally a
good idea to trigger manually or not the filtering (within the memory
possibilities).
I agree with @melancon that you can try first all different methods on the
side before making any decision, especially since you'll have readily
running python code for Tulip, and Tulip can import GEXF.

Benjamin

On 16 October 2015 at 16:22, Alberto Cottica [email protected]
wrote:

The tradeoff is inevitable: a simpler graphic representation means either
a loss of information or a more complex way to aggregate the data. @jenkin
https://github.com/jenkin knows that in a different, less formal
context (hackathon, so anything goes) I reduced a network of collaboration
by dreaming up something I called the "stable collaboration graph", which
was defined as the network of organisations that had collaborated at least
twice. This graph said something different from the source one, but
what it said was relevant to the research question.

I like @luca https://github.com/luca's proposal of playing with
transparency. I also like very much the idea that the reduced graph should
be triggered by the user and not a default.

I am going to ask @melancon https://github.com/melancon to process the
GEXF file from Edgesense via Simmelian backbone. If he agrees, we can spend
some time looking at it and figuring out what it does.


Reply to this email directly or view it on GitHub
#34 (comment).

@jenkin
Copy link
Contributor

jenkin commented Oct 21, 2015

I think playing with opacity is a great idea, look at the attached image... I used a log scale to map weight to opacity (from 0.15 to 1), without considering z-index, and I also changed the node border color (from white to black).

edgesense_w_opacity

In this screenshot all edges between two nodes are aggregated and merged (so the network is undirected) and loops are simply removed.

@albertocottica
Copy link
Member Author

Another easy win is to draw the graph as undirected. This means that, when i => j AND j=> i, Javascript would only draw one edge. Assuming half of the interactions are reciprocated, this would lead to eliminating 1/4 of all edges. Maybe we could have a "simplified visualization" tab with (1) a directed graph that (2) uses "ghosting" . But I think for online communities the age of interactions is more important than their weight.

@luca
Copy link
Contributor

luca commented Oct 26, 2015

@jenkin can you try using the age of interactions in determining the opacity? (example: calculate the weight by summing a number that is between 0 and 1 depending on the age of the edge)

I'd use a two state button (or a check) instead of a tab to select the simplified view (we'll add a contextual help to explain the choices done when simplifying)

@jenkin
Copy link
Contributor

jenkin commented Oct 30, 2015

Here a simple implementation of time-based edges weight: Dataninja@fad04f1

@jenkin
Copy link
Contributor

jenkin commented Nov 14, 2015

Done in last PR, but without control exposed to the user.

@luca
Copy link
Contributor

luca commented Nov 25, 2015

@jenkin thanks for the PR, I've merged it.
Question: if we were to expose the control to the user, how would you do it? Can you add it?

luca added a commit that referenced this issue Dec 2, 2015
@albertocottica albertocottica modified the milestones: Beta, Version 2 (or is it 1?) Dec 9, 2015
@albertocottica
Copy link
Member Author

The time filter helps a lot. Not closing the issue, though, as it contains many interesting ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants