Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Plot graph using graphviz #48

Closed
pinpox opened this issue Jun 17, 2021 · 32 comments · Fixed by #106
Closed

Feature request: Plot graph using graphviz #48

pinpox opened this issue Jun 17, 2021 · 32 comments · Fixed by #106

Comments

@pinpox
Copy link

pinpox commented Jun 17, 2021

It would be nice to be able to visualise the connections between notes as a graph. I have seen this functionality on other applications implementing that implement a Zettelkasten-style notebook and find it very helpful. Here is an example of how that could look.

I propose a command, e.g. zk graph that, that just outputs the source for a graph as a graphviz graph, allowing to pipe it to a file or render it using dot.

Here is another example of a very useful visualisation, haven't been able to run it on my zk notes folder though.

image

Maybe there already is a solution for this I'm not aware of, if so please let me know!

@mickael-menu
Copy link
Member

I had something similar in mind 👍

I thought about having something like zk graph --format <format> [filtering], using the same filtering options as list to get a subset of the graph.

Format could be json, dot, svg and more. And ascii format would be great, but that might stretch the terminal a bit too much!

An --interactive (xor with --format) mode serving an HTML version on the browser allowing to filter interactively the graph would be nice as well.

Initially I wanted to produce a JSON version, because it might be easier to pipe through other programs, including D3.js (which the project you shared uses). But I'm still trying to figure out the best JSON schema. There's a few spec out there but unfortunately nothing really standard:

Making a prototype in DOT could be a first step. There's even a Go package to generate DOT graphs. I'm just wondering if DOT is really appropriate when you have so many nodes. But it could be useful for a small subset of your ZK.

@pinpox
Copy link
Author

pinpox commented Jun 18, 2021

Cool, let me know if you need help testing!

JSON would be very nice indeed, as it would allow to feed the output in something that renders mustache templates. I'm specifically thinking of mustache-go here, which I already use to render static content, e.g. for documentation.

This would also allow to just create a dot or svg mustache template probably without further code.

EDIT: I'm playing around with the list --format option, trying to output the necessary information as JSON. Looking at the docs, the only thing missing would be the ability to list the links of a note as a template value. Any ideas?

 zk list  -P --format='{title="{{title}}", tags=[{{#metadata.tags}}"{{.}}",{{/metadata.tags}}]}'
{title="Bitwarden", tags=["bitwarden",]}
{title="Filebrowser", tags=[]}
{title="Information Surprise", tags=["information-surprise","math","paper","Claude Shannon","information","surprise","entropy",]}

@mickael-menu
Copy link
Member

Sure, I'll keep you posted on this issue.

I'm playing around with the list --format option, trying to output the necessary information as JSON. Looking at the docs, the only thing missing would be the ability to list the links of a note as a template value. Any ideas?

I also want to add a json option to the list --format. Doing it manually is a bit cumbersome and if you have quotes in your note titles they will break the JSON. Actually that might be a better start, as if we expose the links in list we could process a graph. Even though a dedicated graph JSON schema would be nicer to work with for zk graph.

I need to refactor some stuff in the models to have links more readily available. I expect that it could make the index retrieval slower for large ZKs, as it will need SQL joins to retrieve the links. Maybe adding an option --with-links could help for this.

@pinpox
Copy link
Author

pinpox commented Jun 18, 2021

Good point, though I'd assume the ZKs would have to be really big it to be an issue performance-wise. Fully agree on the json option, that would be hugely useful and allow for great integration with a lot of other tools!

@mickael-menu
Copy link
Member

I have a 30k notes ZK for testing and will see if it's an issue with it. If not, we can default with links and optionally add an option to disable links.

@mickael-menu
Copy link
Member

I've been working on the JSON output for zk list: #52

I added a new {{json x}} template helper which outputs valid JSON values. We can rewrite your example to print JSON safely:

$ zk list --format '{ "title": {{json title}}, "tags": {{json tags}} }'

(The Handlebars template parser trips on }}}, so I added an extra space)

With the new --header and --footer options, you can output a valid JSON array:

$ zk list --format '{ "title": {{json title}}, "tags": {{json tags}} }' --delimiter , --header [ --footer ]

I built upon this to add two new formats json and jsonl (JSON Lines) which are shortcuts for:

  • --format json -> --format "{{json .}}" --delimiter , --header [ --footer "]\n"
  • --format jsonl -> --format "{{json .}}" --delimiter "\n" --footer "\n"

@pinpox
Copy link
Author

pinpox commented Jun 20, 2021

Very useful, thank you!
Would it be possible to add an array of links as well? Having only the tags, a directed graph showing connections between notes is still not possible I think?

@mickael-menu
Copy link
Member

No indeed. I actually added this today, but I'm not 100% sure I want to take this direction with zk list. Adding links gets quite hierarchical and feels a bit arbitrary (for example, why not have backlinks then?). Also I've pondered what information of the target should be part of the link. Probably its title and path but IMO it starts to feel redundant in the JSON.

Technically you should be able to post-process zk list --format json to retrieve the links yourself using zk list --linked-by <source path> --format json. This way you can get all the metadata you might need.

It's not a cop out though, I still see the benefit of having the links but I'm exploring the zk graph command instead. I think I will settle on the JSON format from D3.js as this lib is quite popular for these use cases and it is quite clean:

{
    "nodes": [
        { "id": 1, "name": "A" },
        { "id": 2, "name": "B" }
    ],
    "links": [
        { "source": 1, "target": 2 }
    ]
}

I will just add some extra metadata in the objects. Do you have any idea what might be needed there?

JGF is more specified, but at the same time I didn't see concrete applications using it.

@pinpox
Copy link
Author

pinpox commented Jun 20, 2021

Just a thought: Maybe it would be wise to adhere to the do one thing, do it right principle here. I see more value in having a general JSON format representing the data, than some format specific to D3 or any other tool. You could declare the actual graphing as a non-feature and provide an example of how to use external tools (like d3 or dot) to generate graphs. Adding this to each note:

    "links": [
        "titleOrPath1", "titleOrPath2"
    ]

in addition to the output you already implemented above with the --format json flag, would be totally enough I think. What I have in mind would be not generating the graph's source from zk directly, but piping the json data to whatever tool I might want to use.

Example

I'll add a full example, of what I personally would like to do. Take it with a grain of salt, this might be subjective, I'm just trying to give some ideas.

Assume the following notes:

Note 1
---
title: testnote1
created: 2021-06-20
visibility: private
language: en
tags:
- testnote1
---

# testnote1

This is the first test note, [[testnote2]] is related to this.
Note 2
---
title: testnote2
created: 2021-06-20
visibility: private
language: en
tags:
- testnote2
---

# testnote2

Here is a second test note. [[testnote3]] is related and [[testnote4]] aswell.
Note 3
---
title: testnote3
created: 2021-06-20
visibility: private
language: en
tags:
- testnote3
---

# testnote3

The third note.
Note 4
---
title: testnote4
created: 2021-06-20
visibility: private
language: en
tags:
- testnote4
---

# testnote4

Having four notes is enough for an example.

Running zk list --format json --no-pager -m "testnote*" | jq already gives most of the output I need, I have added the links array manually for each note for this example. Assume the command would print the following output:

Expected output with `--format json`
[
{
  "path": "testnote3.md",
  "title": "testnote3",
  "link": "[testnote3](testnote3)",
  "lead": "# testnote3",
  "body": "# testnote3\n\nThe third note.",
  "snippets": [
    "# testnote3\n\nThe third note."
  ],
  "rawContent": "edited/removed to keep example short",
  "wordCount": 18,
  "tags": [
    "testnote3"
  ],
  "metadata": {
    "created": "2021-06-20",
    "language": "en",
    "tags": [
      "testnote3"
    ],
    "title": "testnote3",
    "visibility": "private"
  },
  "created": "2021-06-20T14:14:45.031440861Z",
  "modified": "2021-06-20T14:14:55.598864042Z",
  "checksum": "5ae38fe0ff556bc37df7b9bfcce7893a0d090a6258c27f85dfaa441a23717fc1",
      "links": [ ]
},
{
  "path": "testnote4.md",
  "title": "testnote4",
  "link": "[testnote4](testnote4)",
  "lead": "# testnote4",
  "body": "# testnote4\n\nHaving four notes is enough for an example.",
  "snippets": [
    "# testnote4\n\nHaving four notes is enough for an example."
  ],
  "rawContent": "edited/removed to keep example short",
  "wordCount": 23,
  "tags": [
    "testnote4"
  ],
  "metadata": {
    "created": "2021-06-20",
    "language": "en",
    "tags": [
      "testnote4"
    ],
    "title": "testnote4",
    "visibility": "private"
  },
  "created": "2021-06-20T14:15:02.689644096Z",
  "modified": "2021-06-20T14:15:44.49852715Z",
  "checksum": "70dd52ebab26ca37f483c0a36277bd9121ebe4a5bcb8aad9ad5335915a904e91",
      "links": [ ]
},
{
  "path": "testnote1.md",
  "title": "testnote1",
  "link": "[testnote1](testnote1)",
  "lead": "# testnote1",
  "body": "# testnote1\n\nThis is the first test note, [[testnote2]] is related to this.",
  "snippets": [
    "# testnote1\n\nThis is the first test note, [[testnote2]] is related to this."
  ],
  "rawContent": "edited/removed to keep example short",
  "wordCount": 26,
  "tags": [
    "testnote1"
  ],
  "metadata": {
    "created": "2021-06-20",
    "language": "en",
    "tags": [
      "testnote1"
    ],
    "title": "testnote1",
    "visibility": "private"
  },
  "created": "2021-06-20T14:13:19.435952767Z",
  "modified": "2021-06-20T14:13:39.439388055Z",
  "checksum": "b216e9e1192701e4f9582a6964cbc8df93250977fd0d34ee410edaa557ecbc1f",
      "links": [ "testnote2" ]
},
{
  "path": "testnote2.md",
  "title": "testnote2",
  "link": "[testnote2](testnote2)",
  "lead": "# testnote2",
  "body": "# testnote2\n\nHere is a second test note. [[testnote3]] is related and [[testnote4]] aswell.",
  "snippets": [
    "# testnote2\n\nHere is a second test note. [[testnote3]] is related and [[testnote4]] aswell."
  ],
  "rawContent": "edited/removed to keep example short",
  "wordCount": 27,
  "tags": [
    "testnote2"
  ],
  "metadata": {
    "created": "2021-06-20",
    "language": "en",
    "tags": [
      "testnote2"
    ],
    "title": "testnote2",
    "visibility": "private"
  },
  "created": "2021-06-20T14:13:45.750153716Z",
  "modified": "2021-06-20T14:14:36.358996502Z",
  "checksum": "f3e54b0c4702663bc4e9f81c71567b04d5d1a4127c83eb2cf3a194299dc9c939",
      "links": [ "testnote3", "testnote4" ]
}
]

Using any mustache parser (here I'm using this simple go cli tool ) it would allow to render a graph easily.

I have saved the above to a file notes.json and this as graph.mustache

digraph D {

{{#.}}
"{{title}}" [shape=box, label="{{title}}" ]
{{#links}}
"{{title}}" -> "{{.}}" [color="black", style=dashed]
{{/links}}
{{/.}}
}

Running cat notes.json | mustache graph.mustache | dot -Tpng > g.png produces the following image:

g

With a different template, any other graphing tool, e.g. d3 could be "supported" in the same way, without requiring any development on zk itself.

TL;DR

I would like to be able to do this:


zk list --format json --no-pager -m "testnote*" | mustache graph.mustache | dot
-Tpng > picture.png

And get the image above. Sorry for the long comment, I hope you can find anything useful in it. The only missing piece of the puzzle is adding "links": [ "title1", "title2" ] to the json you have implemented.

@mickael-menu
Copy link
Member

Thanks for the food for thought, different use cases are always super helpful!

Maybe it would be wise to adhere to the do one thing, do it right principle here. I see more value in having a general JSON format representing the data, than some format specific to D3 or any other tool.

I fully agree, that's why I wanted to find a standard spec to represent graphs. D3's format is quite clean but you convinced me to use a custom format more semantically related to the note domain, although I will keep the overall structure of D3.

{
    "notes": [
        { "path": "a.md", "title": "A" },
        { "path": "b.md", "title": "B" }
    ],
    "links": [
        { "source": "a.md", "target": "b.md" }
    ]
}

If you look at JGF, Sigma or even DOT, they all follow the same structure:

  • A list of nodes/vertices with labels and IDs
  • A list of edges/links connecting two IDs

Keeping this structure would make it trivial to convert to other formats with jq or a template engine like mustache.

You could declare the actual graphing as a non-feature and provide an example of how to use external tools (like d3 or dot) to generate graphs.

I think interconnectedness is an important part of a healthy notebook and I want to have links and graphs as first-class citizens in zk. I think using zk list with your example is not sufficient because:

  • What about if you want to use different node labels than the note title?
  • We are missing some interesting metadata which could make the graph more readable, such as:
    • Popover on hover to show the link snippet.
    • Colored arrows to show the link relationship (e.g. proves, refutes, example).
  • It's not easy with the flat list structure to perform queries such as finding the backlinks of a given note or the shortest path between two notes.

Also having a dedicated zk graph command introduces a namespace for specific flags. For example, an option to include or not the outbound links to notes that are not matching the given filtering options.

@pinpox
Copy link
Author

pinpox commented Jun 20, 2021

I see your point, having a better or more wide-used and common structure is definitely a plus!

  • What about if you want to use different node labels than the note title?
  • We are missing some interesting metadata which could make the graph more readable, such as:
    Popover on hover to show the link snippet.
    Colored arrows to show the link relationship (e.g. proves, refutes, example).

This is mostly why I put the links in the nodes itself in my example. Assuming the link only has a "target", the source is always the note itself, which would allow to access any field in it as part of the template. E.g. I used title in the example when doing this:

{{#links}}
"{{title}}" -> "{{.}}" [color="black", style=dashed]
{{/links}}

But I could have used any other field. Using the title is not very wise actually, as there could be multiple notes with the same title in a zettekasten that does not use the title as file name. I suppose the path could be considered as unique identifier and would be a better choice since there is no ID as such in the output.

It's not easy with the flat list structure to perform queries such as finding the backlinks of a given note or the shortest path between two notes.

I agree this could be an issue. I'll look into JGF and Sigma in more detail, I'm not very familiar with how to save graphs as json yet. That being said, there is always the option of using jq, which would be able to transform the data in any way needed. I think as long as it's performance is reasonable, theoretically any format that has the relevant data in it somewhere could be converted as needed with a jq query.

I personally would not mind having to use jq to transform the data to my personal needs. It's obviously your decision what format you want to use in the end. I maybe should rephrase my feature request as "add a way to print out the data as json, with all necessary fields needed to create arbitrary graphs with it".

Thanks for looking into this by the way!

@mickael-menu
Copy link
Member

@pinpox I noticed you forked the project. If you want to test it out until I add zk graph, here's a patch of the changes I made to add links in zk list: links.patch.txt

@pinpox
Copy link
Author

pinpox commented Jun 26, 2021

Hey, thanks for the patch!
I'm currently using a nix derivation that build zk from a branch/repo of my choice to test it. You can find my package here, it will install zk from 0.5.0 from github https://github.com/pinpox/nixos/blob/main/packages/zk/default.nix . I'll keep an eye on the development of this, if you want me to test anything in particular, let me know. The easiest way for me is to just use a feature branch as base for building the package.

@mickael-menu
Copy link
Member

mickael-menu commented Jun 26, 2021

@pinpox You can try out the feature/graph branch. Still a work in progress but it should give you everything you need with:

zk graph --format json [filtering options as usual]

You can use the path, sourcePath and targetPath properties as node IDs. Only the links between two notes matching the given filtering options will be added, so you should not have any "dangling links".

Also I removed the ./go script to build the project, you can use directly make or make install now. But it looks like you were not using it anyways in your nix package.

Please share any cool script you come up with to generate the actual graphs 👍

@pinpox
Copy link
Author

pinpox commented Jun 26, 2021

Awesome! This makes it easy to generate graphs. Here is a working example:

File template.mustache:

digraph D {

{{#notes}}
"{{path}}" [shape=box, label="{{title}}" ]
{{/notes}}

{{#links}}
"{{sourcePath}}" -> "{{targetPath}}" [color="black", style=dashed]
{{/links}}
}

Then do:

zk graph --format json | mustache template.mustache | dot -Tpng > pic.png

You can also use neato, fdp, sfdp, twopi or circo instead of dot (see here for details) to alter the resulting layout.

Example using sfdp which seems to look the best for my test data:
sfdp
Obviously the graphs can be styled as you want, the template is just a minimal example

I'm working on an example to also show tags

@mickael-menu
Copy link
Member

mickael-menu commented Jun 27, 2021

Nice, thanks!

I think you should be able to adapt the code of the link you shared easily. Just need to expand the $data variable inside after converting the JSON to the D3.js format with jq.

Using sfdp is great for smaller subset of notes:

zk graph --format json -m "editor" | mustache graph.mustache | sfdp -Tpng > pic.png

pic

But a JS solution would be probably better for larger graphs, like in Obsidian:

zk graph --format json | mustache graph.mustache | sfdp -Tpng > pic.png

pic

@pinpox
Copy link
Author

pinpox commented Jun 27, 2021

Definitely! I'll try to get a working example with d3. One thing I noticed: there are "double links" when a note links multiple times to another. Is that intended/wanted? Of course it can be filtered with jq but maybe the links edges should be unique?

@mickael-menu
Copy link
Member

One thing I noticed: there are "double links" when a note links multiple times to another. Is that intended/wanted?

Yes because each link might have different associated metadata.

@pinpox
Copy link
Author

pinpox commented Jun 30, 2021

I got a first working example with d3:
image
I still want to improve it a bit and will then upload it to some public repository, but if you want to take a look in the meantime the files used are here:
https://gist.github.com/pinpox/17496ff61ad4012dccb238d8f3deca81

The json data is directly understood by the script, just run zk graph --format json > data/notes2.json and it should work. I filtered for broken links as this messes with d3.

@mickael-menu
Copy link
Member

Nice, thanks! Chrome gave me a CORS error but I could test it easily running python3 -m http.server in the folder.

On neuron they have been working on graphs with D3 as well: srid/neuron#589

We probably need some fine-tuning for the attraction settings. Or maybe using a much smaller font and use zooming.
Screenshot 2021-07-01 at 11 32 40

@pinpox
Copy link
Author

pinpox commented Jul 1, 2021

Yes, the attraction simulation is not optimal. Specifically this example would be a better fit:

When using D3’s force layout with a disjoint graph, you typically want the positioning forces (d3.forceX and d3.forceY) instead of the centering force (d3.forceCenter). The positioning forces, unlike the centering force, prevent detached subgraphs from escaping the viewport.

(Taken from this example: https://observablehq.com/@d3/disjoint-force-directed-graph?collection=@d3/d3-force )

I have tried modifying my code to use the example settings:

  const simulation = d3.forceSimulation(nodes)
      .force("link", d3.forceLink(links).id(d => d.id))
      .force("charge", d3.forceManyBody())
      .force("x", d3.forceX())
      .force("y", d3.forceY());

But can't get it to work. Seems like the different d3 versions don't work exactly the same way, but I havn't been able to fix it. Using the example above should result in the nodes been stretched apart into a circle when disconnected. I'm sure it's not that complicated, but my Javascript skills are not that good and I'm struggeling to figure out how to convert these observablehq.com notebooks to plain javascript/html files.

There is another example here that also looks way better, but I have the same problem.
If you know how to use thee forceX() and forceY() instead of the forces used in my example, please let me know. I'll keep trying and report back if I get it to work. The second example also uses d3 color-legend which seems to help with the visibility/overlapping of the labels by using a better positioning algorithm.

@mickael-menu
Copy link
Member

I actually never worked with D3 before but I've seen it being used a lot for this use case. If you're able to make the font size much smaller that could be a good workaround.

@pinpox
Copy link
Author

pinpox commented Jul 1, 2021

Small update. Got the correct forces working. The labels are still messy and overlapping, but I think I'm on the right track now. The simulation now prevents disconnected nodes to "fly away", they all nicely arrange into a circle. I'll either add collision to the text labels too or use some placement algorithm, there seem to be a few options. Another, simpler, way might be just making the links longer or the text smaller.

image

Here is the updated source

EDIT: updated force parameters and text size. Better now.

@pinpox
Copy link
Author

pinpox commented Jul 10, 2021

@mickael-menu Is https://github.com/mickael-menu/zk/releases/tag/v0.6.0 compatible with the command above? The graph subcommand seems to be missing?

@mickael-menu
Copy link
Member

Not yet, the graph feature is a significant change and I need more time to clean up the code. I merged main in feature/graph so it should be up to date with the recent changes.

@mickael-menu
Copy link
Member

@pinpox I merged the feature in main. Let me know if everything is fine and I'll do a proper release.

You might need to reset your database if you used the old feature/graph branch with rm .zk/notebook.db.

@pinpox
Copy link
Author

pinpox commented Nov 14, 2021

@mickael-menu Looks good! I build from master and the command seems to produce correct JSON.

Maybe it would be a good idea to document an example of how to render a proper graph with it somewhere in the docs?

@mickael-menu
Copy link
Member

That's a good idea. I didn't spend much time playing with this though, would you consider writing a Markdown page dedicated to the graph command?

@almereyda
Copy link

almereyda commented Nov 16, 2021

As a late addition to the reference of the non-working zetteltools in the OP, the adjustments in joashxu/zetteltools@master...almereyda:master allowed me to render the graph of links between the notes in my Zettelkasten with it.

The Zettelkasten is a view on a Logseq Journal.
$ cat graph.sh                                               
#!/usr/bin/env bash
set -m

# https://unix.stackexchange.com/questions/50179/what-happens-when-you-delete-a-hard-link
rm -rf ./zk/*.md

# https://unix.stackexchange.com/questions/40634/in-ubuntu-is-there-a-way-to-virtually-merge-two-folders-without-unionfs-or-aufs
cp -anl "$PWD/journals/"* zk/
cp -anl "$PWD/pages/"* zk/

zk -W zk graph --format json > graph/notes.json

http -q graph & xdg-open http://localhost:8000 && fg

http is a small webserver that works similarily to python -m http.server.

In the graph/ subdirectory we also find forked revisions of index.html and graph.js ¹ ² supplied by the pinpox gists above.

@pinpox
Copy link
Author

pinpox commented Nov 17, 2021

@almereyda I'm struggeling a bit to understand. Did you use the gists, your forked zetteltools repo or both?
Could you provide a complete minimal working example? I cloned your fork and running python3 zettvis.py ~/Notes/ gives no links and no titles on hover:

image

That's a good idea. I didn't spend much time playing with this though, would you consider writing a Markdown page dedicated to the graph command?

@mickael-menu I'd like to include a full working example in the documentation with the above if possible. The forces in zetteltools seem to make more sense, at least the example image in the upsteam repo looks better.

@mickael-menu
Copy link
Member

@mickael-menu I'd like to include a full working example in the documentation with the above if possible. The forces in zetteltools seem to make more sense, at least the example image in the upsteam repo looks better.

Sure, feel free to open a PR on the repo 👍

@TSoli
Copy link

TSoli commented Jul 16, 2024

@almereyda I'm struggeling a bit to understand. Did you use the gists, your forked zetteltools repo or both? Could you provide a complete minimal working example? I cloned your fork and running python3 zettvis.py ~/Notes/ gives no links and no titles on hover:

image

That's a good idea. I didn't spend much time playing with this though, would you consider writing a Markdown page dedicated to the graph command?

@mickael-menu I'd like to include a full working example in the documentation with the above if possible. The forces in zetteltools seem to make more sense, at least the example image in the upsteam repo looks better.

I have the same issue. With your gist it also seems not all the notes fit in the frame and I am not sure how to zoom. Did anyone figure out how to get the zetteltools version working or a nicer graph with zoom/pan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants