Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flexible options to archetype graphs #19

Merged
merged 24 commits into from
Sep 12, 2024

Conversation

joeloskarsson
Copy link
Contributor

@joeloskarsson joeloskarsson commented Sep 4, 2024

Describe your changes

This PR is a collection of modifications and new options to the existing graph archetypes. The motivation for this change is

  1. To allow for more flexibility in graph creation.
  2. (enabled by 1.) Bring the graph archetypes closer to their original formulations in our 2023 paper / in neural-lam.

Specifically, the changes included in this PR are:

  1. Allow for specifying rel_max_dist when connecting graphs using within_radius method. rel_max_dist specifies the radius as a relative distance to the maximum edge length in the mesh graph. Specifically, for multi-scale and hierarchical graphs this is the maximum edge length of the bottom level intra-level edges.
  2. Change the refinement_factor argument into two: a grid_refinement_factor describing the refinement between grid nodes and the bottom mesh level and a level_refinement_factor describing the refinement factor between levels in the graph hierarchy. Naturally flat graphs only have grid_refinement_factor.
  3. In hierarchical graphs, make sure that the grid is only connected to the bottom level of the hierarchy when g2m and m2g are created.
  4. Change default archetypes to use new options and match the graph creation from neural-lam.

Minor:

  1. Fix spelling of Oskarsson 😜
  2. Change docstrings to clarify that the archetypes are LAM graphs that are inspired by corresponding global graphs (Keisler, GraphCast).
  3. Add vim gitignore
  4. Add documentation and test for changes

Issue Link

Fixes #17 in archetype docstrings. Perhaps a similar clarification should be done in the actual documentation, before closing #17? Not sure if that is within scope of this PR. Comment below what you thing.

(Somewhat unintentionally) this also seems to fix #12 😄 I think that is just because of the changes to the flat (Keisler) graph creation, that is a bit more robust now.

Type of change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

  • My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the documentation to cover introduced code changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have given the PR a name that clearly describes the change, written in imperative form (context).
  • I have requested a reviewer and an assignee (assignee is responsible for merging)

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section
    reflecting type of change (add section where missing):
    • added: when you have added new functionality
    • changed: when default behaviour of the code has been changed
    • fixes: when your contribution fixes a bug

Checklist for assignee

  • PR is up to date with the base branch
  • the tests pass
  • author has added an entry to the changelog (and designated the change as added, changed or fixed)
  • Once the PR is ready to be merged, squash commits and merge the PR.

@joeloskarsson joeloskarsson self-assigned this Sep 4, 2024
@joeloskarsson
Copy link
Contributor Author

Example use of different refinement factors:

Grid-refinement=2, Level-refinement=3
graph_g2_l3

Grid-refinement=3, Level-refinement=2
graph_g3_l2

Copy link
Contributor

@sadamov sadamov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joeloskarsson for this PR, making the graph creation process even more modular and flexbile.

  • Why is the default radius d=0.51, is that obvious or should we document it?

PS: Just used weather-model-graphs for the first time, loving the Docs notebooks @leifdenby 🫶

src/weather_model_graphs/create/archetype.py Outdated Show resolved Hide resolved
Comment on lines +80 to +81
g2m_connectivity="within_radius",
m2g_connectivity="nearest_neighbours",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that these are "historical" graphs, but what is the rational behind having a different connectivity in g2m and m2g? (Mostly asking for educatory purposes, I think the code is fine).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have ever read a thorough explanation, but mostly used this for historical reasons as well. One motivation could be based on what we think g2m and m2g does. g2m is supposed to aggregate grid information up to the mesh. It could then be useful to include a large area around each mesh node as the "information aggregation window", and overlaps in this are not a problem. For m2g, its purpose is to extract the information from the mesh to determine the final prediction in each grid node. At this point we expect this information to be localised to the closest mesh nodes, so in a sense m2g only performs a fancy interpolation between the closest mesh nodes. If we connect to the closest mesh nodes we don't expect mesh nodes further away to contribute with more infromation.

src/weather_model_graphs/create/archetype.py Outdated Show resolved Hide resolved
src/weather_model_graphs/create/base.py Outdated Show resolved Hide resolved
graph : networkx.Graph
Graph to check for levels
attr : str
Attribute to split on existance of
Copy link
Contributor

@sadamov sadamov Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence, could you clarify existance of what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified this

Co-authored-by: sadamov <[email protected]>
@leifdenby
Copy link
Member

Specifically, the changes included in this PR are:

1. Allow for specifying `rel_max_dist` when connecting graphs using `within_radius` method. `rel_max_dist` specifies the radius as a relative distance to the maximum edge length in the mesh graph. Specifically, for multi-scale and hierarchical graphs this is the maximum edge length of the bottom level intra-level edges.

2. Change the `refinement_factor` argument into two: a `grid_refinement_factor` describing the refinement between grid nodes and the bottom mesh level and a `level_refinement_factor` describing the refinement factor between levels in the graph hierarchy. Naturally flat graphs only have `grid_refinement_factor`.

3. In hierarchical graphs, make sure that the grid is only connected to the bottom level of the hierarchy when g2m and m2g are created.

4. Change default archetypes to use new options and match the graph creation from neural-lam.

Very happy with these changes @joeloskarsson! They are a great addition 🚀

@joeloskarsson
Copy link
Contributor Author

@sadamov I added an explanation about why the g2m default radius is 0.51, with a link to this visualization: https://www.desmos.com/calculator/sqqz0ka4ho

@joeloskarsson
Copy link
Contributor Author

@TomasLandelius @leifdenby do you want to look over this more before we merge this, or are you happy with Simon's review? Would be nice to get merged so I can continue building on it (will still build on this on new branches, but easier to PR those new branches later if this is merged).

Copy link
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes all the mistakes I made with the archetype graphs implementations and adds so many small improvements too! Thank you again for doing this work ❤️ It was a joy to review

src/weather_model_graphs/create/archetype.py Show resolved Hide resolved
src/weather_model_graphs/networkx_utils.py Show resolved Hide resolved
src/weather_model_graphs/create/base.py Show resolved Hide resolved
src/weather_model_graphs/create/base.py Show resolved Hide resolved
@leifdenby
Copy link
Member

The only missing element is an update to the changelog then this is good to go in :)

@leifdenby
Copy link
Member

I just noticed one more thing: It would be good to clear the output for the example notebooks. For example https://github.com/joeloskarsson/weather-model-graphs/blob/archetype_changes/docs/creating_the_graph.ipynb is > 10Mb in your PR. I should work out how to clear the content with pre-commit before the notebooks are committed. The rendered content is created automatically when the gh-action builds the documentation site and pushes to the https://mllam.github.io/weather-model-graphs site

@TomasLandelius
Copy link

OK with me.

@joeloskarsson
Copy link
Contributor Author

Happy that you like these changes!

Yes, I wasn't entirely sure how to work with the documentation. When I originally committed the notebooks with output the diff become so large that it was completely unreadable. I first thought that was a quite big downside to having docs in jupyter notebooks, but I see now that once you clear the output the diff actually becomes as readable as a diff for any other code 😄 And it makes a lot of sense to not commit the notebook output.

I've now updated also the changelog so will go ahead and merge this.

@joeloskarsson joeloskarsson merged commit b468ed5 into mllam:main Sep 12, 2024
3 checks passed
@joeloskarsson joeloskarsson deleted the archetype_changes branch September 12, 2024 12:15
@leifdenby leifdenby added this to the v0.2.0 milestone Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify LAM/global scope and relation to existing works No mesh nodes returned for odd dimensions
4 participants