-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nx-cugraph: add triangles and clustering algorithms #4093
Conversation
Note that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, just a couple of questions/feedback.
] | ||
|
||
|
||
@networkx_algorithm(plc="triangle_count", version_added="24.02") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subjective feedback: I remember we discussed this but FWIW this is still a bit surprising. I have to take a beat and remember that these params are metadata and not important to functionality when reading the code. I suspect this could result in another issue like this someday since it's not self-documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it help to rename plc
to _plc
to signal that this parameter doesn't matter if you don't know what it's for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to use version_added
to update the docstring, but I haven't fully baked that yet.
We could also make plc
a code comment, which is a less useful, but also potentially less confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, a leading underscore could help emphasize these are "internal" params. I was mainly bringing it up for awareness, and this can (and probably should, unless you'd actually prefer to address it here) be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, so we have a reason to merge all open nx-cugraph PRs ;)
return 0 | ||
degree = int((G.src_indices == nodes).sum()) | ||
return 2 * numer / (degree * (degree - 1)) | ||
# What about self-edges? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leaves me curious; what happens with self-edges? I'm guessing the degree value for a node with a self-edge might be artificially high, but I hate to guess. Is this a limitation we need to address (perhaps as a FIXME) and/or something we need to include in extra_docstrings
? Maybe at a minimum can you update the comment to elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkX is underspecified. There are no tests with self-edges. Perhaps we could add a test to match NetworkX, or we could try to upstream tests to NetworkX. The latter will require the NetworkX team being intentional about what is supposed to happen in this case. Should we ignore self-edges or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to ignore selfloops and comparison tests added.
@@ -26,6 +26,6 @@ def is_bipartite(G): | |||
G = _to_graph(G) | |||
# Counting triangles may not be the fastest way to do this, but it is simple. | |||
node_ids, triangles, is_single_node = _triangles( | |||
G, None, symmetrize="union" if G.is_directed else None | |||
G, None, symmetrize="union" if G.is_directed() else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this discovered by the new tests? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the tests.
/merge |
As discussed here: #4093 (comment) Authors: - Erik Welch (https://github.com/eriknw) Approvers: - Rick Ratzel (https://github.com/rlratzel) URL: #4106
NetworkX tests are somewhat underspecified regarding how to handle self-loops for these algorithms. Also, I'm not sure if transitivity is supposed to work on directed graphs.
Once #4071 is merged, it should be easy to add
is_bipartite
function (and maybe others?).