Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented MinkowskiKNNEdges #630

Merged
merged 7 commits into from
Jan 12, 2024

Conversation

AMHermansen
Copy link
Collaborator

Implementation of MinkowskiKNNEdges.

Current implementation is written purely in python, and has a larger than necessary memory overhead. Current memory consumption is O(n^2), where n is the largest number of pulses in a single event. A custom implementation in cpp/cuda would reduce this to O(n*k) where k is number of edges.

The code is currently working, but I'll try to construct some sort of unit test to verify it.

AMHermansen and others added 2 commits November 15, 2023 17:31
There was a bug where source/target was swapped.
@RasmusOrsoe
Copy link
Collaborator

@AMHermansen I think this could be a nice addition to the repo!

It would be fantastic if you could add a unit test that assert that the code works, and some kind of quantification of computation speed for future reference (See attached plot as an example)
277937748-06b7bf8d-96de-4540-8954-f45e2f7f1adf

The overhead of measuring the execution speed as shown above is quite small, and I'd be happy to share the code via slack that I used to make the plot.

@AMHermansen
Copy link
Collaborator Author

@AMHermansen I think this could be a nice addition to the repo!

It would be fantastic if you could add a unit test that assert that the code works, and some kind of quantification of computation speed for future reference (See attached plot as an example) 277937748-06b7bf8d-96de-4540-8954-f45e2f7f1adf

The overhead of measuring the execution speed as shown above is quite small, and I'd be happy to share the code via slack that I used to make the plot.

If my memory of algorithmic complexity serves me right, the time complexity should be O(B*n^2), where B is number of batches and n is largest number of pulses in a batch, but I'll make a concrete benchmark that shows speeds on the hardware that I'm using.
(I didn't feel it was any slower than the current KNNEdges, but this was done during CPU-preprocessing, which I have never experienced being a bottleneck during training.)

Do you have a suggestion for values of k and n, for which I should try it on (I believe only n should matter, but I can probably test it both), and should I also benchmark on cpu vs gpu?

I'm assuming that randomly generated data suffices for the computational benchmark?

@RasmusOrsoe
Copy link
Collaborator

Do you have a suggestion for values of k and n, for which I should try it on (I believe only n should matter, but I can probably test it both), and should I also benchmark on cpu vs gpu?

I think it will be sufficient to probe k = [4, 8, 16] and n in a similar range to the xaxis of the plot above. So ~10 to ~10.000. The idea here is not to probe the phase space completely, but to give some rough idea on how the the execution time scales with number of pulses and k.

I'm assuming that randomly generated data suffices for the computational benchmark?

Yes, that is perfectly fine.

@AMHermansen
Copy link
Collaborator Author

Hello @RasmusOrsoe
I've now done the benchmark, and the normal KNN is definitely faster. I've also ran a training script with this MKNN and there the entire training-pipeline didn't seem to be much slower. (My usual experience with cpu-load during runs is that I/O and forward/backward pass on the GPU are much more dominating factors for overall training time).

For a run where I was cropping graphs to have at-most 256 nodes I didn't notice any particular slowdown from this definition, so the increased computation time seems to be acceptable at least up to 256-nodes.

I would imagine that a custom cpp/cu implementation would reach similar speeds as the knn-benchmark, should we continue with this implementation or do you find the slowdown to be too great?
edge_definition_benchmark

@RasmusOrsoe
Copy link
Collaborator

@AMHermansen Thank you for this very thorough test! I think the execution times here are well within the margins of acceptance - great job! I don't think it's necessary to optimize this further.

@AMHermansen AMHermansen marked this pull request as ready for review December 12, 2023 20:19
@AMHermansen
Copy link
Collaborator Author

@RasmusOrsoe Could you have a look at this PR. I'm still not entirely sure that the row/coll order is correct. I think the easiest way for you to verify this is in the unit test, where I include expected, which contains how I expect the edge_index to look like. (i.e. verify that I haven't ordered rows/cols in the wrong way)

Copy link
Collaborator

@RasmusOrsoe RasmusOrsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @AMHermansen,

Sorry for the wait; I had overseen your request for review.

It looks to me like the format of the edge_index is correct and with the right dtype. Generally, the format should be [2, n_edges]. I took a look at your expected edge indices and I do think they look correct to me!

I've left a few superficial comments.


Returns: Matrix of shape (n, m) of all pairwise Minkowski distances.
"""
space_coords = space_coords or [0, 1, 2]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason why you don't let space_coords default to [0,1,2] instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your question right, you're asking why I have space_coords=None in the function definition, and then later do space_coords = space_coords or [0, 1, 2].

The reason is that I generally avoid having mutable objects as default parameters, since IMO python treats mutable default objects in a non-intuitive way, and it can create some hard-to-debug bugs.

An example from the Python Documentation would be:

def f(a, L=[]):
    L.append(a)
    return L

print(f(1))  # prints [1]
print(f(2))  # prints [1, 2]
print(f(3))  # prints [1, 2, 3]

The above behavior is related to how python treats function definitions. More information can be found in Section 4.8.1 and the related warning in the link-above.
The same behavior was what was changed in #633

self.nb_nearest_neighbours = nb_nearest_neighbours
self.c = c
self.time_like_weight = time_like_weight
self.space_coords = space_coords or [0, 1, 2]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

@AMHermansen AMHermansen merged commit f8d88b8 into graphnet-team:main Jan 12, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants