Memory overflow for large instance #6

yetinam · 2024-09-05T11:19:26Z

I'm struggling with the memory consumption of GrowClust3D for a large instance. My input contains about 350,000 events and 21 million differential travel times. I've tried running on a machine with 375GB, but unfortunately the computation runs out of memory. In this process, the memory consumption is growing slowly over time, i.e., the code only crashed after 2.6 million pairs.

From a theoretical standpoint, I'm not sure why the memory consumption of GrowClust3D should grow over time, so I had a look into the code. My suspicion is that the memory explosion comes from the dictonary cid2pairD1, mapping clusters to the indices of all pairs originating from this cluster. In each iteration, the cluster keys of the two clusters are merged and stored in one of the arrays. However, the entry for the now merged and thereby defunct cluster is not cleared. I assume that this way over time the dictionary will get very large because indices to many pairs will become present in many clusters. From how I understand it, the entry for a cluster in the array could be cleared after the merging because it will never be requested again. However, I don't understand the code well enough (and don't really know how to use julia), so I wanted to ask for your opinion on the memory issue and the potential cause.

The text was updated successfully, but these errors were encountered:

dttrugman · 2024-09-06T14:26:56Z

Hi @yetinam,

Thanks for letting me know about this issue. Let me look into this problem and see what I can do; it's been a while since I've studied some of these lower level functions that may be causing the bottleneck.

Daniel

dttrugman · 2024-09-06T15:32:57Z

Ok, I took a look and indeed, those two dictionaries are not being cleared. If this was the issue, it should now be fixed with the latest update to the code.

yetinam · 2024-09-06T15:45:13Z

Thanks! I'll update my version and give it a try next week.

yetinam · 2024-09-11T15:42:13Z

I got around to testing the new version, but I'm still running into memory trouble. I got the code (in the old version) to run successfully with 750 GB memory, but the new version still crashes on 375 GB. I don't really have a good setup to measure the exact memory consumption though, so I don't know if the fix still improved the result.

dttrugman · 2024-09-13T14:56:30Z

Thanks for checking. I'm not sure where the bottleneck is at this point, but I'll keep looking for areas of improvement.

dttrugman added the enhancement New feature or request label Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory overflow for large instance #6

Memory overflow for large instance #6

yetinam commented Sep 5, 2024

dttrugman commented Sep 6, 2024

dttrugman commented Sep 6, 2024

yetinam commented Sep 6, 2024

yetinam commented Sep 11, 2024

dttrugman commented Sep 13, 2024

Memory overflow for large instance #6

Memory overflow for large instance #6

Comments

yetinam commented Sep 5, 2024

dttrugman commented Sep 6, 2024

dttrugman commented Sep 6, 2024

yetinam commented Sep 6, 2024

yetinam commented Sep 11, 2024

dttrugman commented Sep 13, 2024