Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use standard NumPy random number generators in metric spaces to limit RAM usage #179

Open
rhugonnet opened this issue Apr 28, 2024 · 0 comments

Comments

@rhugonnet
Copy link
Contributor

rhugonnet commented Apr 28, 2024

Right now the standard random number generators of NumPy: rng = np.random.default_rng(seed=) do not work when passed to ProbabilisticMetricSpace or RasterMetricSpace. Only the legacy ones do (equivalent of np.random.seed() now defined as np.random.RandomState), but they are probably not that useful in our case (we don't need to exactly reproduce random sampling from old scripts). And, the legacy versions leak a lot of memory when using a random choice without replacement, which is exactly what we use: numpy/numpy#14169.

So for instance, if we only want to use 10,000 samples from 1 billion for the variogram estimation, the legacy version will still create an array of 1 billion points in the background using tons of RAM 😅.

Will try to fix this at the same time as #178!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant