-
I've come across some strange behaviour while experimenting with some simple machine learning code. Running the non-present version any number of times I see expected results. I surmise that the gpu card itself is in some way retaining state between program executions. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
The only state ArrayFire retains across sessions is the kernel binaries saved to the disk (to avoid recompilation on target machine at every startup session of the application, this is also started with recent 3.7.2 release) which don't hard code any runtime information. At every session (i.e. ArrayFire library is loaded into memory), it starts afresh - only global state that gets populated on startup is the information of what devices are accessible. I don't think even vendor drivers retain any application specific state other than caching kernel binaries, which is for speed related reasons. I know that NVIDIA does this, not sure about AMD - but they also probably do cache kernel binaries. Nevertheless, none of these store any app specific info. From what you describe, this seems to be seed related issues but I am not certain about it unless I look at some kind of reproducible code, preferable a stand alone example. |
Beta Was this translation helpful? Give feedback.
-
would it be acceptable to you if I uploaded the code, which consists
of a slightly modified
version of a small machine learning library written in rust, to github?
I haven't found a way to boil the problem down yet.
…On 23/07/2020, pradeep ***@***.***> wrote:
The only state ArrayFire retains across sessions is the kernel binaries
saved to the disk (to avoid recompilation on target machine at every startup
session of the application, this is also started with recent 3.7.2 release)
which don't hard code any runtime information. At every session (i.e.
ArrayFire library is loaded into memory), it starts afresh - only global
state that gets populated on startup is the information of what devices are
accessible.
I don't think even vendor drivers retain any application specific state
other than caching kernel binaries, which is for speed related reasons. I
know that NVIDIA does this, not sure about AMD - but they also probably do
cache kernel binaries. Nevertheless, none of these store any app specific
info.
From what you describe, this seems to be seed related issues but I am not
certain about it unless I look at some kind of reproducible code, preferable
a stand alone example.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/arrayfire/arrayfire-rust/issues/231#issuecomment-662846833
|
Beta Was this translation helpful? Give feedback.
-
I suspected the rust rand-0.7.3 / OS interaction but discounted that
as less likely than the gpu harbouring state given how prevalent rand
is in the rust eco system.
when you say 'seed related issue' bear in mind that I'm talking about
two independent
program executables that don't create/retain anything on disk - I see
no other way for them
to influence each other than through OS state (/dev/random
infrastructure perhaps) or hardware.
and one is most definitely influencing the other.
|
Beta Was this translation helpful? Give feedback.
-
I can't speak for drivers with 100% certainty, but ArrayFire for sure doesn't keep any global state that persists across sessions. A short code snippet (that reproduces the issue) would be great. As you said in original description if it is related random number generation, we have only a couple of random number generation functions. Have you tried using the the functions in standalone code, as in bottom-up approach ? Like keep adding your application logic (that surrounds the random number generation) gradually to this standalone program to find which code addition breaks the output consistency. That should help narrow down the problem too. |
Beta Was this translation helpful? Give feedback.
-
@progman1 Have you figured out what is the problem ? If it is something related to system setup, please share your work around, it would be helpful for any future users who face similar issue. Thank you. |
Beta Was this translation helpful? Give feedback.
-
FYI - If by chance this has anything to do with arrayfire/arrayfire#2980 , a couple of randu and randn related issues are being handled in that upstream PR. Closing due to inactivity. |
Beta Was this translation helpful? Give feedback.
The only state ArrayFire retains across sessions is the kernel binaries saved to the disk (to avoid recompilation on target machine at every startup session of the application, this is also started with recent 3.7.2 release) which don't hard code any runtime information. At every session (i.e. ArrayFire library is loaded into memory), it starts afresh - only global state that gets populated on startup is the information of what devices are accessible.
I don't think even vendor drivers retain any application specific state other than caching kernel binaries, which is for speed related reasons. I know that NVIDIA does this, not sure about AMD - but they also probably do cache kernel binarie…