-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for randomized sampling #181
Conversation
Added tool_random_mode and tool_periodic_mode to identify whether tool uses periodic sampling or random sampling (or possibly a combination of both (every 20th timestep, gather data with 50% probability).
Fixing to use float rather than int for sampling probability
Sample output of two independent runs showing sampling of Kernel_logger. The skip rate of sampler is set to 0 (every Kokkos kernel invocation is profiled/logged). The sampler probability, the new environment variable and feature in this PR, is set to 1.0%, and this means that on every kernel invocation, there is a 1% chance that the kernel will be logged. The probability of logging for a kernel invocation is independent of any other kernel invocation. The result of the two different runs shows two different numbers of samples, showing that the sampling is non-deterministic and random. The number of samples is roughly right: half of the 600 sets of 4 kernel invocations of stream , i.e., 600*4/2 = 1200 kernels will be logged at maximum given the skip rate of 1; then, applying the probability of 1% to this number 1200 is 12. We see that the sampler outputs on the order of 12 logs in the output below. More runs might show that the sampler sometimes outputs something like 14 samples, or 15 samples.
|
These two outputs from two different runs both using no skipping of any logging/profiling should be done, unlike the previous case where every other kernel invocation was logged/profiling (changing Run 1
Run 2
|
One note in the previous runs is that the device ID is wrong. This is another PR. |
static uint64_t kernelSampleSkip = | ||
101; // Default skip rate of every 100 invocations | ||
static float tool_prob_num = | ||
1.0; // Default probability of 1 percent of all invocations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set to max of uint64_t for kernelSampleSkip and -1 for tool_prob_num
"sampling probability to 0 percent; none of the invocations of " | ||
"a Kokkos Kernel will be profiled.\n"); | ||
tool_prob_num = 0.0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both tool_prob_num < 0 and kernelSampleSkip is max of uint64_t set tool_prob_num to 10.0
Make an error check checking that not. both of them are set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tool_prob_num
default has been assigned to the default requested. The kernelSampleSkip
default is part of a new PR which focuses just on the correct matching of sampled kernels.
maximum uInt64_t for kernelSampleSkip and -1.0 for tool prob num
In this case, only use the probability set Note: an alternative is to gracefully exit. Feedback welcome here.
Fix #180.
The sampler will allow user to use either periodic sampling or random sampling via environment variable or kokkos-tools-args.
The solution should allow for possibly a combination of both (e.g., every 20th invocation of a Kokkos::parallel_for, gather time spent with probability 63%).