-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add script to run hash algorithm benchmark #336
Conversation
Signed-off-by: Spencer Schrock <[email protected]>
a94a8ed
to
1ed45f0
Compare
As far as hashing is concerned, bytes are bytes. By generating our own bytes, we avoid I/O associated with reading models from disk. While we could read the model into memory, recreating the filesystem seems complicated. Signed-off-by: Spencer Schrock <[email protected]>
840f809
to
6395b78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, it looks great. I have a few comments / discussion starters to make this useful both for humans and machines (plotting, comparing between runs)
benchmarks/exp_hash.py
Outdated
hasher = _get_hasher(algorithm) | ||
|
||
def hash(hasher=hasher, size=size): | ||
hasher.update(data[:size]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we reinitialize the hasher too under the measured scope? We can make _get_hasher
return just the constructor and call it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is in the inner most loop, it's always a new hasher. I dont think we need to reset anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But each call of hash
from timeit
just hashes data, it doesn't time the time it takes to init the hasher.
Signed-off-by: Spencer Schrock <[email protected]>
Signed-off-by: Spencer Schrock <[email protected]>
Signed-off-by: Spencer Schrock <[email protected]>
Signed-off-by: Spencer Schrock <[email protected]>
Summary
Builds upon the work in #306 and starts to define individual experiments. This one is aimed specifically at hashing algorithm.
As far as hashing is concerned, bytes are bytes. By generating our own bytes, we avoid I/O associated with reading models from disk. While we could read actual models into memory, recreating the filesystem seems unecessary for this benchmark.
Release Note
NONE
Documentation
NONE