You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a subtle incorrectness in random_walk_fastest. It does not return the first position 0, compared to other two implementations. The following code demonstrates this:
This may look as a minor distinction, but may cause bugs. There are two ways to remove this inconsistency: exclude initial 0 position from first two functions (trivial) or prepend it to the last function. There are such options for the latter path:
Convert steps to numpy list after cumsum and concat with [0]
Concatenate np.array([0])
Allocate empty array in advance, and use out parameter of cumsum
My implementation of these options:
N=10_000# Option 1: Convert `steps` to python list after `cumsum` and concat with `[0]`defrandom_walk_option1(n=N):
steps=np.random.choice([-1,+1], n)
steps=np.cumsum(steps)
return [0] +steps.tolist()
# Option 2.1: Concatenate `np.array([0])`defrandom_walk_option2_1(n=N):
steps=np.random.choice([-1,+1], n)
steps=np.cumsum(steps)
returnnp.concatenate((np.array([0]), steps))
# Option 2.2: Concatenate `[0]`defrandom_walk_option2_2(n=N):
steps=np.random.choice([-1,+1], n)
steps=np.cumsum(steps)
returnnp.concatenate(([0], steps))
# Option 3: Allocate empty array in advance, and use `out` parameter of `cumsum`defrandom_walk_option3_1(n=N):
walk=np.empty(n+1)
walk[0] =0steps=np.random.choice([-1,+1], n)
np.cumsum(steps, out=walk[1:])
returnwalkdefrandom_walk_option3_2(n=N):
walk=np.zeros(n+1)
steps=np.random.choice([-1,+1], n)
np.cumsum(steps, out=walk[1:])
returnwalk
While option_2_1 is slightly better, the benchmark is not stable. Maybe it is because the difference in the number of operations is up to the constant, the difference is not significant?
Fixing it in a book may be nasty/non-trivial because will break the simplicity of implementation and add more confusion for the reader :)
The text was updated successfully, but these errors were encountered:
Thanks for the report (you're totally right) and all the proposed fixes!
I think Option 1 is the most readable and can go inside the book. Can you make a PR?
There is a subtle incorrectness in
random_walk_fastest
. It does not return the first position0
, compared to other two implementations. The following code demonstrates this:This may look as a minor distinction, but may cause bugs. There are two ways to remove this inconsistency: exclude initial
0
position from first two functions (trivial) or prepend it to the last function. There are such options for the latter path:steps
to numpy list aftercumsum
and concat with[0]
np.array([0])
out
parameter ofcumsum
My implementation of these options:
I also benchmarked them to see the difference: https://colab.research.google.com/drive/19OJdrD4SZLk4ug6OI6DC6wq3XwHQPAgr?usp=sharing
While
option_2_1
is slightly better, the benchmark is not stable. Maybe it is because the difference in the number of operations is up to the constant, the difference is not significant?Fixing it in a book may be nasty/non-trivial because will break the simplicity of implementation and add more confusion for the reader :)
The text was updated successfully, but these errors were encountered: