You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We noticed that the graph for random projection execution times is very interesting.
For some reason, when n=1553, the execution is orders of magnitude slower than when n=1550. i.e in our case, a going from less than a half a second to 6.5 seconds.
It seems that this effects literally everyone except one group, the fastest one.
I have two questions: Why is this the case? and secondly, is it intentional to test that value?
Many thanks.
The text was updated successfully, but these errors were encountered:
This behavior occurs pretty much every year (though in different puzzles, for different
reasons), but I think you are the first person to ever actually ask why. When there were
orals, I would often ask why every submission spikes by orders of magnitude at some
values, and often people had never notice it or thought about it. So kudos for asking
the question.
For the first question: it is not intentional to test that value. The only thing
I make sure of is that I'm not testing "nice" values, so I try to avoid too many multiples
of 32 or 64 in the scale factors. My assumption (based on experience) is that we will hit
"interesting" values eventually.
As to why: I try to avoid looking at implementations during, so I'm not sure, but I'm guessing (I genuinely have no idea - everything from this point is pure speculation
just from looking at the graph) that most implementations are GPU to the
right-hand side of about 800 (for the blah results). GPUs have certain architectural
properties that encourage certain batch-sizes (this is implicitly covered in one of the
lectures, though you need to think through the implications). They also have a memory
interface, which also means there is a certain preferred granularity there too. The choice
of work-group size is going to determine how your iteration space gets mapped
onto these hardware features. So if certain scale factors map badly to the underlying
hardware, is there anything you can do to make the iteration space map better, while
keeping the same scale?
(Note that this is all speculation, but is not deliberately intended to mislead - I may
just be wrong in my guesses)
Hi,
We noticed that the graph for random projection execution times is very interesting.
For some reason, when n=1553, the execution is orders of magnitude slower than when n=1550. i.e in our case, a going from less than a half a second to 6.5 seconds.
It seems that this effects literally everyone except one group, the fastest one.
I have two questions: Why is this the case? and secondly, is it intentional to test that value?
Many thanks.
The text was updated successfully, but these errors were encountered: