-
Notifications
You must be signed in to change notification settings - Fork 24
Ivy Bridge (2.3GHz i7 3615QM)
Christopher Celio edited this page Feb 28, 2014
·
6 revisions
- blue (unit-stride)
- green (cache-line stride)
- red (random stride)
The difference between random stride and cache-line stride demonstrate the power of Intel's prefetchers. A pointer chase that is unit stride will perform at full pipeline speed, even over an array that is sitting in off-chip DRAM. The cache-line stride is not much slower - about 6ns latency per load!! (remember- each load is dependent on the previous load!).
- blue (1 thread)
- green (2 threads)
- red (4 threads)
- cyan (8 threads)
These graphs shows both the per thread request bandwidth and the aggregate bandwidth of Ivy Bridge. Each thread is performing a number of independent pointer chases. As more threads are added, per thread bandwidth drops. Also, fewer parallel requests can be handled at higher levels of the memory hierarchy.
Add/Mul Mix | Adds Only | Add/Mul Double | SIMD Add | |
---|---|---|---|---|
MFLops | 4310 | 3010 | 8645 | 5360 |