-
Notifications
You must be signed in to change notification settings - Fork 24
Kayla (1.4 GHz ARM Cortex A9)
Christopher Celio edited this page Feb 28, 2014
·
6 revisions
Kayla has four ARMv7 Cortex A9 cores. These are narrow out-of-order processors running at 1.4 GHz.
- blue (unit-stride)
- green (cache-line stride)
- red (random stride)
- Kayla lacks prefetchers (random stride and cache-line stride performance is identical)
- the L1 data cache is 32 kB, access latency is 2.85 ns (exactly 4 cycles for a 1.4 GHz clock)
- the L2 cache is 1 MB, access latency is ~20 ns.
- off-chip DRAM access latency is ~145 ns.
Performance becomes muddled around 512kB-1MB. According to the ARM Cortex A9 manual, the TLB is at most 128 entries, which provides a TLB reach of 512kB. A single core attempting to access more than 512kB will see at least some performance degradation.
These graphs shows both the per thread request bandwidth and the aggregate bandwidth of the Cortex A9. Each thread is performing a number of independent pointer chases. As more threads are added, per thread bandwidth drops.
Performance is comparable, if a little worse, than Rocket.
Add/Mul Mix | Adds Only | |
---|---|---|
MFLops | 788.9 | 955.7 |