Optimisation of array/image/projdata algebra #1545
Replies: 1 comment
-
Note that the
Timings are reported to stdout as:
Some of the code is multi-threaded, in which case the wall-clock time will/should be lower than the CPU time (which sums time over all threads). The wall-clock time includes “system time” and "waiting time" though, for instance spent doing memory allocation. Presumably, that explains why the “create_proj_data_in_mem_no_init” wall-clock is longer than the “init” one, as in the current code, they do exactly the same. My guess is that at the first call, the OS spends time allocating the memory, while a “delete” doesn’t return to the OS, but keeps it for the process, such that in subsequent calls, the allocation is a lot faster. (Could be tested by swapping those 2 tests around, and by reading up on memory allocation!). |
Beta Was this translation helpful? Give feedback.
-
Currently, we have simple loops for numerical operations, e.g.
STIR/src/include/stir/VectorWithOffset.inl
Lines 692 to 693 in 2eb11a9
A few
Array
operations were recently parallelised, e.g.STIR/src/include/stir/Array.inl
Lines 335 to 341 in 2eb11a9
There are multiple steps on this:
I thought I'd create this Discussion to get some ideas/experiences together.
Beta Was this translation helpful? Give feedback.
All reactions