Replies: 1 comment 2 replies
-
The slowness of more-than-one nesting of variable-length lists is a known issue. The problem is that it's serialized in a different way (non-columnar), so the NumPy tricks that work for Here's where I talked about this problem at 2019 CHEP (Figure 3 is based on a hacked performance study): https://arxiv.org/abs/2001.06307 and this year: https://arxiv.org/abs/2102.13516 (Figure 1 is using AwkwardForth, a new formalism that is yet to be integrated into Uproot). Uproot 3 could read these data faster because it delayed deserialization: it made Awkward 0 ObjectArrays, breaking the Awkward Array formalism so that you couldn't slice it like other arrays. Having a different public interface because of an internal difference in ROOT serialization was itself a problem: I ended up having to explain/apologize for this interface difference a lot. In Awkward 1, it was a goal to have all arrays behave the same way, regardless of where they came from. However, that means that Uproot 4 must deserialize the Since your code selects one element, it only deserializes that one in Uproot 3, whereas Uproot 4 has to deserialize everything to give you that one. That's why the speed is different. I'm working on file-writing at the moment, but integrating AwkwardForth so that Uproot 4 will be able to deserialize these objects as fast as ROOT (Figure 1 of that new paper). AwkwardForth is a few times slower than compiled, optimized C++, but many times faster than pure Python and about as fast as the data transfer from RAM to CPU (so computation is not the primary bottleneck, especially if there's any decompression involved). |
Beta Was this translation helpful? Give feedback.
-
Hello,
I'm new to uproot and either I found that I don't know how to use it optimally, or I found a significant inefficiency.
In my TTree there are many branches, but almost all of the data is concentrated in a vector, where trace is a class with 6 vector inside. ROOT TTree essentially splits it into 6 branches with vector<vector>. This is not what TTree was designed for and ROOT needs to read the whole vector at once, unfortunately (see https://root-forum.cern.ch/t/the-optimal-way-to-store-variable-tracks-count-in-a-ttree if interested), but with uproot I've stumbled upon a very significant slowdown.
The TTree has 1001 events, vectors are variable, but in every entry their dimension is roughly 180*1000. The TTree is 4.5 GB large and almost all of it are those vectors. Compression is off.
Reading just 1 of those 6 vector<vector> in a single entry takes ROOT roughly 0.07s. Uproot3 takes 3 seconds, Uproot4 takes 1s.
The code is (where event=500):
root:
entries = t.Draw("traces.SimSignal_X", "", "goff", 1, event-1)
uproot3:
energy_root = t.array("traces.SimSignal_X")[event-1]
uproot4:
energy_root = t["traces.SimSignal_X"].array(entry_start=event-1, entry_stop=event)
However, when I try to read the 3 of the vector<vector> and do some multiplication, it takes ROOT around 2 s, uproot3 9 s, and with uproot4 it is so long that I gave up on waiting. The code is:
root:
entries = t.Draw("traces[100].SimSignal_X[150]*traces[100].SimSignal_Y[150]*traces[100].SimSignal_Z[150]*traces[10].SimSignal_X[150]*traces[50].SimSignal_Z[150]", "", "goff")
uproot3:
uproot4:
I actually finally gave up on a less demanding example, which was also running endlessly:
energy_root1 = t["traces.SimSignal_X"].array()[:,0,150]
The TTree printout for the relevant part is:
Am I doing something wrong, especially comparing uproot3 to uproot4, or did I stumble upon some uproot problem? I will be grateful for any advice.
Beta Was this translation helpful? Give feedback.
All reactions