Replies: 2 comments 1 reply
-
I'll go for the easy one first: "Executor" is capitalized in the docs because it's this: https://docs.python.org/3/library/concurrent.futures.html#executor-objects There's a way to make Sphinx docs cross-reference, but I've never set it up. Actually, anything that duck-types with this kind of Executor will work: it has to have a Probably the most common Executor one might use for I/O is the ThreadPoolExecutor, since the ProcessPoolExecutor would have to shuttle data through a socket after it reads it, which effectively means twice the I/O. I did some early experiments with ThreadPoolExecutors (see the plot under the Uproot 3 documentation for this feature). The only cases where you get much benefit from it are those in which decompression is a computationally intensive task, such as the LZMA in that example. Decompression routines take large blocks of data by pointer into compiled code that releases the GIL. If, instead of decompression, it's |
Beta Was this translation helpful? Give feedback.
-
As for passing down an interpretation, yes it is possible. I wasn't sure that I retained this feature from Uproot 3 because it's so hard to describe, and I guess I didn't document it for that reason, but if you pass a dict as Here's an example that reads >>> import uproot, skhep_testdata
>>> tree = uproot.open(skhep_testdata.data_path("uproot-Zmumu.root"))["events"]>>> tree.arrays(["Px", "Py", "Pz"])
>>> tree.arrays(
... {"px1": uproot.AsDtype(">f8"), "py1": uproot.AsDtype(">i8"), "pz1": uproot.AsDtype("<f8")},
... entry_stop=1,
... ).tolist()
[{'px1': -41.1952876442, 'py1': 4625600239601887419, 'pz1': -8.641303535866137e-247}] If some branches should use default interpretations, passing >>> tree.arrays({"px1 + py1": uproot.AsDtype(">f8")})
...
uproot.exceptions.KeyInFileError: not found: 'px1 + py1'
Available keys: 'px1', 'py1', 'pz1', 'pt1', 'px2', 'py2', 'E1', 'phi1', 'Q1', 'pz2', 'pt2', 'M', 'Type',
'eta1', 'E2', 'phi2', 'Q2', 'Run', 'eta2', 'Event'
in file /home/jpivarski/.local/skhepdata/uproot-Zmumu.root
in object /events;1 Oh, that's nice. If using a dict, the keys must be branch names; computed expressions are not allowed. That's good. This is a difficult feature to explain, including why you'd ever even want it; it would make the documentation even denser than it is. But when you need it, you need it, so I'm glad you asked in a Discussion. |
Beta Was this translation helpful? Give feedback.
-
This is related to an older discussion [1] funnily from one of my colleagues
;)
I ran into the same issue while porting code from
uproot3
touproot4
. Inuproot3
we used the.lazyarray()
implementation to iterate through very large branches and were able to pass our custom interpretations whereuproot3
could not figure those out.uproot4
is also failing for a couple of branches but does a much better job in general. Unfortunately for one of our largest branches it still fails, so we need to help out with a custom interpretation.As you mentioned in the discussion[2], the
.iterate()
implementation inuproot4
is fairly complicated and by that time it did not support custom interpretations. I found in the current docs[2] that there is aninterpretation_executor
parameter:I looked at the code but could not figure out how to create such an executor. I found
TrivialExecutor
but not much docs, so I thought I'll ask first before I spend hours of hacking or until I figure out that it's a dead end;)
https://github.com/scikit-hep/uproot4/blob/53c5a99e3f5e428769542e140b1d0c76a4997fa9/src/uproot/source/futures.py#L58
Anyways, I thought that this is something new and there is a mechanism now to hook into the interpretation now but then I saw via
git blame
thatinterpretation_executor
was already present when the discussion[2] was ongoing;)
which might be this aforementioned "dead-end".So my question is: is there a preferred way to teach
uproot
upstreams how to interpret something (via the class name) so that it automagically knows which interpretation to pick when access data? I was going to dive into the sources but the mechanics are different touproot3
and I am sure it's much more effective if you toss me in the right direction.Below is the full session with one of our publicly available test files, which shows all the streamers and also demonstrates that the custom interpretation passed to
.array()
is working nicely.In this particular case, the streamer interpretation of
KM3NETDAQ::JDAQSummaryFrame
is missing butuproot4
parsed correctly that the corresponding branch is avector<KM3NETDAQ::JDAQSummaryFrame>
.[1] #354
[2] https://uproot.readthedocs.io/en/latest/uproot.behaviors.TBranch.TBranch.html?highlight=iterate#iterate
Beta Was this translation helpful? Give feedback.
All reactions