Sensitivity of `step_size` parameter #926

roy-brener-cern · 2023-08-02T13:37:53Z

roy-brener-cern
Aug 2, 2023

Dear Jim, uprooters et al.,

I am running an uproot program which reads-in .root trees, conducts an analysis and writes into trees in ouput .root files. The program uses utils.make_chunk_events from UprootFramework. Each of the ntuples being read contain a varying number of .root files which amount to varying sizes per ntuple (from ~0.5 GB to ~1.5 TB which correspond to ~1 to ~350 files per ntuple).

Recently, it has occurred to me that the program is extremely sensitive to the step_size parameter, both in terms of calculation speed and memory usage. E.g. when running with the nominal 1.5 GB (as given in the documentation), the chunks (see above) become very large and the memory usage of the program increases dramatically (~O(20 GB)). Conversely, reducing the step_size to 50 MB (75 MB and 100 MB give similar results) reduces the chunk size and requires much less memory during the run. Reducing step_size too much to about 2 MB results in the chunk size being extremely small and total running time quite long. Also, the memory usage whilst using a ~ 2 MB step_size is not kept low during a long run, and increases to about 6 GB after an ~hour's run.

It seems there is a non-linear dependence of running time and memory usage on step_size and it's hard to understand what the optimal step_size is, per size of the input ntuple. Similarly, it seems that during a run, the memory usage increases and it is not clear whether it reaches a steady-state. To that effect, I thought setting parallel=False might help be more memory-conservative, but it's unclear whether this helps. It is hard to test and know, for a run that's anticipated to take many hours to complete.

Could someone kindly provide some information about these issues? Having read the documentation, it's very unclear how to tune the step_size correctly (and whether parallel=True/False has anything to do with restricting memory consumption). Given this program is meant to run on a very large amount of datasets in a batch system, it would be very helpful to understand and hence optimise, before exploiting many computing resources.

Many thanks in advance.

Roy

jpivarski · 2023-08-02T15:16:48Z

jpivarski
Aug 2, 2023
Maintainer

For some clarity on the step_size parameter, when it's expressed as a number of bytes, this number of bytes refers to the compressed size of the TBaskets, and it's a per-file average. The uproot.TBranch.num_entries_for method is used to convert a memory_size to a number of entries, and then a constant number of entries is used through the file. When iterating over multiple files or using uproot.dask to access multiple files, the memory_size → number of entries is reevaluated in each file.

Why use the compressed TBasket size (the size on disk) instead of the uncompressed size or some kind of estimate of the memory space needed by the objects that will be instantiated?

Only the compressed size is available without seeking through the TTree. Although it's possible to get the uncompressed size from each TBasket's TKey, that would require seeking to arbitrary parts of the file and reading small TKey objects. For local files, this means that disk-fetches will likely be needed, since the OS pre-fetches data from files sequentially, and for remote files, this means that the client will need to make many requests to the server, which is bad for latency-bound networks. By contrast, the compressed sizes are all available in the TTree metadata, which is contiguous and needs to be read anyway.
Even after decompression, a TBasket's size is not necessarily the size of the Python objects that will be instantiated. If the TBranch type is non-jagged numbers (AsDtype), then the TBasket size is equal to the allocated array (they are exactly the same data), and if the TBranch type is jagged numbers (AsJagged), then they're pretty close (the offsets need minor modification). If the TBranch type is more complicated and it can be handled by AwkwardForth, the output array will be an Awkward Array, probably smaller than the TBaskets themselves, but if the TBranch type is too complicated to be handled by AwkwardForth, then the output array will contain Python objects and its memory use will be huge (and slow).
In addition, arrays read from TBaskets get concatenated into contiguous arrays, so there should be a 2× temporary overhead during the time interval when both copies exist.

These issues complicate the relationship between step_size and the memory that will be used, but it's usually not very sensitive. Is it possible that your ROOT files have weird combinations of small and large TBaskets? There are tools like uproot.TBranch.basket_entry_start_stop to analyze things like this, if you're ready to get into the low-level I/O of your files.

1 reply

roy-brener-cern Aug 3, 2023
Author

Hi Jim,
Thanks for the elaborate reply.
I don't think I have such weird combinations of TBaskets..
Cheers,
Roy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sensitivity of `step_size` parameter #926

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Sensitivity of step_size parameter #926

roy-brener-cern Aug 2, 2023

Replies: 1 comment · 1 reply

jpivarski Aug 2, 2023 Maintainer

roy-brener-cern Aug 3, 2023 Author

Sensitivity of `step_size` parameter #926

roy-brener-cern
Aug 2, 2023

Replies: 1 comment 1 reply

jpivarski
Aug 2, 2023
Maintainer

roy-brener-cern Aug 3, 2023
Author