Example showing how to visualise 20GB of data using plothist and pyarrow. The data are generated/loaded in batches so that the required memory is only ~2GB. The hardware used to measure the running times is reported at the end of this page.
generate_data.py
generates a data filedata.pq
of ~20GB usingpyarrow
(takes ~3min).plot_data.py
plots the generated data usingplothist
andpyarrow
(takes ~30s).plot_categorised_weighted_data.py
plots the generated data usingplothist
andpyarrow
, taking into account different data categories and a specific weight for each entry (takes ~1min).
This is the resulting plot of plot_data.py
:
And the result of plot_categorised_weighted_data.py
:
- CPU: Intel Core i5-1235U
- RAM: 16GB DDR4, 3200 MT/s
- Disk: 500GB SSD, Read/Write Speed: 2.1/1.1 GB/s
This repository is dedicated to N.K.R.