Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/Data handler to save data #188

Merged
merged 28 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
9093c5d
add functionality to select new data folder based on idx
nulinspiratie Feb 20, 2024
cdbd19e
started adding save_data
nulinspiratie Feb 21, 2024
5e5957e
basic data handler
nulinspiratie Feb 21, 2024
36d40d4
add numpy array processor
nulinspiratie Feb 22, 2024
9b90b0e
add xarray data handler
nulinspiratie Feb 22, 2024
2c61e3a
working tests, added init
nulinspiratie Feb 22, 2024
d4e7be9
add DataHandler.path
nulinspiratie Feb 22, 2024
695ca21
docs + small changes
nulinspiratie Feb 22, 2024
6e7556a
Added initialization name
nulinspiratie Feb 23, 2024
46f3681
Proper sorting of data folders
nulinspiratie Feb 23, 2024
6cbc66e
add documentation
nulinspiratie Feb 25, 2024
4cad629
add optional xarray
nulinspiratie Feb 25, 2024
c452b7a
lower min xarray version
nulinspiratie Feb 25, 2024
087388f
add xarray as to poetry extras
nulinspiratie Feb 25, 2024
798ce63
modify workflow to allow xarray
nulinspiratie Feb 25, 2024
5af5273
remove underscore for workflow
nulinspiratie Feb 25, 2024
8d52ad9
update lock file
nulinspiratie Feb 25, 2024
84a6b4e
remove min_size numpy array
nulinspiratie Feb 25, 2024
c8ee97a
add test xarray skip if not installed
nulinspiratie Feb 25, 2024
7d620b4
Reduce performance test duration
nulinspiratie Feb 25, 2024
bc07ff1
added `additional_files`
nulinspiratie Feb 25, 2024
2bbef32
black formatting
nulinspiratie Feb 25, 2024
99c1853
added info on auto using filename as name
nulinspiratie Feb 25, 2024
c5ed718
Update changelog and readme
nulinspiratie Feb 26, 2024
9bd4419
Fix attempt: windows \ to /
nulinspiratie Feb 26, 2024
47e308a
fix: create create_data without creating
nulinspiratie Feb 26, 2024
3f22f64
fix import pathlib
nulinspiratie Feb 26, 2024
a9e14ea
Allow multiple saves
nulinspiratie Feb 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/on-pull-request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ jobs:
poetry-

- name: Set up the project
run: poetry install --extras configbuilder
run: poetry install --extras "configbuilder datahandler"

- name: Check formatting
run: poetry run poe check-format
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
## [Unreleased]
### Added
- simulator - ``create_simulator_controller_connections`` can now be used to create the connections between a subset of a large cluster.
- results - ``DataHandler`` can be used to save data (values, matplotlib figures, numpy/xarray arrays) to the local file storage.

### Changed
- config/waveform_tools - Added sampling rate argument with default value set to 1GS/s to the waveforms.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ It includes:

* [QUA Loops Tools](qualang_tools/loops/README.md) - This library includes tools for parametrizing QUA for_ loops using the numpy (linspace, arange, logspace) methods or by directly inputting a numpy array.
* [Plotting Tools](qualang_tools/plot/README.md) - This library includes tools to help handling plots from QUA programs.
* [Result Tools](qualang_tools/results/README.md) - This library includes tools for handling and fetching results from QUA programs.
* [Result Tools](qualang_tools/results/README.md) - This library includes tools for handling and fetching results from QUA programs, and saving them to the local file storage.
* [Units Tools](qualang_tools/units/README.md) - This library includes tools for using units (MHz, us, mV...) and converting data to other units (demodulated data to volts for instance).
* [Analysis Tools](qualang_tools/analysis/README.md) - This library includes tools for analyzing data from experiments.
It currently has a two-states discriminator for analyzing the ground and excited IQ blobs.
Expand Down
66 changes: 46 additions & 20 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ waitress = { version = "^2.0.0", optional = true }
dill = { version = "^0.3.4", optional = true }
pypiwin32 = { version = "^223", optional = true }
ipython = { version = "^7.31.1", optional = true }
xarray = { version = "^2023.0.0", optional = true }
scikit-learn = "^1.0.2"

[tool.poetry.dev-dependencies]
Expand All @@ -48,6 +49,7 @@ setuptools = "^69.0.2"
[tool.poetry.extras]
interplot = ["dill", "pypiwin32", "ipython"]
configbuilder = ["pandas", "dash", "dash-html-components", "dash-core-components", "dash-bootstrap-components", "dash-cytoscape", "dash-table", "dash-dangerously-set-inner-html", "docutils", "waitress"]
datahandler = ["xarray"]

[tool.black]
line-length = 120
Expand Down
119 changes: 119 additions & 0 deletions qualang_tools/results/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,3 +156,122 @@ for i in range(len(freqs_external)): # Loop over the LO frequencies
# Process and plot the results
...
```


## Data handler
The `DataHandler` is used to easily save data once a measurement has been performed.
It saves data into an automatically generated folder with folder structure:
`{root_data_folder}/%Y-%m-%d/#{idx}_{name}_%H%M%S`.
- `root_data_folder` is the root folder for all data, defined once at the start
- `%Y-%m-%d`: All datasets are first ordered by date
- `{idx}`: Datasets are identified by an incrementer (starting at `#1`).
nulinspiratie marked this conversation as resolved.
Show resolved Hide resolved
Whenever a save is performed, the index of the last saved dataset is determined and
increased by 1.
- `name`: Each data folder has a name
nulinspiratie marked this conversation as resolved.
Show resolved Hide resolved
- `%H%M%S`: The time is also specified.
This structure can be changed in `DataHandler.folder_structure`.

Data is generally saved using the command `data_handler.save_data("msmt_name", data)`,
where `data` is a dictionary.
The data is saved to the json file `data.json` in the data folder, but nonserialisable
types are saved into separate files. The following nonserialisable types are currently
supported:
- Matplotlib figures
- Numpy arrays
- Xarrays


### Basic example
```python
# Assume a measurement has been performed, and all results are collected here
T1_data = {
"T1": 5e-6,
"T1_figure": plt.figure(),
"IQ_array": np.array([[1, 2, 3], [4, 5, 6]])
}

# Initialize the DataHandler
data_handler = DataHandler(root_data_folder="C:/data")

# Save results
data_folder = data_handler.save_data(data=T1_data, name="T1_measurement")
print(data_folder)
# C:/data/2024-02-24/#152_T1_measurement_095214
# This assumes the save was performed at 2024-02-24 at 09:52:14
```
After calling `data_handler.save_data()`, three files are created in `data_folder`:
- `T1_figure.png`
- `arrays.npz` containing all the numpy arrays
- `data.json` which contains:
```
{
"T1": 5e-06,
"T1_figure": "./T1_figure.png",
"IQ_array": "./arrays.npz#IQ_array"
}
```

### Creating a data folder
A data folder can be created in two ways:
```python
# Method 1: explicitly creating data folder
data_folder_properties = data_handler.create_data_folder(name="new_data_folder")

# Method 2: Create when saving results
data_folder = data_handler.save_data(data=T1_data, name="T1_measurement")
```
Note that the methods return different results.
The method `DataHandler.save_data` simply returns the path to the newly-created data folder, whereas `DataHandler.create_data_folder` returns a dict with additional information on the data folder such as the `idx`.
This additional information can also be accessed after calling `DataHandler.save_data` through the attribute `DataHandler.path_properties`.

### Saving multiple times
A `DataHandler` object can be used to save multiple times to different data folders:
```python

data_handler = DataHandler(root_data_folder="C:/data")

T1_data = {...}

# Save results
data_folder = data_handler.save_data(data=T1_data, name="T1_measurement")
# C:/data/2024-02-24/#1_T1_measurement_095214

T1_modified_data = {...}

data_folder = data_handler.save_data(data=T1_modified_data, name="T1_measurement")
# C:/data/2024-02-24/#2_T1_measurement_095217
```
The save second call to `DataHandler.save_data` creates a new data folder where the incrementer is increased by 1.

### Manually adding additional files to data folder
After a data folder has been created, its path can be accessed from `DataHandler.path`.
This allows you to add additional files:

```python
data_folder = data_handler.save_data(data)
assert data_folder == data_handler.path # data_folder is added to data_handler.path

(data_handler.path / "test_file.txt").write_text("I'm adding a file to the data folder")
```

### Auto-saving additional files to data folder
In many cases certain files need to be added every time a data folder is created.
Instead of having to manually add these files each time, they can be specified beforehand:

```python
DataHandler.additional_files = {
"configuration.py": "configuration.py
}
```
Each key is a path from the current working directory, and the corresponding value is the target filepath w.r.t. the data folder.
The key does not have to be a relative filepath, it can also be an absolute path.
This can be useful if you want to autosave a specific file on a fixed location somewhere on your hard drive.

### Use filename as name
Instead of manually specifying the name for a data folder, often the current filename is a good choice.
This can be done by creating the data handler as such:

```python
from pathlib import Path
data_handler = DataHandler(name=Path(__file__).stem)
```
4 changes: 3 additions & 1 deletion qualang_tools/results/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@
from qualang_tools.results.results import progress_counter
from qualang_tools.results.results import wait_until_job_is_paused

__all__ = ["fetching_tool", "progress_counter", "wait_until_job_is_paused"]
from qualang_tools.results.data_handler import DataHandler, data_processors

__all__ = ["fetching_tool", "progress_counter", "wait_until_job_is_paused", "DataHandler", "data_processors"]
6 changes: 6 additions & 0 deletions qualang_tools/results/data_handler/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from .data_folder_tools import *
from . import data_processors
from .data_processors import DEFAULT_DATA_PROCESSORS
from .data_handler import *

__all__ = [*data_folder_tools.__all__, data_processors, DEFAULT_DATA_PROCESSORS, *data_handler.__all__]
Loading
Loading