Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/Data handler to save data #188

Merged
merged 28 commits into from
Feb 27, 2024
Merged

Feat/Data handler to save data #188

merged 28 commits into from
Feb 27, 2024

Conversation

nulinspiratie
Copy link
Contributor

@nulinspiratie nulinspiratie commented Feb 22, 2024

Data handler

This PR is made because a standardized method is needed in Qualibrate to save data.

Introduction

The DataHandler is used to easily save data once a measurement has been performed.
It saves data into an automatically generated folder with folder structure:
{root_data_folder}/%Y-%m-%d/#{idx}_{name}_%H%M%S.

  • root_data_folder is the root folder for all data, defined once at the start
  • %Y-%m-%d: All datasets are first ordered by date
  • {idx}: Datasets are identified by an incrementer (starting at #1).
    Whenever a save is performed, the index of the last saved dataset is determined and
    increased by 1.
  • name: Each data folder has a name
  • %H%M%S: The time is also specified.
    This structure can be changed in DataHandler.folder_structure.

Data is generally saved using the command data_handler.save_data("msmt_name", data),
where data is a dictionary.
The data is saved to the json file data.json in the data folder, but nonserialisable
types are saved into separate files. The following nonserialisable types are currently
supported:

  • Matplotlib figures
  • Numpy arrays
  • Xarrays

Basic example

# Assume a measurement has been performed, and all results are collected here
T1_data = {
    "T1": 5e-6,
    "T1_figure": plt.figure(),
    "IQ_array": np.array([[1, 2, 3], [4, 5, 6]])
}

# Initialize the DataHandler
data_handler = DataHandler(root_data_folder="C:/data")

# Save results
data_folder = data_handler.save_data(data=T1_data, name="T1_measurement")
print(data_folder)
# C:/data/2024-02-24/#152_T1_measurement_095214
# This assumes the save was performed at 2024-02-24 at 09:52:14

After calling data_handler.save_data(), three files are created in data_folder:

  • T1_figure.png
  • arrays.npz containing all the numpy arrays
  • data.json which contains:
    {
        "T1": 5e-06,
        "T1_figure": "./T1_figure.png",
        "IQ_array": "./arrays.npz#IQ_array"
    }
    

Creating a data folder

A data folder can be created in two ways:

# Method 1: explicitly creating data folder
data_folder_properties = data_handler.create_data_folder(name="new_data_folder")

# Method 2: Create when saving results
data_folder = data_handler.save_data("T1_measurement", data=T1_data)

Note that the methods return different results.
The method DataHandler.save_data simply returns the path to the newly-created data folder, whereas DataHandler.create_data_folder returns a dict with additional information on the data folder such as the idx.
This additional information can also be accessed after calling DataHandler.save_data through the attribute DataHandler.path_properties.

Manually adding additional files to data folder

After a data folder has been created, its path can be accessed from DataHandler.path.
This allows you to add additional files:

data_folder = data_handler.save_data(data)
assert data_folder == data_handler.path  # data_folder is added to data_handler.path

(data_handler.path / "test_file.txt").write_text("I'm adding a file to the data folder")

Auto-saving additional files to data folder

In many cases certain files need to be added every time a data folder is created.
Instead of having to manually add these files each time, they can be specified beforehand:

DataHandler.additional_files = {
    "configuration.py": "configuration.py
}

Each key is a path from the current working directory, and the corresponding value is the target filepath w.r.t. the data folder.
The key does not have to be a relative filepath, it can also be an absolute path.
This can be useful if you want to autosave a specific file on a fixed location somewhere on your hard drive.

Copy link

github-actions bot commented Feb 22, 2024

Unit Test Results

394 tests   391 ✔️  26s ⏱️
    1 suites      3 💤
    1 files        0

Results for commit a9e14ea.

♻️ This comment has been updated with latest results.

@nulinspiratie
Copy link
Contributor Author

The tests are failing because I haven't added xarray as a required package but as an optional package. @yomach @TheoLaudatQM any recommendations? Should I skip tests if xarray isn't installed?

qualang_tools/results/README.md Show resolved Hide resolved
qualang_tools/results/README.md Outdated Show resolved Hide resolved
qualang_tools/results/README.md Outdated Show resolved Hide resolved
qualang_tools/results/README.md Show resolved Hide resolved
@yomach
Copy link
Collaborator

yomach commented Feb 23, 2024

The tests are failing because I haven't added xarray as a required package but as an optional package. @yomach @TheoLaudatQM any recommendations? Should I skip tests if xarray isn't installed?

You can tell the tests to build with these packages, I'll take a look next week

@yomach
Copy link
Collaborator

yomach commented Feb 23, 2024

The tests are failing because I haven't added xarray as a required package but as an optional package. @yomach @TheoLaudatQM any recommendations? Should I skip tests if xarray isn't installed?

You can tell the tests to build with these packages, I'll take a look next week

@nulinspiratie
Check out commit 2bc4181, and specifically this change:
image
This is how you tell poetry to install extra packages for the testing.

@yomach yomach requested a review from yonatanrqm February 23, 2024 20:59
@yomach
Copy link
Collaborator

yomach commented Feb 23, 2024

The tests are failing because I haven't added xarray as a required package but as an optional package. @yomach @TheoLaudatQM any recommendations? Should I skip tests if xarray isn't installed?

You can tell the tests to build with these packages, I'll take a look next week

@nulinspiratie Check out commit 2bc4181, and specifically this change: image This is how you tell poetry to install extra packages for the testing.

Wait, I don't see you added xarray as a required package at all?

@nulinspiratie
Copy link
Contributor Author

@yomach yeah just noticed the pyproject.toml file was never committed. I fixed it now, all the tests are working

Copy link
Collaborator

@yomach yomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add it also to the CHANGELOG.md and to the main README.md file (in the root directory)

@nulinspiratie
Copy link
Contributor Author

Please add it also to the CHANGELOG.md and to the main README.md file (in the root directory)

Done @yomach !

@nulinspiratie
Copy link
Contributor Author

@TheoLaudatQM @yonatanrqm any comments? If possible, I'm hoping to have this merged tomorrow

@nulinspiratie nulinspiratie merged commit 1afde01 into main Feb 27, 2024
2 checks passed
@nulinspiratie nulinspiratie deleted the data_handler branch February 27, 2024 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants