From 1b2f9747e2f9bfd28b2ec8941b5553f91c188539 Mon Sep 17 00:00:00 2001 From: Laurent Paul Vallet Date: Thu, 23 Nov 2023 09:44:28 +0100 Subject: [PATCH 1/6] Create 0290-metadata-reorganization.md formatting code blocks and adding Compatibility issues --- design/0290-metadata-reorganization.md | 183 +++++++++++++++++++++++++ 1 file changed, 183 insertions(+) create mode 100644 design/0290-metadata-reorganization.md diff --git a/design/0290-metadata-reorganization.md b/design/0290-metadata-reorganization.md new file mode 100644 index 0000000..015152d --- /dev/null +++ b/design/0290-metadata-reorganization.md @@ -0,0 +1,183 @@ +# Proposal: Metadata organization improvement + +Author(s): Laurent Vallet Last updated: 22/11/2023 Discussion at https://github.com/PlanktoScope/PlanktoScope/issues/44 + +## Abstract + +This design document proposes a reorganization of the metadata variables around 3 json files in order to : +- Improve the logic behind the configuration files to facilitate the search of a specific variable +- Share easily the config.json file to apply Planktoscope settings to another user without any personal information or specific hardware configuration +- Use a more precise hardware.json file to debug and have all the information about a specific machine for Fairscope use +- Improve data quality by adding more relevant scientific variables to the metadata +- Create a personal_info.json file that could be later on encrypted and used to commit modifications to the machine, export directly to Ecotaxa and other user oriented future features + +## Background + +For now, the PlanktoScope software uses 2 configuration files : +- hardware.json stored in https://github.com/PlanktoScope/device-backend/blob/main/default-configs/ +- config.json stored in https://github.com/PlanktoScope/PlanktoScope/blob/master/software/node-red-dashboard/default-configs/ + +The hardware.json file contains parameters to configure the hardware but not the caracteristics of the hardware. +The config.json file contains the GUI parameters selected by the user to make the acquisition of the data. +Both are used by the Node Red flows. +The variables stored in these files and/or set by the user via the Node Red interface are used to generate the metadata.json file that is exported by the acquisition process. + +An inventory of all the metadata was made : https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link +Only variables with the prefixes sample_, acq_, object_, process_ are exported to the metadata.json. + +## Proposal + +We propose to add a third file with the personal informations of the user and its scientific campaign : personal_info.json +We also propose to add metadata variables that could improve informations contained in Ecotaxa concerning the data acquired with the Planktoscope. +We finally propose to dispatch existing metadatas between those 3 files according to the following rules : + +- The hardware.json file should only have informations concerning the caracteristics of the machine. + This file should not be modified by the user except for the hardware case version of the planktoscope and its serial number where we cannot retrieve automatically the information + +- The config.json file should only store acquisition settings and camera settings selected by the user via the GUI + Exchanging this file should asssure a comparable result. + +- The personal_info.json file should store informations entered by the user via the GUI on the id of the user/team, campaign data informations and Ecotaxa login infos in order to use Ecotaxa API to directly export data. + This file could be used between team members using Planktoscopes for the same campaign/study to assure an identical protocol/equipment used information. + +Here is the current content of hardware.json : +''' +{ + "stepper_reverse": false, + "microsteps": 256, + "focus_steps_per_mm": 40, + "pump_steps_per_ml": 2045, + "focus_max_speed": 5, + "pump_max_speed": 50, + "stepper_type": "pscope_hat", + "red_gain": 2.4, + "blue_gain": 1.35, + "analog_gain": 1.0, + "digital_gain": 1.0, + "acq_fnumber_objective": 12, + "process_pixel_fixed": 0.88 +} +''' +Here is what we propose for its content : +''' +{ + "inst_serial_number": "U072", + "acq_inst_name": "PlanktoScope", + "acq_inst_version": "v2.6.1", + "acq_rpi_model": "Raspberry Pi 4 4Go", + "acq_camera_model": "Raspberry Pi High Quality Camera", + "acq_HAT_model": "FairScope_HAT v1.3", + "acq_objective_focal_length": 12, + "acq_tube_focal_length": 25, + "acq_LED_model": "Adafruit - 754", + "acq_pump_model": "Kamoer - KAS B12 SF", + "acq_flowcell_model": "FairScope Capillary - 300 um", + "microsteps": 256, + "pump_max_speed": 50, + "pump_steps_per_ml": 2045, + "stepper_reverse": false, + "process_id": "1", + "process_pixel_size": 0.75 +} +''' +Here is the current content of config.json : +''' +{ + "sample_project": "Project's name", + "sample_id": 1, + "sample_ship": "Vessel name", + "sample_operator": "Operator's name", + "sample_sampling_gear": "net", + "sample_gear_net_opening": 40, + "acq_id": 1, + "acq_instrument": "PlanktoScope v2.5", + "acq_celltype": 300, + "acq_minimum_mesh": 10, + "acq_maximum_mesh": 200, + "acq_volume": 1, + "object_depth_min": 1, + "object_depth_max": 2, + "process_id": 1, + "nb_frame": 100, + "sleep_before": 0.5, + "imaging_pump_volume": 0.01 +} +''' +Here is what we propose for its content : +''' +{ + "acq_camera_iso": 100, + "acq_focus_max_speed": 5, + "acq_camera_shutter_speed": 125, + "acq_camera_white_balance": false, + "acq_volume_interframe": 0.01, + "acq_nb_frames" : 10, + "focus_steps_per_mm": 40, + "sleep_before": 0.5, + "object_camera_gain_analog": 1, + "object_camera_gain_digital": 1, + "object_camera_gain_red": 1.5, + "object_camera_gain_blue": 1.9 +} +''' +Here is an example of the content of personal_info.json : +''' +{ + "sample_project": "FairScope Factory Settings", + "sample_operator": "Thibaut Pollina", + "sample_vessel": "La Baraka", + "sample_method": "Pump Samplers", + "sample_id": "Sample_1", + "sample_net_mesh_size": 20, + "sample_sieve_mesh size": 200, + "sample_net_mouth_diameter": 30, + "acq_id": "Tank_B", + "object_depth_min": 0, + "object_depth_max": 0, + "process_id": "1" +} +''' +This new file could be created at home/pi/PlanktoScope/ + +## Rationale + +Having 3 files instead of 2 to simplify and reorganize the metadata may seem to be counter productive. +Maybe just a proper formatting of the json file with personal_info: {"key":"value", ...}, settings:{"key":"value", ...} can be another solution. +The key aspect of this proposal is to know where to look at when debugging, modifying the use of the Planktoscope and to be more user friendly in order to facilitate the growth of a "non-developpers community based users". +The use and limitations of prefixes (acq_, object_, sample_, process_) suppose that we respect EcoTaxa's specifications. + +## Compatibility + +There should be no compatibility issue unless we couple this proposal with the loading of hardware.json with caracteristics stored in the HAT EEPROM. + +## Implementation + +The implementation impacts both Node Red and device-backend. + +### Device Backend +The files impacted are listed below : +device-backend/control/pscopehat/planktoscope/imager/_init_.py +device-backend/control/adafruithat/planktoscope/imager/_init_.py +device-backend/control/pscopehat/planktoscope/stepper.py +device-backend/control/adafruithat/planktoscope/stepper.py + +The modification consists of loading the config.json file in addition of the hardware.json to retrieve the variables that we moved from one another. +Changing also the name of the variables if their names were modified. + +### Node Red +The files impacted are listed below : +adafruithat.json +pscopehat.json + +The modification involves loading the new variables implemented in hardware.json and config.json, renaming the variables loaded in global context of node red to match the names in the json files. +Creating the personal_info.json if it does not exists. +Loading personal_info metadatas in the global context. +Adding new fields to retrieve these new metadatas in the GUI. +Writing these metadatas in the 3 json files at the end of each process (Sample, Optic config, Fluidic acquisition) + + +## Open issues (if applicable) + +Maybe in a community based meeting we should fuel this proposal with recommandations from users, which metadata they would like to retrieve with their exported datas. +Is there a version of the software that is not compatible with an old version of the hardware (to evaluate the compatibility issues) ? + From d8dfa8f594e18f2c84ea9861cd12097bc22c7590 Mon Sep 17 00:00:00 2001 From: Laurent Paul Vallet Date: Thu, 23 Nov 2023 09:49:26 +0100 Subject: [PATCH 2/6] Update 0290-metadata-reorganization.md Formatting code blocks properly. --- design/0290-metadata-reorganization.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/design/0290-metadata-reorganization.md b/design/0290-metadata-reorganization.md index 015152d..2f13691 100644 --- a/design/0290-metadata-reorganization.md +++ b/design/0290-metadata-reorganization.md @@ -41,7 +41,7 @@ We finally propose to dispatch existing metadatas between those 3 files accordin This file could be used between team members using Planktoscopes for the same campaign/study to assure an identical protocol/equipment used information. Here is the current content of hardware.json : -''' +``` { "stepper_reverse": false, "microsteps": 256, @@ -57,9 +57,9 @@ Here is the current content of hardware.json : "acq_fnumber_objective": 12, "process_pixel_fixed": 0.88 } -''' +``` Here is what we propose for its content : -''' +``` { "inst_serial_number": "U072", "acq_inst_name": "PlanktoScope", @@ -79,9 +79,9 @@ Here is what we propose for its content : "process_id": "1", "process_pixel_size": 0.75 } -''' +``` Here is the current content of config.json : -''' +``` { "sample_project": "Project's name", "sample_id": 1, @@ -102,9 +102,9 @@ Here is the current content of config.json : "sleep_before": 0.5, "imaging_pump_volume": 0.01 } -''' +``` Here is what we propose for its content : -''' +``` { "acq_camera_iso": 100, "acq_focus_max_speed": 5, @@ -119,9 +119,9 @@ Here is what we propose for its content : "object_camera_gain_red": 1.5, "object_camera_gain_blue": 1.9 } -''' +``` Here is an example of the content of personal_info.json : -''' +``` { "sample_project": "FairScope Factory Settings", "sample_operator": "Thibaut Pollina", @@ -136,7 +136,7 @@ Here is an example of the content of personal_info.json : "object_depth_max": 0, "process_id": "1" } -''' +``` This new file could be created at home/pi/PlanktoScope/ ## Rationale From 5fb48115b5197131fa995fe4a97cc606ed14c97d Mon Sep 17 00:00:00 2001 From: Ethan Li Date: Mon, 27 Nov 2023 16:32:50 -0800 Subject: [PATCH 3/6] Clean up formatting and grammar in the design document --- design/0290-metadata-reorganization.md | 324 +++++++++++++------------ 1 file changed, 170 insertions(+), 154 deletions(-) diff --git a/design/0290-metadata-reorganization.md b/design/0290-metadata-reorganization.md index 2f13691..c61b1b3 100644 --- a/design/0290-metadata-reorganization.md +++ b/design/0290-metadata-reorganization.md @@ -1,183 +1,199 @@ # Proposal: Metadata organization improvement -Author(s): Laurent Vallet Last updated: 22/11/2023 Discussion at https://github.com/PlanktoScope/PlanktoScope/issues/44 +Author(s): Laurent Vallet ([@LaurentPV](https://github.com/LaurentPV)), Ethan Li ([@ethanjli](https://github.com/ethanjli)) -## Abstract +Last updated: 2023-11-27 -This design document proposes a reorganization of the metadata variables around 3 json files in order to : -- Improve the logic behind the configuration files to facilitate the search of a specific variable -- Share easily the config.json file to apply Planktoscope settings to another user without any personal information or specific hardware configuration -- Use a more precise hardware.json file to debug and have all the information about a specific machine for Fairscope use -- Improve data quality by adding more relevant scientific variables to the metadata -- Create a personal_info.json file that could be later on encrypted and used to commit modifications to the machine, export directly to Ecotaxa and other user oriented future features +Discussion at: -## Background +## Abstract -For now, the PlanktoScope software uses 2 configuration files : -- hardware.json stored in https://github.com/PlanktoScope/device-backend/blob/main/default-configs/ -- config.json stored in https://github.com/PlanktoScope/PlanktoScope/blob/master/software/node-red-dashboard/default-configs/ +This design document proposes a reorganization of the metadata fields into three JSON files, in order to: -The hardware.json file contains parameters to configure the hardware but not the caracteristics of the hardware. -The config.json file contains the GUI parameters selected by the user to make the acquisition of the data. -Both are used by the Node Red flows. -The variables stored in these files and/or set by the user via the Node Red interface are used to generate the metadata.json file that is exported by the acquisition process. +- Have a more logical organization of metadata fields between the configuration files, to make it easier to find specific fields. +- Make it easy for a user to share their `config.json` file, so that they can their Planktoscope settings can be reused on other machines or by other users, but without sharing any personal information or specific hardware configurations. +- Have a more precise `hardware.json` file for debugging and for holding all information about a specific machine (e.g. which will be useful for FairScope). +- Create a `personal_info.json` file which eventually could be encrypted and used to commit modifications to the machine, export directly to Ecotaxa, or provide other other user-oriented features in the future. +- Improve data quality by adding more relevant scientific variables to the metadata. + +## Background -An inventory of all the metadata was made : https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link -Only variables with the prefixes sample_, acq_, object_, process_ are exported to the metadata.json. +Currently, the PlanktoScope software stores metadata in two configuration files: + +- [`hardware.json`](https://github.com/PlanktoScope/device-backend/tree/v2023.9.0-beta.1/default-configs) + - This file contains parameters to configure the hardware, but it does not describe the characteristics of the hardware. + - Depending on the PlanktoScope hardware version selected by the user in the Node-RED dashboard, a `hardware.json` file for that version is copied from `/home/pi/device-backend/default-configs` to `/home/pi/PlanktoScope`. + - Here is an example of the contents of the `hardware.json` file: + ``` + { + "stepper_reverse": false, + "microsteps": 256, + "focus_steps_per_mm": 40, + "pump_steps_per_ml": 2045, + "focus_max_speed": 5, + "pump_max_speed": 50, + "stepper_type": "pscope_hat", + "red_gain": 2.4, + "blue_gain": 1.35, + "analog_gain": 1.0, + "digital_gain": 1.0, + "acq_fnumber_objective": 12, + "process_pixel_fixed": 0.88 + } + ``` + +- [`config.json`](https://github.com/PlanktoScope/PlanktoScope/tree/software/v2023.9.0-beta.1/software/node-red-dashboard/default-configs) + - This file contains the GUI parameters selected by the user in the Node-RED dashboard for data acquisition. + - Depending on the PlanktoScope hardware version selected by the user in the Node-RED dashboard, a `hardware.json` file for that version is copied from `/home/pi/PlanktoScope/software/node-red-dashboard/default-configs` to `/home/pi/PlanktoScope` + - Here is an example of the contents of the `config.json` file: + ``` + { + "sample_project": "Project's name", + "sample_id": 1, + "sample_ship": "Vessel name", + "sample_operator": "Operator's name", + "sample_sampling_gear": "net", + "sample_gear_net_opening": 40, + "acq_id": 1, + "acq_instrument": "PlanktoScope v2.5", + "acq_celltype": 300, + "acq_minimum_mesh": 10, + "acq_maximum_mesh": 200, + "acq_volume": 1, + "object_depth_min": 1, + "object_depth_max": 2, + "process_id": 1, + "nb_frame": 100, + "sleep_before": 0.5, + "imaging_pump_volume": 0.01 + } + ``` + +Both configuration files are used by the Node-RED dashboard. +We have a ["Metadata Compilation" spreadsheet](https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link) which describes every metadata field, including metadata fields which are persistently stored in those files and metadata fields which are set by the user via the Node-RED dashboard but not persisted in files. +Both types of metadata fields are used to generate a `metadata.json` file which is exported by the Python backend's `ImagerProcess` module as part of image acquisition. +Only fields from our "Metadata Compilation" spreadsheet with field names containing one of the following prefixes are exported to the `metadata.json` file: +- `sample_` +- `acq_` +- `object_` +- `process_` ## Proposal -We propose to add a third file with the personal informations of the user and its scientific campaign : personal_info.json -We also propose to add metadata variables that could improve informations contained in Ecotaxa concerning the data acquired with the Planktoscope. -We finally propose to dispatch existing metadatas between those 3 files according to the following rules : - -- The hardware.json file should only have informations concerning the caracteristics of the machine. - This file should not be modified by the user except for the hardware case version of the planktoscope and its serial number where we cannot retrieve automatically the information - -- The config.json file should only store acquisition settings and camera settings selected by the user via the GUI - Exchanging this file should asssure a comparable result. - -- The personal_info.json file should store informations entered by the user via the GUI on the id of the user/team, campaign data informations and Ecotaxa login infos in order to use Ecotaxa API to directly export data. - This file could be used between team members using Planktoscopes for the same campaign/study to assure an identical protocol/equipment used information. - -Here is the current content of hardware.json : -``` -{ - "stepper_reverse": false, - "microsteps": 256, - "focus_steps_per_mm": 40, - "pump_steps_per_ml": 2045, - "focus_max_speed": 5, - "pump_max_speed": 50, - "stepper_type": "pscope_hat", - "red_gain": 2.4, - "blue_gain": 1.35, - "analog_gain": 1.0, - "digital_gain": 1.0, - "acq_fnumber_objective": 12, - "process_pixel_fixed": 0.88 -} -``` -Here is what we propose for its content : -``` -{ - "inst_serial_number": "U072", - "acq_inst_name": "PlanktoScope", - "acq_inst_version": "v2.6.1", - "acq_rpi_model": "Raspberry Pi 4 4Go", - "acq_camera_model": "Raspberry Pi High Quality Camera", - "acq_HAT_model": "FairScope_HAT v1.3", - "acq_objective_focal_length": 12, - "acq_tube_focal_length": 25, - "acq_LED_model": "Adafruit - 754", - "acq_pump_model": "Kamoer - KAS B12 SF", - "acq_flowcell_model": "FairScope Capillary - 300 um", - "microsteps": 256, - "pump_max_speed": 50, - "pump_steps_per_ml": 2045, - "stepper_reverse": false, - "process_id": "1", - "process_pixel_size": 0.75 -} -``` -Here is the current content of config.json : -``` -{ - "sample_project": "Project's name", - "sample_id": 1, - "sample_ship": "Vessel name", - "sample_operator": "Operator's name", - "sample_sampling_gear": "net", - "sample_gear_net_opening": 40, - "acq_id": 1, - "acq_instrument": "PlanktoScope v2.5", - "acq_celltype": 300, - "acq_minimum_mesh": 10, - "acq_maximum_mesh": 200, - "acq_volume": 1, - "object_depth_min": 1, - "object_depth_max": 2, - "process_id": 1, - "nb_frame": 100, - "sleep_before": 0.5, - "imaging_pump_volume": 0.01 -} -``` -Here is what we propose for its content : -``` -{ - "acq_camera_iso": 100, - "acq_focus_max_speed": 5, - "acq_camera_shutter_speed": 125, - "acq_camera_white_balance": false, - "acq_volume_interframe": 0.01, - "acq_nb_frames" : 10, - "focus_steps_per_mm": 40, - "sleep_before": 0.5, - "object_camera_gain_analog": 1, - "object_camera_gain_digital": 1, - "object_camera_gain_red": 1.5, - "object_camera_gain_blue": 1.9 -} -``` -Here is an example of the content of personal_info.json : -``` -{ - "sample_project": "FairScope Factory Settings", - "sample_operator": "Thibaut Pollina", - "sample_vessel": "La Baraka", - "sample_method": "Pump Samplers", - "sample_id": "Sample_1", - "sample_net_mesh_size": 20, - "sample_sieve_mesh size": 200, - "sample_net_mouth_diameter": 30, - "acq_id": "Tank_B", - "object_depth_min": 0, - "object_depth_max": 0, - "process_id": "1" -} -``` -This new file could be created at home/pi/PlanktoScope/ +We propose to add a third file, to be named `personal_info.json`, which will store the user's personal information and information about the scientific mission for which the PlanktoScope is being operated. +We also propose to add some more metadata fields to improve the metadata exported to Ecotaxa. +Finally, we propose to reorganize existing metadata fields between three files, according to the following rules: + +- The `hardware.json` file should only have information about the characteristics of the machine. + This file should not be modified by the user except for selecting the PlanktoScope's hardware version and the machine's serial number, and only when we cannot retrieve the information automatically (such as from the custom PlanktoScope HAT's EEPROM). + - Here is an example of our proposal for the contents of the `hardware.json` file: + ``` + { + "inst_serial_number": "U072", + "acq_inst_name": "PlanktoScope", + "acq_inst_version": "v2.6.1", + "acq_rpi_model": "Raspberry Pi 4 4Gb", + "acq_camera_model": "Raspberry Pi High Quality Camera", + "acq_HAT_model": "FairScope_HAT v1.3", + "acq_objective_focal_length": 12, + "acq_tube_focal_length": 25, + "acq_LED_model": "Adafruit - 754", + "acq_pump_model": "Kamoer - KAS B12 SF", + "acq_flowcell_model": "FairScope Capillary - 300 um", + "microsteps": 256, + "pump_max_speed": 50, + "pump_steps_per_ml": 2045, + "stepper_reverse": false, + "process_id": "1", + "process_pixel_size": 0.75 + } + ``` +- The `config.json` file should only store acquisition settings and camera settings selected by the user via the GUI. + Exchanging this file should assure a comparable result. + - Here is an example of our proposal for the contents of the `config.json` file: + ``` + { + "acq_camera_iso": 100, + "acq_focus_max_speed": 5, + "acq_camera_shutter_speed": 125, + "acq_camera_white_balance": false, + "acq_volume_interframe": 0.01, + "acq_nb_frames" : 10, + "focus_steps_per_mm": 40, + "sleep_before": 0.5, + "object_camera_gain_analog": 1, + "object_camera_gain_digital": 1, + "object_camera_gain_red": 1.5, + "object_camera_gain_blue": 1.9 + } + ``` +- The `personal_info.json` file should store information entered by the user (via the GUI) about the identity of the user/team, information about the mission where the PlanktoScope is being deployed, and Ecotaxa login credentials for the PlanktoScope software to interact with the Ecotaxa API (especially for exporting data directly to EcoTaxa). + This file could be shared between team members using Planktoscopes for the same mission, to assure that an identical protocol/equipment used information. + - Here is an example of our proposal for the contents of the `personal_info.json` file: + ``` + { + "sample_project": "FairScope Factory Settings", + "sample_operator": "Thibaut Pollina", + "sample_vessel": "La Baraka", + "sample_method": "Pump Samplers", + "sample_id": "Sample_1", + "sample_net_mesh_size": 20, + "sample_sieve_mesh size": 200, + "sample_net_mouth_diameter": 30, + "acq_id": "Tank_B", + "object_depth_min": 0, + "object_depth_max": 0, + "process_id": "1" + } + ``` + - This new file could be stored at `/home/pi/PlanktoScope/`, which is where the two other configuration files are currently stored. ## Rationale -Having 3 files instead of 2 to simplify and reorganize the metadata may seem to be counter productive. -Maybe just a proper formatting of the json file with personal_info: {"key":"value", ...}, settings:{"key":"value", ...} can be another solution. -The key aspect of this proposal is to know where to look at when debugging, modifying the use of the Planktoscope and to be more user friendly in order to facilitate the growth of a "non-developpers community based users". -The use and limitations of prefixes (acq_, object_, sample_, process_) suppose that we respect EcoTaxa's specifications. +Having three files instead of two to simplify and reorganize the metadata may seem to be counterproductive. +An alternative solution could be to improve the organization of the json file with nested objects, such as `personal_info: {"key":"value", ...}` and `settings:{"key":"value", ...}`. +The key aspect of this proposal is to know where to go when debugging, modifying the use of the Planktoscope; and to be more user-friendly in order to facilitate the growth of a community of users who aren't software developers. +The use and limitations of prefixes (`acq_`, `object_`, `sample_`, `process_`) suppose that we respect EcoTaxa's specifications. ## Compatibility -There should be no compatibility issue unless we couple this proposal with the loading of hardware.json with caracteristics stored in the HAT EEPROM. +There should be no compatibility issue unless we couple this proposal with the loading of `hardware.json` with caracteristics stored in the HAT EEPROM. ## Implementation -The implementation impacts both Node Red and device-backend. +The implementation impacts both the Node-RED dashboard and the Python backend. + +### Python Backend +The following files will be impacted: + +- `device-backend/control/pscopehat/planktoscope/imager/_init_.py` +- `device-backend/control/adafruithat/planktoscope/imager/_init_.py` +- `device-backend/control/pscopehat/planktoscope/stepper.py` +- `device-backend/control/adafruithat/planktoscope/stepper.py` + +These files will need to be modified to: + +- Load the `config.json` file in addition to the `hardware.json` file, in order to retrieve the metadata field values which this proposal moved from `hardware.json` latter file to the `config.json` file. +- Change the names of metadata fields according to this proposal. -### Device Backend -The files impacted are listed below : -device-backend/control/pscopehat/planktoscope/imager/_init_.py -device-backend/control/adafruithat/planktoscope/imager/_init_.py -device-backend/control/pscopehat/planktoscope/stepper.py -device-backend/control/adafruithat/planktoscope/stepper.py +### Node-RED Dashboard +The following files will be impacted: -The modification consists of loading the config.json file in addition of the hardware.json to retrieve the variables that we moved from one another. -Changing also the name of the variables if their names were modified. +- `adafruithat.json` +- `pscopehat.json` -### Node Red -The files impacted are listed below : -adafruithat.json -pscopehat.json +The files will need to be modified to: -The modification involves loading the new variables implemented in hardware.json and config.json, renaming the variables loaded in global context of node red to match the names in the json files. -Creating the personal_info.json if it does not exists. -Loading personal_info metadatas in the global context. -Adding new fields to retrieve these new metadatas in the GUI. -Writing these metadatas in the 3 json files at the end of each process (Sample, Optic config, Fluidic acquisition) +- Load the new variables implemented in `hardware.json` and `config.json`. +- Rename global variables in the Node-RED flows to match the metadata field names in the JSON files. +- Create the `personal_info.json` file if it does not already exist. +- Load metadata fields from `personal_info.json` as global variables in the Node-RED flows. +- Add new GUI input fields to set the values of new metadata fields as needed. +- Write metadata values to the three JSON files at the end of each process (Sample, Optic Configuration, Fluidic Acquisition) ## Open issues (if applicable) -Maybe in a community based meeting we should fuel this proposal with recommandations from users, which metadata they would like to retrieve with their exported datas. -Is there a version of the software that is not compatible with an old version of the hardware (to evaluate the compatibility issues) ? +- Maybe in a community-based meeting we should iterate on this proposal with recommendations from users, e.g. about which metadata fields they would like to retrieve with their exported data. +- Is there a version of the software that is not compatible with an old version of the hardware (to evaluate the compatibility issues)? From 568a723160deb5fb40b9099ddd41f0168550851c Mon Sep 17 00:00:00 2001 From: Ethan Li Date: Mon, 27 Nov 2023 17:00:47 -0800 Subject: [PATCH 4/6] Fix some typos I had caused, and further improve clarity --- design/0290-metadata-reorganization.md | 36 ++++++++++++++------------ 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/design/0290-metadata-reorganization.md b/design/0290-metadata-reorganization.md index c61b1b3..43c329e 100644 --- a/design/0290-metadata-reorganization.md +++ b/design/0290-metadata-reorganization.md @@ -10,10 +10,10 @@ Discussion at: This design document proposes a reorganization of the metadata fields into three JSON files, in order to: -- Have a more logical organization of metadata fields between the configuration files, to make it easier to find specific fields. -- Make it easy for a user to share their `config.json` file, so that they can their Planktoscope settings can be reused on other machines or by other users, but without sharing any personal information or specific hardware configurations. +- Organize metadata fields between the configuration files in a more logical way which makes it easier to find specific fields. +- Make it easy for a user to share their `config.json` file, so that they can make their Planktoscope settings available for reuse on other machines or by other users, but without sharing any personal information or hardware-specific information (e.g. machine serial number). - Have a more precise `hardware.json` file for debugging and for holding all information about a specific machine (e.g. which will be useful for FairScope). -- Create a `personal_info.json` file which eventually could be encrypted and used to commit modifications to the machine, export directly to Ecotaxa, or provide other other user-oriented features in the future. +- Create a `personal_info.json` file which eventually could be encrypted and used to commit modifications to the machine, upload datasets to Ecotaxa, or provide other other user-oriented features in the future. - Improve data quality by adding more relevant scientific variables to the metadata. ## Background @@ -43,8 +43,8 @@ Currently, the PlanktoScope software stores metadata in two configuration files: ``` - [`config.json`](https://github.com/PlanktoScope/PlanktoScope/tree/software/v2023.9.0-beta.1/software/node-red-dashboard/default-configs) - - This file contains the GUI parameters selected by the user in the Node-RED dashboard for data acquisition. - - Depending on the PlanktoScope hardware version selected by the user in the Node-RED dashboard, a `hardware.json` file for that version is copied from `/home/pi/PlanktoScope/software/node-red-dashboard/default-configs` to `/home/pi/PlanktoScope` + - This file contains inputs entered by the user in the Node-RED dashboard to describe their sample and to configure image acquisition. + - Depending on the PlanktoScope hardware version selected by the user in the Node-RED dashboard, a `config.json` file for that version is copied from `/home/pi/PlanktoScope/software/node-red-dashboard/default-configs` to `/home/pi/PlanktoScope` - Here is an example of the contents of the `config.json` file: ``` { @@ -69,10 +69,9 @@ Currently, the PlanktoScope software stores metadata in two configuration files: } ``` -Both configuration files are used by the Node-RED dashboard. -We have a ["Metadata Compilation" spreadsheet](https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link) which describes every metadata field, including metadata fields which are persistently stored in those files and metadata fields which are set by the user via the Node-RED dashboard but not persisted in files. -Both types of metadata fields are used to generate a `metadata.json` file which is exported by the Python backend's `ImagerProcess` module as part of image acquisition. -Only fields from our "Metadata Compilation" spreadsheet with field names containing one of the following prefixes are exported to the `metadata.json` file: +Both of these configuration files are used by the Node-RED dashboard for saving metadata persistently across restart, but some metadata information set by the user in the Node-RED dashboard is not persisted. +Both persisted and unpersisted metadata fields are assembled into a `metadata.json` file for each raw dataset, which is created by the Python backend's `ImagerProcess` module as part of image acquisition. +We have a ["Metadata Compilation" spreadsheet](https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link) which describes every metadata field; only fields from that spreadsheet with field names containing one of the following prefixes are exported to the `metadata.json` file: - `sample_` - `acq_` - `object_` @@ -80,12 +79,13 @@ Only fields from our "Metadata Compilation" spreadsheet with field names contain ## Proposal -We propose to add a third file, to be named `personal_info.json`, which will store the user's personal information and information about the scientific mission for which the PlanktoScope is being operated. +We propose to add a third file, to be named `personal_info.json`, which will store the user's personal information as well as information about the scientific mission for which the PlanktoScope is being operated. We also propose to add some more metadata fields to improve the metadata exported to Ecotaxa. Finally, we propose to reorganize existing metadata fields between three files, according to the following rules: -- The `hardware.json` file should only have information about the characteristics of the machine. - This file should not be modified by the user except for selecting the PlanktoScope's hardware version and the machine's serial number, and only when we cannot retrieve the information automatically (such as from the custom PlanktoScope HAT's EEPROM). +- The `hardware.json` file should only have information about the hardware characteristics of the PlanktoScope. + This file should only be modified by the user when we cannot determine the information automatically (such as from the custom PlanktoScope HAT's EEPROM). + In such situations, the user should only need to select the PlanktoScope's hardware version and its serial number (assuming their PlanktoScope has a standard hardware configuration). - Here is an example of our proposal for the contents of the `hardware.json` file: ``` { @@ -146,18 +146,22 @@ Finally, we propose to reorganize existing metadata fields between three files, "process_id": "1" } ``` - - This new file could be stored at `/home/pi/PlanktoScope/`, which is where the two other configuration files are currently stored. + - This new file could be stored at `/home/pi/PlanktoScope/`, alongside the two other configuration files. ## Rationale +By making the naming and organization of metadata fields more logical, we can make it easier for developers and users to find the necessary metadata fields when they inspect the metadata files as part of debugging or modifying their Planktoscopes. + Having three files instead of two to simplify and reorganize the metadata may seem to be counterproductive. An alternative solution could be to improve the organization of the json file with nested objects, such as `personal_info: {"key":"value", ...}` and `settings:{"key":"value", ...}`. -The key aspect of this proposal is to know where to go when debugging, modifying the use of the Planktoscope; and to be more user-friendly in order to facilitate the growth of a community of users who aren't software developers. -The use and limitations of prefixes (`acq_`, `object_`, `sample_`, `process_`) suppose that we respect EcoTaxa's specifications. +However, separating different group of metadata fields into different files based on when/how those metadata files need to be changed/shared makes it easy to replace the values of one group of fields just by overwriting one file. +This is an advantage of having multiple metadata files rather than a single metadata file. + +The use and limitations of prefixes (`acq_`, `object_`, `sample_`, `process_`) in the metadata field names is motivated by following EcoTaxa's metadata field naming system. ## Compatibility -There should be no compatibility issue unless we couple this proposal with the loading of `hardware.json` with caracteristics stored in the HAT EEPROM. +There should be no compatibility issue unless we couple this proposal with the loading of `hardware.json` metadata fields from data stored in the PlanktoScope HAT's EEPROM. ## Implementation From 9e8ea800d226616b43a6d6cda67d26ed94c3d73d Mon Sep 17 00:00:00 2001 From: Ethan Li Date: Mon, 27 Nov 2023 17:01:59 -0800 Subject: [PATCH 5/6] Rename design document file to pad the proposal number to 5 digits --- ...etadata-reorganization.md => 00290-metadata-reorganization.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename design/{0290-metadata-reorganization.md => 00290-metadata-reorganization.md} (100%) diff --git a/design/0290-metadata-reorganization.md b/design/00290-metadata-reorganization.md similarity index 100% rename from design/0290-metadata-reorganization.md rename to design/00290-metadata-reorganization.md From 726bb007307be309e2fa41faf9f0c9f7022d0cb5 Mon Sep 17 00:00:00 2001 From: Ethan Li Date: Mon, 27 Nov 2023 17:12:59 -0800 Subject: [PATCH 6/6] Fix some typos and factual errors I had introduced --- design/00290-metadata-reorganization.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/design/00290-metadata-reorganization.md b/design/00290-metadata-reorganization.md index 43c329e..11c2941 100644 --- a/design/00290-metadata-reorganization.md +++ b/design/00290-metadata-reorganization.md @@ -13,7 +13,7 @@ This design document proposes a reorganization of the metadata fields into three - Organize metadata fields between the configuration files in a more logical way which makes it easier to find specific fields. - Make it easy for a user to share their `config.json` file, so that they can make their Planktoscope settings available for reuse on other machines or by other users, but without sharing any personal information or hardware-specific information (e.g. machine serial number). - Have a more precise `hardware.json` file for debugging and for holding all information about a specific machine (e.g. which will be useful for FairScope). -- Create a `personal_info.json` file which eventually could be encrypted and used to commit modifications to the machine, upload datasets to Ecotaxa, or provide other other user-oriented features in the future. +- Create a `personal_info.json` file which eventually could be encrypted and used to commit modifications to the machine, upload datasets to Ecotaxa, or provide other user-oriented features in the future. - Improve data quality by adding more relevant scientific variables to the metadata. ## Background @@ -44,7 +44,8 @@ Currently, the PlanktoScope software stores metadata in two configuration files: - [`config.json`](https://github.com/PlanktoScope/PlanktoScope/tree/software/v2023.9.0-beta.1/software/node-red-dashboard/default-configs) - This file contains inputs entered by the user in the Node-RED dashboard to describe their sample and to configure image acquisition. - - Depending on the PlanktoScope hardware version selected by the user in the Node-RED dashboard, a `config.json` file for that version is copied from `/home/pi/PlanktoScope/software/node-red-dashboard/default-configs` to `/home/pi/PlanktoScope` + - Depending on the HAT type specified for the PlanktoScope distro setup scripts to create the PlanktoScope SD card image, a default `config.json` file for the latest hardware version of the HAT type (v2.1 for `adafruithat`, v2.6 for `pscopehat`) is copied from `/home/pi/PlanktoScope/software/node-red-dashboard/default-configs` to `/home/pi/PlanktoScope`. + This is done in order to set the `acq_instrument` field to a reasonable default value, as a workaround for the storage of that metadata field in `config.json` rather than `hardware.json`. - Here is an example of the contents of the `config.json` file: ``` { @@ -69,8 +70,8 @@ Currently, the PlanktoScope software stores metadata in two configuration files: } ``` -Both of these configuration files are used by the Node-RED dashboard for saving metadata persistently across restart, but some metadata information set by the user in the Node-RED dashboard is not persisted. -Both persisted and unpersisted metadata fields are assembled into a `metadata.json` file for each raw dataset, which is created by the Python backend's `ImagerProcess` module as part of image acquisition. +Both of these configuration files are used by the Node-RED dashboard for saving metadata persistently across restarts, but some metadata information set by the user in the Node-RED dashboard is not persisted. +Persisted and unpersisted metadata fields are assembled into a `metadata.json` file for each raw dataset, which is created by the Python backend's `ImagerProcess` module as part of image acquisition. We have a ["Metadata Compilation" spreadsheet](https://docs.google.com/spreadsheets/d/1TSIaOFEIMvvYyqAFrsiZxVtGXZvWVdZbWO_LU-2A_TE/edit?usp=drive_link) which describes every metadata field; only fields from that spreadsheet with field names containing one of the following prefixes are exported to the `metadata.json` file: - `sample_` - `acq_`