Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse multiplexing datasets into Fractal #34

Closed
jluethi opened this issue Jul 12, 2022 · 10 comments
Closed

Parse multiplexing datasets into Fractal #34

jluethi opened this issue Jul 12, 2022 · 10 comments
Assignees
Labels
High Priority Current Priorities & Blocking Issues

Comments

@jluethi
Copy link
Collaborator

jluethi commented Jul 12, 2022

A typical multiplexing dataset will consist of multiple input folders. Each folder contains the images of a multiplexing cycle.

They all belong to the same physical plate though. Thus, if we parse multiplexing data, we want to parse multiple folders into a single Zarr file, give the user the ability to name every channel for each cycle and retain the cycle information in metadata somewhere.

I created a tiny example dataset for this:
/data/active/fractal/3D/PelkmansLab/CardiacMultiplexing/Multiplex_2x2_singleWell

It is structured the following way:

.
├── cycle1
└── cycle2

Both cycles contain the same B03 well & 2x2 sites, channels C01, C02 & C03. When parsed to OME-Zarr, I would expect to have channels 0-5 (with some names based on user inputs). The sites in both channels are basically the same (except for very small shifts between cycles)

@jluethi
Copy link
Collaborator Author

jluethi commented Aug 29, 2022

Regarding how data containing multiple acquisitions is structured in the OME-Zarr file:
On one level, multiple cycles/acquisitions are just multiple channels for the same plate. But we need to have metadata to know which channels belong to a given acquisition.
The OME-NGFF spec already has some logic built in for defining "acquisitions", see here: https://ngff.openmicroscopy.org/latest/

In that logic, we would save multiple "field_of_views" / folders of images per well, instead of just one image folder per well. Each image folder would then contain the images for a given acquisition.

I'm not sure yet how the downstream reading of this would work, I don't think the napari-ome-zarr plugin would display this correctly (it currently just displays the first field of view per well, so would potentially just ignore the other cycles if we save it that way??). It also is a somewhat complicated mix of having multiple images per well (when FOVs are saved separately, which we aren't doing) and multiple acquisitions per well.

@jluethi jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022
@jluethi
Copy link
Collaborator Author

jluethi commented Sep 8, 2022

I started a discussion on loading multiple acquisitions here: ome/ome-zarr-py#225

Big picture, I think it would be promising to use the acquisition metadata and save an OME-Zarr image per cycle for each well. If that's where the OME-NGFF community is going and we can add support for that to ome-zarr-py, then proceeding with this logic would be a good plan. Should make it quite easy to handle the parsing, but require a bit of work on the ome-zarr-py side to support visualization on a plate level.

@jluethi jluethi added the High Priority Current Priorities & Blocking Issues label Oct 5, 2022
@jluethi
Copy link
Collaborator Author

jluethi commented Oct 11, 2022

Main take-aways from the discussion:

  1. We can start using the acquisition metadata. It makes sense that we'd save our images to OME-Zarr that way. This will mean, wells in our multi-well plate will look like this:
.
└── B
    └── 03
        ├── 0   # Acquisition 1 => First cycle, contains all channels for the first cycle
        ├── 1   # Acquisition 2
        ├── 2   # Acquisition 3
        └── 3   # Acquisition 4

Each Image at the well level is its own acquisition with the correct metadata. See https://ngff.openmicroscopy.org/latest/#well-md for an example of a well with 2 acquisitions (they also have multiple images per acquisition, which we don't do).

Also, the plate metadata contains information about all the acquisitions that are present, see here: https://ngff.openmicroscopy.org/latest/#plate-md

  1. We will have to come up with a proposed implementation for how to load multiple acquisition using the ome-zarr-py library at the plate level. See interesting discussion on this in Handling “acquisitions” in plate & well reading ome/ome-zarr-py#225

@tcompa
Copy link
Collaborator

tcompa commented Nov 16, 2022

The Multiplex dataset in fractal-analytics-platform/fractal-client#213 is a bit too large (1G per cycle).
@jluethi could we produce a smaller one?

It's even better if it could include some differences between cycles (e.g. the two cycles have different number/names of channels), so that we can discuss about the correct way to store this information.

@jluethi
Copy link
Collaborator Author

jluethi commented Nov 16, 2022

Ok, I'll create a smaller test set like the tiny test that has the following:
3 cycles:
Cycle 1: 4 channels
Cycle 2: 3 channels
Cycle 3: 2 channels

Channel names will be a mix of distinct and not (each cycle has a DAPI channel, but the other channels have different names)

Given you want them < 1GB, let's do 2 FOVs, 2 Z planes

Do we need a second tiny one to test the multi-well version of this?

@jluethi
Copy link
Collaborator Author

jluethi commented Nov 17, 2022

Test set is now available here:

UZH path: /data/active/fractal/3D/PelkmansLab/CardiacMultiplexing/tiny_multiplexing
FMI path: TBD
Image count: 48
Metadata: Yes

Description:
Synthetic multiplexing test data for Fractal from the 20200810-CardiomyocyteDifferentiation14 from Joel Lüthi
Contains 2 wells, B03 & B05. Only 2 FOV, 2 Z slices each (slices 5 & 6 of the stack, renamed to 1 & 2) + Synthetic metadata files for each cycle

Cycle 1: Only contains DAPI (C01)
Cycle 2: Contains DAPI (C01) & Na/K ATPase (C03)
Cycle 3: Contains DAPI (C01), HSP60 (C02) & bTubulin (C03)

@tcompa
Copy link
Collaborator

tcompa commented Nov 21, 2022

Version 0.4.0 should be ready for testing, and then we can close this (parsing) issue.
The discussion is still open about parameters and workflows, to be continued elsewhere.

@jluethi
Copy link
Collaborator Author

jluethi commented Nov 21, 2022

Great!
Do you have an example for how to use it you can at to the demo repo? Or example code in how to call the parsing function that I can use to build the example from? :)
Will switch to the 0.4.0 version for my current test and then hopefully get to testing the multiplexing parsing soon :)

@tcompa
Copy link
Collaborator

tcompa commented Nov 21, 2022

This example should work:
https://github.com/fractal-analytics-platform/fractal-demos/tree/main/examples-v1/02_cardio_multiplexing

(reminder: you should modify the INPUT_PATH variable)

@jluethi
Copy link
Collaborator Author

jluethi commented Nov 24, 2022

This is working with #223 and the 0.4.6 core tasks release :)

@jluethi jluethi closed this as completed Nov 24, 2022
Repository owner moved this from TODO to Done in Fractal Project Management Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
High Priority Current Priorities & Blocking Issues
Projects
Archived in project
Development

No branches or pull requests

2 participants