-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the "decollage" process for raw microscope output to the package #22
Conversation
@Kzra your feedback particularly appreciated, if you're still generating new images on a regular basis then this should be directly useful to you now, and faster as we're not re-reading the large TIFF every time we extract a small window If not, i guess the next step is to add a binary classifier and add a probably-junk flag and maybe a confidence metric to each output image, option of just not sending anything on to the cloud at this stage, and see if we can do that with the new object_store_api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work speeding up the decollage process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work adding the decollage process.
See #21 for the context for this and links to the original - moving a rough script from the internal project and refactoring it for use in a future pipeline, as yet unspecified.
To test
Run unit tests
Run from the commandline (stopgap)
The last argument there is an "experiment name" used to name the output files. This is a stop-gap set of changes, I didn't want to go any further as it's still not completely clear how the workflow fits together. #9
What this doesn't cover
One discovery here is there's a lot of metadata for individual images based on segmentation and shape analysis that happens onboard the FlowCam - a lot more detail than I thought we'd have access to.
Given we don't have a really clear use case for it, I haven't attempted to do anything with that here but I can see the output being usefully either dropped into the object store and picked up for use with dask via intake, or indexed in a lightweight database like sqlite/datasette