You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's a Miro board which formed a walkthrough picture of the end-to-end flow from instrument to inference, when this project was first proposed as one the RSE group could pick up as an onboarding exercise. Our limited Miro plan isn't great for collaboration or reuse, though. The experience of writing diagrams as dotfiles was good, either with sketchviz or a VSCode extension. It needs a pipeline to render them and publish embedded in markdown to Pages, useful elsewhere.
The wider aim here is to identify components in other projects that could be tested with this use case (such as on-prem Apache Beam or Airflow installations) and provide some natural prioritisation for the laundry list of possible next steps - c.f. the places to intervene that emerged from the Miro diagram, which is probably not exhaustive; the existing work only covers the last stages:
Workflow stages
Linking the sample acquisition date and time to the output data of the instrument (in this case the FlowCam)
Navigate security concerns about saving output straight onto a network drive rather manual transfer
Package up the input-to-analysis-ready processing in a way that could be run as a pipeline, e.g. by Airflow
Process to poll for new source data, process them, and upload to object storage without manual triggering
Binary classifier(s) to sift volumes of uninteresting data to save on excess cloud storage
Establish the best running consensus we can on managing credentials for server-side use with cloud storage
Same consensus but for client-side, handling SSO and whether we it makes sense to lean on Posit Connect to do that
Write-safe options for metadata, general preferences for vector and document stores that are easy to audit and (re)deploy
Standards oriented metadata catalogue interfaces
Proof of concept feature extraction from images and sight of its future applications
The text was updated successfully, but these errors were encountered:
https://nerc-ceh.github.io/plankton_ml/diagrams/ - this is unfinished as it stands. The diagrams that exist were really useful for prompting conversation with support teams about future options for #20 and the pipelines got reused elsewhere.
I think this is a nice visual output, helps with project communication and to refer back to, would like to complete the drafts as a relative priority
There's a Miro board which formed a walkthrough picture of the end-to-end flow from instrument to inference, when this project was first proposed as one the RSE group could pick up as an onboarding exercise. Our limited Miro plan isn't great for collaboration or reuse, though. The experience of writing diagrams as dotfiles was good, either with sketchviz or a VSCode extension. It needs a pipeline to render them and publish embedded in markdown to Pages, useful elsewhere.
The wider aim here is to identify components in other projects that could be tested with this use case (such as on-prem Apache Beam or Airflow installations) and provide some natural prioritisation for the laundry list of possible next steps - c.f. the places to intervene that emerged from the Miro diagram, which is probably not exhaustive; the existing work only covers the last stages:
Workflow stages
The text was updated successfully, but these errors were encountered: