Skip to content

Commit

Permalink
Add jupyter service
Browse files Browse the repository at this point in the history
  • Loading branch information
consolethinks authored Apr 16, 2024
1 parent 3e3eb09 commit b27b3f5
Show file tree
Hide file tree
Showing 8 changed files with 220 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
package.json
package-lock.json
node_modules/
.ipynb_checkpoints/
*.ipynb
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,13 @@ graph TD
mongodb[mongodb**] --> backend
backend --> frontend[frontend**]
backend --> searchapi
backend --> jupyter
end
proxy -.- backend
proxy -.- frontend
proxy -.- searchapi
proxy -.- jupyter
```

We flag with `*` the services which have extra internal dependencies, which are not shared across the two backend versions, and with `**` the ones which have an explicit dependency on the `BE_VERSION` value. To view them, refer to the service README.
Expand Down
1 change: 1 addition & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ include:
- ./services/frontend/docker-compose.yaml
- ./services/searchapi/docker-compose.yaml
- ./services/proxy/docker-compose.yaml
- ./services/jupyter/docker-compose.yaml
23 changes: 23 additions & 0 deletions services/jupyter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Jupyter Notebook

This Jupyter Notebook instance is preconfigured with an example notebook that shows the usage of [Pyscicat](https://github.com/scicatproject/pyscicat).

## [Pyscicat Notebook](./config/notebooks/pyscicat.ipynb)
This notebook demonstrates all the major actions Pyscicat is capable of:
* logging into SciCat backend
* dataset creation
* datablock creation
* attachment upload

## [.env file](./config/.env)
It contains the environment variables for connecting to the backend service deployed by this project

## [Thumbnail image](./config/notebooks/example_files/thumbnail.png)
An example image that is used for the attachment upload demonstration

## Default configuration
This service is only dependant on the backend service, since it demonstrates communication with the latter through Pyscicat.

The notebooks are mounted to the container from the [config/notebooks](config/notebooks/) directory. The changes to these notebooks should *not* be contributed back to this repository, unless this is intentional. In the case you want to upstream changes to these notebooks, be sure to clear all the results from them.

The [main readme](../../README.md) covers all dependencies of this package.
4 changes: 4 additions & 0 deletions services/jupyter/config/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
BE_BASE_URL="http://backend:3000/api/v3/"
USERNAME="admin"
PASSWORD="2jf70TPNZsS"
NOTEBOOK_ARGS="--NotebookApp.token=''"
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
175 changes: 175 additions & 0 deletions services/jupyter/config/notebooks/pyscicat.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# install pyscicat\n",
"!pip install 'pyscicat>=0.4.4'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create scicat client\n",
"\n",
"from pyscicat.client import ScicatClient\n",
"import os\n",
"\n",
"base_url = os.environ.get('BE_BASE_URL')\n",
"username = os.environ.get('USERNAME')\n",
"password = os.environ.get('PASSWORD')\n",
"\n",
"client = ScicatClient(base_url=base_url, username=username, password=password)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create (raw) dataset\n",
"\n",
"from pyscicat.model import Ownable, RawDataset\n",
"from datetime import datetime\n",
"\n",
"ownable = Ownable(\n",
" ownerGroup=\"example_research_group\", # only obligatory element of this model\n",
" accessGroups=['group1', 'group2'],\n",
" instrumentGroup='instrument1'\n",
")\n",
"\n",
"dataset = RawDataset(\n",
" contactEmail=\"[email protected]\", # must be valid email in format\n",
" creationTime=datetime(year=2024, month=4, day=12, hour=12, minute=35, second=30).isoformat(),\n",
" datasetName=\"example_dataset\",\n",
" description=\"You can insert a lengthy description here\",\n",
" instrumentId=\"example.instrument.id\",\n",
" isPublished=False,\n",
" keywords=[\"important\", \"biology\", \"example\"],\n",
" license=\"Public Domain\",\n",
" numberOfFiles = 6,\n",
" orcidOfOwner=\"0000-0001-5109-3700\",\n",
" owner=\"Emmett Exemplum, PhD\",\n",
" ownerEmail=\"example.exemplum@some_uni.ch\",\n",
" size=589824, # in bytes!\n",
" sourceFolder=\"/datasets/example_dataset\", # this will have to reflect the retrieval location for the archival system \n",
" #sourceFolderHost=\"earth.net\", # same as above but the network host part (instead of filesystem)\n",
" validationStatus=\"valid\",\n",
" version=\"4.0.0\", # optional\n",
" scientificMetadata={}, # optional\n",
" principalInvestigator=\"Mr. Irvine Investigator\",\n",
" creationLocation=\"University of Example, Exemplia\",\n",
" #dataFormat=\"someformat\", # optional\n",
" sampleId=\"example.sample.id\",\n",
" **ownable.model_dump()\n",
")\n",
"\n",
"# attempt to create dataset, *CAN* throw ScicatCommError\n",
"dataset_id = client.datasets_create(dataset)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Add datablock to dataset\n",
"\n",
"from pyscicat.model import DataFile, CreateDatasetOrigDatablockDto\n",
"from datetime import datetime\n",
"\n",
"# disable info level logging\n",
"import logging\n",
"logging.getLogger().setLevel(logging.FATAL)\n",
"\n",
"files = [\n",
" \"file1.txt\",\n",
" \"someimage.raw\",\n",
" \"thumbnail/someimage.jpg\",\n",
" \"thumbnail/FoilHoles.jpg\",\n",
" \"transformations.json\",\n",
" \"FoilHoles.raw\"\n",
"]\n",
"file_sizes = [\n",
" 100,\n",
" 204800,\n",
" 20480,\n",
" 20480,\n",
" 1000,\n",
" 342964\n",
"]\n",
"file_times = [\n",
" datetime(year=2024, month=4, day=12, hour=12, minute=35, second=30),\n",
" datetime(year=2024, month=2, day=4, hour=5, minute=56, second=39),\n",
" datetime(year=2024, month=3, day=30, hour=19, minute=4, second=53),\n",
" datetime(year=2024, month=4, day=2, hour=16, minute=25, second=37),\n",
" datetime(year=2024, month=4, day=12, hour=8, minute=13, second=44),\n",
" datetime(year=2024, month=2, day=4, hour=4, minute=24, second=17)\n",
"]\n",
"dataFileList = [\n",
" DataFile(path=p, size=s, time=t.isoformat()) for (p, s, t) in zip(files, \n",
" file_sizes, \n",
" file_times)\n",
"]\n",
"\n",
"data_block = CreateDatasetOrigDatablockDto(\n",
" size=576, version=1, dataFileList=dataFileList\n",
")\n",
"\n",
"client.datasets_origdatablock_create(dataset_id, data_block);"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Upload attachment\n",
"\n",
"from pyscicat.client import encode_thumbnail\n",
"from pyscicat.model import Attachment\n",
"\n",
"attachment = Attachment(\n",
" datasetId=dataset_id,\n",
" thumbnail=encode_thumbnail(\"example_files/thumbnail.png\", \"png\"),\n",
" caption=\"Example thumbnail image as attachment\",\n",
" **ownable.model_dump()\n",
")\n",
"\n",
"client.datasets_attachment_create(attachment);"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
13 changes: 13 additions & 0 deletions services/jupyter/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
services:
jupyter:
image: quay.io/jupyter/base-notebook:x86_64-notebook-7.1.2
depends_on:
backend:
condition: service_healthy
labels:
- traefik.http.routers.jupyter.rule=Host(`jupyter.localhost`)
- traefik.http.services.jupyter.loadbalancer.server.port=8888
volumes:
- ./config/notebooks:/home/jovyan/notebooks
env_file:
- config/.env

0 comments on commit b27b3f5

Please sign in to comment.