-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add JUPYTER_MODE for JupyterHub #77
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -123,21 +123,22 @@ services: | |
- ./config/yarn-write-policy.json:/config/yarn-write-policy.json | ||
- ./scripts/minio_create_bucket_entrypoint.sh:/scripts/minio_create_bucket_entrypoint.sh | ||
|
||
dev_notebook: | ||
dev_jupyterlab: | ||
build: | ||
context: . | ||
dockerfile: Dockerfile | ||
container_name: spark-dev-notebook | ||
container_name: dev-jupyterlab | ||
ports: | ||
- "4041:4041" | ||
depends_on: | ||
- spark-master | ||
- minio-create-bucket | ||
environment: | ||
- NOTEBOOK_PORT=4041 | ||
- JUPYTER_MODE=jupyterlab | ||
- YARN_RESOURCE_MANAGER_URL=http://yarn-resourcemanager:8032 | ||
- SPARK_MASTER_URL=spark://spark-master:7077 | ||
- SPARK_DRIVER_HOST=spark-dev-notebook | ||
- SPARK_DRIVER_HOST=dev-jupyterlab | ||
- MINIO_URL=http://minio:9002 | ||
- MINIO_ACCESS_KEY=minio-readwrite | ||
- MINIO_SECRET_KEY=minio123 | ||
|
@@ -151,34 +152,94 @@ services: | |
volumes: | ||
- ./cdr/cdm/jupyter:/cdm_shared_workspace | ||
|
||
user_notebook: | ||
user-jupyterlab: | ||
build: | ||
context: . | ||
dockerfile: Dockerfile | ||
container_name: spark-user-notebook | ||
container_name: user-jupyterlab | ||
ports: | ||
- "4042:4042" | ||
depends_on: | ||
- spark-master | ||
- minio-create-bucket | ||
environment: | ||
- NOTEBOOK_PORT=4042 | ||
- JUPYTER_MODE=jupyterlab | ||
- YARN_RESOURCE_MANAGER_URL=http://yarn-resourcemanager:8032 | ||
- SPARK_MASTER_URL=spark://spark-master:7077 | ||
- SPARK_DRIVER_HOST=spark-user-notebook | ||
- SPARK_DRIVER_HOST=user-jupyterlab | ||
- MINIO_URL=http://minio:9002 | ||
- MINIO_ACCESS_KEY=minio-readonly | ||
- MINIO_SECRET_KEY=minio123 | ||
- S3_YARN_BUCKET=yarn | ||
- MAX_EXECUTORS=4 | ||
# TODO: create postgres user w/ only write access to the hive tables | ||
# TODO: create postgres user r/ only read access to the hive tables | ||
- POSTGRES_USER=hive | ||
- POSTGRES_PASSWORD=hivepassword | ||
- POSTGRES_DB=hive | ||
- POSTGRES_URL=postgres:5432 | ||
volumes: | ||
- ./cdr/cdm/jupyter/user_shared_workspace:/cdm_shared_workspace/user_shared_workspace | ||
|
||
dev_jupyterhub: | ||
build: | ||
context: . | ||
dockerfile: Dockerfile | ||
container_name: dev-jupyterhub | ||
ports: | ||
- "4043:4043" | ||
depends_on: | ||
- spark-master | ||
- minio-create-bucket | ||
environment: | ||
- NOTEBOOK_PORT=4043 | ||
- JUPYTER_MODE=jupyterhub | ||
- YARN_RESOURCE_MANAGER_URL=http://yarn-resourcemanager:8032 | ||
- SPARK_MASTER_URL=spark://spark-master:7077 | ||
- SPARK_DRIVER_HOST=dev-jupterhub | ||
- MINIO_URL=http://minio:9002 | ||
- MINIO_ACCESS_KEY=minio-readwrite | ||
- MINIO_SECRET_KEY=minio123 | ||
- S3_YARN_BUCKET=yarn | ||
- MAX_EXECUTORS=4 | ||
- POSTGRES_USER=hive | ||
- POSTGRES_PASSWORD=hivepassword | ||
- POSTGRES_DB=hive | ||
- POSTGRES_URL=postgres:5432 | ||
- USAGE_MODE=dev | ||
volumes: | ||
- ./cdr/cdm/jupyter:/cdm_shared_workspace | ||
- ./cdr/cdm/jupyter/jupyterhub/users_home:/jupyterhub/users_home | ||
|
||
user_jupyterhub: | ||
build: | ||
context: . | ||
dockerfile: Dockerfile | ||
container_name: user-jupyterhub | ||
ports: | ||
- "4044:4044" | ||
depends_on: | ||
- spark-master | ||
- minio-create-bucket | ||
environment: | ||
- NOTEBOOK_PORT=4044 | ||
- JUPYTER_MODE=jupyterhub | ||
- YARN_RESOURCE_MANAGER_URL=http://yarn-resourcemanager:8032 | ||
- SPARK_MASTER_URL=spark://spark-master:7077 | ||
- SPARK_DRIVER_HOST=user-jupyterhub | ||
- MINIO_URL=http://minio:9002 | ||
- MINIO_ACCESS_KEY=minio-readonly | ||
- MINIO_SECRET_KEY=minio123 | ||
- S3_YARN_BUCKET=yarn | ||
- JUPYTER_MODE=jupyterhub | ||
- MAX_EXECUTORS=4 | ||
- POSTGRES_USER=hive | ||
- POSTGRES_PASSWORD=hivepassword | ||
- POSTGRES_DB=hive | ||
- POSTGRES_URL=postgres:5432 | ||
Comment on lines
+236
to
+239
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really the point of this PR but it makes me nervous that users have access to these creds... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea. We do have a todo item for this.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even that would make me nervous. It means that any user could blow away the hive tables There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I updated the todo item to
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Users won't need to create tables? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess another question is if users and devs are all in the same jupyterhub instance is it possible to have different environments There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Users won't since they don't have minIO write permission. I am hoping it can be configured. But TBD. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so users = read only, devs = write, essentially. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we could ever get the remote metastore working that might provide some protection as well, not sure |
||
volumes: | ||
- ./cdr/cdm/jupyter/jupyterhub/users_home:/jupyterhub/users_home | ||
|
||
postgres: | ||
image: postgres:16.3 | ||
restart: always | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,5 @@ | ||
#!/bin/bash | ||
|
||
echo "starting jupyter notebook" | ||
|
||
# Ensure NOTEBOOK_DIR is set | ||
if [ -z "$NOTEBOOK_DIR" ]; then | ||
echo "ERROR: NOTEBOOK_DIR is not set. Please run setup.sh first." | ||
|
@@ -10,17 +8,28 @@ fi | |
|
||
mkdir -p "$NOTEBOOK_DIR" && cd "$NOTEBOOK_DIR" | ||
|
||
# install Plotly extension | ||
jupyter labextension install [email protected] | ||
|
||
# install ipywidgets extension | ||
jupyter labextension install @jupyter-widgets/[email protected] | ||
if [ "$JUPYTER_MODE" = "jupyterlab" ]; then | ||
echo "starting jupyterlab" | ||
# install Plotly extension | ||
jupyter labextension install [email protected] | ||
|
||
# install ipywidgets extension | ||
jupyter labextension install @jupyter-widgets/[email protected] | ||
|
||
# Start Jupyter Lab | ||
jupyter lab --ip=0.0.0.0 \ | ||
--port="$NOTEBOOK_PORT" \ | ||
--no-browser \ | ||
--allow-root \ | ||
--notebook-dir="$NOTEBOOK_DIR" \ | ||
--ServerApp.token='' \ | ||
--ServerApp.password='' | ||
elif [ "$JUPYTER_MODE" = "jupyterhub" ]; then | ||
echo "starting jupyterhub" | ||
|
||
# Start Jupyter Lab | ||
jupyter lab --ip=0.0.0.0 \ | ||
--port="$NOTEBOOK_PORT" \ | ||
--no-browser \ | ||
--allow-root \ | ||
--notebook-dir="$NOTEBOOK_DIR" \ | ||
--ServerApp.token='' \ | ||
--ServerApp.password='' | ||
echo "TO BE IMPLEMENTED" | ||
else | ||
echo "ERROR: JUPYTER_MODE is not set to jupyterlab or jupyterhub. Please set JUPYTER_MODE to either jupyterlab or jupyterhub." | ||
exit 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably only need one jupyterhub instance eventually and control permission based on different user groups. But I haven't completely sort it out yet.