generated from ebmdatalab/notebook-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit d6373cb
Showing
15 changed files
with
1,071 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Don't allow windows checkouts to convert `\n` to `\r\n`, as this | ||
# breaks stuff that is meant to be run in linux-in-docker | ||
* text=auto eol=lf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Credentials for accessing BigQuery | ||
bq-service-account.json | ||
|
||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# jupyter | ||
.ipynb_checkpoints | ||
.ipython/ | ||
.jupyter/ | ||
.local/ | ||
|
||
# sublime test/pycharm | ||
.idea/ | ||
.DS_Store | ||
|
||
# Emacs | ||
*~ | ||
|
||
# Linux trash directories | ||
.Trash-*/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
# The Bennett Institute's default notebook environment | ||
|
||
|
||
## Running Jupyter Lab | ||
|
||
You will need to have installed Git and Docker, please see the | ||
[`INSTALLATION_GUIDE.md`](INSTALLATION_GUIDE.md) for further details. | ||
|
||
Windows and Linux users should double-click the `jupyter-lab` file. | ||
Users on macOS should double-click `jupyter-lab-mac-os` instead. | ||
|
||
This will build a Docker image with all software requirements installed, | ||
start a new Jupyter Lab server, and then provide a link to access this | ||
server. | ||
|
||
The first time you run this command it may take some time to download | ||
and install the necessary software. Subsequent runs should be much | ||
faster. | ||
|
||
|
||
## Adding or updating Python packages | ||
|
||
To install a new package: | ||
|
||
* Add it to the bottom of the `requirements.in` file. | ||
* From the Jupyter Labs Launcher page, choose "Terminal" (in the | ||
"Other" section). | ||
* Run: | ||
```sh | ||
pip-compile -v | ||
``` | ||
This will automatically update your `requirements.txt` file to | ||
include the new package. (The `-v` just means "verbose" so you can | ||
see progess as this command can take a while to run.) | ||
* Shutdown the Jupyter server and re-run the `jupyter-lab` launcher | ||
script. | ||
* Docker should automatically install the new package before starting | ||
the server. | ||
|
||
To update an existing package the process is the same as above except | ||
that instead of running `pip-compile -v` you should run: | ||
```sh | ||
pip-compile -v --upgrade-package <package_name> | ||
``` | ||
|
||
To update _all_ packages you can run: | ||
```sh | ||
pip-compile -v --upgrade | ||
``` | ||
|
||
|
||
## Importing from `lib` | ||
|
||
We used to have configuration which made Python files in the top-level | ||
`lib` directory importable. However this did not work reliably and users | ||
developed a variety of different workarounds. We now no longer make any | ||
changes to Python's default import behaviour. Depending on what | ||
workarounds you already have in place this may make no difference to | ||
you, or it may break your imports. | ||
|
||
If you find your imports no longer work and you have imports of the | ||
form: | ||
```python | ||
from lib import my_custom_library | ||
``` | ||
Then you should move the `lib` directory to be inside `notebooks` and it | ||
should work. | ||
|
||
If your imports no longer work and they are of the form: | ||
```python | ||
import my_custom_library | ||
``` | ||
Then you can move `lib/my_custom_library.py` to | ||
`notebooks/my_custom_library.py`. | ||
|
||
|
||
## Diffing notebook files | ||
|
||
By default, changes to `.ipynb` files do not produce easily readable | ||
diffs in Github. One solution is to enable the "[Rich Jupyter Notebook | ||
Diffs][richdiff]" preview feature. You can find this by clicking your | ||
account icon in top right of the Github interface, choosing "Feature | ||
preview", then "Rich Jupyter Notebooks Diffs" and then "Enable". | ||
|
||
[richdiff]: https://github.blog/changelog/2023-03-01-feature-preview-rich-jupyter-notebook-diffs/ | ||
|
||
Another option is to use [Jupytext][jupytext], which we have pre-added to the | ||
list of installed packages. You can use either the `percent` or | ||
`markdown` formats to create notebooks which have naturally readable | ||
diffs, at the cost of not being able to save the outputs of cells within | ||
the notebook. | ||
|
||
[jupytext]: https://jupytext.readthedocs.io/en/latest/ | ||
|
||
To use the "paired" format in which a traditional `.ipynb` file is saved | ||
alongside a pure-Python variant inside a `diffable_python` directory, | ||
add a file called `jupytext.toml` to the root of your repo containing | ||
these lines: | ||
```toml | ||
[formats] | ||
"notebooks/" = "ipynb" | ||
"notebooks/diffable_python/" = "py:percent" | ||
``` | ||
|
||
To prevent `.ipynb` files from showing in Github diffs add these lines | ||
to the bottom of the `.gitattributes` files: | ||
``` | ||
# Don't show notebook files when diffing in GitHub | ||
notebooks/**/*ipynb linguist-generated=true | ||
``` | ||
|
||
|
||
## How to invite people to cite | ||
|
||
Once a project is completed, please use the instructions [here](https://guides.github.com/activities/citable-code/) to deposit a copy of your code with Zenodo. You will need a Zenodo free account to do this. This creates a DOI. Once you have this please add this in the readme. | ||
|
||
If there is a paper associated with this code, please change the 'how to cite' section to the citation and DOI for the paper. This allows us to build up citation credit. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# syntax=docker/dockerfile:1.2 | ||
FROM python:3.12-bookworm | ||
|
||
# Install apt packages, using the host cache | ||
COPY packages.txt /tmp/packages.txt | ||
RUN --mount=target=/var/lib/apt/lists,type=cache,sharing=locked \ | ||
--mount=target=/var/cache/apt,type=cache,sharing=locked \ | ||
rm -f /etc/apt/apt.conf.d/docker-clean \ | ||
&& apt-get update \ | ||
&& sed 's/#.*//' /tmp/packages.txt \ | ||
| xargs apt-get -y --no-install-recommends install | ||
|
||
# Install Python packages, using the host cache | ||
COPY requirements.txt /tmp/requirements.txt | ||
RUN --mount=type=cache,target=/root/.cache \ | ||
python -m pip install --no-deps --requirement /tmp/requirements.txt | ||
|
||
# Without this, the Jupyter terminal defaults to /bin/sh which is much less | ||
# usable | ||
ENV SHELL=/bin/bash | ||
# Jupyter writes various runtime files to $HOME so we need that to be writable | ||
# regardless of which user we run as | ||
ENV HOME=/tmp | ||
# Allow Jupyter to be configured from within the workspace | ||
ENV JUPYTER_CONFIG_DIR=/workspace/jupyter-config | ||
# This variable is only needed for the `ebmdatalab` package: | ||
# https://pypi.org/project/ebmdatalab/ | ||
ENV EBMDATALAB_BQ_CREDENTIALS_PATH=/workspace/bq-service-account.json | ||
|
||
# Run any necessary post-installation tasks | ||
COPY postinstall.sh /tmp/postinstall.sh | ||
RUN /tmp/postinstall.sh | ||
|
||
RUN mkdir /workspace | ||
WORKDIR /workspace |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
## Docker enviroment | ||
|
||
### Why Docker? | ||
|
||
Software Engineers and Developers need to collaborate on software together. In our team, we use Jupyter | ||
Notebooks to carry out research. Our work requires use of existing software packages. A common problem | ||
is that different team members have different versions of these packages on their machine and work on | ||
different operating systems. This means there are sometimes problems with running shared code. This is | ||
particularly a problem when using a Windows machine. | ||
|
||
Docker allows you to run identical software on all platforms. It creates containers that are guaranteed | ||
to be identical on any system that can run Docker. The exact specification of the environment are | ||
recorded in the `Dockerfile` and by distributing this file, it guarantees that all team members | ||
have the same set up. Because each container is its own entity, team members can have multiple projects | ||
on their machine at the same time without creating clashes between different versions of a package. | ||
|
||
### Installation | ||
|
||
#### | ||
|
||
Windows and Macs have different installation processes. Regardless of machine, you will have to install | ||
Docker and make an account on the [Docker Website](https://docs.docker.com/). | ||
|
||
Please follow installation instructions on the [Docker website](https://docs.docker.com/install/) for how to complete this step. | ||
Docker Desktop is preferred over Docker Toolbox. Docker Desktop offers native support via Hyper-V containers, and so is preferred, but requires | ||
Windows 10 64-bit Pro, Enterprise, or Education (Build 15063 or later), and Hyper-V and Containers | ||
Windows features must be enabled (all of which are the case on our standard university laptop | ||
installs; if Hyper-V has not been enabled, [follow the instructions here[(https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v)). | ||
|
||
Docker Toolbox runs docker within a Linux virtualbox via Docker Machine, and therefore offers a functional but sub-optimal experience. | ||
|
||
|
||
|
||
#### Windows | ||
|
||
First install Docker Desktop onto your machine. Windows users who log into an Active Directory domain | ||
(i.e. a network login) may find they lack permissions to start Docker correctly. If | ||
so, follow [these instructions](https://github.com/docker/for-win/issues/785#issuecomment-344805180). | ||
|
||
It is best to install using the default settings. You may be asked to enable Hyper-V and Containers, | ||
which you should do. At least one user has had the box ticked on the screen but had to untick and tick again | ||
to get this to enable correctly (Detailed in issue [#4](https://github.com/ebmdatalab/custom-docker/issues/4)). | ||
|
||
When starting Docker, it takes a while to actually start up - up to 5 minutes. While it's doing so, an animation runs in the notification area: | ||
|
||
![image](https://user-images.githubusercontent.com/211271/72052991-14a8c000-32be-11ea-948f-575a3c84bc3b.png) | ||
|
||
Another notification appears when it's finished. | ||
|
||
"Running" means there's a docker service running on your computer, to which you can connect using the command line. You can check it's up and running by opening a Command Prompt and entering `docker info`, which should output a load of diagnostics. | ||
|
||
To be able to access the windows filesystem from the docker container (and therefore do development inside Jupyter with results appearing in a place visible to Git), you must explicitly share your hard drive in the Docker settings (click system tray docker icon; select "settings"; select "shared drives") | ||
|
||
##### Network login issues | ||
|
||
When installing from the office, and logged in as a network user, there have been permission problems | ||
that have been solved by adding the special "Authenticated Users" group to the `docker-users` group, per [this comment](https://github.com/docker/for-win/issues/785#issuecomment-327237998) (screenshot of place to do it [here](https://github.com/docker/for-win/issues/785#issuecomment-344805180)). | ||
|
||
Finally, note that when authentication changes (e.g. different logins), you sometimes have to reauthorise Docker's "Shared Drives" (click system tray docker icon; select "settings"; select "shared drives"; click Reset credentials; retick the drive to share; Apply) | ||
|
||
#### Macs | ||
|
||
Follow the instructions from the Docker website. You may have to restart your computer during installation. | ||
|
||
Once you have Docker installed, you will need to log in. This can be accessed via the Applications Folder | ||
and once you have logged in, you should have the Docker icon on the top taskbar (ie. next to battery icon, etc.) | ||
|
||
![image](https://user-images.githubusercontent.com/25401512/75257439-dff4b780-57dc-11ea-9ae8-592e1570bc71.png) | ||
|
||
Once this is running, you should be able to use Docker. | ||
|
||
#### Gotchas | ||
|
||
- The first time you use Docker or use a new Docker template, please be aware that it takes a long time to make the build. | ||
It is easy to think that it has frozen, but it will make quite a while to get going. | ||
|
||
If this is the case, look at this cat whilst you wait: | ||
|
||
![Alt Text](https://media.giphy.com/media/vFKqnCdLPNOKc/giphy.gif) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# The Bennett Institute's skeleton notebook environment | ||
|
||
|
||
## Getting started with this skeleton project | ||
|
||
This is a skeleton project for creating a reproducible, cross-platform | ||
analysis notebook, using Docker. | ||
|
||
Developers and analysts using this skeleton for new development should | ||
refer to [`DEVELOPERS.md`](DEVELOPERS.md) for instructions on getting | ||
started. Update this `README.md` so it is a suitable introduction to | ||
your project. | ||
|
||
|
||
## Running Jupyter Lab | ||
|
||
You will need to have installed Git and Docker, please see the | ||
[`INSTALLATION_GUIDE.md`](INSTALLATION_GUIDE.md) for further details. | ||
|
||
Windows and Linux users should double-click the `jupyter-lab` file. | ||
Users on macOS should double-click `jupyter-lab-mac-os` instead. | ||
|
||
Note: if double-clicking the `jupyter-lab` file opens the file in VS Code, you | ||
should instead right-click on the file and open it with Git for Windows. | ||
|
||
This will build a Docker image with all software requirements installed, | ||
start a new Jupyter Lab server, and then provide a link to access this | ||
server. | ||
|
||
The first time you run this command it may take some time to download | ||
and install the necessary software. Subsequent runs should be much | ||
faster. | ||
|
||
Note: if running the command fails with: | ||
|
||
``` | ||
docker: Error response from daemon: user declined directory sharing C:\path\to\directory | ||
``` | ||
|
||
you should open the Docker dashboard, and then under Settings -> Resources -> | ||
FileSharing, add the appropriate path. | ||
|
||
|
||
## How to cite | ||
|
||
XXX Please change to either a paper (if published) or the repo. You may find it helpful to use Zenodo DOI (see [`DEVELOPERS.md`](DEVELOPERS.md#how-to-invite-people-to-cite) for further information) |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Ignore runtime-generated config | ||
/lab | ||
/labconfig |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
#!/bin/bash | ||
|
||
# We want launcher shell scripts which can be directly executed from the file | ||
# manager GUI without requiring a terminal. On Windows this requires an | ||
# extension of ".sh", on macOS this requires either no extension or the | ||
# extension ".command". There's no way to jointly satisfy these requirements so | ||
# we need two launchers with different extensions, one of which just | ||
# immediately executes the other. | ||
|
||
# Unset CDPATH to prevent `cd` potentially behaving unexpectedly | ||
unset CDPATH | ||
cd "$( dirname "${BASH_SOURCE[0]}")" | ||
|
||
exec ./jupyter-lab.sh |
Oops, something went wrong.