Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge changes to teaser into main branch #1

Merged
merged 6 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 15 additions & 24 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,40 +34,31 @@ website:
contents:
- team.md
- schedule.md
- setup.qmd
- section: Orientation
- section: Set-up
contents:
- text: RStudio
href: content/02-rstudio.qmd
- text: Local set-up - R
- set-up.qmd
- text: R and RStudio on your computer
href: content/02-local-setup-r.qmd
- text: R and RStudion on the JupyterHub
href: content/02-jupyterhub-setup-r.qmd
- text: Git-Authentication
href: content/02-git.md
- text: Git-JupyterLab
href: content/02-git-jupyter.md
- text: Git-RStudio
href: content/02-git-rstudio.md
- coc.md
- section: "Orientation"
contents:
- text: Reproducible Reports
href: content/01-repro-reports.qmd
- text: Intro to RStudio
href: content/02-rstudio.qmd
- text: JupyterHubs
href: content/01-intro-to-jupyterhub.qmd
- section: "Tutorials"
contents:
- section: "Welcome"
contents:
- text: Welcome
href: content/01-welcome.md
- text: Reproducible Reports
href: content/01-repro-reports.qmd
- text: Jupyter hubs
href: content/01-intro-to-jupyterhub.qmd
- section: "Tutorials"
contents:
- text: Week 1
href: tutorials/tutorial-1.qmd
- text: Week 2
href: tutorials/tutorial-2.qmd
- text: Week 3
href: tutorials/tutorial-3.qmd
- text: Week 4
href: tutorials/tutorial-4.qmd
- text: Coming!
- text: <img width=100px>
- text: <img src="/images/noaa_emblem_logo-2022.png" width=100px>
href: https://fisheries.noaa.gov
Expand All @@ -79,7 +70,7 @@ website:
href: https://nmfs-openscapes.2i2c.cloud/
text: "JupyterHub"
- icon: wechat
href: https://github.com/nmfs-opensci/Quarto-Workshop-2024/discussions
href: https://mail.google.com/mail/u/0/#chat/space/AAAAuuftzDk
text: "Discussions"
- icon: github
href: https://github.com/nmfs-opensci/Quarto-Workshop-2024
Expand Down
27 changes: 19 additions & 8 deletions content/01-repro-reports.qmd
Original file line number Diff line number Diff line change
@@ -1,16 +1,27 @@
---
title: Intro the Cloud
title: Reproducible Reports
---

![](../images/cloud-overview.png)
Today we are working with a JupyterHub in Azure, while the data we are accessing is on AWS us-west-2. This means we cannot really do the lower right option of 'cloud native' computing. If our JupyterHub were on AWS us-west-2, then we could direct connect to the S3 buckets and it would be as if we had downloaded the data. We can effectively "attach" a cloud drive with petabytes of data to our virtual machine.
Reproducible reports are documents that combine analysis code, outputs (like plots or tables), and narratives to ensure that results can be easily replicated and verified by others. They are commonly used in scientific research, data science, and analytics to maintain transparency and reliability. By incorporating both the code and results into one document (or a collection of documents), anyone can run the same analysis on the same data (or updated data) and obtain the same results.

Today we will be showing how to "stream cloud data without downloads". Because we are on Azure and the data are in S3 us-west-2 buckets, we have to use "https" access. Be aware that *https access without downloads is painfully slow on cloud-ignorant netCDF files*. Of course, downloads of large datasets is also painfully slow. This is one of the reasons why providing cloud-optimized data formats is so important.
### Examples of NMFS reproducible reports

JupyterHubs (and virtual machines) can be easily spun up on any cloud provider and computing is cheap (storage is expensive). You do not need to install the compute environments. The common workflow is to run a provided docker image with `docker run`.
Join the NMFS R User Group on October 29th 12pm PT/3pm ET to hear 5 examples of how NMFS groups have used Quarto (and RMarkdown) to create more efficient and reproducible fisheries reports. [Add to calendar](https://calendar.google.com/calendar/event?action=TEMPLATE&tmeid=cGRyaThvbTgwdHBxOHZxZ2Y4bXJ1cGptdTFfMjAyNDEwMjJUMTkwMDAwWiBub2FhLmdvdl82MHJmbjdtbDlycGNobDYzdnM0YWY5bjAxOEBn&tmsrc=noaa.gov_60rfn7ml9rpchl63vs4af9n018%40group.calendar.google.com).

---
<hr>

Sam Schiano gave a great talk October 3rd on the NOAA Fisheries Stock Assessment Workflow project. She reviewed different approaches that the stock assessment teams across the science centers have taken to creating 'templates' for stock assessment reports.

<center>
{{< video https://www.youtube.com/embed/Q8XJTTkjcts width="400" height="300">}}
</center>

<hr>

The Alaska Fisheries Science Center MML program gave a talk about how they transformed MML SARs to a reproducible and more efficient workflow.

<center>
{{< video https://www.youtube.com/embed/wqbwcCXbFL8 width="400" height="300">}}
</center>

Lecture on NASA earth data in the cloud by Michele Thornton (NASA Openscapes) [Video](https://www.youtube.com/watch?v=E5Dpeap16hU)

{{< video https://www.youtube.com/watch?v=E5Dpeap16hU width="400" >}}
10 changes: 0 additions & 10 deletions content/01-welcome.md

This file was deleted.

20 changes: 0 additions & 20 deletions content/02-earthdata.md

This file was deleted.

30 changes: 30 additions & 0 deletions content/02-jupyterhub-setup-r.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: R/RStudio on JupyterHub
---

Everything you need is installed for you. Note, before you can log in, we will need to add you to the access group.

* <https://nmfs-openscapes.2i2c.cloud>
* Click Start Server and accept defaults
* Click the RStudio button. If you don't see it, click the blue button in the top left to open a new 'Launcher' window.

![](content/img/jhub-launcher.png)

## Access to the JupyterHub

### GitHub username

* Create a GitHub account (if you don’t already have one) at <https://github.com>. Advice for choosing a GitHub username: this is a professional username that you will use in work settings. GitHub accounts are not anonymous; this is for sharing work. Using your real name is common.
* Write down your username and password; you will need to log in during the course!
* Here is [a video](https://www.youtube.com/watch?v=nHXw4mGoqiE) showing the whole process

### Get on JupyterHub

Once you have submitted your GitHub username and have been accepted as a member on the nmfs-openscapes organization, you can log-into the JupyterHub.

<https://nmfs-openscapes.2i2c.cloud>

* **Choose the default Py-R base geospatial image**. [Watch a video](https://youtu.be/o99jZWHqKi8) of the login process and basic JupyterHub orientation.

* home directory is yours and no one else can see it.
* To share files, you can connect to a GitHub repository or use the `shared` directory. Everyone can read and write to this directory. Please don't delete content that is not your own.
68 changes: 8 additions & 60 deletions content/02-local-setup-r.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,70 +2,18 @@
title: Setting up on your computer - R users
---

# Running locally
You can also use R and RStudio that you have installed locally.

If you are set up locally to run R and have tidyverse installed, then for most of the tutorials, you only need to install `earthdatalogin` and `terra` with
## Install R 4+

```
devtools::install_github("boettiger-lab/earthdatalogin")
install.packages("terra")
```
Any version of R 4+ will be fine.

RStudio should alert you if there are other packages that you need to install.
https://cran.r-project.org/

# Running from Posit Cloud
## Install RStudio Desktop

The standard [Posit Cloud](https://posit.cloud/) has most of the packages you will need. For most of the tutorials, you only need to install `earthdatalogin` and `terra` with
Please update to the latest version of RStudio Desktop: https://posit.co/download/rstudio-desktop/. Quarto comes bundled with RStudio so you do not need to install it separately.

```
devtools::install_github("boettiger-lab/earthdatalogin")
install.packages("terra")
```
## Install the needed packages

# Running from a Docker file

If you don't have RStudio or R installed, the easiest way to run the R tutorials locally is to use the Docker container `py-rocket-geospatial`; this is the environment that is being used in the JupyterHub. You'll need containerization software such as [Docker Desktop](https://www.docker.com/products/docker-desktop/) or [Podman](https://podman.io/) installed.

## Start the docker image

Make sure Docker or Podman is running (open the Desktop application), then run the container:

Windows, Linux or Intel-chip Mac:
```bash
docker run -p 8888:8888 --cap-add SYS_PTRACE --security-opt seccomp=unconfined ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial:latest
```

Apple-chip Mac (probably will not work):
```bash
docker run -p 8888:8888 --platform linux/amd64 --cap-add SYS_PTRACE --security-opt seccomp=unconfined ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial:latest
```

::: {.callout-tip}
If you're using Podman, simply replace `docker` with `podman` in this command and any that follow.
:::

After `docker run` there will be a long printout in the terminal window, you will see instructions for how to access the server (`To access the server...`). Copy and paste either of the URLs into a web browser. You should be greeted with a JupyterLab dashboard as in the live demo.

## Clone the tutorials

Clone and then cd into the repo. Open a terminal and run
```bash
cd ~
git clone https://github.com/nmfs-opensci/EDMW-3B-tutorials
```

The Python tutorials are in the `tutorials/python` directory.


## Connect your local files to image

Clone and then cd into the repo. Open a terminal and run
```bash
git clone https://github.com/nmfs-opensci/EDMW-3B-tutorials
cd EDMW-3B-tutorials
```

Windows, Linux or Intel-chip Mac:
```bash
docker run -p 8888:8888 --cap-add SYS_PTRACE --security-opt seccomp=unconfined -v /$(pwd):/home/jovyan/ ghcr.io/nmfs-opensci/container-images/py-rocket-geospatial:latest
```
We will update this section closer to the workshops.
2 changes: 1 addition & 1 deletion content/02-rstudio.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: RStudio - R

::: {.callout-note icon=false}

In this tutorial, we will provide a brief introduction to:
In this tutorial, we will provide a brief introduction to RStudio using the JupyterHub.

1. Open RStudio in the JupyterHub
2. Basic navigation around **RStudio**: the 4 main panels and menus
Expand Down
Binary file modified content/sst.nc
Binary file not shown.
Loading