Skip to content

Commit

Permalink
update links for sda-*
Browse files Browse the repository at this point in the history
  • Loading branch information
blankdots committed Nov 13, 2023
1 parent 1f8fb93 commit 17f54d5
Show file tree
Hide file tree
Showing 8 changed files with 42 additions and 50 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Recommended provisioning methods provided for production are:

* on a [Kubernetes cluster](https://github.com/neicnordic/sda-helm/), using `kubernetes` and `helm` charts;
* on a [Kubernetes cluster](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts), using `kubernetes` and `helm` charts;
* on a [Docker Swarm cluster](https://github.com/neicnordic/LocalEGA-deploy-swarm), using `gradle` and `docker swarm`.

## Architecture
Expand All @@ -11,30 +11,30 @@ SDA is divided into several components, which can be deployed either for Federat

### Core Components

Source code for core components (unless specified otherwise) is available at: https://github.com/neicnordic/sda-pipeline
Source code for core components is available at: https://github.com/neicnordic/sensitive-data-archive

| Component | Role |
|---------------|------|
| inbox | SFTP, S3 or HTTPS server, acting as a dropbox, where user credentials are fetched from CentralEGA or via ELIXIR AAI. https://github.com/neicnordic/sda-s3proxy/ or https://github.com/neicnordic/sda-inbox-sftp |
| inbox | SFTP, S3 or HTTPS server, acting as a dropbox, where user credentials are fetched from CentralEGA or via LifeScience AAI. [s3inbox](https://github.com/neicnordic/sensitive-data-archive/tree/main/sda/cmd/s3inbox/s3inbox.md) or [sftp-inbox](https://github.com/neicnordic/sensitive-data-archive/tree/main/sda-sftp-inbox) |
| intercept | The intercept service relays message between the queue provided from the federated service and local queues. **(Required for Federated EGA use case)** |
| ingest | Split the Crypt4GH header and move the remainder to the storage backend. No cryptographic task, nor access to the decryption keys. |
| verify | Decrypt the stored files and checksum them against their embedded checksum. |
| archive | Storage backend: as a regular file system or as a S3 object store. |
| finalize | Handle the so-called _Accession ID_ to filename mappings from CentralEGA. |
| mapper | The mapper service register mapping of accessionIDs (IDs for files) to datasetIDs. |
| data out API | Provides a download/data access API for streaming archived data either in encrypted or decrypted format - source at: https://github.com/neicnordic/sda-doa |
| download | Provides a download/data access API for streaming (decrypted) archived data - source at: https://github.com/neicnordic/sda-download |

### Associated components

| Component | Role |
|---------------|------|
| db | A Postgres database with appropriate schemas and isolations https://github.com/neicnordic/sda-db/ |
| mq | A (local) RabbitMQ message broker with appropriate accounts, exchanges, queues and bindings, connected to the CentralEGA counter-part. https://github.com/neicnordic/sda-mq/ |
| db | A [Postgres database](https://github.com/neicnordic/sensitive-data-archive/tree/main/postgresql) with appropriate schemas and isolations |
| mq | A [(local) RabbitMQ](https://github.com/neicnordic/sensitive-data-archive/tree/main/rabbitmq) message broker with appropriate accounts, exchanges, queues and bindings, connected to the CentralEGA counter-part. |


### Stand-alone components

| Component | Role |
|---------------|------|
| metadata | Component used in standalone version of SDA. Provides an interface and backend to submit Metadata and associated with a file in the Archive. https://github.com/neicnordic/sda-metadata-mirror/ with UI https://github.com/neicnordic/FormSubmission_UI |
| orchestrate | Component that automates ingestion in stand-alone deployments of SDA Pipeline https://github.com/neicnordic/sda-orchestration |
19 changes: 8 additions & 11 deletions docs/connection.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,15 @@ The RabbitMQ message brokers of each SDA instance are the **only**
components with the necessary credentials to connect to Central EGA
message broker.

We call `CEGAMQ` and `LocalMQ` (Local Message Broker, also known as
`sda-mq`), the RabbitMQ message brokers of, respectively, `Central EGA`
and `SDA`/`LocalEGA`.
We call `CEGAMQ` and `LocalMQ` (Local Message Broker, somtimes know as `sda-mq`),
the RabbitMQ message brokers of, respectively, `Central EGA` and `SDA`/`LocalEGA`.

Local Message Broker
--------------------

> NOTE:
> Source code repository for MQ component is available at:
> [https://github.com/neicnordic/sda-mq](https://github.com/neicnordic/sda-mq)
> [sensitive-data-archive RabbitMQ](https://github.com/neicnordic/sensitive-data-archive/tree/main/rabbitmq)

### Configuration
Expand Down Expand Up @@ -85,7 +84,6 @@ following queues, in the default `vhost`:
Name | Purpose
:----------------|:---------------------------------------
archived | Archived files.
backup | Signal files to backup
completed | Files are backed up
error | User-related errors
files | Receive notification for ingestion from `CEGAMQ` or Orchestrator
Expand Down Expand Up @@ -147,7 +145,7 @@ Central EGA and any Local EGAs. Central EGA's messages are
JSON-formatted.

The JSON schemas can be found in:
<https://github.com/neicnordic/sda-pipeline/tree/master/schemas>
<https://github.com/neicnordic/sensitive-data-archive/tree/main/sda/schemas>

When a `Submission Inbox` sends an `upload` message to CentralEGA it contains the
following:
Expand Down Expand Up @@ -210,7 +208,7 @@ of messages:
> sha256 checksum will be calculated by `Ingest` service.
The message received from Central EGA to start ingestion at a Federated EGA node.
Processed by the the sda-pipeline `ingest` service.
Processed by the the `ingest` service.

```javascript
{
Expand Down Expand Up @@ -258,9 +256,8 @@ adding the [Accession ID]{.title-ref}.
```

`Finalize` service should receive the message below and assign the
`Accession ID` to the corresponding file and send a message to `backup`
queue for the backup services or in case there is no backup service to
the `completed` queue.
`Accession ID` to the corresponding file and send a message to the `completed` queue
when the `accession ID` has been set (in case of Federated EGA this also means backup copy has been done).

```javascript
{
Expand All @@ -275,7 +272,7 @@ the `completed` queue.
}
```

The message sent from the sda-pipeline `finalize` service to the `backup` service via `completed` queue.
The message sent from the `finalize` to the `completed` queue.

```javascript
{
Expand Down
2 changes: 1 addition & 1 deletion docs/dataout.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ SDA-download
Recommended provisioning method for production is:

- on a `kubernetes cluster` using the [helm
chart](https://github.com/neicnordic/sda-helm/).
chart](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts).

`sda-download` focuses on enabling deployment of a stand-alone version
of SDA, with features such as:
Expand Down
10 changes: 5 additions & 5 deletions docs/db.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
Database Setup
==============

We use a Postgres database (version 13+ ) to store intermediate data, in
We use a Postgres database (version 15+ ) to store intermediate data, in
order to track progress in file ingestion. The `lega` database schema is
documented below.

> NOTE:
> Source code repository for DB component is available at:
> <https://github.com/neicnordic/sda-db>
> <https://github.com/neicnordic/sensitive-data-archive/tree/main/postgresql>
The database container will initialize and create the necessary database
structure and functions if started with an empty area. Procedures for
*backing up the database* are important but considered out of scope for
the secure data archive project.

Look at [the SQL
definitions](https://github.com/neicnordic/sda-db/tree/master/initdb.d)
definitions](https://github.com/neicnordic/sensitive-data-archive/tree/main/postgresql/initdb.d)
if you are also interested in the database triggers.

Configuration
Expand Down Expand Up @@ -82,8 +82,8 @@ both the database initialization scripts (and bumping the bootstrapped
schema version) as well as creating the corresponding migration script
to perform the changes on a database in use.

Migration scripts should be placed in `/migratedb.d/` in the sda-db repo
(<https://github.com/neicnordic/sda-db>). We recommend naming them
Migration scripts should be placed in `/migratedb.d/` in the *sensitive-data-archive* repo
(<https://github.com/neicnordic/sensitive-data-archive/tree/main/postgresql>). We recommend naming them
corresponding to the schema version they provide migration to. There is
an "empty" migration script (`01.sql`) that can be used as a
template.
16 changes: 2 additions & 14 deletions docs/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,22 +9,10 @@ Swarm](https://docs.docker.com/engine/swarm/) for production.

The production deployment repositories are:

- [Kubernetes Helm charts](https://github.com/neicnordic/sda-helm/);
- [Kubernetes Helm charts](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts);
- [Docker Swarm
deployment](https://github.com/neicnordic/LocalEGA-deploy-swarm/).

The following container images are used in the deployments:

- `neicnordic/sda-pipeline`, provides the LocalEGA services (minimal
container with static binary and support files).
- `neicnordic/sda-mq`, provides the broker (mq) service (based on
*rabbitmq:3.8.16-management-alpine*;
- `neicnordic/sda-db`, provides the database service (based on
*postgres:13-alpine3.14*);
- `neicnordic/sda-inbox-sftp`, provides the inbox service via sftp
(based on Apache Mina, container base
*openjdk:13-alpine*);
- `neicnordic/sda-doa`, provides the data out service (Data Out API);
- `neicnordic/sda-s3-proxy`, provides the inbox service via a s3 proxy
(S3 proxy inbox, minimal container with static binary and support
files).
- `neicnordic/sensitive-data-archive`, provides the SDA services as well as PostgreSQL and RabbitMQ configuration.
21 changes: 15 additions & 6 deletions docs/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,28 @@ Installation
============

The sources for SDA can be downloaded and installed from the [NeIC
Github repo](https://github.com/neicnordic/sda-pipeline).
Github repo](https://github.com/neicnordic/sensitive-data-archive).

In order to build binaries:
```bash
$ git clone https://github.com/neicnordic/sda-pipeline.git
$ go build
$ git clone https://github.com/neicnordic/sensitive-data-archive.git
$ cd sda
$ for p in cmd/*; do go build -buildvcs=false -o "${p/cmd\//sda-}" "./$p"; done
```

To be able to develop the source code
```bash
$ git clone https://github.com/neicnordic/sensitive-data-archive.git
$ go work init
$ go work use ./sda
$ cd sda
```

The recommended method is however to use one of our deployment
strategies:

- [Kubernetes Helm charts](https://github.com/neicnordic/sda-helm/);
- [Docker
Swarm](https://github.com/neicnordic/LocalEGA-deploy-swarm/).
- [Kubernetes Helm charts](https://github.com/neicnordic/sensitive-data-archive/tree/main/charts);
- [Docker Swarm](https://github.com/neicnordic/LocalEGA-deploy-swarm/).

Configuration
-------------
Expand Down
8 changes: 4 additions & 4 deletions docs/submission.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Structure of the message and its contents are described in

> NOTE:
> Source code repository for Submission components is available at:
> <https://github.com/neicnordic/sda-pipeline>
> <https://github.com/neicnordic/sensitive-data-archive>
### Ingestion Workflow

Expand Down Expand Up @@ -133,7 +133,7 @@ Mina SSHD.

> NOTE:
> Sources are located at the separate repository:
> <https://github.com/neicnordic/sda-inbox-sftp> Essentially, it's a
> <https://github.com/neicnordic/sensitive-data-archive/tree/main/sda-inbox-sftp> Essentially, it's a
> Spring-based Maven project, integrated with the
> [Local Message Broker](connection.md#local-message-broker).
Expand All @@ -142,12 +142,12 @@ Mina SSHD.

> NOTE:
> Sources are located at the separate repository:
> <https://github.com/neicnordic/sda-s3proxy>
> <https://github.com/neicnordic/sensitive-data-archive/blob/main/sda/cmd/s3inbox/>
The S3 Proxy uses access tokens as the main authentication mechanism.

The sda authentication service
(<https://github.com/neicnordic/sda-auth>) is designed to convert CEGA
(<https://github.com/neicnordic/sensitive-data-archive/tree/main/sda-auth>) is designed to convert CEGA
REST endpoint authentication to a JWT that can be used when uploading to
the S3 proxy.

Expand Down
4 changes: 1 addition & 3 deletions docs/tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,4 @@ scenarios, users will utilize the system as a whole.
> Unit tests and integration tests are automatically executed with every
> push and PR to the `NeIC Github repo` via Github Actions.
In order to replicate integration tests on a local machine see:
[sda-pipeline Local testing
howto](https://github.com/neicnordic/sda-pipeline/tree/master/dev_utils#readme)
**TO ADD: Local Setup Testing**

0 comments on commit 17f54d5

Please sign in to comment.