Skip to content

Commit

Permalink
Merge pull request #1958 from reebhub/RDoc-3119_AWS-SQS-ETL
Browse files Browse the repository at this point in the history
Added Snowflake and Amazon SQS ETL pages (Server, Studio, Client-API)
  • Loading branch information
ppekrol authored Dec 20, 2024
2 parents 1da3523 + 0f528e3 commit 3b98a9d
Show file tree
Hide file tree
Showing 70 changed files with 2,945 additions and 71 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ There are three authentication methods available:

{NOTE: }

<a id="example-basic" /> __Example__:
<a id="example-basic" /> **Example**:

---

Expand All @@ -105,7 +105,7 @@ There are three authentication methods available:
{NOTE/}
{NOTE: }

<a id="delete-processed-documents" /> __Delete processed documents__:
<a id="delete-processed-documents" /> **Delete processed documents**:

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Before setting up the ETL task, define a connection string that the task will us

{NOTE: }

<a id="example-basic" /> __Example - basic__:
<a id="example-basic" /> **Example - basic**:

---

Expand All @@ -76,7 +76,7 @@ Before setting up the ETL task, define a connection string that the task will us
{NOTE/}
{NOTE: }

<a id="delete-processed-documents" /> __Example - delete processed documents__:
<a id="delete-processed-documents" /> **Example - delete processed documents**:

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,30 +34,29 @@
RavenDB produces messages to broker queues via the following Queue ETL tasks:

* **Kafka ETL Task**
You can define a Kafka ETL Task from the [Studio](../../../../studio/database/tasks/ongoing-tasks/kafka-etl-task)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/kafka).
You can define a Kafka ETL Task from [Studio](../../../../studio/database/tasks/ongoing-tasks/kafka-etl-task)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/kafka).
* **RabbitMQ ETL Task**
You can define a RabbitMQ ETL Task from the [Studio](../../../../studio/database/tasks/ongoing-tasks/rabbitmq-etl-task)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/rabbit-mq).
You can define a RabbitMQ ETL Task from [Studio](../../../../studio/database/tasks/ongoing-tasks/rabbitmq-etl-task)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/rabbit-mq).
* **Azure Queue Storage ETL Task**
You can define an Azure Queue Storage ETL Task from the [Studio](../../../../studio/database/tasks/ongoing-tasks/azure-queue-storage-etl)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/azure-queue).
You can define an Azure Queue Storage ETL Task from [Studio](../../../../studio/database/tasks/ongoing-tasks/azure-queue-storage-etl)
or using the [Client API](../../../../server/ongoing-tasks/etl/queue-etl/azure-queue).

---

These ETL tasks:
The above ETL tasks:

* **Extract** selected data from RavenDB documents from specified collections.
* **Transform** the data to new JSON objects.
* Wrap the JSON objects as [CloudEvents messages](https://cloudevents.io) and **Load** them to the designated message broker.
* Wrap the JSON objects as [CloudEvents messages](https://cloudevents.io)
and **Load** them to the designated message broker.

{PANEL/}

{PANEL: Data delivery}

{NOTE: }

#### What is transferred
#### What is transferred:

* **Documents only**
A Queue ETL task transfers documents only.
Expand All @@ -66,10 +65,7 @@ These ETL tasks:
JSON objects produced by the task's transformation script are wrapped
and delivered as [CloudEvents Messages](../../../../server/ongoing-tasks/etl/queue-etl/overview#cloudevents).

{NOTE/}
{NOTE: }

#### How are messages produced and consumed
#### How are messages produced and consumed:

* The Queue ETL task will send the messages it produces to the target using a **connection string**,
which specifies the destination and credentials required to authorize the connection.
Expand All @@ -79,10 +75,7 @@ These ETL tasks:
* RavenDB publishes messages to the designated brokers using [transactions and batches](../../../../server/ongoing-tasks/etl/basics#batch-processing),
creating a batch of messages and opening a transaction to the destination queue for the batch.

{NOTE/}
{NOTE: }

#### Idempotence and message duplication
#### Idempotence and message duplication:

* RavenDB is an **idempotent producer**, which typically does not send duplicate messages to queues.
* However, it is possible that duplicate messages will be sent to the broker.
Expand All @@ -93,7 +86,6 @@ These ETL tasks:
* Therefore, if processing each message only once is important to the consumer,
it is **the consumer's responsibility** to verify the uniqueness of each consumed message.

{NOTE/}
{PANEL/}

{PANEL: CloudEvents}
Expand Down Expand Up @@ -121,8 +113,8 @@ These ETL tasks:

{PANEL: Task statistics}

Use the Studio's [Ongoing tasks stats view](../../../../studio/database/stats/ongoing-tasks-stats/overview) to see various statistics related to data extraction, transformation,
and loading to the target broker.
Use the Studio [Ongoing tasks stats](../../../../studio/database/stats/ongoing-tasks-stats/overview) view
to see various statistics related to data extraction, transformation, and loading to the target broker.

![Queue Brokers Stats](images/overview_stats.png "Ongoing tasks stats view")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
---

* This article focuses on how to create a RabbitMQ ETL task using the Client API.
To define a RabbitMQ ETL task from the Studio see [Studio: RabbitMQ ETL Task](../../../../studio/database/tasks/ongoing-tasks/rabbitmq-etl-task).
To define a RabbitMQ ETL task from Studio see [Studio: RabbitMQ ETL Task](../../../../studio/database/tasks/ongoing-tasks/rabbitmq-etl-task)
For an **overview of Queue ETL tasks**, see [Queue ETL tasks overview](../../../../server/ongoing-tasks/etl/queue-etl/overview).

* In this page:
Expand Down Expand Up @@ -61,7 +61,7 @@ Before setting up the ETL task, define a connection string that the task will us

{NOTE: }

<a id="example-basic" /> __Example - basic__:
<a id="example-basic" /> **Example - basic**:

---

Expand All @@ -76,7 +76,7 @@ Before setting up the ETL task, define a connection string that the task will us
{NOTE/}
{NOTE: }

<a id="delete-processed-documents" /> __Example - delete processed documents__:
<a id="delete-processed-documents" /> **Example - delete processed documents**:

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,21 @@
{NOTE: }

* The RavenDB **Azure Queue Storage ETL task** -
* **Extracts** selected data from RavenDB documents from specified collections.
* **Extracts** selected data from RavenDB documents of specified collections.
* **Transforms** the data into JSON object.
* Wraps the JSON objects as [CloudEvents messages](https://cloudevents.io) and **Loads** them to an Azure Queue Storage.

* The Azure Queue Storage ETL task transfers **documents only**.
Document extensions like attachments, counters, time series, and revisions are not sent.
The maximum message size in Azure Queue Storage is 64KB, documents larger than this won’t be loaded.
The maximum message size in Azure Queue Storage is 64KB, documents larger than this will not be loaded.

* The Azure Queue Storage enqueues incoming messages at the tail of a queue.
[Azure Functions](https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue-trigger?tabs=python-v2%2Cisolated-process%2Cnodejs-v4%2Cextensionv5&pivots=programming-language-csharp)
can be triggered to access and consume messages when the enqueued messages advance to the queue head.

---

* This page explains how to create an Azure Queue Storage ETL task using the Studio.
* This page explains how to create an Azure Queue Storage ETL task using Studio.
[Learn here](../../../../server/ongoing-tasks/etl/queue-etl/azure-queue) how to define an Azure Queue Storage ETL task using the Client API.

* In this page:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Add Connection String Operation
---

{NOTE: }

* Use the [PutConnectionStringOperation](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#the%C2%A0putconnectionstringoperation%C2%A0method) method to define a connection string in your database.

* In this page:
* [Add a RavenDB connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-a-ravendb-connection-string)
* [Add an SQL connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-an-sql-connection-string)
* [Add a Snowflake connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-a-snowflake-connection-string)
* [Add an OLAP connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-an-olap-connection-string)
* [Add an Elasticsearch connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-an-elasticsearch-connection-string)
* [Add a Kafka connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-a-kafka-connection-string)
* [Add a RabbitMQ connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-a-rabbitmq-connection-string)
* [Add an Azure Queue Storage connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-an-azure-queue-storage-connection-string)
* [Add an Amazon SQS connection string](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#add-an-amazon-sqs-connection-string)
* [The PutConnectionStringOperation method](../../../../client-api/operations/maintenance/connection-strings/add-connection-string#the%C2%A0putconnectionstringoperation%C2%A0method)

{NOTE/}

---

{PANEL: Add a RavenDB connection string}

RavenDB connection strings are used by RavenDB [RavenDB ETL Tasks](../../../../server/ongoing-tasks/etl/raven).

#### Example:
{CODE add_raven_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

#### Syntax:
{CODE raven_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{NOTE: }

**Secure servers**

To [connect to secure RavenDB servers](../../../../server/security/authentication/certificate-management#enabling-communication-between-servers:-importing-and-exporting-certificates)
you need to:
1. Export the server certificate from the source server.
2. Install it as a client certificate on the destination server.

This can be done from the Studio [Certificates view](../../../../server/security/authentication/certificate-management#studio-certificates-management-view).

{NOTE/}

{PANEL/}

{PANEL: Add an SQL connection string}

SQL connection strings are used by RavenDB [SQL ETL Tasks](../../../../server/ongoing-tasks/etl/sql).

#### Example:
{CODE add_sql_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

#### Syntax:
{CODE sql_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{PANEL/}

{PANEL: Add a Snowflake connection string}

[Snowflake connection strings](https://github.com/snowflakedb/snowflake-connector-net/blob/master/doc/Connecting.md)
are used by RavenDB [Snowflake ETL Tasks](../../../../server/ongoing-tasks/etl/snowflake).

#### Example:
{CODE add_snowflake_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{PANEL/}

{PANEL: Add an OLAP connection string}

OLAP connection strings are used by RavenDB [OLAP ETL Tasks](../../../../server/ongoing-tasks/etl/olap).

#### Example: To a local machine
{CODE add_olap_connection_string_1@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

#### Example: To a cloud-based server

* The following example shows a connection string to Amazon AWS.
* Adjust the parameters as needed if you are using other cloud-based servers (e.g. Google, Azure, Glacier, S3, FTP).
* The available parameters are listed in [ETL destination settings](../../../../server/ongoing-tasks/etl/olap#etl-destination-settings).

{CODE add_olap_connection_string_2@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

#### Syntax:
{CODE olap_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{PANEL/}

{PANEL: Add an Elasticsearch connection string}

Elasticsearch connection strings are used by RavenDB [Elasticsearch ETL Tasks](../../../../server/ongoing-tasks/etl/elasticsearch).

#### Example:
{CODE add_elasticsearch_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

#### Syntax:
{CODE elasticsearch_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{PANEL/}

{PANEL: Add a Kafka connection string}

Kafkah connection strings are used by RavenDB [Kafka Queue ETL Tasks](../../../../server/ongoing-tasks/etl/queue-etl/kafka).
Learn how to add a Kafka connection string in the [Add a Kafka connection string]( ../../../../server/ongoing-tasks/etl/queue-etl/kafka#add-a-kafka-connection-string) section.

{PANEL/}

{PANEL: Add a RabbitMQ connection string}

RabbitMQ connection strings are used by RavenDB [RabbitMQ Queue ETL Tasks](../../../../server/ongoing-tasks/etl/queue-etl/rabbit-mq).
Learn how to add a RabbitMQ connection string in the [Add a RabbitMQ connection string]( ../../../../server/ongoing-tasks/etl/queue-etl/rabbit-mq#add-a-rabbitmq-connection-string) section.

{PANEL/}

{PANEL: Add an Azure Queue Storage connection string}

Azure Queue Storage connection strings are used by RavenDB [Azure Queue Storage ETL Tasks](../../../../server/ongoing-tasks/etl/queue-etl/azure-queue).
Learn to add an Azure Queue Storage connection string in the [Add an Azure Queue Storage connection string]( ../../../../server/ongoing-tasks/etl/queue-etl/azure-queue#add-an-azure-queue-storage-connection-string) section.

{PANEL/}

{PANEL: Add an Amazon SQS connection string}

Amazon SQS connection strings are used by RavenDB [Amazon SQS ETL Tasks](../../../../server/ongoing-tasks/etl/queue-etl/aws-sqs).
Learn to add an SQS connection string in [this section](../../../../server/ongoing-tasks/etl/queue-etl/aws-sqs#add-an-aws-sqs-connection-string).

{PANEL/}

{PANEL: The&nbsp;`PutConnectionStringOperation`&nbsp;method}

{CODE put_connection_string@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

| Parameters | Type | Description |
|----------------------|---------------------------------|----------------------------------------------------|
| **connectionString** | `RavenConnectionString` | Object that defines the RavenDB connection string. |
| **connectionString** | `SqlConnectionString` | Object that defines the SQL Connection string. |
| **connectionString** | `SnowflakeConnectionString` | Object that defines the Snowflake connction string. |
| **connectionString** | `OlapConnectionString` | Object that defines the OLAP connction string. |
| **connectionString** | `ElasticSearchConnectionString` | Object that defines the Elasticsearch connction string. |
| **connectionString** | `QueueConnectionString` | Object that defines the connection string for the Queue ETLs tasks (Kafka, RabbitMQ, Azure Queue Storage, and Amazon SQS). |

{CODE connection_string_class@ClientApi\Operations\Maintenance\ConnectionStrings\AddConnectionStrings.cs /}

{PANEL/}

## Related Articles

### Connection Strings

- [Get](../../../../client-api/operations/maintenance/connection-strings/get-connection-string)
- [Remove](../../../../client-api/operations/maintenance/connection-strings/remove-connection-string)

### ETL (Extract, Transform, Load) Tasks

- [Operations: How to Add ETL](../../../../client-api/operations/maintenance/etl/add-etl)
- [Ongoing Tasks: ETL Basics](../../../../server/ongoing-tasks/etl/basics)

### External Replication

- [External Replication Task](../../../../studio/database/tasks/ongoing-tasks/external-replication-task)
- [How Replication Works](../../../../server/clustering/replication/replication)

Loading

0 comments on commit 3b98a9d

Please sign in to comment.