Skip to content

Commit

Permalink
feat(source): add webhook source (#73)
Browse files Browse the repository at this point in the history
* transfer #2762

* edit webhook

* edit github webhook

* enrich

* fix

* Apply suggestions from code review

Co-authored-by: hengm3467 <[email protected]>
Signed-off-by: IrisWan <[email protected]>

* lower case

* Update webhook.mdx

* mark as premium and public preview

* Update integrations/sources/github-webhook.mdx

Signed-off-by: Kexiang Wang <[email protected]>

---------

Signed-off-by: IrisWan <[email protected]>
Signed-off-by: Kexiang Wang <[email protected]>
Co-authored-by: Kexiang Wang <[email protected]>
Co-authored-by: hengm3467 <[email protected]>
  • Loading branch information
3 people authored Nov 28, 2024
1 parent 18964c8 commit 4d56439
Show file tree
Hide file tree
Showing 7 changed files with 227 additions and 2 deletions.
1 change: 1 addition & 0 deletions changelog/product-lifecycle.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Below is a list of all features in the public preview phase:

| Feature name | Start version |
| :-- | :-- |
| [Ingest data from webhook](/integrations/sources/webhook) | 2.1 |
| [Shared source](/sql/commands/sql-create-source#shared-source) | 2.1 |
| [ASOF join](/processing/sql/joins#asof-joins) | 2.1 |
| [Partitioned Postgres CDC table](/integrations/sources/postgresql-cdc#ingest-data-from-a-partitioned-table) | 2.1 |
Expand Down
2 changes: 1 addition & 1 deletion get-started/rw-premium-edition-intro.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ RisingWave Premium 1.0 is the first major release of this new edition with sever

### Connectors

<CardGroup> <Card title="Sink to Snowflake" icon="snowflake" href="/integrations/destinations/snowflake" horizontal /> <Card title="Sink to DynamoDB" icon="database" href="/integrations/destinations/amazon-dynamodb" horizontal /> <Card title="Sink to OpenSearch" icon="magnifying-glass" href="/integrations/destinations/opensearch" horizontal /> <Card title="Sink to BigQuery" icon="table" href="/integrations/destinations/bigquery" horizontal /> <Card title="Sink to SharedMergeTree table engine on ClickHouse Cloud" icon="server" horizontal /> <Card title="Sink to SQL Server" icon="database" href="/integrations/destinations/sql-server" horizontal /> <Card title="Direct SQL Server CDC source connector" icon="plug" href="/integrations/sources/sql-server-cdc" horizontal /> <Card title="Sink to Iceberg with glue catalog" icon="mountain" href="/integrations/destinations/apache-iceberg#glue-catelogs" horizontal /> </CardGroup>
<CardGroup> <Card title="Sink to Snowflake" icon="snowflake" href="/integrations/destinations/snowflake" horizontal /> <Card title="Sink to DynamoDB" icon="database" href="/integrations/destinations/amazon-dynamodb" horizontal /> <Card title="Sink to OpenSearch" icon="magnifying-glass" href="/integrations/destinations/opensearch" horizontal /> <Card title="Sink to BigQuery" icon="table" href="/integrations/destinations/bigquery" horizontal /> <Card title="Sink to SharedMergeTree table engine on ClickHouse Cloud" icon="server" horizontal /> <Card title="Sink to SQL Server" icon="database" href="/integrations/destinations/sql-server" horizontal /> <Card title="Direct SQL Server CDC source connector" icon="plug" href="/integrations/sources/sql-server-cdc" horizontal /> <Card title="Sink to Iceberg with glue catalog" icon="mountain" href="/integrations/destinations/apache-iceberg#glue-catelogs" horizontal /> <Card title="Ingest data from webhook" icon="laptop-code" href="/integrations/sources/webhook" horizontal /></CardGroup>

For users who are already using these features in 1.9.x or earlier versions, rest assured that the functionality of these features will be intact if you stay on the version. If you choose to upgrade to v2.0 or later versions, an error will show up to indicate you need a license to use the features.

Expand Down
124 changes: 124 additions & 0 deletions integrations/sources/github-webhook.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
---
title: "Ingest data from GitHub webhook"
description: "Ingest GitHub events directly into your RisingWave database for real-time processing and analytics."
sidebarTitle: GitHub webhook
---

GitHub webhook allows you to build or set up integrations that subscribe to certain events on `GitHub.com`. When one of those events is triggered, GitHub sends an HTTP POST payload to the webhook's configured URL. webhooks can be used to update an external issue tracker, trigger CI builds, update a backup mirror, or even deploy to your production server.

This guide will walk through the steps to set up RisingWave as a destination for GitHub webhooks.

## 1. Create a secret in RisingWave

First, create a secret in RisingWave to securely store a secret string. This secret will be used to validate incoming webhook requests from GitHub.

```sql
CREATE SECRET test_secret WITH (backend = 'meta') AS 'TEST_WEBHOOK';
```

| Parameter or clause | Description |
| :--------- | :-----------|
|`test_secret`| The name of the secret.|
| `TEST_WEBHOOK`| The secret string used for signing and verifying webhook payloads. Replace this with a secure, random string.|

## 2. Create a table in RisingWave

Next, create a table configured to accept webhook data from GitHub.

```sql
CREATE TABLE wbhtable (
data JSONB
) WITH (
connector = 'webhook'
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'x-hub-signature-256', ## Example value: `sha256=f37a93a68fef1505d75e920a15d0543199557be72d2182e5cf8c15d7f9a6260f`
'sha256=' || encode(hmac(test_secret, data, 'sha256'), 'hex')
);
```


| Parameter or clause | Description |
| :--------- | :-----------|
| `data JSONB` | Defines the name of column to store the JSON payload from the webhook. Currently, only `JSONB` type is supported for webhook tables. |
| `headers->>'...'` | Extracts the signature provided by GitHub in the `x-hub-signature-256` HTTP header. <br/> <br/> In `secure_compare()` function, the whole HTTP header is interpreted as a JSONB object, and you can access the header value using the `->>` operator, but only the lower-case header names in the `->>` operator, otherwise the verification will fail. |
|`'sha256=' \|\| encode(...)` | Computes the expected signature. In the example above, it generates an `HMAC SHA-256` hash of the payload (`data`) using the secret (`test_secret`), encodes it in hexadecimal, and prefixes it with `sha256=`.|
| `secure_compare(...)` | Validates requests by matching the header signature against the computed signature, ensuring only authenticated requests are processed. The `secure_compare()` function compares two strings in a fixed amount of time, regardless of whether they are equal or not, ensuring that the comparison is secure and resistant to timing attacks. |

In GitHub webhook, you can choose between `SHA-1` and `SHA-256 HMAC` algorithms for signing the payload. The example above uses `SHA-256 HMAC`. If you want to use `SHA-1`, change `x-hub-signature-256` into `x-hub-signature`, `sha256` into `sha1` in the `VALIDATE` clause.

```sql Example using SHA-1
CREATE SECRET test_secret WITH ( backend = 'meta') AS 'TEST_WEBHOOK';
-- webhook table example github
create table wbhtable (
data JSONB
) WITH (
connector = 'webhook',
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'x-hub-signature',
'sha1=' || encode(hmac(test_secret, data, 'sha1'), 'hex')
);
```

## 3. Set up webhook in GitHub

After configuring RisingWave to accept webhook data, set up GitHub to send events to your RisingWave instance.

### RisingWave webhook URL

The webhook URL should follow this format:
```
https://<HOST>/webhook/<database>/<schema_name>/<table_name>
```

| Parameter | Description |
|-----------|-------------|
| `HOST` | The hostname or IP address where your RisingWave instance is accessible. This could be a domain name or an IP address. |
| `database` | The name of the RisingWave database where your table resides |
| `schema_name` | The schema name of your table, typically `public` unless specified otherwise. |
| `table_name` | The name of the table you created to receive webhook data, e.g., `wbhtable`. |


### Configure webhook in GitHub

For more detailed instructions, refer to the [GitHub documentation](https://docs.github.com/en/webhooks/using-webhooks/creating-webhooks#creating-a-repository-webhook).

<Steps>
<Step>
Go to your GitHub repository, and click on **Settings** tab.
</Step>
<Step>
In the left sidebar, click on **webhooks** > **Add webhook**.
</Step>
<Step>
Configure the webhook settings:

- **Payload URL**: Enter your RisingWave webhook URL.
- **Content type**: Select `application/json`.
- **Secret**: Enter the same secret string you used when creating the RisingWave secret (e.g., `'TEST_WEBHOOK'`). This ensures that GitHub signs the payloads using this secret, allowing RisingWave to validate them.
- **Which events would you like to trigger this webhook?**: Choose the events you want to subscribe to. For testing purposes, you might start with Just the push event.
- **Active**: Ensure the webhook is set to active.
</Step>
<Step>
Click **Add webhook** at the bottom of the page to save.
</Step>
</Steps>
## 4. Push data from GitHub via webhook

With the webhook configured, GitHub will automatically send HTTP POST requests to your RisingWave webhook URL whenever the specified events occur (e.g., pushes to the repository). RisingWave will receive these requests, validate the signatures, and insert the payload data into the target table.

## 5. Further event processing
The data in the table is already ready for further processing. You can access the fields using `data->'field_name'` in SQL queries.

You can create a materialized view to extract specific fields from the JSON payload.

```sql
CREATE MATERIALIZED VIEW github_events AS
SELECT
data->>'action' AS action,
data->'repository'->>'full_name' AS repository_name,
data->'sender'->>'login' AS sender_login,
data->>'created_at' AS event_time
FROM wbhtable;
```

You can now query `github_events` like a regular table to perform analytics, generate reports, or trigger further processing.
68 changes: 68 additions & 0 deletions integrations/sources/webhook.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: "Ingest data from webhook"
sidebarTitle: Overview
description: Describes how to ingest data from webhook to RisingWave
---

A webhook is a mechanism that enables real-time data transfer between applications by sending immediate notifications when specific events occur. Instead of continuously polling for updates, applications can receive data automatically, making it an efficient way to integrate with third-party services.

RisingWave can serve as a webhook destination, directly accepting HTTP requests from external services and storing the incoming data in its tables. When a webhook is triggered, RisingWave processes and ingests the data in real-time.


This direct integration eliminates the need for an intermediary message broker like Kafka. Instead of setting up and maintaining an extra Kafka cluster, you can directly send data to RisingWave to process it in real-time, which enables efficient data ingestion and stream processing without extra infrastructure.

<Tip>
**PREMIUM EDITION FEATURE**

This feature is only available in the premium edition of RisingWave. The premium edition offers additional advanced features and capabilities beyond the free and community editions. If you have any questions about upgrading to the premium edition, please contact our sales team at [[email protected]](mailto:[email protected]).
</Tip>

<Note>
**PUBLIC PREVIEW**

This feature is in the public preview stage, meaning it's nearing the final product but is not yet fully stable. If you encounter any issues or have feedback, please contact us through our [Slack channel](https://www.risingwave.com/slack). Your input is valuable in helping us improve the feature. For more information, see our [Public preview feature list](/changelog/product-lifecycle#features-in-the-public-preview-stage).
</Note>

## Creating a webhook table in RisingWave

To utilize webhook sources in RisingWave, you need to create a table configured to accept webhook requests. Below is a basic example of how to set up such a table:

```sql
CREATE SECRET test_secret WITH (backend = 'meta') AS 'secret_value';

CREATE TABLE wbhtable (
data JSONB
) WITH (
connector = 'webhook'
) VALIDATE SECRET test_secret AS secure_compare(
headers->>'{header of signature}',
{signature generation expressions}
);
```

| Parameter or clause | Description |
| :---------- | :------------ |
| `CREATE SECRET` | Securely stores a secret value in RisingWave for request validation. |
| `CREATE TABLE` | Defines a table with a JSONB column to store webhook payload data. |
| `connector` | Configures the table to accept incoming HTTP webhook requests |
| `VALIDATE SECRET...AS...` | Authenticates requests using the stored secret and signature comparison. |
| `secure_compare()` | Validates requests by matching the header signature against the computed signature, ensuring only authenticated requests are processed. Note `secure_compare(...)` is the only supported validation function for webhook tables. |
| `header_of_signature` | Specifies which HTTP header contains the incoming signature. |
| `signature_generation_expressions` | Expression to compute the expected signature using the secret and payload. |


## Supported webhook sources and authentication methods
RisingWave has been verified to work with the following webhook sources and authentication methods:

|webhook source|Authentication methods|
| :-- | :-- |
|GitHub| SHA-1 HMAC, SHA-256 HMAC |
|Rudderstack| Bearer Token |
|Segment| SHA-1 HMAC |
|AWS EventBridge| Bearer Token |
|HubSpot| API Key, Signature V2 |

<Note>While only the above sources have been thoroughly tested, RisingWave can support additional webhook sources and authentication methods. You can integrate other services using similar configurations.</Note>

## See also
[Ingest from Github webhook](/integrations/sources/github-webhook): Step-by-step guide to help you set up and configure your webhook sources.
7 changes: 7 additions & 0 deletions mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -708,6 +708,13 @@
"integrations/sources/emqx",
"integrations/sources/hivemq"
]
},
{
"group": "Webhook",
"pages": [
"integrations/sources/webhook",
"integrations/sources/github-webhook"
]
}
]
}
Expand Down
10 changes: 10 additions & 0 deletions sql/functions/comparison.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,13 @@ title: "Comparison functions and operators"
| IS NOT NULL | `value IS NOT NULL` <br/>Whether a value is not null. | 1 IS NOT NULL → t |
| IS UNKNOWN | `boolean IS UNKNOWN` <br/>Whether a boolean expression returns an unknown value (typically represented by a null). | null IS UNKNOWN → t <br/>false IS UNKNOWN → f |
| IS NOT UNKNOWN | `boolean IS NOT UNKNOWN` <br/>Whether a boolean expression returns true or false. | true IS NOT UNKNOWN → t <br/>null IS NOT UNKNOWN → f |

## Special Comparison

### `secure_compare`

Compare two strings in a fixed amount of time, regardless of whether they are equal or not, ensuring that the comparison is secure and resistant to timing attacks.

```sql title=Syntax
secure_compare (varchar, varchar) -> boolean
```
17 changes: 16 additions & 1 deletion sql/functions/cryptographic.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Cryptographic functions"
description: "Raw encryption functions are basic encryption functions that perform encryption and decryption of data using cryptographic algorithms."
---

### `Raw encryption functions`
## `Raw encryption functions`

Please note they solely apply a cipher to the data and do not provide additional security measures.

Expand Down Expand Up @@ -56,3 +56,18 @@ SELECT decrypt('\x9cf6a49f90b3ac816aeeeed286606fdb','my_secret_key111', 'aes-cbc
(1 row)

```

## `hmac`

Returns the `HMAC` result regarding the input secret, payload and hash algorithm. Please refer to [`HMAC`](https://en.wikipedia.org/wiki/HMAC) for more information in cryptography. Currently, the supported hash algorithms for `hash_algo` are `sha1` and `sha256`.

```sql Syntax
hmac (secret varchar, payload bytea, hash_algo varchar) -> signature bytea
```

```sql Example
SELECT hmac('secret', 'payload'::bytea, 'sha256');
----RESULT
\xb82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4
(1 row)
```

0 comments on commit 4d56439

Please sign in to comment.