Skip to content

Commit

Permalink
Multisubscription documentation with tests (#89)
Browse files Browse the repository at this point in the history
PR for #85 and #87.

Co-authored-by: Torben Meyer <[email protected]>
Co-authored-by: Ramin Gharib <[email protected]>
Co-authored-by: Christoph Böhm <[email protected]>
  • Loading branch information
4 people authored Oct 13, 2022
1 parent f144720 commit 7832df1
Show file tree
Hide file tree
Showing 10 changed files with 559 additions and 45 deletions.
15 changes: 15 additions & 0 deletions common/src/testFixtures/avro/ClickStats.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"namespace": "com.bakdata.quick.avro",
"name": "ClickStatsAvro",
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "amount",
"type": "long"
}
]
}
15 changes: 15 additions & 0 deletions common/src/testFixtures/avro/PurchaseStats.avsc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"namespace": "com.bakdata.quick.avro",
"name": "PurchaseStatsAvro",
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "amount",
"type": "long"
}
]
}
10 changes: 10 additions & 0 deletions common/src/testFixtures/proto/test.proto
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,16 @@ message ComplexProtoTestRecord {
ProtoTestRecord protoTestRecord = 2;
}

message ClickStatsProto {
string id = 1;
int64 amount = 2;
}

message PurchaseStatsProto {
string id = 1;
int64 amount = 2;
}

message ProtoRangeQueryTest {
int64 userId = 1;
int32 timestamp = 2;
Expand Down
90 changes: 90 additions & 0 deletions docs/docs/developer/multi-subscription-details.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Multi-subscriptions details

The introduction to multi-subscriptions is
[here](../user/getting-started/working-with-quick/multi-subscriptions.md).

We now discuss implementation details of multi-subscriptions.
As a reminder, we followed these steps:

1. Define the multi-subscription.
2. Apply the new schema.
3. Run the multi-subscription.
4. Ingest data.


## Apply the new schema

When applied,
a [`MultiSubscriptionFetcher`](https://github.com/bakdata/quick/blob/0.7.0/gateway/src/main/java/com/bakdata/quick/gateway/fetcher/subscription/MultiSubscriptionFetcher.java)
is created.
During this process,
Quick constructs a [`DataFetcherClient`](https://github.com/bakdata/quick/blob/0.7.0/gateway/src/main/java/com/bakdata/quick/gateway/fetcher/DataFetcherClient.java)
(for REST calls)
and a [`SubscriptionProvider`](https://github.com/bakdata/quick/blob/0.7.0/gateway/src/main/java/com/bakdata/quick/gateway/fetcher/subscription/SubscriptionProvider.java)
(for consuming from a topic)
for each field marked with the `@topic` directive.

## Run the multi-subscription

When a query is executed,
the created Kafka Consumers poll the corresponding topics for events.
When a new event is emitted,
it is sent via a WebSocket to the user.
To get the missing part of complex objects, the `MultiSubscriptionFetcher`
uses either the REST service of the corresponding mirror
or an internal cache to fetch missing data.
This choice depends on the current scenario.
In our [example](../user/getting-started/working-with-quick/multi-subscriptions.md),
there are three scenarios for building complex objects via multi-subscriptions:

1. Purchase event arrives; there were some click events,
but they have not been seen by the `MultiSubscriptionFetcher` yet
(for example, because the subscription started after the events were produced).
2. Purchase event arrives, and we have already seen the click event for the corresponding id.
3. Purchase event arrives, and there has been no click event.
In the above considerations, the first event is `Purchase`.
However, the choice is interchangeable.
The system behaves similarly
if the order of arrival changes,
i.e., `Click` first, then `Purchase`.

## Ingest data

Say you first ingest a single `Purchase`.
Thus, a `Click` is missing.
`MultiSubscriptionFetcher` first checks the internal cache to see
whether there is a `Click` that refers to the particular id
(the same the `Purchase` refers to).
If successful, `MultiSubscriptionFetcher` retrieves the value from the cache,
creates a complex object,
and returns it to the user (Scenario 2).
If there is a cache miss,
`MultiSubscriptionFetcher` uses the REST interface of the corresponding client-mirror
and sends a request for the desired key.
If there were previous click events (Scenario
1), it retrieves `Click` data,
creates the complex object
and inserts the information into the cache for later use.
If it receives nothing from the REST endpoint,
the result depends on the user's decision concerning nullable values.
If a user allows nullable values,
an incomplete object will be returned, e.g.:
```json
{
"data": {
"userStatistics": {
"purchase": {
"purchaseId": "abc"
},
"click": null
}
}
}
```

If nullable values are not allowed,
a user will receive a subscription error:
```console
Error: The field at path '/userStatistics/click' was declared as a non null type,
but the code involved in retrieving data has wrongly returned a null value [...].
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# Multi-subscriptions

In [Subscriptions](subscriptions.md), you learned
how to get real-time updates using the `Subscription` type.
The example discussed receiving updates from a single topic.

Quick also allows you to create a so-called multi-subscription.
A multi-subscription enables you to
retrieve complex objects where elements
come from more than one Kafka topic.
To integrate a multi-subscription into your application,
we take the following steps:

1. Define the multi-subscription.
2. Apply the new schema.
3. Run the multi-subscription.
4. Ingest data.

---

For the purpose of this tutorial,
consider the following scenario:
You want live statistics
of users' clicks and purchases.
Information about clicks and purchases is stored
in separate Kafka topics.
To retrieve such combined users' statistics,
you create a multi-subscription.
Before you define the multi-subscription,
add the following type to your schema `schema.gql`:
```graphql
type Click {
userId: Int!
timestamp: Int!
}
```
Next, create a topic that holds `Click` entries:
```shell
quick topic create click --key-type string
--value-type schema --schema example.Click
```


## Modify your schema with and define the multi-subscription

As a first step, you extend the schema
(from earlier sections of this documentation)
with the definition of a multi-subscription as follows:
```graphql
type Subscription {
userStatistics: UserStatistics
}

type UserStatistics {
purchases: Purchase @topic(name: "purchase")
clicks: Click @topic(name: "click")
}
```

Note that multi-subscriptions need different modelling than single subscriptions.
In a [single subscription](subscriptions.md), you add a topic directive
(which references a Kafka topic) to the field that describes the entities
you want to receive updates for, i.e.:
```graphql title="schema.gql"
type Subscription {
purchases: Purchase @topic(name: "purchase")
}
```

In a multi-subscription, the field `userStatistics` in `Subscription`
is defined through another type, i.e., `UserStatistics`.
This `UserStatistics` comprises the desired fields, each followed by the `@topic` directive.

## Apply the new schema

You can now apply the modified schema to your gateway:
```console
quick gateway apply example -f schema.gql
```

## Run the multi-subscription

To demo the multi-subscription,
we use a GraphQL client, e.g., [Altair](https://altair.sirmuel.design/).
The setup is described [here](subscriptions.md).

Subscribe to user statistics with the following query:
```graphql title="subscription.gql"
subscription {
userStatistics {
purchase {
purchaseId
}
click {
userId
timestamp
}
}
}
```

In Altair, you would press the `Run subscription` button
to start the subscription.

## Ingest data

You can now ingest data to the different topics
and see how the multi-subscription behaves.

Start with adding a single purchase:
```json title="subscription-purchase.json"
{
"key": "abc",
"value": {
"purchaseId": "abc",
"productId": 123,
"userId": 2,
"amount": 2,
"price": {
"total": 29.99,
"currency": "DOLLAR"
}
}
}
```
```shell
curl --request POST --url "$QUICK_URL/ingest/purchase" \
--header "content-type:application/json" \
--header "X-API-Key:$QUICK_API_KEY"\
--data "@./subscription-purchase.json"
```

As a result, you see the following JSON response in your GraphQL client:
```json
{
"data": {
"userStatistics": {
"purchase": {
"purchaseId": "abc"
},
"click": null
}
}
}
```

There hasn't been any click event yet,
so the response contains only purchase data.

Now, send the following click event via the ingest service.
!!! Note
The key of the click event equals the id of the purchase event.
```json title="click1.json"
{
"key": "abc",
"value": {
"userId": 2,
"timestamp": 123456
}
}
```
```shell
curl --request POST --url "$QUICK_URL/ingest/click" \
--header "content-type:application/json" \
--header "X-API-Key:$QUICK_API_KEY"\
--data "@./click1.json"
```

You should see the following subscription result in your GraphQL client:
```json
{
"data": {
"userStatistics": {
"purchase": {
"purchaseId": "abc"
},
"click": {
"userId": 2,
"timestamp": 123456
}
}
}
}
```

Now consider another click with the same id `abc` but with a different timestamp.
This causes a new response with the latest timestamp.
```json title="click2.json"
{
"key": "abc",
"value": {
"userId": 2,
"timestamp": 234567
}
}
```

As you can see, when a new event of one type, say `Click`, arrives,
you immediately receive the latest state of the other type, here `Purchase`.

Thus, Quick automatically creates an up-to-date response
with elements of the different types stored in different topics.
This mechanism can be generalized to multi-subscriptions
that comprise more than two types (topics).
For example, in a type that consists of three elements
a new event causes a fetch of the corresponding other types
to create a response immediately.

Please see the [developer section on multi-subscriptions](../../../developer/multi-subscription-details.md),
where we discuss the subtleties of that process.
2 changes: 2 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ nav:
- user/getting-started/working-with-quick/ingest-data.md
- user/getting-started/working-with-quick/query-data.md
- user/getting-started/working-with-quick/subscriptions.md
- user/getting-started/working-with-quick/multi-subscriptions.md
- user/getting-started/working-with-quick/range-query.md
- user/getting-started/teardown-resources.md
- Examples:
Expand All @@ -87,6 +88,7 @@ nav:
- developer/operations.md
- developer/cli.md
- developer/notice.md
- developer/multi-subscription-details.md
- developer/range-query-details.md
- Changelog: changelog.md
- Roadmap: roadmap.md
Loading

0 comments on commit 7832df1

Please sign in to comment.