Skip to content

Commit

Permalink
Docs update (#308)
Browse files Browse the repository at this point in the history
* Documentation update

* Updated protobuf dev url
  • Loading branch information
JulienR1 authored Sep 18, 2023
1 parent c3652e0 commit a21236e
Show file tree
Hide file tree
Showing 10 changed files with 73 additions and 59 deletions.
38 changes: 19 additions & 19 deletions docs/concepts-and-fundamentals/benefits.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,24 @@ description: StreamingFast Substreams benefits and comparisons

## Important Substreams facts include:

* It provides a streaming-first system based on gRPC, protobuf, and the StreamingFast Firehose.
* It supports a highly cacheable and parallelizable remote code execution framework.
* It enables the community to build higher-order modules that are composable down to individual modules.
* Deterministic blockchain data is fed to Substreams, **making it deterministic**.
* It is **not** a relational database.
* It is **not** a REST service.
* It is **not** concerned directly about how data is queried.
* It is **not** a general-purpose non-deterministic event stream processor.
- It provides a streaming-first system based on gRPC, protobuf, and the StreamingFast Firehose.
- It supports a highly cacheable and parallelizable remote code execution framework.
- It enables the community to build higher-order modules that are composable down to individual modules.
- Deterministic blockchain data is fed to Substreams, **making it deterministic**.
- It is **not** a relational database.
- It is **not** a REST service.
- It is **not** concerned directly about how data is queried.
- It is **not** a general-purpose non-deterministic event stream processor.

### Substreams offers several benefits including:

* The ability to store and process blockchain data using advanced parallelization techniques, making the processed data available for various types of data stores or real-time systems.
* A streaming-first approach that inherits low latency extraction from [StreamingFast Firehose](https://firehose.streamingfast.io/).
* The ability to save time and money by horizontally scaling and increasing efficiency by reducing processing time and wait time.
* The ability for communities to [combine Substreams modules](../developers-guide/modules/) to form compounding levels of data richness and refinement.
* The use of [protobufs for data modeling and integration](../developers-guide/creating-protobuf-schemas.md) in a variety of programming languages.
* The use of the Rust programming language and a wide array of third-party libraries compilable with WASM to manipulate blockchain data on-the-fly.
* Inspiration from conventional large-scale data systems fused into the novelties of blockchain technology.
- The ability to store and process blockchain data using advanced parallelization techniques, making the processed data available for various types of data stores or real-time systems.
- A streaming-first approach that inherits low latency extraction from [StreamingFast Firehose](https://firehose.streamingfast.io/).
- The ability to save time and money by horizontally scaling and increasing efficiency by reducing processing time and wait time.
- The ability for communities to [combine Substreams modules](../developers-guide/modules/) to form compounding levels of data richness and refinement.
- The use of [protobufs for data modeling and integration](../developers-guide/creating-protobuf-schemas.md) in a variety of programming languages.
- The use of the Rust programming language and a wide array of third-party libraries compilable with WASM to manipulate blockchain data on-the-fly.
- Inspiration from conventional large-scale data systems fused into the novelties of blockchain technology.

### **Other features**

Expand All @@ -49,16 +49,16 @@ Substreams is a streaming engine similar to [Fluvio](https://www.fluvio.io/), [K

#### Substreams & Subgraphs

A lot of questions arise around Substreams and Subgraphs as they are both part of The Graph ecosystem. Substreams has been created by StreamingFast team, the first core developers teams outside of Edge & Node, the founding team of The Graph. It was created in response to different use cases especially around analytics and big data that couldn't be served by Subgraph due to its current programming model. Here some of the key points for which Substreams were created:
A lot of questions arise around Substreams and Subgraphs as they are both part of The Graph ecosystem. Substreams has been created by the StreamingFast team, the first core developers teams outside of Edge & Node, the founding team of The Graph. It was created in response to different use cases especially around analytics and big data that couldn't be served by Subgraph due to its current programming model. Here some of the key points for which Substreams were created:

- Offer a streaming-first approach to consuming/transforming blockchain's data
- Offer a highly parallelizable yet simple model to consume/transform blockchain's data
- Offer a composable system where you can depend on building blocks offered by the community
- Offer rich block model

While they share similar ideas around blockchain's transformation/processing and they are both part of The Graph ecosystem, both can be viewed as independent technology that are unrelated to each other. One cannot take a Subgraph's code and run it on Substreams engine, they are incompatible. Here some of key differences:
While they share similar ideas around blockchain's transformation/processing and they are both part of The Graph ecosystem, both can be viewed as independent technology that are unrelated to each other. One cannot take a Subgraph's code and run it on the Substreams engine, they are incompatible. Here some of key differences:

- You write your Substreams in Rust while Subgraph are written in AssemblyScript
- You write your Substreams in Rust while Subgraphs are written in AssemblyScript
- Substreams are "stateless" request through gRPC while Subgraphs are persistent deployment
- Substreams offers you the chain's specific full block while in Subgraph, you define "triggers" that will invoke your code
- Substreams are consumed through a gRPC connection where you control the actual output message while Subgraphs are consumed through GraphQL
Expand All @@ -67,4 +67,4 @@ While they share similar ideas around blockchain's transformation/processing and

Substreams offer quite a different model when compared to Subgraph, just Rust alone is a big shift for someone used to write Subgraphs in AssemblyScript. Substreams is working a lot also with Protobuf models also.

One of the benefits of Substreams is that the persistent storage solution is not part of the technology directly, so you are free to use the database of your choice which enable a lot of analytics use cases that was not possible (or harder to implement) today using Subgraphs like persistent your transformed data to BigQuery or Clickhouse, Kafka, etc. Also, the live streaming feature of Substreams enables further use cases and super quick reactivity that will benefits a lot of user.
One of the benefits of Substreams is that the persistent storage solution is not part of the technology directly, so you are free to use the database of your choice. This enables a lot of analytics use cases that were not possible (or harder to implement) today using Subgraphs like persistent your transformed data to BigQuery or Clickhouse, Kafka, etc. Also, the live streaming feature of Substreams enables further use cases and super quick reactivity that will benefits a lot of user.
22 changes: 11 additions & 11 deletions docs/concepts-and-fundamentals/fundamentals.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ Substreams development involves using several different pieces of technology, in

### The process to use Substreams includes:

* Choose the blockchain to capture and process data.
* Identify interesting smart contract addresses (like DEXs or interesting wallet addresses).
* Identify the data and defining and creating protobufs.
* Find already-built Substreams modules and consume their streams, or:
* Write Rust Substreams module handler functions.
* Update the Substreams manifest to reference the protobufs and module handlers.
* Use the [`substreams` CLI](../reference-and-specs/command-line-interface.md) to send commands and view results.
- Choose the blockchain to capture and process data.
- Identify interesting smart contract addresses (like DEXs or interesting wallet addresses).
- Identify the data and defining and creating protobufs.
- Find already-built Substreams modules and consume their streams, or:
- Write Rust Substreams module handler functions.
- Update the Substreams manifest to reference the protobufs and module handlers.
- Use the [`substreams` CLI](../reference-and-specs/command-line-interface.md) to send commands and view results.

### **The Substreams engine**

Expand All @@ -42,7 +42,7 @@ The data flow is [defined in the Substreams manifest](../reference-and-specs/man

### **Substreams DAG**

Substreams modules are composed through a [directed acyclic graph](https://en.wikipedia.org/wiki/Directed\_acyclic\_graph) (DAG).
Substreams modules are composed through a [directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) (DAG).

{% hint style="info" %}
**Note**: In DAGs, data flows from one module to another in a one-directional manner, with no cycle, similar to Git's model of commits and branches.
Expand All @@ -60,16 +60,16 @@ The Substreams engine creates the "_compute graph_" or "_dependency graph_" at r

[Protocol buffers or protobufs](https://developers.google.com/protocol-buffers) are the data models operated on by the[ Rust-based module handler functions](../developers-guide/modules/writing-module-handlers.md). They define and outline the data models in the protobufs.

* View the [`erc721.proto`](https://github.com/streamingfast/substreams-template/blob/develop/proto/erc721.proto) protobuf file in the [Substreams Template repository](https://github.com/streamingfast/substreams-template).
* View the Rust module handlers in the [`lib.rs`](https://github.com/streamingfast/substreams-template/blob/develop/src/lib.rs) file in the [Substreams Template repository](https://github.com/streamingfast/substreams-template).
- View the [`erc721.proto`](https://github.com/streamingfast/substreams-template/blob/develop/proto/erc721.proto) protobuf file in the [Substreams Template repository](https://github.com/streamingfast/substreams-template).
- View the Rust module handlers in the [`lib.rs`](https://github.com/streamingfast/substreams-template/blob/develop/src/lib.rs) file in the [Substreams Template repository](https://github.com/streamingfast/substreams-template).

{% hint style="info" %}
**Note**: Protobufs include the names of the data objects and the fields contained and accessible within them.
{% endhint %}

Many protobuf definitions have already been created, such as [the erc721 token model](https://github.com/streamingfast/substreams-template/blob/develop/proto/erc721.proto), for use by developers creating Substreams data transformation strategies.

Custom smart contracts, [like UniSwap](https://github.com/streamingfast/substreams-uniswap-v3/blob/e4b0fb016210870a385484f29bb5116931ea9a50/proto/uniswap/v1/uniswap.proto), also have protobuf definitions that are referenced in the Substreams manifest and made available to module handler functions. Protobufs provide an API to the data for smart contract addresses.
Custom smart contracts, like [UniSwap](https://github.com/streamingfast/substreams-uniswap-v3/blob/e4b0fb016210870a385484f29bb5116931ea9a50/proto/uniswap/v1/uniswap.proto), also have protobuf definitions that are referenced in the Substreams manifest and made available to module handler functions. Protobufs provide an API to the data for smart contract addresses.

In object-oriented programming terminology, protobufs are the objects or object models. In front-end web development, they are similar to REST or other data APIs.

Expand Down
5 changes: 3 additions & 2 deletions docs/developers-guide/cookbook/advanced-params.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ pub fn map_whale_transfers(params: String, block: Block) -> Result<Transfers, Er
}
```

You can even pass a vector of addreses to track multiple specific whales in our example:
You can even pass a vector of addresses to track multiple specific whales in our example:

```rust
#[derive(Debug, Deserialize)]
Expand All @@ -80,6 +80,7 @@ pub fn map_whale_transfers(params: String, block: Block) -> Result<Transfers, Er
```

Depending on the crate you use to decode params string, you can pass them to Substreams CLI like this for example:

```bash
substreams gui map_whale_transfers -p map_whale_transfers="address[]=aaa..aaa&address[]=bbb..bbb&amount=100"
```
```
22 changes: 14 additions & 8 deletions docs/developers-guide/creating-protobuf-schemas.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ Learn more about the details of Google Protocol Buffers in the official document

**Google Protocol Buffer Documentation**

[Learn more about Google Protocol Buffers](https://developers.google.com/protocol-buffers) in the official documentation provided by Google.
[Learn more about Google Protocol Buffers](https://protobuf.dev/) in the official documentation provided by Google.

**Google Protocol Buffer Tutorial**

[Explore examples and additional learning material](https://developers.google.com/protocol-buffers/docs/tutorials) for Google Protocol Buffers provided by Google.
[Explore examples and additional learning material](https://protobuf.dev/programming-guides/proto3/) for Google Protocol Buffers provided by Google.

### Protobuf definition for Substreams

Expand All @@ -31,6 +31,7 @@ Define a protobuf model as [`proto:eth.erc721.v1.Transfers`](https://github.com/
{% endhint %}

{% code title="eth/erc721/v1/erc721.proto" lineNumbers="true" %}

```protobuf
syntax = "proto3";
Expand All @@ -48,6 +49,7 @@ message Transfer {
uint64 ordinal = 5;
}
```

{% endcode %}

[View the `erc721.proto`](https://github.com/streamingfast/substreams-template/blob/develop/proto/erc721.proto) file in the official Substreams Template example repository.
Expand All @@ -61,9 +63,9 @@ The protobuf file serves as the interface between the module handlers and the da
{% hint style="success" %}
**Tip**: Protobufs are platform-independent and are defined and used for various blockchains.

* The ERC721 smart contracts used in the Substreams Template example are generic contracts used across many different Ethereum applications.
* The size and scope of the Substreams module dictates the number of and complexity of protobufs.
{% endhint %}
- The ERC721 smart contracts used in the Substreams Template example are generic contracts used across many different Ethereum applications.
- The size and scope of the Substreams module dictates the number of and complexity of protobufs.
{% endhint %}

The Substreams Template example extracts `Transfer` events from the [Bored Ape Yacht Club smart contract](https://etherscan.io/address/0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d) which is located on the Ethereum blockchain.

Expand All @@ -80,21 +82,25 @@ The [`substreams` CLI](../reference-and-specs/command-line-interface.md) is used
Notice the `protogen` command and Substreams manifest passed into the [`substreams` CLI](../reference-and-specs/command-line-interface.md).

{% code overflow="wrap" %}

```bash
substreams protogen ./substreams.yaml --exclude-paths="sf/ethereum,sf/substreams,google"
```

{% endcode %}

The pairing code is generated and saved into the [`src/pb/eth.erc721.v1.rs`](https://github.com/streamingfast/substreams-template/blob/develop/src/pb/eth.erc721.v1.rs)Rust file.

The [`mod.rs`](https://github.com/streamingfast/substreams-template/blob/develop/src/pb/mod.rs) file located in the `src/pb` directory of the Substreams Template example is responsible for exporting the freshly generated Rust code.

{% code title="src/pb/mod.rs" overflow="wrap" lineNumbers="true" %}

```rust
#[path = "eth.erc721.v1.rs"]
#[allow(dead_code)]
pub mod erc721;
```

{% endcode %}

View the [`mod.rs`](https://github.com/streamingfast/substreams-template/blob/develop/src/pb/mod.rs) file in the repository.
Expand All @@ -113,7 +119,7 @@ The [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) [`enum
**Note**: The standard approach to represent nullable data in Rust is to wrap optional values in [`Option<T>`](https://doc.rust-lang.org/rust-by-example/std/option.html).
{% endhint %}

The Rust [`match`](https://doc.rust-lang.org/rust-by-example/flow\_control/match.html) keyword is used to compare the value of an [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) to a [`Some`](https://doc.rust-lang.org/std/option/) or [`None`](https://doc.rust-lang.org/std/option/) variant. Handle a type wrapped [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) in Rust by using:
The Rust [`match`](https://doc.rust-lang.org/rust-by-example/flow_control/match.html) keyword is used to compare the value of an [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) to a [`Some`](https://doc.rust-lang.org/std/option/) or [`None`](https://doc.rust-lang.org/std/option/) variant. Handle a type wrapped [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) in Rust by using:

```rust
match person.Location {
Expand All @@ -122,15 +128,15 @@ match person.Location {
}
```

If you are only interested in finding the presence of a value, use the [`if let`](https://doc.rust-lang.org/rust-by-example/flow\_control/if\_let.html) statement to handle the [`Some(x)`](https://doc.rust-lang.org/std/option/) arm of the [`match`](https://doc.rust-lang.org/rust-by-example/flow\_control/match.html) code.
If you are only interested in finding the presence of a value, use the [`if let`](https://doc.rust-lang.org/rust-by-example/flow_control/if_let.html) statement to handle the [`Some(x)`](https://doc.rust-lang.org/std/option/) arm of the [`match`](https://doc.rust-lang.org/rust-by-example/flow_control/match.html) code.

```rust
if let Some(location) = person.location {
// Value is present, do something
}
```

If a value is present, use the [`.unwrap()`](https://doc.rust-lang.org/rust-by-example/error/option\_unwrap.html) call on the [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) to obtain the wrapped data. You'll need to account for these types of scenarios if you control the creation of the messages yourself or if the field is documented as always being present.
If a value is present, use the [`.unwrap()`](https://doc.rust-lang.org/rust-by-example/error/option_unwrap.html) call on the [`Option`](https://doc.rust-lang.org/rust-by-example/std/option.html) to obtain the wrapped data. You'll need to account for these types of scenarios if you control the creation of the messages yourself or if the field is documented as always being present.

{% hint style="info" %}
**Note**: You need to be **absolutely sure** **the field is always defined**, otherwise Substreams panics and never completes, getting stuck on a block indefinitely.
Expand Down
2 changes: 1 addition & 1 deletion docs/developers-guide/parallel-execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Parallel execution addresses the problem of the slow single linear execution of

The server will define an execution schedule and take the module's dependencies into consideration. The server's execution schedule is a list of pairs of (`module, range`), where range contains `N` blocks. This is a configurable value set to 25K blocks, on the server.

The single map_transfer module will fulfill a request from 0 - 75,000. The server's execution plan returns the results of `[(map_transfer, 0 -> 24,999), (map_transfer, 25,000 -> 74,999), (map_transfer, 50,000 -> 74,999)]`.
The single map_transfer module will fulfill a request from 0 - 75,000. The server's execution plan returns the results of `[(map_transfer, 0 -> 24,999), (map_transfer, 25,000 -> 49,999), (map_transfer, 50,000 -> 74,999)]`.

The three pairs will be simultaneously executed by the server handling caching of the output of the store. For stores, an additional step will combine the store keys across multiple segments producing a unified and linear view of the store's state.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

Substreams-powered subgraph are the prime candidate for Substreams output.

See The Graph's documentation to roll out your:
See The Graph's documentation to roll out yours:

[https://thegraph.com/docs/en/cookbook/substreams-powered-subgraphs/](https://thegraph.com/docs/en/cookbook/substreams-powered-subgraphs/)
Loading

0 comments on commit a21236e

Please sign in to comment.