Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom index Substreams module #322

Open
abourget opened this issue Oct 4, 2023 · 1 comment
Open

Custom index Substreams module #322

abourget opened this issue Oct 4, 2023 · 1 comment
Assignees
Labels

Comments

@abourget
Copy link
Contributor

abourget commented Oct 4, 2023

WARN: See #410 which has precedence over this.

Reasons to do it:

  • Block skipping is one of the things that brought a huge performance boost when replacing JSON-RPC with Firehose for graph-node
  • Will be needed for Solana if want to penetrate that market in any way.

Reasons not to do it:

  • New chains have other issues before performance. They need to exist first.
  • Firehose for Ethereum already has a native version of this, so there's less pressure to roll it out.

Proposition for custom filter modules

that would apply the same principoles as the Firehose indexes, but in a generalized fashion.

WARN: we would need to add the blockIndex query and module reference (not the name, remember!) in the module_hash.

Summary of the existing CombinedFilter from Ethereum:

This is to provide context only.

message CombinedFilter {
  repeated LogFilter log_filters = 1;
  repeated CallToFilter call_filters = 2;
...
}
message LogFilter {
  repeated bytes addresses = 1;
  repeated bytes event_signatures = 2; // corresponds to the keccak of the event signature which is stores in topic.0
}
message CallToFilter {
  repeated bytes addresses = 1;
  repeated bytes signatures = 2;
}

This translates to this language:

(   -- log filter
  (addr:0x123 OR addr:0x234 OR addr:0x345)  -- alternatively TRUE if our list is empty
    AND
  (evsig:0x123 OR evsig:0x234 OR evsig:0x345)  -- alternatively "TRUE" if our list is empty
)
 OR 
(   -- call filter
  (to:0x123 OR to:0x234 OR to:0x345)  -- or TRUE if list is empty
    AND
  (methsig:0x234 OR methsig:0x456 OR methsig:0x678)  -- or TRUE if list is empty
)

User experience and manifest definitions

Let's say this is a publicly shared filtering package:

package:
  name: eth-filters
  version: v1.0.0
  
modules:
- name: events
  doc: |
      Sifts through logs and indexes keys as: addr:0x123, addr:0x234, evsig:0x456, evsig:0x567
  kind: filter
  inputs:
  - source: sf.ethereum.type.v1.Block
  output:
    type: sf.substreams.filter.v1.Keys
 
- name: logs_reducer
  inputs:
  - source: sf.ethereum.type.v1.Block
  output:
    type: sf.filtered.ethereum.LogsOnly

- name: reduced-events
  doc: |
      Sifts through logs and indexes keys as: addr:0x123, addr:0x234, evsig:0x456, evsig:0x567
  kind: filter
  inputs:
  - map: logs_reducer
  output:
    type: sf.substreams.filter.v1.Keys

 
- name: calls
  doc: |
      Sifts through calls and indexes keys as: to:0x123, to:0x234, methsig:0x456, methsig:0x567
  kind: filter
  inputs:
  - source: sf.ethereum.type.v1.Block

It would be consumed as:

blockSieve's doc: This instructions allows you only receive the inputs for blocks matching certain criteria, allowing more efficiency.

  • name: my_wallet
    blockSieve:
    name: eth_sieve:contracts_and_events
    match: (addr:USDC || addr:PANCAKE) && (evsig:Transfer || evsig:TransferFrom)
    inputs:
  • map: eth_events

here^ I receive onlyh the eth_events of the blocks matching my sieve.

  • name: my_wallet
    blockSieve:
    name: eth_filters:contracts_and_events
    match: (addr:USDC || addr:PANCAKE) && (evsig:Transfer || evsig:TransferFrom)
    inputs:

  • map: eth_filters:events

params:
eth_filters:filtered_events: addrUSDC

blockSieve's doc: This instructions allows you to only process blocks matching certain criterias, avoiding the overhead of processing blocks where you know it doesn't contain what you're interested in.

imports:
  eth-filters: https://spkg.io/streamingfast/eth-filters-v1.0.0.spkg

module:
- name: fastsieve
  blockSieve:
    name: eth_sieve:contracts_and_events
    match: (addr:USDC || addr:PANCAKE) && (evsig:Transfer || evsig:TransferFrom)
  inputs:
  - source: sf.ethereum.type.v2.Block
  - map: uniswapv3:prices


- name: fastcrawl

  blockFilter:
  blockRemover:
  blockSkipper:
    name: eth-filters:events
    keepQuery: (addr:USDC || addr:PANCAKE) && (evsig:Transfer || evsig:TransferFrom)


  # This negative is shoot someone calling to do it yourself
  blockSkipper:
    name: eth-filters:events
    query: (!addr:USDC && !addr:PANCAKE) || (!evsig:Transfer && !evsig:TransferFrom)

  block:
  blockSieve:
  blockPresenceFilter:
    name: eth-filters:events
    query: (addr:123123 || addr:23123) && (evsig:123 || evsig:0x234)   # All of nothing filtering here. We don't process any inputs here if the filter says "no".
  inputs:
  - source: sf.ethereum.type.v2.Block
  - map: uniswapv3:prices

filterQueries:
  fastcrawl: (addr:123123 || addr:234234) && (evsig:123123 || evsig:0x345)
  fastcrawl: (addr:123123 || addr:234234) && (evsig:123123 || evsig:0x345)

substreams run alex-coo-stuff.spkg -filter-query=fastcrawl.eth-filters:events="(blah||)" -filter-query=fastcrawl.eth-filters:calls="(bloh||)" ....

We'd want someone to be able to express this:

Indexing higher order data with this pattern.

Someone could also index from higher order data, like Uniswap-v3 prices:

modules:
- name: univ3-prices-filter
  kind: filter
  inputs:
  - map: univ3_prices

to consume this we would:

- name: univ3-prices-filter
  type: filter
  inputs:
  - map: uniswap_v3_prices
  output: my.Prices
@sduchesneau
Copy link
Contributor

Nice work here #403

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants