Skip to content

subsquid-labs/local-csv-indexing

Repository files navigation

MATIC data analytics - A squid example for CSV storage

Quick-start

Open in Gitpod

Introduction

In this article I describe how to use Subsquid's indexing framework for data analytics prototyping. I have built an indexer that processes MATIC transactions on Ethereum mainnet, and dumps them on a local CSV files. I have then developed a simple Python script to demonstrate how to import these CSVs into a Pandas DataFrame, and perform aggregation operations on this data.

This project is a squid that indexes blockchain information generated by the Transfers of MATIC tokens on Ethereum mainnet. The indexer writes it to multiple files, divided in chunks of configurable size, in the CSV format.

It also contains a simple Python script (in the data folder), which reads the CSV files, imports the data in a Pandas DataFrame, aggregates the data (albeit rather trivially) and plots a bar chart.

MATIC is the native token of the Polygon project. It is defined by an ERC-20 standard smart contract, and the tokens are transferred via the contract's transfer function, which emits a Transfer event. Such event is exactly what the squid ETL is indexing, and eventually writing to CSV files using Subsqduid's file-store and file-store-csv libraries.

The analysis.py Python script is using Pandas to read the data stored in all CSVs, and create a DataFrame. Then, using the groupby function, it calculates the daily total value of transfers, and it creates a bar chart, using matplotlib Python library.

The project is relatively simple, because its purpose is purely demonstrative. Its intent is to showcase the capabilities of Subsquid SDK.

What is a Squid?

A squid is a project that extracts and transforms on-chain data in order to present it as a GraphQL API. Squids are developed using the Subsquid SDK, which provides extensive tooling to define data schemas, data transfomation rules, and the shape of the resulting API.

We recommend that you read Subsquid docs to understand how it works: https://docs.subsquid.io/

Prerequisites

  • Node 16.x
  • Docker
  • NPM

Quick-start local indexing

  1. Clone the repository
  2. Install dependencies (in a console window): npm i
  3. Build the project sqd build
  4. Launch the database container sqd up
  5. Launch the processor sqd process
  6. Launch the GraphQL server (in a separate console window) sqd serve
  7. Access the GraphiQL Playground, by running sqd open http://localhost:4350/graphql

Key components

  • The schema.graphql file is used to define the database and API schemas. A command line tool will automatically generate code from it, which you can find in src/model/generated
  • The db/migrations folder contains automatically files with SQL statements to modify the database (create, alter, delete tables), similarly to any ORM database interface.
  • The src/abi folder contains facade TypeScript code, automatically generated by a command line tool from one, or multiple smart contract ABI(s). This code is used to programmatically interface with the smart contract(s) and decode events and function calls.
  • The main logic of this project is defined in src/processor.ts. The EvmBatchProcessor class is configured and used to perform request to Subsquid's Archive for Ethereum blockchain, to obtain necessary data. Then some custom logic is implemented to process this data in batches, and save it on the database with the custom defined structure.

Subsquid documentation has dedicated sections and pages describing each of these concepts, it is advised to consult them, before starting to develop your own squid.

Development flow

1. Define database schema

Start development by defining the schema of the target database via schema.graphql. Schema definition consists of regular graphql type declarations annotated with custom directives. Full description of schema.graphql dialect is available here.

2. Generate TypeORM classes

Mapping developers use TypeORM EntityManager to interact with target database during data processing. All necessary entity classes are generated by the squid framework from schema.graphql. This is done by running npx sqd codegen command.

3. Generate database migrations

All database changes are applied through migration files located at db/migrations. squid-typeorm-migration(1) tool provides several commands to drive the process.

## delete all migrations
rm -rf db/migrations/*.js

## drop create the database
make down
make up

## create a new schema migration from the entities
npx squid-typeorm-migration generate      

See docs on schema updates for more details.

4. Import ABI contract and generate interfaces to decode events

It is necessary to import the respective ABI definition to decode EVM logs.

To generate a type-safe facade class to decode EVM logs, place the ABI in the assets folder and use squid-evm-typegen(1), e.g.:

npx squid-evm-typegen src/abi assets/ERC721.json#erc721

For more details about squid-evm-typegen read the docs page

Project conventions

Squid tools assume a certain project layout.

  • All compiled js files must reside in lib and all TypeScript sources in src. The layout of lib must reflect src.
  • All TypeORM classes must be exported by src/model/index.ts (lib/model module).
  • Database schema must be defined in schema.graphql.
  • Database migrations must reside in db/migrations and must be plain js files.
  • sqd(1) and squid-*(1) executables consult .env file for a number of environment variables.

GraphQL server extensions

It is possible to extend squid-graphql-server(1) with custom type-graphql resolvers and to add request validation. See the docs for more details.

Learn More

You can learn more in the Create React App documentation.

To learn React, check out the React documentation.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published