In this article I describe how to use Subsquid's indexing framework for data analytics prototyping. I have built an indexer that processes MATIC transactions on Ethereum mainnet, and dumps them on a local CSV files. I have then developed a simple Python script to demonstrate how to import these CSVs into a Pandas DataFrame, and perform aggregation operations on this data.
This project is a squid that indexes blockchain information generated by the Transfers of MATIC tokens on Ethereum mainnet. The indexer writes it to multiple files, divided in chunks of configurable size, in the CSV format.
It also contains a simple Python script (in the data
folder), which reads the CSV files, imports the data in a Pandas DataFrame, aggregates the data (albeit rather trivially) and plots a bar chart.
MATIC is the native token of the Polygon project. It is defined by an ERC-20 standard smart contract, and the tokens are transferred via the contract's transfer
function, which emits a Transfer
event.
Such event is exactly what the squid ETL is indexing, and eventually writing to CSV files using Subsqduid's file-store
and file-store-csv
libraries.
The analysis.py
Python script is using Pandas to read the data stored in all CSVs, and create a DataFrame. Then, using the groupby
function, it calculates the daily total value of transfers, and it creates a bar chart, using matplotlib
Python library.
The project is relatively simple, because its purpose is purely demonstrative. Its intent is to showcase the capabilities of Subsquid SDK.
A squid is a project that extracts and transforms on-chain data in order to present it as a GraphQL API. Squids are developed using the Subsquid SDK, which provides extensive tooling to define data schemas, data transfomation rules, and the shape of the resulting API.
We recommend that you read Subsquid docs to understand how it works: https://docs.subsquid.io/
- Node 16.x
- Docker
- NPM
- Clone the repository
- Install dependencies (in a console window):
npm i
- Build the project
sqd build
- Launch the database container
sqd up
- Launch the processor
sqd process
- Launch the GraphQL server (in a separate console window)
sqd serve
- Access the GraphiQL Playground, by running
sqd open http://localhost:4350/graphql
- The
schema.graphql
file is used to define the database and API schemas. A command line tool will automatically generate code from it, which you can find insrc/model/generated
- The
db/migrations
folder contains automatically files with SQL statements to modify the database (create, alter, delete tables), similarly to any ORM database interface. - The
src/abi
folder contains facade TypeScript code, automatically generated by a command line tool from one, or multiple smart contract ABI(s). This code is used to programmatically interface with the smart contract(s) and decode events and function calls. - The main logic of this project is defined in
src/processor.ts
. TheEvmBatchProcessor
class is configured and used to perform request to Subsquid's Archive for Ethereum blockchain, to obtain necessary data. Then some custom logic is implemented to process this data in batches, and save it on the database with the custom defined structure.
Subsquid documentation has dedicated sections and pages describing each of these concepts, it is advised to consult them, before starting to develop your own squid.
Start development by defining the schema of the target database via schema.graphql
.
Schema definition consists of regular graphql type declarations annotated with custom directives.
Full description of schema.graphql
dialect is available here.
Mapping developers use TypeORM EntityManager
to interact with target database during data processing. All necessary entity classes are
generated by the squid framework from schema.graphql
. This is done by running npx sqd codegen
command.
All database changes are applied through migration files located at db/migrations
.
squid-typeorm-migration(1)
tool provides several commands to drive the process.
## delete all migrations
rm -rf db/migrations/*.js
## drop create the database
make down
make up
## create a new schema migration from the entities
npx squid-typeorm-migration generate
See docs on schema updates for more details.
It is necessary to import the respective ABI definition to decode EVM logs.
To generate a type-safe facade class to decode EVM logs, place the ABI in the assets
folder and use squid-evm-typegen(1)
, e.g.:
npx squid-evm-typegen src/abi assets/ERC721.json#erc721
For more details about squid-evm-typegen
read the docs page
Squid tools assume a certain project layout.
- All compiled js files must reside in
lib
and all TypeScript sources insrc
. The layout oflib
must reflectsrc
. - All TypeORM classes must be exported by
src/model/index.ts
(lib/model
module). - Database schema must be defined in
schema.graphql
. - Database migrations must reside in
db/migrations
and must be plain js files. sqd(1)
andsquid-*(1)
executables consult.env
file for a number of environment variables.
It is possible to extend squid-graphql-server(1)
with custom
type-graphql resolvers and to add request validation. See the docs for more details.
You can learn more in the Create React App documentation.
To learn React, check out the React documentation.