Welcome to the Artemis DBT repository!
This is the home of all of the data-munging business logic that ultimately powers a variety of the data in the Artemis application suite, including the Terminal, Sheets, and Snowflake integration.
If you're in this repository, you're likely one of the following:
- a researcher who is trying to better understand methodology
- a protocol that would like to self-list on Artemis
- a data scientist that would like to add new protocol or metric (and possibly be compensated for their contributions via a bounty)
DBT stands for data-build-tool and is an approach to building SQL models in a collaborative and iterative environment. It allows you to "modularize and centralize your analytics code" while mostly being vanilla SQL with some minor syntatic magic on top. This repository uses the Snowflake SQL syntax on top of DBT.
We use DBT to transform raw blockchain data (transactions, traces, and decoded event logs) into high-fidelity metrics for our users.
For most SQL wizards, reading DBT models comes intutitvely, but below are some revelant resources to learn more:
There are two ways to get help:
- Pop into our Discord and ask us anything in the Methodology channel (fastest)
- Raise an issue on this Github.
- Fork this repository (button towards the top right)
- Run
source syncenv
to download dependencies - Write SQL model changes
- Open a PR and view results of your changes directly in the Github Actions - more on this in adding a new metric.
For running on snowflake, create a .env.local
file in the root of this repo by running cp .env.local.template .env.local
and filling out the necessary values.
Run source syncenv
to load these environment variables into your shell and install the necessary Python dependencies.
In terms of system design, Artemis manages a data pipeline that pipes raw data from a series of vendors, including Flipside, Goldsky, and QuickNode into our Snowflake data warehouse. This repository contains the business logic for turning raw data into fact
tables that describe an individual protocol or collection of protocols.
Fact tables are then combined into ez_asset_metrics
tables that are piped into the downstream applications. We use the STAR schema model to label our tables.
BEFORE adding metrics for a protocol, you must create the asset first.
For example, let's say a user wants to add GEODNET fees to our DePin dashboard.. They must first add the GEODNET asset first by completing the following steps:
- Fork this repository
- Add GEODNET to the
databases.csv
file - Request and merge a PR with this change
The Artemis team will then create the necessary permissions and warehouses in order for GEODNET to show up in the Terminal.
Once the asset exists, there are several ways to pull metrics for the protocol in question. Taking the GEODNET fees example, we will breakdown how to add this protocol's metrics by provider below.
GEODNET is a DePin protocol where 80% of fees are sent to the burn address and counted towards network revenue.
To calculate fees, we can write a query directly in Flipside's studio to count the token transfers towards the burn address, found here.
Productionizing this into the Artemis DBT schemas, fact tables are expected to have the following columns:
date
[DATETIME]fees
[NUMBER] - this field name will change based on the metric you are pullingchain
[STRING]protocol
[STRING]
In this example, the chain that GEODNET publishes fees on is polygon
and the protocol is geodnet
- both of these fields should match the asset names in the assets.csv
file.
Given our Flipside query is already in this format, we can mostly copy and paste this directly into a fact table in the correct directory for GEODNET: models/projects/geodnet/core/fact_geodnet_fees_revenue.sql
Note two important distinctions in this file:
- Rather than using the the
polygon.core.ez_token_transfers
database and table, we usepolygon_flipside.core.ez_token_transfers
. This will tell the Artemis pipeline to pull from the Flipside Database. - The model has the following in the header, which tells DBT to materalize the results as a table:
{{ config(materialized="table") }}
Now, we just need to create our ez_geodnet_metrics
table in the right directory and query the results of the above fact table, shown here: models/projects/geodnet/core/ez_geodnet_metrics.sql
We are done! Open up a PR targeting this repository and check to make sure the code compiles in the Github actions.
You can see a demo output of your new query by clicking on the "Details" of the Show Changed Models
CI step.
Out of scope for right now. Will be added on future iterations.
Out of scope for right now. Will be added on future iterations.