Skip to content

Artemis-xyz/dbt

Repository files navigation

Artemis DBT

Type=Combination, Color=Mix

githubWorkflowBadge discordBadge twitterBadge

Welcome to the Artemis DBT repository!

Table of Contents

Introduction

What is this?

This is the home of all of the data-munging business logic that ultimately powers a variety of the data in the Artemis application suite, including the Terminal, Sheets, and Snowflake integration.

Who is this for?

If you're in this repository, you're likely one of the following:

  • a researcher who is trying to better understand methodology
  • a protocol that would like to self-list on Artemis
  • a data scientist that would like to add new protocol or metric (and possibly be compensated for their contributions via a bounty)

wtf is DBT?

DBT stands for data-build-tool and is an approach to building SQL models in a collaborative and iterative environment. It allows you to "modularize and centralize your analytics code" while mostly being vanilla SQL with some minor syntatic magic on top. This repository uses the Snowflake SQL syntax on top of DBT.

We use DBT to transform raw blockchain data (transactions, traces, and decoded event logs) into high-fidelity metrics for our users. image

For most SQL wizards, reading DBT models comes intutitvely, but below are some revelant resources to learn more:

How do I get help?

There are two ways to get help:

  1. Pop into our Discord and ask us anything in the Methodology channel (fastest)
  2. Raise an issue on this Github.

Environment Setup

  1. Fork this repository (button towards the top right)
  2. Run source syncenv to download dependencies
  3. Write SQL model changes
  4. Open a PR and view results of your changes directly in the Github Actions - more on this in adding a new metric.

For running on snowflake, create a .env.local file in the root of this repo by running cp .env.local.template .env.local and filling out the necessary values.

Run source syncenv to load these environment variables into your shell and install the necessary Python dependencies.

System Design

In terms of system design, Artemis manages a data pipeline that pipes raw data from a series of vendors, including Flipside, Goldsky, and QuickNode into our Snowflake data warehouse. This repository contains the business logic for turning raw data into fact tables that describe an individual protocol or collection of protocols.

Untitled (1) (1)

Fact tables are then combined into ez_asset_metrics tables that are piped into the downstream applications. We use the STAR schema model to label our tables.

Adding a new asset

BEFORE adding metrics for a protocol, you must create the asset first.

For example, let's say a user wants to add GEODNET fees to our DePin dashboard.. They must first add the GEODNET asset first by completing the following steps:

  • Fork this repository
  • Add GEODNET to the databases.csv file
  • Request and merge a PR with this change

The Artemis team will then create the necessary permissions and warehouses in order for GEODNET to show up in the Terminal.

Example PR

Adding a new metric

Once the asset exists, there are several ways to pull metrics for the protocol in question. Taking the GEODNET fees example, we will breakdown how to add this protocol's metrics by provider below.

GEODNET is a DePin protocol where 80% of fees are sent to the burn address and counted towards network revenue.

Using Flipside's Warehouse

To calculate fees, we can write a query directly in Flipside's studio to count the token transfers towards the burn address, found here.

Productionizing this into the Artemis DBT schemas, fact tables are expected to have the following columns:

  • date [DATETIME]
  • fees [NUMBER] - this field name will change based on the metric you are pulling
  • chain [STRING]
  • protocol [STRING]

In this example, the chain that GEODNET publishes fees on is polygon and the protocol is geodnet - both of these fields should match the asset names in the assets.csv file.

Given our Flipside query is already in this format, we can mostly copy and paste this directly into a fact table in the correct directory for GEODNET: models/projects/geodnet/core/fact_geodnet_fees_revenue.sql

Note two important distinctions in this file:

  1. Rather than using the the polygon.core.ez_token_transfers database and table, we use polygon_flipside.core.ez_token_transfers. This will tell the Artemis pipeline to pull from the Flipside Database.
  2. The model has the following in the header, which tells DBT to materalize the results as a table: {{ config(materialized="table") }}

Now, we just need to create our ez_geodnet_metrics table in the right directory and query the results of the above fact table, shown here: models/projects/geodnet/core/ez_geodnet_metrics.sql

We are done! Open up a PR targeting this repository and check to make sure the code compiles in the Github actions.

You can see a demo output of your new query by clicking on the "Details" of the Show Changed Models CI step. image

Screenshot 2024-04-04 at 12 01 52 PM

Using QuickNode's RPCs

Out of scope for right now. Will be added on future iterations.

Using Goldsky-ingested Raw Data

Out of scope for right now. Will be added on future iterations.