Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add raw batch data wrapper query #77

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions cowprotocol/raw_data/.sqlfluff
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[sqlfluff:templater:jinja:context]
start_time='2024-08-01 12:00'
end_time='2024-08-02 12:00'
blockchain='ethereum'
132 changes: 132 additions & 0 deletions cowprotocol/raw_data/batch_data_4351957.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
-- This query provides data related to rewards/payouts on a per batch auction level
-- for all auctions that had at least one winner.
-- Parameters:
-- {{blockchain}}: the chain for which we want to retrieve batch data

-- The output has the following columns:
-- environment: varchar
-- auction_id: integer
-- block_number: integer
-- block_deadline: integer
-- tx_hash: varbinary
-- solver: varbinary
-- execution_cost: decimal(38, 0)
-- surplus: decimal(38, 0)
-- protocol_fee: decimal(38, 0)
-- network_fee: decimal(38, 0)
-- uncapped_payment_native_token: decimal(38, 0)
-- capped_payment: decimal(38, 0)
-- winning_score: decimal(38, 0)
-- reference_score: decimal(38, 0)

with
past_batch_data_ethereum as (
select
s.environment,
null as auction_id,
d.block_number,
d.block_deadline,
from_hex(d.tx_hash) as tx_hash,
from_hex(d.solver) as solver,
cast(d.data.execution_cost as decimal(38, 0)) as execution_cost, --noqa: RF01
cast(d.data.surplus as decimal(38, 0)) as surplus, --noqa: RF01
cast(d.data.protocol_fee as decimal(38, 0)) as protocol_fee, --noqa: RF01
cast(d.data.fee as decimal(38, 0)) as network_fee, --noqa: RF01
cast(d.data.uncapped_payment_eth as decimal(38, 0)) as uncapped_payment_native_token, --noqa: RF01
cast(d.data.capped_payment as decimal(38, 0)) as capped_payment, --noqa: RF01
cast(d.data.winning_score as decimal(38, 0)) as winning_score, --noqa: RF01
cast(d.data.reference_score as decimal(38, 0)) as reference_score --noqa: RF01
from cowswap.raw_batch_rewards as d inner join cow_protocol_ethereum.solvers as s on d.solver = cast(s.address as varchar) where d.block_deadline < 20866925
),

past_batch_data_gnosis as ( --noqa: ST03
select
'a' as environment,
0 as auction_id,
0 as block_number,
0 as block_deadline,
0x as tx_hash,
0x as solver,
0 as execution_cost,
0 as surplus,
0 as protocol_fee,
0 as network_fee,
0 as uncapped_payment_native_token,
0 as capped_payment,
0 as winning_score,
0 as reference_score
where false
),

past_batch_data_arbitrum as ( --noqa: ST03
select
'a' as environment,
0 as auction_id,
0 as block_number,
0 as block_deadline,
0x as tx_hash,
0x as solver,
0 as execution_cost,
0 as surplus,
0 as protocol_fee,
0 as network_fee,
0 as uncapped_payment_native_token,
0 as capped_payment,
0 as winning_score,
0 as reference_score
where false
)

select *
from past_batch_data_{{blockchain}}
union all
select
environment,
auction_id,
block_number,
block_deadline,
tx_hash,
solver,
cast(execution_cost as decimal(38, 0)) as execution_cost,
cast(surplus as decimal(38, 0)) as surplus,
cast(protocol_fee as decimal(38, 0)) as protocol_fee,
cast(network_fee as decimal(38, 0)) as network_fee,
cast(uncapped_payment_eth as decimal(38, 0)) as uncapped_payment_native_token,
cast(capped_payment as decimal(38, 0)) as capped_payment,
cast(winning_score as decimal(38, 0)) as winning_score,
cast(reference_score as decimal(38, 0)) as reference_score
from dune.cowprotocol.dataset_batch_data_{{blockchain}}_2024_10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to add these 15 lines manually every month? Can we not come up with a better solution where this is automatically generated based on the current date?

If we go down this route, I'd at least like to see this query re-written in a way the the redundant part becomes just

union all
select * from <new month table>

Otherwise this file will be a horror in 6 months from now.

Copy link
Contributor Author

@harisang harisang Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is that no one touches this query and this is updated automatically by the script that uploads data on Dune.

Indeed, ideally we should have a select *. These were uploaded using dune-sync-v1. If we are able to do proper type casting with dune-sync-v2, this could go away.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this is updated automatically by the script that uploads data on Dune.

How would this work specifically? Would the script would make a pull request to this repository and automatically merge a change to the query on the first of each month?

Let's please think this process through to avoid a bad surprise 4 weeks from now when we think we are "done" with this project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok good point about how to sync with this repo. I hadn't thought about this. So what I have in mind is basically what Bram has done when testing some versions of the sync in the Prefect repo: https://github.com/cowprotocol/Prefect/blob/a233d2831e936aa05ff4a7984aa1116580402e11/config/tasks/dune_sync.py#L121

Basically, since the dune-sync job will take a timestamp in order to work, if the timestamp says that it is the first day of the month, it would update the dune query automatically by pushing a change directly to Dune. Which means that the version of the query in this repo would get outdated. Unless we allow the script to also push directly to main, but i am not sure if this is necessary, since again, this query is not to be actively maintained by anyone.

Copy link
Contributor Author

@harisang harisang Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative that I tried was to actually prefill the query with all tables for the next few years (!). But it seems that Dune complains, as expected, for non-existent tables. Not sure if there is a workaround to that (we could create dummy tables for the next 48 months for example but not sure if we want that)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having this be an automated job makes sense (likely it would still benefit from having the only redundant part be select * from table_x rather than all the select statements. Therefore this query should not live here I believe.

union all
select
environment,
auction_id,
block_number,
block_deadline,
tx_hash,
solver,
cast(execution_cost as decimal(38, 0)) as execution_cost,
cast(surplus as decimal(38, 0)) as surplus,
cast(protocol_fee as decimal(38, 0)) as protocol_fee,
cast(network_fee as decimal(38, 0)) as network_fee,
cast(uncapped_payment_eth as decimal(38, 0)) as uncapped_payment_native_token,
cast(capped_payment as decimal(38, 0)) as capped_payment,
cast(winning_score as decimal(38, 0)) as winning_score,
cast(reference_score as decimal(38, 0)) as reference_score
from dune.cowprotocol.dataset_batch_data_{{blockchain}}_2024_11
union all
select
environment,
auction_id,
block_number,
block_deadline,
tx_hash,
solver,
cast(execution_cost as decimal(38, 0)) as execution_cost,
cast(surplus as decimal(38, 0)) as surplus,
cast(protocol_fee as decimal(38, 0)) as protocol_fee,
cast(network_fee as decimal(38, 0)) as network_fee,
cast(uncapped_payment_eth as decimal(38, 0)) as uncapped_payment_native_token,
cast(capped_payment as decimal(38, 0)) as capped_payment,
cast(winning_score as decimal(38, 0)) as winning_score,
cast(reference_score as decimal(38, 0)) as reference_score
from dune.cowprotocol.dataset_batch_data_{{blockchain}}_2024_12
Loading