Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: database structure #80

Merged
merged 6 commits into from
Oct 22, 2024
Merged

Conversation

Vovke
Copy link
Collaborator

@Vovke Vovke commented Oct 10, 2024

Want to discuss database structure here

@Vovke Vovke added documentation Improvements or additions to documentation question Further information is requested labels Oct 10, 2024
docs/DATABASE.md Outdated
### Orders (`orders`)
- order - String: order identifier provided by the frontend
- payment_status - Enum: (pending|paid|timed_out).
- withdrawal_status - Enum: (waiting|failed|completed|none).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need "forced" status as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c69aa99

@@ -0,0 +1,32 @@
Plan is to update the database scheme in a way that it will support the requirements we have as for the API specs and additional improvements of the deamon.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these things derived from API? Please indicate if something is different from types defined there, ideally these should be as similar as possible, but deviations are ok where they make sense

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in c69aa99

- order - String: order id to link transaction to order
- chain - String: identifier for the chain where transaction occured
- block_nmber - Integer: Block number where the transaction is recorded.
- position_in_block - Integer: Position of the transaction within the block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might also (later?) want to store the block for which transaction was submitted. It is included into saved transaction bytes though, so maybe not

@Vovke Vovke requested review from Slesarew and fluiderson October 15, 2024 18:34
@fluiderson
Copy link
Collaborator

fluiderson commented Oct 18, 2024

Firstly, tables aren't enough. We also need to define keys for them. Also, remember that our database backend (sled (current) & redb (future)) is a simple key-value storage, so we can not define a table with a composite key, and query the table by different keys.

Another suggestion is to make the database store an absolute basic data that can't be fetched from a blockchain by a simple query.

Now about the scheme itself. It'd be nice to have descriptions for enum variants, too. Like we already have in the API specification:
https://github.com/Alzymologist/kalatori-api/blob/1ccdcaa2958ca2dd8dee47492fcfaea688e01e7b/kalatori.yaml#L410-L413
Just to have all info in one place.

orders

  • amount
    Should have the u128 type. The float type is a mere construct for handling limitations of the shop frontend. Decimals are fetched from a blockchain.
  • currency
    Why not make it as it's in the API specification? On Substrate blockchains, currency isn't defined as a string. There's always a single native token, and sometimes they have multiple assets that're defined by an ID number (u32, precisely). They have a name, but it's just metadata that can be changed anytime. So, a name is a bad value for a key of currency.

TLDR: Should have the type enum { Native, Asset(u32) }. Names are fetched from a blockchain.

  • payment_account
    Should have the type [u8; 32].
  • recipient
    Should have the type [u8; 32].
  • payment_page & redirect_url
    It isn't clear from where the daemon should receive these fields, and hence whether it should store them at all.
  • death
    We should define the representation of custom types like Timestamp. For now, this is a string, but I propose on using the same type as Substrate blockchains are using for timestamps - u32.

As for the key for this table, the order ID is already used for it. Just highlight it.

Also, we may need to link an order to the blockchain that it's on. So, I propose adding the chain field with the type H64.

transactions

The scheme description should also include info regarding whose & what transactions we will record. I suppose these are incoming & outgoing transfers.

The key for this table can't be just an order ID. I propose almost the same scheme as Substrate blockchains are using for transactions. The key should consist of 3 parts: the order ID, the block hash, & the transaction ID in the block. But there's a hitch, we can't know in which block the transaction will end up & what ID it'll have until the block is recorded on a blockchain. So, what the pending status is for? I propose to get rid of it because we already have payment_status & withdrawal_status in the orders table. If our transfer transaction will go wrong and won't be recorded on the blockchain, we can record it in the message field, or simply print it in the log. Or both.

  • status
    Remove the pending variant. See above for a reason.
  • chain
    Remove this field. An order has it.
  • block_number
    Use block hash in the key instead of the block number.
  • position_in_block
    A part of the key.
  • transaction_bytes
    Remove this field. A transaction can be fetched from a blockchain.
  • timestamp
    Remove this field. A timestamp is included in the transaction that can be fetched from a blockchain.
  • sender.
    Remove this field. See above.
  • recipient.
    Remove this field. See above.
  • amount
    Remove this field. See above.
  • currency
    Remove this field. See above.

instance_info

This isn't a table. More like a structure. I've already had it under the name daemon_info before @Slesarew's rewrite.

struct DaemonInfo {
    /// The daemon matches `ChainName` here with chain names from the config, connects to them, and
    /// checks if `genesis` of `chains` equals the genesis hash from connected chains.
    chains: Vec<(ChainName, ChainProperties)>,
    /// The public part of the current signing key. The daemon matches it with the given key and
    /// refuses to work of they are unequal.
    public: PublicKey,
    /// Needed for the key rotation mechanism.
    old_publics_death_timestamps: Vec<(PublicKey, Timestamp)>,
    /// `instance_id`.
    instance: String,
}

struct ChainProperties {
    genesis: BlockHash,
    // Used as key for a chain in the `orders` table.
    hash: H64,
}

I stored all miscellaneous data under the root table. Its key is a string, and a value is opaque bytes with encoded data. It had the following slots:

  • db_version: u32
    Self-explanatory.
  • daemon_info: DaemonInfo
    The struct of the same name above.

You can put it like that in the scheme, or edit it. But first, regarding your structure.

  • version
    Why do you need to store the daemon version in the database? It's a hard-coded value in the daemon binary. Can you elaborate on "consistency with ServerInfo"?
  • debug
    The same question as above. This value is set in the config/CLI, and thus doesn't have to be stored in the database.
  • kalatori_remark
    As above.

orders_per_chain

K: H64 V: u64

A new table to store how many orders does a chain have. This table is for the case when some chain was added in the config, the chain was added to the database on a daemon start, but no orders were created on it. A user can remove unused chain from the config but not from the database, so how can the daemon know which one was unused and delete it by itself? It'll check this table. Another option is to iterate through all orders in the database on each daemon start, so this table is a much better choice.

Of course, it could have bool for a value type, but I think there may be a case in the future where this info will be useful.

docs/DATABASE.md Outdated Show resolved Hide resolved
docs/DATABASE.md Outdated Show resolved Hide resolved
docs/DATABASE.md Outdated Show resolved Hide resolved
docs/DATABASE.md Outdated Show resolved Hide resolved
@Vovke
Copy link
Collaborator Author

Vovke commented Oct 21, 2024

@fluiderson thanks a lot for wrapping it up, I've addressed the remarks below or with code suggestions.

Firstly, tables aren't enough. We also need to define keys for them.

the keys are

  • orders table will be order_id
  • transactions table, transaction_id
  • instance_info table, instance_id

Now about the scheme itself. It'd be nice to have descriptions for enum variants, too.

My goal with this Pull Request is to have a discussion and decide as for database structure, we can describe here enums as well, but thats not the goal atm.

orders

  • currency
    Why not make it as it's in the API specification? On Substrate blockchains, currency isn't defined as a string. There's always a single native token, and sometimes they have multiple assets that're defined by an ID number (u32, precisely). They have a name, but it's just metadata that can be changed anytime. So, a name is a bad value for a key of currency.

thats the way we get it from frontend, even assuming the small probability of ticker name update, we will still get same ticker from there. I agree that this is not the best idea to store it that way, but thats something we will take care of in the future. Created #89 to take care of it later.

  • payment_page & redirect_url
    It isn't clear from where the daemon should receive these fields, and hence whether it should store them at all.

those fields are listed in the API, some integrations expected to require those fields

transactions

The key for this table can't be just an order ID. I propose almost the same scheme as Substrate blockchains are using for transactions. The key should consist of 3 parts: the order ID, the block hash, & the transaction ID in the block. But there's a hitch, we can't know in which block the transaction will end up & what ID it'll have until the block is recorded on a blockchain.

I suggest to use transaction id as a key, and adding array of transactions to order table or introducing "order_id to transaction array table", wdyt?

So, what the pending status is for? I propose to get rid of it because we already have payment_status & withdrawal_status in the orders table. If our transfer transaction will go wrong and won't be recorded on the blockchain, we can record it in the message field, or simply print it in the log. Or both.

pending status is initial transaction status for transactions which are yet to be included in the block. That is listed in the API specs as well.

  • chain
    Remove this field. An order has it.
  • block_number
    Use block hash in the key instead of the block number.
  • position_in_block
    A part of the key.
  • transaction_bytes
    Remove this field. A transaction can be fetched from a blockchain.
  • timestamp
    Remove this field. A timestamp is included in the transaction that can be fetched from a blockchain.
  • sender.
    Remove this field. See above.
  • recipient.
    Remove this field. See above.
  • amount
    Remove this field. See above.
  • currency
    Remove this field. See above.

While this information is indeed available on chain I would prefer to store it in the db for 3 reasons:

  • in case we want to provide some kind of statistics, csv report or any action which require fetching all the transaction data from the database it will make it way harder to populate it
  • as we will eventually migrate to a more convenient data storage having those fields will make it easier for us to migrate
  • its way easier to implement it that way while also minimizing the amount of race condition presented by rpc instability e.g I should always have enough information about transactions especially in cases where rpcs are unstable and doesn't allow me to connect

instance_info

The only field needed there is instance name, all the other fields are stored without specific reason.

orders_per_chain

K: H64 V: u64

Please create an issue to add this table if its needed

@fluiderson
Copy link
Collaborator

I suggest to use transaction id as a key

What's the transaction ID? Do you mean the extrinsic ID from Subscan?

and adding array of transactions to order table or introducing "order_id to transaction array table", wdyt?

We can store transactions in a single table as per my key design. What're the benefits of adding another table or field in the orders table?

pending status is initial transaction status for transactions which are yet to be included in the block. That is listed in the API specs as well.

To store a transaction, we need its ID. Until it isn't included in the block, what ID should we use for it, then? Either this is a fundamental flaw in our API spec or we need to come up with a more complex key to identify both included & pending transactions.

The only field needed there is instance name, all the other fields are stored without specific reason.

So, the current key checking, the key rotation mechanism, the chains genesis checking will be in future versions, too? If yes, then please remove the chain field from the orders table, because it's a part of the chains genesis checking. H64 can't be generated without it.

@Vovke
Copy link
Collaborator Author

Vovke commented Oct 22, 2024

@fluiderson I definitely would prefer having extrinsic but as you said that won't allow us to precreate the transaction.
As discussed above we can have an array of linked transaction id inside the order table or having linking table order_id -> [trx_id_1, trx_id_2, trx_id_3....], both of those approaches will allow us to have our own transaction id as a key for fetching correct transactions for the order.

If yes, then please remove the chain field from the orders table, because it's a part of the chains genesis checking. H64 can't be generated without it.

removed

@Vovke
Copy link
Collaborator Author

Vovke commented Oct 22, 2024

discussed with @fluiderson

  • we will generate our own transaction_id to link transaction and order
  • need to add transaction type field to the transactions, can be payment|withdrawal

docs/DATABASE.md Show resolved Hide resolved
docs/DATABASE.md Show resolved Hide resolved
@Vovke
Copy link
Collaborator Author

Vovke commented Oct 22, 2024

@Slesarew this can be merged

@Slesarew Slesarew merged commit fd562cb into Kalapaja:main Oct 22, 2024
4 of 6 checks passed
@Vovke Vovke deleted the database-discussion branch November 11, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants