Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to understand cloud-oriented design? OLTP? #1

Open
bytesleak opened this issue Feb 11, 2022 · 3 comments
Open

How to understand cloud-oriented design? OLTP? #1

bytesleak opened this issue Feb 11, 2022 · 3 comments

Comments

@bytesleak
Copy link

No description provided.

@drmingdrmer
Copy link
Member

Try best to utilize cloud service to break the limits that are normally found in a program designed for the local machine, such as:

  • WAL can be replaced with Raft or Kafka or other queue/log service that provides strong consistency.
  • Data blocks(SSTable) can be stored in S3 for reliability and replication.

An application on the cloud has almost unlimited resources to use if you got enough money.

Thus money-oriented optimization will be considered:
Data storage is cheap, while data IO is expensive. Data structures should be friendly with object-store. The data index has to be as small as possible.

Globally synchronization is expensive: locking is difficult to be done correctly in a distributed environment. Snapshot-based designs are more considered.

Message transmission becomes more expensive across nodes. Data layout has to be considered if cross-region deploy involves.

Algorithms that are friendly in a distributed system will be considered.
Such as relatively relaxed consensus protocol(CRDT or else), non-linear WAL(log) structure.

@bytesleak
Copy link
Author

Thank you for your reply, I get the idea. Will it consider scalability under multi-core?

@drmingdrmer
Copy link
Member

Thank you for your reply, I get the idea. Will it consider scalability under multi-core?

It sounds like a has-to-do. :)

The internal sharding will make it generally friendly to multi-core env:
https://datafuselabs.github.io/openkv/arch/sharding.html

And most of the data in an LSM are static SSTable(on disk or cached in memory), this makes data sharing across cores more efficient. No need to worry about cache line invalidation or synchronization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants