Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async Client #102

Open
jonas32 opened this issue Apr 20, 2021 · 24 comments
Open

Async Client #102

jonas32 opened this issue Apr 20, 2021 · 24 comments

Comments

@jonas32
Copy link
Contributor

jonas32 commented Apr 20, 2021

I would like to start a discussion about making this async.
@khaf already mentioned that this is planned anyways.
Before i will start to test around, i would like to get a two questions cleared.
If you have any additional ideas or feedback on that, please share it!

1: Project layout
There are three ways to look at that. First one would be splitting in two crates (one repo with sub crates), 1 sync and 1 async. Would result in maintaining 2 codebases (some parts could be shared) but would probably be cleaner and with less overhead. Way two would be building async native and implementing sync via manual blocking via feature flag. Third one would be dropping the sync support. Either way, this will be a breaking change because it will require users to use a runtime.

2: Tokio or async-std? Tokio is more mature and way more adopted in the ecosystem. Based on that, i would decide for Tokio.

@khaf
Copy link
Collaborator

khaf commented Jun 16, 2021

Sorry for responding so late. I postponed chipping in to study the rust ecosystem a bit, but it seems that it's been months and hasn't materialized!

  1. I believe we should have an async implementation, and then provide a thin compatibility layer for the old sync API on top of it, while deprecating it. I see no point to having sync data API in this day and age.
  2. Here is where I'm not sure. I'd rather have a design with pluggable async backends, but I don't know how viable that is. Any ideas?

@jonas32
Copy link
Contributor Author

jonas32 commented Jun 16, 2021

I support deprecating the sync API. That makes sense.
I'm also not sure about point 2. Making the client async is already a lot of work, supporting 2 different async platforms means feature flagging the code or splitting it into different crates. Thats a lot of maintaining overhead.
There is probably not much benefit from building this fully modular since async-std and Tokio are the only two really spread runtimes.
I would be fine with both ways, but i would only take care of the base and the Tokio part.
When Tokio is done, we probably got most use-cases covered.
Having the sync and async code base mixed up will end in a mess, so how would you like to separate that?
I would think of either forking to build it clean and maintain both repos for some time or moving one of the APIs to a sub package. "aerospike::sync::Client or aerospike::async::Client" for example.
What do you think about that?

@databasedav
Copy link

shouldn't the async client use the abstractions in futures so it's runtime agnostic?

@jonas32
Copy link
Contributor Author

jonas32 commented Jun 17, 2021

In theory yes, but there is some more into this, for example the TCP sockets. I don't know if there is any abstraction for stuff like that. I only ever saw this in relation to the Tokio or async-std API.
You can also see that in most other database related libs. Most of them explicitly reference Tokio and/or async-std.
I like the way sqlx solved that problem:
https://github.com/launchbadge/sqlx/blob/master/sqlx-rt/src/lib.rs

@jhecking
Copy link
Contributor

The sqlx wrapper for Tokio/async-std reminds me of the small abstraction layer we have in the Aerospike C client to support libuv, libev, and libevent for async operations. If it's not clear whether either Tokio or async-std will be the dominant API for async Rust, then it might be worthwhile to support both.

@khaf
Copy link
Collaborator

khaf commented Jun 17, 2021

The way rust async design and ecosystem has evolved leads me to believe there may not even be just one dominant async library in the future, so it is important for us to design something in the spirit of the original philosophy. At the first glance and on the surface, sqlx implementation seems like a good direction to me as well.

@jonas32
Copy link
Contributor Author

jonas32 commented Jun 19, 2021

I think we probably can take a lot of stuff from the sqlx abstraction and actix-rt. That means the abstraction will not be a big part of the work.
But the client will need a lot of restructuring.
I would go for this way:
Project structure:

  • Crate aerospike-core-> Everything thats shared between the codebases, like the encoding functions.
  • Crate aerospike-rt -> The internal async abstraction lib
  • Crate aerospike-async -> The async codebase
  • Crate aerospike-sync -> The sync codebase
  • Crate aerospike -> Reexporter for the 2 above, depending on the feature flags
    The async/sync crate could also be a single one, but i would split that for better overview and structure.
    Essentially they will be the same, just that the sync one will be deprecated and not use the rt lib.

Feature flags for actix-rt, tokio and async-std. actix-rt is just another wrapper lib for async-std and Tokio, but its common to only have this one and no explicit Tokio runtime. So most projects also support that one. Its nearly no overhead since the imports are nearly identical to Tokio.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 4, 2021

I started to work on the async version...
Porting the current client is a huge pain and causes a lot of overhead. I spent >25 hours, just trying to fix all the problems caused by references and lifetimes and I'm far from getting this to work.
Async rust really does not like the way variables are stored in the client. There are two ways to solve the problem. First one would be by moving the variables instead of referencing, the second one would be smartpointers.
I will go for the first one since this is the way cleaner and more usable way.

This will cause a lot of refactoring in the lower levels of the client and change the API a lot.
I'm going to do this changes anyway, so now the question is should i do that here or should i fork out?

The changes will partly abandon the low memory profile idea of the client. I will have an eye on good memory usage, but this wont be the priority anymore. The memory usage will probably increase, but i don't see the benefit compared to all the trade-offs in adding a lot of boilerplate to keep it as low as possible. Overall performance is probably the better target (the smartpointer way would probably affect this).

In addition, the sync client will also not be a fully independent one, it will basically just wrap the async one and block. This is a common strategy in rust. It will still require a runtime like Tokio, but that is no problem since there is one in place in most of the projects. I underestimated how many basic functions need to be changed to make them run async. There is nearly no function besides msgpack and buffer writing that works without changes.
I managed to get cluster and net more or less running on async, but as soon as i start with commands, it becomes ugly and hacky.

Short overview of what i will do:

  • Crate aerospike-rt -> Runtime abstraction
  • Crate aerospike-core -> Async library
  • Crate aerospike-sync -> Sync wrapper for the core lib
  • Crate aerospike -> The exported API, depending on feature flags

Flags:

  • rt-tokio (default on)
  • rt-async-std (will follow later)
  • rt-actix (will follow later)
  • serialization (default on)

Major changes:

  • Abandon low memory profile focus
  • Timeouts will be removed because they are not supported by async sockets and also not needed in an async env
  • Operations and expressions need a lot of refactoring for async and will switch to moving variables instead of referencing
  • The client API will have a lot of changes to make it more usable (focus on batch reading and operations)
  • The client will probably have buffer caches in a later stage like the golang one has to improve performance when operations are executed multiple times

Since it is closely related to the async runtime, ill try to also implement encryption while building the updated core.

Sorry for skipping the discussion and bringing in facts, but i need progress on this. My last PR is marked ready for review for over 4 months now. If you are ok with this changes, i will do them here with a PR. If you have any comments on this, please share!

@khaf
Copy link
Collaborator

khaf commented Aug 6, 2021

I've been working to support the new server v5.6 features into the client, and as part of that - since it would be a breaking change anyway - I was going to remove a few lifetimes. I thought I'd be done in couple of weeks, so it would make sense for you to wait for my changes to get in. In the mean time, maybe we should talk about the changes first so we are on the same page. My changes are focused on removing many of the &str in the library and turning them into String. This will not have much, if any impact on the memory usage.

I did not know you had a PR open, I thought you just had a branch. If it passes the tests, we can get it merged ASAP.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 6, 2021

Thanks for the response! Can you give a more exact list on what you are implementing?
While clicking through the base of the Java client, i found many missing parts.
My refactoring is currently mainly in the net and cluster crates. Im far from getting any kind of command to work.
Async rust requires all structs to be Send + Sync + Sized. Now we could do this like its done for the Client, but that unsafe block is a bad practice. The only way to resolve this is removing everything that is not thread safe.

@khaf
Copy link
Collaborator

khaf commented Aug 6, 2021

I'm trying to implement the partition scans so that the tests pass. The old scan protocol is deprecated, so the client can not run any scan/queries. I also have to implement the new auth protocol.
I have nothing against changing the client to get it working with async, I'd just prefer to do it in phases, so that the async job becomes just about the async instead of a huge changeset that interlocks the whole span of changes.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 6, 2021

Totally. I did not know that you started to work on the client again.
I wanted to do a load of side features in the same move, because i would otherwise have to port "deprecated" and outdated code to async, so i can remove it again in a later stage to update it.
New Auth protocol and Rack aware are probably the "deepst" ones in the client.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 10, 2021

Please have a look at my #108 PR. If you see anything i missed, please tell me.
Its a stupid 1:1 port from old sync to ne async. No functionality changes.

The only part i dont really understand yet is the threadpool. I would like to get rid of that, since it makes not much sense in async. The only part that prevents me is this "scoped" pools. Can you explain more why the client needs them instead of simple run and die threads?

@jhecking
Copy link
Contributor

The only part that prevents me is this "scoped" pools. Can you explain more why the client needs them instead of simple run and die threads?

ca9940b: Does this help?

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 12, 2021

Unscoped threads are a good use case for executing queries accross
multiple nodes because the query threads continue running even after the
result set has been passed back to the caller.

Can you explain this further? Why does the client keep them? Is there any technical reason or is it just to skip the overhead of spawning new threads everytime?

However in the case of
the batch read operation, all threads used to fetch records from
different nodes need to complete before the results are passed back to
the caller.

So the point of the scoped ones is basically just to await until all of them are finished before returning anything?

@jhecking
Copy link
Contributor

The main difference between the scoped and unscoped threads is in the lifetimes of the data that you can pass to the thread to process. For unscoped threads, you either have to pass ownership to the thread or you can pass references to data with 'static lifetimes. You can't pass references to any data with non-static lifetimes, since you don't know how long the thread is going to run, so it might outlive any non-static lifetimes. For scoped threads, you can pass references to data with non-static lifetimes, as long as its guaranteed that the lifetime of the data outlives the lifetime of the scope.

We use that for batch reads. All the batch records are held in a single vector. All the batch jobs (one per cluster node) hold a shared reference to this structure, but each has a unique set of offsets its responsible for. (Since each record is owned by exactly one node.) All the batch jobs are executed in parallel. Once all jobs are complete, the same batch records vector is passed back to the caller.

Probably could be done without the scoped threads. You'd have to slice up the batch records into smaller vectors which can be passed to each batch job. Once a job is done it passes ownership back. Then you just have to reassemble all the individual pieces back into a single result vector in the correct order.

I haven't used async/await in Rust yet, so I don't know how this would work in that case.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 13, 2021

Im trying to implement that splitting right now.
There is one point that brings in a little complexity. How important is it to keep the order in the return vec?
Right now, the order of batch records is the same that the user gave in. Does the returned one have to keep this order or can it be "unordered"?
Right now, the client will first get the nodes for each batch read and then add the index of the element to the node vec.
Instead, i would like to directly add the batch record instead of its index. That would do the slicing part but then i give up the original ordering.

@jhecking
Copy link
Contributor

I think all Aerospike clients have always kept the order of the results the same, so for consistency the Rust client should probably do the same.

If we decide to break compatibility with other clients, then the batch function should probably just take IntoIter instead of a Vec and return some collection that doesn’t imply order like a HashSet.

@jonas32
Copy link
Contributor Author

jonas32 commented Aug 14, 2021

We can extend the BatchRead struct with an index field. Then we can rebuild a new vec at the end. That would probably work.
Accepting IntoIter might be a problem because iters are mutable references. I have not yet managed to send such a reference over multiple threads without a lot of overhead and cloning.
However, this will add overhead and slow down the client, since i will have to move around mutable vectors to do that.
If that bloat is fine, ill implement it. Right now its more or less ordered by node.

@jhecking
Copy link
Contributor

I think we should keep the current interface that takes a Vec and returns a Vec with records in the same order. @khaf thoughts?

@jonas32
Copy link
Contributor Author

jonas32 commented Sep 15, 2021

Did you already have time to look into the async PR?
Whats the status on partition scan and the new auth? We would like to update our infrastructure but the client is preventing that right now.

@jonas32
Copy link
Contributor Author

jonas32 commented Sep 30, 2021

Sorry for pinging again, but i need at least an answer on the second question because otherwise i will have to do it myself.
I waited for over a month now. This is starting to destroy our time plan. @jhecking @khaf

@khaf
Copy link
Collaborator

khaf commented Sep 30, 2021

@jonas32 Sorry I was distracted on other projects and @jhecking has very limited time to invest on this project. I'm back at the Rust client full time for the next few weeks. Let me finish a few minor fixes this week and I'll get back to you early next week.

@bmuddha
Copy link

bmuddha commented Feb 22, 2022

Hello guys, can we expect async client to be released anytime soon? I can see there's a PR open for it, and it seems to be at testing stage

@jonas32 jonas32 mentioned this issue Mar 4, 2022
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants