Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: removing or combining ports used in the standalone mode #12086

Closed
lmatz opened this issue Sep 5, 2023 · 9 comments
Closed

Discussion: removing or combining ports used in the standalone mode #12086

lmatz opened this issue Sep 5, 2023 · 9 comments

Comments

@lmatz
Copy link
Contributor

lmatz commented Sep 5, 2023

Just to move some discussion in the Slack to Github

SCR-20230905-maa

Right now, the standalone mode opens dozens of ports. And for cloud deployment, we'd like to reduce the number of ports. cc @arkbriar could you elaborate more on the benefits of having fewer ports in the standalone mode?

For Prometheus, I suppose three ports can be combined into one only.

And for gRPC communication among meta, frontend, and compute, we probably can do the same as they used to be different processes but now they are all in the same process.

How much work does it take?
Which set of ports are to be combined/removed most easily first?

@github-actions github-actions bot added this to the release-1.2 milestone Sep 5, 2023
@arkbriar
Copy link
Contributor

arkbriar commented Sep 5, 2023

cc @arkbriar could you elaborate more on the benefits of having fewer ports in the standalone mode?

I just think we don't have to expose such many ports. More or less, it will ease the pain to deploy the standalone mode.

@kwannoel
Copy link
Contributor

kwannoel commented Sep 5, 2023

Add some comments from @BugenZhao :

@kwannoel:

Not right now for the standalone binary.
It is just a wrapper around CN, Meta and Frontend entrypoints. So it will require the ports to be supplied.
Eventually they can get removed / combined.
Currently I think it still uses gRPC to communicate between the nodes, even though it is a single process. That can be improved.
cc @BugenZhao
do you have any ideas for it? Did risingwave-playground explore this before?

@BugenZhao:

This is a good point. Generally, we have done a partial of this optimization:
For exchanges between actors, we will prefer local channels and directly move the chunks into them if we find two actors on the same worker node. For standalone deployment, all exchanges will naturally take benefit from this optimization.
For other gRPC requests, we don’t have this optimization and always communicate on http endpoint. But I guess it’s not a big deal in most cases, as the data amount is small and the performance is not critical. However, it may matter for serving and we may improve this if necessary.

it may matter for serving and we may improve this if necessary.
I guess it’s not only simply using a local channel as the underlying medium for tonic gRPCs. We may also consider reducing the encode/decode for each call. This may require us to do some refactor and design a fast path for this kind of deployment.

@kwannoel
Copy link
Contributor

kwannoel commented Sep 5, 2023

Something else I thought about: Even though meta, CN and frontend share the same process, is it possible that other nodes will need to communicate directly to meta / frontend via gRPC? In that case we need to handle routing logic, and the RPC messages need to be tagged.

@kwannoel
Copy link
Contributor

kwannoel commented Sep 5, 2023

As for refactoring work, I think the main hurdle is gRPC, I'm not sure how tightly coupled our system is with it, whether it's as simple as not starting the gRPC service. It would require some investigation on my part.

Other refactoring work would be just adding a flag to each rw node, to indicate they are executing in standalone mode, and do not need to start gRPC service.

@BugenZhao
Copy link
Member

cc @arkbriar could you elaborate more on the benefits of having fewer ports in the standalone mode?

I just think we don't have to expose such many ports. More or less, it will ease the pain to deploy the standalone mode.

I guess avoiding exposing unnecessary ports should be enough. There's not much benefit for thoroughly removing them and switching to an in-memory channel for most RPC calls, as explained above.

@lmatz
Copy link
Contributor Author

lmatz commented Sep 7, 2023

As the connector node is going to be embedded into CN, will it still rely on gRPC to communicate with CN? Or will it be totally replace with JNI?

@kwannoel
Copy link
Contributor

kwannoel commented Sep 7, 2023

As the connector node is going to be embedded into CN, will it still rely on gRPC to communicate with CN? Or will it be totally replace with JNI?

Probably replace with JNI.
Cc @chenzl25

@chenzl25
Copy link
Contributor

chenzl25 commented Sep 7, 2023

There would be 2 phases to replace the connector node with the embedded one. In the first phase, only part of RPCs is replaced. But in the end, all of them would be JNI.

@fuyufjh fuyufjh modified the milestones: release-1.2, release-1.3 Sep 11, 2023
@lmatz lmatz removed this from the release-1.3 milestone Sep 11, 2023
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants