-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unify hummock s3 retry and timeout interface #13843
Comments
Are socket connection time and TTFB time included or excluded in operation timeout? If yes, are we expecting the socket connection timeout and TTFB timeout to be significantly smaller than operation timeout? For simplicity, can we use operation timeout only without introducing socket connection timeout and TTFB timeout? |
It is included. But if we handle the socket connection and TTFB timeout, we can retry unexpected requests much earlier than it reaches opetation timeout. |
..IIUC, the socket connection timeout is the duration between request and server accept, and the TTFB timeout is the duration beyween request and the first byte of data that returned, which includes theobject store agent handling the request and serving the first chunk. |
IMO, the connection timeout and TTFB timeout maybe useful for most reads, except for compactor. |
True. IIUC, setting socket connection timeout and TTFB timeout are simple for aws sdk so I think there is no overhead of doing so. I am not sure whether opendal provides interfaces to set them though. |
This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned. |
Agreed. Since there is no plan to further refactor the AWS SDK, let me close the issue now. |
Currently, the object store only has a retry mechanism based on error or long-term timeout. But sometimes, an object store request can also be hung by other causes and the operation cannot finish and then block the streaming. The object store needs a mechanism to set low-level timeout and retry configurations.
Considering there are various types of operations, the operation timeout cannot be unified with the same timeout. The timeout of each operation should be separated, which includes:
Besides, more low-level timeout configurations are needed to prevent other exceptions:
The text was updated successfully, but these errors were encountered: