Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Compose fails to resume from suspend #17078

Closed
kwannoel opened this issue Jun 3, 2024 · 0 comments
Closed

Docker Compose fails to resume from suspend #17078

kwannoel opened this issue Jun 3, 2024 · 0 comments
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@kwannoel
Copy link
Contributor

kwannoel commented Jun 3, 2024

Describe the bug

After timing out, the container fails to resume again, and after starting the stopping the container. There's currently no known way to recover the container, apart from just recreating it from scratch and losing the cluster data.

Error message/log

There's no clear error log. Just a repeat of these error logs again and again, until ~8.24am:

2024-05-29 08:23:22 2024-05-29T12:23:22.026025381Z  WARN rw-main risingwave_stream::task::barrier_manager: control stream reset with error error=gRPC request failed: Internal error: end of stream
2024-05-29 08:23:22 2024-05-29T12:23:22.026041422Z  WARN rw-main risingwave_stream::task::barrier_manager: failed to notify finish of control stream
2024-05-29 08:23:24 2024-05-29T12:23:24.706507882Z  INFO rw-main bootstrap_recovery{prev_epoch=6536473939345408}:recovery_attempt: risingwave_meta::barrier::recovery: recovering mview progress
2024-05-29 08:23:24 2024-05-29T12:23:24.710513173Z  INFO rw-main bootstrap_recovery{prev_epoch=6536473939345408}:recovery_attempt: risingwave_meta::barrier::recovery: recovered mview progress
2024-05-29 08:23:24 2024-05-29T12:23:24.948064757Z  WARN rw-main 

The user has tried to restart the docker service. But it still fails.

The compute node is started at 8.22:

2024-05-29 08:22:42 2024-05-29T12:22:42.743734918Z  INFO rw-main risingwave_compute::server:

To Reproduce

  1. docker compose up to start the docker container.
  2. Suspend the docker process, e.g. just let your computer sleep.
  3. After 8hrs, use ctrl-c and then docker-compose up to restart the container.

(I was unable to reproduce it with these steps, they were provided by the user).

Expected behavior

At least allow the service to resume.

How did you deploy RisingWave?

No response

The version of RisingWave

v1.9.1-rc2

Additional context

docker-compose.yml is the docker compose file used.

@kwannoel kwannoel added the type/bug Something isn't working label Jun 3, 2024
@kwannoel kwannoel self-assigned this Jun 3, 2024
@github-actions github-actions bot added this to the release-1.10 milestone Jun 3, 2024
@kwannoel kwannoel assigned yezizp2012 and unassigned kwannoel Jun 12, 2024
@yezizp2012 yezizp2012 modified the milestones: release-2.0, release-2.1 Aug 19, 2024
@yezizp2012 yezizp2012 modified the milestones: release-2.1, release-2.2 Oct 16, 2024
@yezizp2012 yezizp2012 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants