Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: cannot create table with cdc backfill #12657

Closed
StrikeW opened this issue Oct 7, 2023 · 3 comments
Closed

Bug: cannot create table with cdc backfill #12657

StrikeW opened this issue Oct 7, 2023 · 3 comments
Assignees
Labels
component/connector type/bug Something isn't working
Milestone

Comments

@StrikeW
Copy link
Contributor

StrikeW commented Oct 7, 2023

Describe the bug

User submit the followed query, then it will get an error: internal error: Rpc error: gRPC error (Unknown error): transport error

set cdc_backfill='true';

CREATE TABLE business_table (
...
)

The table is a small table.
The CN log has nothing special.
From the log of meta, the most related log is the following part:

2023-10-07T08:33:48.606245716Z  INFO risingwave_meta::stream::source_manager: spawning new watcher for source 29053
2023-10-07T08:33:48.795103872Z  INFO risingwave_meta::stream::source_manager: new discovered splits fragment_id=28022 new_discovered_splits={"29053"}
2023-10-07T08:33:50.029924504Z  WARN risingwave_meta::barrier: Failed to complete epoch 5205683641581568: Rpc error: gRPC error (Unknown error): transport error
  backtrace of `MetaError`:
   0: <risingwave_meta::error::MetaError as core::convert::From<risingwave_meta::error::MetaErrorInner>>::from
             at ./risingwave/src/meta/src/error.rs:88:33
   1: <T as core::convert::Into<U>>::into
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/convert/mod.rs:716:9
   2: <risingwave_meta::error::MetaError as core::convert::From<risingwave_rpc_client::error::RpcError>>::from
             at ./risingwave/src/meta/src/error.rs:181:37
   3: <T as core::convert::Into<U>>::into
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/convert/mod.rs:716:9
   4: core::ops::function::FnOnce::call_once
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/ops/function.rs:250:5
   5: core::result::Result<T,E>::map_err
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/result.rs:829:27
   6: risingwave_meta::barrier::GlobalBarrierManager::collect_barrier::{{closure}}
             at ./risingwave/src/meta/src/barrier/mod.rs:855:22
   7: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
   8: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:334:17
   9: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/loom/std/unsafe_cell.rs:16:9
  10: tokio::runtime::task::core::Core<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:323:13
  11: tokio::runtime::task::harness::poll_future::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:485:19
  12: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/panic/unwind_safe.rs:271:9
  13: std::panicking::try::do_call
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/panicking.rs:526:40
  14: std::panicking::try
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/panicking.rs:490:19
  15: std::panic::catch_unwind
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/panic.rs:142:14
  16: tokio::runtime::task::harness::poll_future
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:473:18
  17: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:208:27
  18: tokio::runtime::task::harness::Harness<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:153:15
  19: tokio::runtime::task::raw::RawTask::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/raw.rs:200:18
  20: tokio::runtime::task::LocalNotified<S>::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/mod.rs:400:9
  21: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:639:22
  22: tokio::runtime::coop::with_budget
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/coop.rs:107:5
  23: tokio::runtime::coop::budget
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/coop.rs:73:5
  24: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:575:9
  25: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:538:24
  26: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:491:21
  27: tokio::runtime::context::scoped::Scoped<T>::set
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/context/scoped.rs:40:9
  28: tokio::runtime::context::set_scheduler::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/context.rs:176:26
  29: std::thread::local::LocalKey<T>::try_with
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/thread/local.rs:270:16
  30: std::thread::local::LocalKey<T>::with
             at ./rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/thread/local.rs:246:9
  31: tokio::runtime::context::set_scheduler
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/context.rs:176:17

It seems that the compute node will crash when the cluster handle the CREATE TABLE request.
But we don't have the memory and CPU usage of the cluster right now.

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

K8S

The version of RisingWave

RisingWave-1.3.0-alpha (5ab1f7a)

Additional context

MySQL version:5.7.33-log

@StrikeW StrikeW added the type/bug Something isn't working label Oct 7, 2023
@StrikeW StrikeW self-assigned this Oct 7, 2023
@github-actions github-actions bot added this to the release-1.3 milestone Oct 7, 2023
@StrikeW
Copy link
Contributor Author

StrikeW commented Oct 9, 2023

Checked the customer log, the CN doesn't get launched during the process of the CREATE TABLE request.

@fuyufjh fuyufjh modified the milestones: release-1.3, release-1.4 Oct 10, 2023
@StrikeW
Copy link
Contributor Author

StrikeW commented Oct 10, 2023

Closed as the CN doesn't get launched during the process of the CREATE TABLE request. No evidence to show that there is a bug in the connector.

@StrikeW StrikeW closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023
@StrikeW
Copy link
Contributor Author

StrikeW commented Oct 17, 2023

Update: customer probably doesn't configure a memory limit for the compute node in K8s, so the compute node OOM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/connector type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants