sink to BigQuery error: `Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t.` #17214

BugenZhao · 2024-06-12T06:16:05Z

A user encountered the following error frequently when backfilling the historical data with high throughput to sink them into the downstream BigQuery:

Actor 114514 exited unexpectedly: Executor error: Sink error:
  BigQuery error:
    status: Unavailable,
    message: "Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled:
      Task is overloaded (memory-protection) go/tr-t.",
    details: [],
    metadata: MetadataMap { headers: {} };

The error indicates that the external system is being throttled. Instead of throwing an error and causing the actor to fail, we should retry the request.

By enabling sink_decouple, the error got retried by the log store writer and not triggering the cluster recovery any more. However, it turns out that the effective write throughput is quite limited. The user needs to frequently rescale or pause and resume the cluster to increase the throughput, which is weird.

Questions:

Shall we have connector-specific retrying logic for such throttling error?
Why pausing and resuming increases the throughput?
There seems to be no pressure test that reflects the workload in production. Shall we improve the situation? Also applies to other connectors.
Based on the documentation of BigQuery Write API...
- Shall we adopt the "pending type" stream instead of the default stream (to potentially improve the performance)?
- Why can we do to minimize the possibility to reach the quota / rate limit?

The text was updated successfully, but these errors were encountered:

fuyufjh · 2024-06-12T07:27:53Z

Perhaps our write throughput/frequency is too high for these OLAP systems. I think we may always use sink_decouple and batching writes to reduce the load of target system.

Also +1 for retrying on these transient errors.

xxhZs · 2024-06-12T08:25:44Z

So it seems that implementing decouple commit for bg can solve this problem?

xxhZs · 2024-06-19T05:24:46Z

add retry in #17237

BugenZhao added the component/connector label Jun 12, 2024

github-actions bot added this to the release-1.10 milestone Jun 12, 2024

fuyufjh assigned xxhZs Jun 12, 2024

This was referenced Jun 12, 2024

Support decouple_commit for all olap sinks ? #17223

Closed

fix(sink): add bigquery append_rows retry times #17237

Merged

xxhZs closed this as completed Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sink to BigQuery error: `Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t.` #17214

sink to BigQuery error: `Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t.` #17214

BugenZhao commented Jun 12, 2024 •

edited

Loading

fuyufjh commented Jun 12, 2024 •

edited

Loading

xxhZs commented Jun 12, 2024 •

edited

Loading

xxhZs commented Jun 19, 2024

sink to BigQuery error: Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t. #17214

sink to BigQuery error: Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t. #17214

Comments

BugenZhao commented Jun 12, 2024 • edited Loading

fuyufjh commented Jun 12, 2024 • edited Loading

xxhZs commented Jun 12, 2024 • edited Loading

xxhZs commented Jun 19, 2024

sink to BigQuery error: `Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t.` #17214

sink to BigQuery error: `Request 'AppendRows' from role 'cloud-dataengine-globalrouting' throttled: Task is overloaded (memory-protection) go/tr-t.` #17214

BugenZhao commented Jun 12, 2024 •

edited

Loading

fuyufjh commented Jun 12, 2024 •

edited

Loading

xxhZs commented Jun 12, 2024 •

edited

Loading