bug(CREATE SINK INTO): set "STREAMING_PARALLELISM" cause error. #18668

MrTaozui · 2024-09-24T08:48:10Z

Describe the bug

setting "STREAMING_PARALLELISM" cause error.
simple case:
CREATE TABLE product(
id int,
product_name VARCHAR,
product_color VARCHAR,
primary key(id)
);
insert into product values(1,'p1','blue');
insert into product values(2,'p2','blue');
insert into product values(3,'p3','blue');

CREATE TABLE product_sink_target(
id int,
product_name VARCHAR,
product_color VARCHAR,
primary key(id)
);

set STREAMING_PARALLELISM=2;

CREATE sink product_sink into product_sink_target as select * from product;

error:
ERROR: Failed to run the query

Caused by these errors (recent errors listed first):
1: gRPC request to meta service failed: Internal error
2: merged RPC Error, in worker node 2
3: gRPC request to stream service failed: Internal error
4: actor 13 not found in info table

Then I restart meta nodes and retry with out setting STREAMING_PARALLELISM. Success finally!!!

Error message/log

To Reproduce

apiVersion: risingwave.risingwavelabs.com/v1alpha1
kind: RisingWave
metadata:
name: risingwave-datalab-cluster
namespace: risingwave-operator-system
spec:
metaStore:
postgresql:
enabled: true
host: risingwave-mysql-oss-postgresql
port: 5432
database: risingwave
credentials:
secretName: risingwave-mysql-oss-postgresql
usernameKeyRef: postgres-username
passwordKeyRef: postgres-password

stateStore:
dataDirectory: datalab-hummock02
minio:
endpoint: risingwave-mysql-oss-minio:9000
bucket: risingwave
credentials:
secretName: risingwave-mysql-oss-minio
usernameKeyRef: root-user
passwordKeyRef: root-password
image: xxxxx/risingwave:v2.0.0
components:
meta:
nodeGroups:
- replicas: 2
name: ""
template:
spec:
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: scratch-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: datalab-alicloud-local-quota
volumeMode: Filesystem
name: heap
volumeMounts:
- mountPath: /heap
name: heap
subPathExpr: $(SERVER_POD_IP)/$(SERVER_POD_NAME)
env:
- name: MALLOC_CONF
value: prof:true,lg_prof_interval:-1,lg_prof_sample:20,prof_prefix:/heap/
- name: RW_HEAP_PROFILING_DIR
value: /heap
- name: SERVER_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVER_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
resources:
limits:
cpu: 1
memory: 4Gi
requests:
cpu: 1
memory: 4Gi
nodeSelector:
batch-node-group: batch
frontend:
nodeGroups:
- replicas: 2
name: ""
template:
spec:
resources:
limits:
cpu: 1
memory: 8Gi
requests:
cpu: 1
memory: 8Gi
#nodeSelector:
#batch-node-group: batch
compute:
nodeGroups:
- replicas: 3
name: "streaming"
template:
spec:
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: scratch-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: datalab-alicloud-local-quota
volumeMode: Filesystem
name: heap
volumeMounts:
- mountPath: /heap
name: heap
subPathExpr: $(SERVER_POD_IP)/$(SERVER_POD_NAME)
env:
- name: MALLOC_CONF
value: prof:true,lg_prof_interval:-1,lg_prof_sample:20,prof_prefix:/heap/
- name: RW_HEAP_PROFILING_DIR
value: /heap
- name: RW_COMPUTE_NODE_ROLE
value: streaming
- name: SERVER_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVER_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
resources:
limits:
cpu: 1
memory: 8Gi # Memory limit will be set to RW_TOTAL_MEMORY_BYTES
requests:
cpu: 1
memory: 8Gi
#nodeSelector:
#batch-node-group: batch
- replicas: 3
name: "serving"
template:
spec:
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: scratch-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: datalab-alicloud-local-quota
volumeMode: Filesystem
name: heap
volumeMounts:
- mountPath: /heap
name: heap
subPathExpr: $(SERVER_POD_IP)/$(SERVER_POD_NAME)
env:
- name: MALLOC_CONF
value: prof:true,lg_prof_interval:-1,lg_prof_sample:20,prof_prefix:/heap/
- name: RW_HEAP_PROFILING_DIR
value: /heap
- name: RW_COMPUTE_NODE_ROLE
value: serving
- name: SERVER_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVER_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
resources:
limits:
cpu: 1
memory: 8Gi # Memory limit will be set to RW_TOTAL_MEMORY_BYTES
requests:
cpu: 1
memory: 8Gi
#nodeSelector:
#batch-node-group: batch
compactor:
nodeGroups:
- replicas: 3
name: ""
template:
spec:
volumes:
- ephemeral:
volumeClaimTemplate:
metadata:
labels:
type: scratch-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: datalab-alicloud-local-quota
volumeMode: Filesystem
name: heap
volumeMounts:
- mountPath: /heap
name: heap
subPathExpr: $(SERVER_POD_IP)/$(SERVER_POD_NAME)
env:
- name: MALLOC_CONF
value: prof:true,lg_prof_interval:-1,lg_prof_sample:20,prof_prefix:/heap/
- name: RW_HEAP_PROFILING_DIR
value: /heap
- name: SERVER_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVER_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
resources:
limits:
cpu: 1
memory: 8Gi
requests:
cpu: 1
memory: 8Gi
#nodeSelector:
#batch-node-group: batch

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-12-16T02:11:39Z

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

MrTaozui added the type/bug Something isn't working label Sep 24, 2024

github-actions bot added this to the release-2.1 milestone Sep 24, 2024

BugenZhao assigned shanicky Oct 9, 2024

shanicky modified the milestones: release-2.1, release-2.2 Oct 16, 2024

github-actions bot added the no-issue-activity label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(CREATE SINK INTO): set "STREAMING_PARALLELISM" cause error. #18668

bug(CREATE SINK INTO): set "STREAMING_PARALLELISM" cause error. #18668

MrTaozui commented Sep 24, 2024 •

edited

Loading

github-actions bot commented Dec 16, 2024

bug(CREATE SINK INTO): set "STREAMING_PARALLELISM" cause error. #18668

bug(CREATE SINK INTO): set "STREAMING_PARALLELISM" cause error. #18668

Comments

MrTaozui commented Sep 24, 2024 • edited Loading

Describe the bug

Error message/log

To Reproduce

Expected behavior

How did you deploy RisingWave?

The version of RisingWave

Additional context

github-actions bot commented Dec 16, 2024

MrTaozui commented Sep 24, 2024 •

edited

Loading