Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

隐私求交时master无法调度容器导致任务失败 #463

Open
lyh251543 opened this issue Nov 28, 2024 · 9 comments
Open

隐私求交时master无法调度容器导致任务失败 #463

lyh251543 opened this issue Nov 28, 2024 · 9 comments

Comments

@lyh251543
Copy link

Issue Type

Others

Search for existing issues similar to yours

Yes

Kuscia Version

0.12.0b0

Link to Relevant Documentation

No response

Question Details

esxi自建虚拟机
系统:centos7.9
配置:8C16G 300G数据
软件版本:secretpad-all-in-one/v1.10.0b1
安装方式:中心化(./install master 默认安装)
问题:
创建项目“联合圈人”训练流时隐私求交报错“2024-11-27 16:27:59 INFO the jobId=ifuy, taskId=ifuy-vmdjadws-node-3 start ...
2024-11-27 16:28:01 INFO the jobId=ifuy, taskId=ifuy-vmdjadws-node-3 failed: party alice failed msg: ”
排查日志:
master容器:查看任务调度的容器发现CreatecontainerError,错误原因是“Warning  FailedScheduling   51s                kuscia-scheduler  0/3 nodes are available: waiting for task resource. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling., can not find related task resource.”意思是节点资源不足,但查询发现系统内存free空间还有7G。
疑惑:
资源很充足,但为什么无法调度器容器呢,而且通过kubectl get svc -A  发现bob和alice的svc服务都没有CLUSTER-IP,任务容器日志bash-5.2# kubectl logs -f -n alice       ifuy-vmdjadws-node-3-0
Error from server: Get "https://172.18.0.3:10250/containerLogs/alice/ifuy-vmdjadws-node-3-0/secretflow?follow=true": proxy error from 0.0.0.0:6443 while dialing 172.18.0.3:10250, code 502: 502 Bad Gateway
也提示通讯失败。
@lyh251543
Copy link
Author

完整日志
[root@kube02 secretflow-allinone-package]# free -g
total used free shared buff/cache available
Mem: 125 57 32 4 35 62
Swap: 0 0 0
[root@kube02 secretflow-allinone-package]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/centos-root 100G 9.5G 91G 10% /
devtmpfs 63G 0 63G 0% /dev
tmpfs 63G 0 63G 0% /dev/shm
tmpfs 63G 4.1G 59G 7% /run
tmpfs 63G 0 63G 0% /sys/fs/cgroup
/dev/sda1 1014M 179M 836M 18% /boot
/dev/mapper/centos-home 50G 974M 50G 2% /home
/dev/mapper/centos-data 962G 35G 928G 4% /data
tmpfs 13G 4.0K 13G 1% /run/user/42
tmpfs 13G 64K 13G 1% /run/user/0
overlay 962G 35G 928G 4% /data/docker/overlay2/56c2e8cae685c8566556c795d2acad4c76c26f733932feb2fcbea24cbb084ed9/merged
overlay 962G 35G 928G 4% /data/docker/overlay2/aa389a73d721a9da56c7dc2dea899a806d9b1a233f226aa5071b472808879d8b/merged
overlay 962G 35G 928G 4% /data/docker/overlay2/4707f014efeb5745c67e579e461ce4df20897e802e029eb62d54ab1304c0ff41/merged
overlay 962G 35G 928G 4% /data/docker/overlay2/caab4856276e04f7ed9ad79aaff267f607df59d7bfbdb246835b774c4ffcb973/merged
overlay 962G 35G 928G 4% /data/docker/overlay2/0a2b893697be57f1d18e800d1ea3330df23efd34e094072ce3b4e4adc0ac80d0/merged
[root@kube02 secretflow-allinone-package]#
[root@kube02 secretflow-allinone-package]#
[root@kube02 secretflow-allinone-package]#
[root@kube02 secretflow-allinone-package]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b112b9681963 secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretpad:0.11.0b0 "/bin/sh -c 'java ${…" 4 minutes ago Up 4 minutes 80/tcp, 9001/tcp, 0.0.0.0:8080->8080/tcp root-kuscia-master-secretpad
fd2d57ea0ad1 secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.12.0b0 "tini -- bin/kuscia …" 12 minutes ago Up 12 minutes 0.0.0.0:43081->80/tcp, 0.0.0.0:48080->1080/tcp, 0.0.0.0:48082->8082/tcp, 0.0.0.0:48083->8083/tcp, 0.0.0.0:43084->9091/tcp root-kuscia-lite-tee
a23b5b03f07f secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.12.0b0 "tini -- bin/kuscia …" 16 minutes ago Up 16 minutes 0.0.0.0:33081->80/tcp, 0.0.0.0:38080->1080/tcp, 0.0.0.0:38082->8082/tcp, 0.0.0.0:38083->8083/tcp, 0.0.0.0:33084->9091/tcp root-kuscia-lite-bob
0bac030ed226 secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.12.0b0 "tini -- bin/kuscia …" 20 minutes ago Up 20 minutes 0.0.0.0:23081->80/tcp, 0.0.0.0:28080->1080/tcp, 0.0.0.0:28082->8082/tcp, 0.0.0.0:28083->8083/tcp, 0.0.0.0:23084->9091/tcp root-kuscia-lite-alice
2e3f5dbfe87b secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia:0.12.0b0 "tini -- bin/kuscia …" 21 minutes ago Up 21 minutes 0.0.0.0:13081->80/tcp, 0.0.0.0:18080->1080/tcp, 0.0.0.0:18082->8082/tcp, 0.0.0.0:18083->8083/tcp, 0.0.0.0:13084->9091/tcp root-kuscia-master
[root@kube02 secretflow-allinone-package]# docker exec -ti 2e3f5dbfe87b bash
bash-5.2# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
alice dataproxy-alice-597968f9cf-jdfrr 1/1 Running 0 17m
bob dataproxy-bob-6c65dc5966-pbgfh 1/1 Running 0 13m
tee dataproxy-tee-5fffbb6d9-c7gsb 1/1 Running 0 9m47s
tee capsule-manager-sim-66bdfb5587-2xbpd 1/1 Running 0 7m26s
alice lnkl-ngvmfuhy-node-3-0 0/1 ContainerCreating 0 39s
bob lnkl-ngvmfuhy-node-3-0 0/1 ContainerCreating 0 39s
bash-5.2# kubectl describe po -n alice lnkl-ngvmfuhy-node-3-0
Name: lnkl-ngvmfuhy-node-3-0
Namespace: alice
Priority: 0
Service Account: default
Node: root-kuscia-lite-alice-kube02/172.18.0.3
Start Time: Thu, 28 Nov 2024 09:45:23 +0800
Labels: kuscia.secretflow/communication-role-client=true
kuscia.secretflow/communication-role-server=true
kuscia.secretflow/controller=kusciatask
kuscia.secretflow/pod-identity=6762b6d8-fe60-4a64-ab3c-495587be03ce-0
kuscia.secretflow/pod-role=
kuscia.secretflow/task-resource-group-uid=61a5b0bb-2dc5-4250-beca-250c0596eac8
kuscia.secretflow/task-resource-uid=d5233c68-1290-4f57-b9b6-bd40db7ef4de
kuscia.secretflow/task-uid=6762b6d8-fe60-4a64-ab3c-495587be03ce
Annotations: kuscia.secretflow/config-template-value-cm-name: lnkl-ngvmfuhy-node-3-kuscia-gen-conf
kuscia.secretflow/config-template-volumes: config-template
kuscia.secretflow/initiator: bob
kuscia.secretflow/task-id: lnkl-ngvmfuhy-node-3
kuscia.secretflow/task-resource: lnkl-ngvmfuhy-node-3-66257f7797d6
kuscia.secretflow/task-resource-group: lnkl-ngvmfuhy-node-3
kuscia.secretflow/taskresource-reserving-timestamp: 2024-11-28T09:45:23+08:00
Status: Pending
IP:
IPs:
Containers:
secretflow:
Container ID:
Image: secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.10.0b1
Image ID:
Ports: 22645/TCP, 22646/TCP, 22647/TCP, 22641/TCP, 22642/TCP, 22643/TCP, 22644/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
sh
Args:
-c
python -m secretflow.kuscia.entry ./kuscia/task-config.conf
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
KUSCIA_PORT_SPU_NUMBER: 22645
KUSCIA_PORT_FED_NUMBER: 22646
KUSCIA_PORT_GLOBAL_NUMBER: 22647
KUSCIA_PORT_NODE_MANAGER_NUMBER: 22641
KUSCIA_PORT_OBJECT_MANAGER_NUMBER: 22642
KUSCIA_PORT_CLIENT_SERVER_NUMBER: 22643
KUSCIA_PORT_INFERENCE_NUMBER: 22644
Mounts:
./kuscia/task-config.conf from config-template (rw,path="task-config.conf")
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-template:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: lnkl-ngvmfuhy-node-3-configtemplate
Optional: false
QoS Class: BestEffort
Node-Selectors: kuscia.secretflow/namespace=alice
Tolerations: kuscia.secretflow/agent:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 62s kuscia-scheduler 0/3 nodes are available: waiting for task resource. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling., can not find related task resource.
Normal Scheduled 60s kuscia-scheduler Successfully assigned alice/lnkl-ngvmfuhy-node-3-0 to root-kuscia-lite-alice-kube02
Warning MissingClusterDNS 60s (x2 over 61s) Agent pod: "lnkl-ngvmfuhy-node-3-0_alice(a90b8693-924a-4c6b-bacf-09de99233e30)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.
Normal Pulled 60s Agent Container image "secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.10.0b1" already present on machine
bash-5.2# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
alice dataproxy-alice-597968f9cf-jdfrr 1/1 Running 0 19m
bob dataproxy-bob-6c65dc5966-pbgfh 1/1 Running 0 15m
tee dataproxy-tee-5fffbb6d9-c7gsb 1/1 Running 0 11m
tee capsule-manager-sim-66bdfb5587-2xbpd 1/1 Running 0 8m44s
bob lnkl-ngvmfuhy-node-3-0 0/1 Error 0 117s
bash-5.2# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.43.0.1 443/TCP 22m
alice dataproxy-grpc ClusterIP None 31675/TCP 19m
bob dataproxy-grpc ClusterIP None 21449/TCP 15m
tee dataproxy-grpc ClusterIP None 23076/TCP 11m
tee capsule-manager ClusterIP 10.43.3.173 8888/TCP 8m58s
kuscia-system secretpad ExternalName root-kuscia-master-secretpad 9001/TCP 6m6s
bob lnkl-ngvmfuhy-node-3-0-inference ClusterIP None 25426/TCP 2m11s
bob lnkl-ngvmfuhy-node-3-0-spu ClusterIP None 25427/TCP 2m11s
bob lnkl-ngvmfuhy-node-3-0-global ClusterIP None 25429/TCP 2m11s
bob lnkl-ngvmfuhy-node-3-0-fed ClusterIP None 25428/TCP 2m11s

@lyh251543
Copy link
Author

bash-5.2# kubectl logs -f -n alice lttb-ymgvyxns-node-3-0
Error from server: Get "https://172.18.0.3:10250/containerLogs/alice/lttb-ymgvyxns-node-3-0/secretflow?follow=true": proxy error from 0.0.0.0:6443 while dialing 172.18.0.3:10250, code 502: 502 Bad Gatewa

@zimu-yuxi
Copy link

kubeclt get kj 任务名 -n cross-domain -oyaml看下

@lyh251543
Copy link
Author

bash-5.2# kubectl get kj -A
NAMESPACE NAME STARTTIME COMPLETIONTIME LASTRECONCILETIME PHASE
cross-domain lnkl 5h46m 5h44m 5h44m Failed
cross-domain ntdf 5h12m 5h12m 5h12m Failed
cross-domain lttb 4h52m 4h52m 4h52m Failed
cross-domain kbzt 156m 155m 155m Failed
bash-5.2# kubectl get kj kbzt -n cross-domain -oyaml
bash-5.2#kubectl get kj kbzt -n cross-domain -oyaml
apiVersion: kuscia.secretflow/v1alpha1
kind: KusciaJob
metadata:
annotations:
kuscia.secretflow/initiator: hmxibzri
kuscia.secretflow/interconn-self-parties: hmxibzri_reuhujft
kuscia.secretflow/self-cluster-as-initiator: "true"
creationTimestamp: "2024-11-28T04:55:06Z"
generation: 1
name: kbzt
namespace: cross-domain
resourceVersion: "49633"
uid: b912cc93-12a5-4608-9b1c-9cfc0abb6a2a
spec:
initiator: hmxibzri
maxParallelism: 1
scheduleMode: BestEffort
tasks:

  • alias: kbzt-jgynegvt-node-3
    appImage: secretflow-image
    parties:
    • domainID: hmxibzri
    • domainID: reuhujft
      taskID: kbzt-jgynegvt-node-3
      taskInputConfig: |-
      {
      "sf_datasource_config": {
      "hmxibzri": {
      "id": "default-data-source"
      },
      "reuhujft": {
      "id": "default-data-source"
      }
      },
      "sf_cluster_desc": {
      "parties": ["hmxibzri", "reuhujft"],
      "devices": [{
      "name": "spu",
      "type": "spu",
      "parties": ["hmxibzri", "reuhujft"],
      "config": "{"runtime_config":{"protocol":"SEMI2K","field":"FM128"},"link_desc":{"connect_retry_times":60,"connect_retry_interval_ms":1000,"brpc_channel_protocol":"http","brpc_channel_connection_type":"pooled","recv_timeout_ms":1200000,"http_timeout_ms":1200000}}"
      }, {
      "name": "heu",
      "type": "heu",
      "parties": ["hmxibzri", "reuhujft"],
      "config": "{"mode": "PHEU", "schema": "paillier", "key_size": 2048}"
      }],
      "ray_fed_config": {
      "cross_silo_comm_backend": "brpc_link"
      }
      },
      "sf_node_eval_param": {
      "domain": "data_prep",
      "name": "psi",
      "version": "0.0.8",
      "attr_paths": ["input/input_table_1/key", "input/input_table_2/key", "protocol", "sort_result", "allow_empty_result", "allow_duplicate_keys", "allow_duplicate_keys/no/skip_duplicates_check", "allow_duplicate_keys/no/receiver_parties", "ecdh_curve"],
      "attrs": [{
      "is_na": false,
      "ss": ["id2"]
      }, {
      "is_na": false,
      "ss": ["id1"]
      }, {
      "is_na": false,
      "s": "PROTOCOL_RR22"
      }, {
      "b": true,
      "is_na": false
      }, {
      "is_na": true
      }, {
      "is_na": false,
      "s": "no"
      }, {
      "is_na": true
      }, {
      "is_na": false,
      "ss": ["hmxibzri", "reuhujft"]
      }, {
      "is_na": false,
      "s": "CURVE_FOURQ"
      }],
      "inputs": [{
      "type": "sf.table.individual",
      "meta": {
      "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable",
      "line_count": "-1"
      },
      "data_refs": [{
      "uri": "bob_1417007189.csv",
      "party": "hmxibzri",
      "format": "csv"
      }]
      }, {
      "type": "sf.table.individual",
      "meta": {
      "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable",
      "line_count": "-1"
      },
      "data_refs": [{
      "uri": "alice_1278350067.csv",
      "party": "reuhujft",
      "format": "csv"
      }]
      }],
      "checkpoint_uri": "ckkbzt-jgynegvt-node-3-output-0"
      },
      "sf_output_uris": ["kbzt_jgynegvt_node_3_output_0"],
      "sf_input_ids": ["dzlgvyli", "eybbajfo"],
      "sf_input_partitions_spec": ["", ""],
      "sf_output_ids": ["kbzt-jgynegvt-node-3-output-0"],
      "table_attrs": [{
      "table_id": "dzlgvyli",
      "column_attrs": [{
      "col_name": "id2",
      "col_type": "label"
      }, {
      "col_name": "contact_cellular",
      "col_type": "feature"
      }, {
      "col_name": "contact_telephone",
      "col_type": "feature"
      }, {
      "col_name": "contact_unknown",
      "col_type": "feature"
      }, {
      "col_name": "month_apr",
      "col_type": "feature"
      }, {
      "col_name": "month_aug",
      "col_type": "feature"
      }, {
      "col_name": "month_dec",
      "col_type": "feature"
      }, {
      "col_name": "month_feb",
      "col_type": "feature"
      }, {
      "col_name": "month_jan",
      "col_type": "feature"
      }, {
      "col_name": "month_jul",
      "col_type": "feature"
      }, {
      "col_name": "month_jun",
      "col_type": "feature"
      }, {
      "col_name": "month_mar",
      "col_type": "feature"
      }, {
      "col_name": "month_may",
      "col_type": "feature"
      }, {
      "col_name": "month_nov",
      "col_type": "feature"
      }, {
      "col_name": "month_oct",
      "col_type": "feature"
      }, {
      "col_name": "month_sep",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_failure",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_other",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_success",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_unknown",
      "col_type": "feature"
      }, {
      "col_name": "y",
      "col_type": "feature"
      }]
      }, {
      "table_id": "eybbajfo",
      "column_attrs": [{
      "col_name": "id1",
      "col_type": "label"
      }, {
      "col_name": "age",
      "col_type": "feature"
      }, {
      "col_name": "education",
      "col_type": "feature"
      }, {
      "col_name": "default",
      "col_type": "feature"
      }, {
      "col_name": "balance",
      "col_type": "feature"
      }, {
      "col_name": "housing",
      "col_type": "feature"
      }, {
      "col_name": "loan",
      "col_type": "feature"
      }, {
      "col_name": "day",
      "col_type": "feature"
      }, {
      "col_name": "duration",
      "col_type": "feature"
      }, {
      "col_name": "campaign",
      "col_type": "feature"
      }, {
      "col_name": "pdays",
      "col_type": "feature"
      }, {
      "col_name": "previous",
      "col_type": "feature"
      }, {
      "col_name": "job_blue-collar",
      "col_type": "feature"
      }, {
      "col_name": "job_entrepreneur",
      "col_type": "feature"
      }, {
      "col_name": "job_housemaid",
      "col_type": "feature"
      }, {
      "col_name": "job_management",
      "col_type": "feature"
      }, {
      "col_name": "job_retired",
      "col_type": "feature"
      }, {
      "col_name": "job_self-employed",
      "col_type": "feature"
      }, {
      "col_name": "job_services",
      "col_type": "feature"
      }, {
      "col_name": "job_student",
      "col_type": "feature"
      }, {
      "col_name": "job_technician",
      "col_type": "feature"
      }, {
      "col_name": "job_unemployed",
      "col_type": "feature"
      }, {
      "col_name": "marital_divorced",
      "col_type": "feature"
      }, {
      "col_name": "marital_married",
      "col_type": "feature"
      }, {
      "col_name": "marital_single",
      "col_type": "feature"
      }]
      }]
      }
      tolerable: false
  • alias: kbzt-jgynegvt-node-4
    appImage: secretflow-image
    dependencies:
    • kbzt-jgynegvt-node-3
      parties:
    • domainID: hmxibzri
    • domainID: reuhujft
      taskID: kbzt-jgynegvt-node-4
      taskInputConfig: |-
      {
      "sf_datasource_config": {
      "hmxibzri": {
      "id": "default-data-source"
      },
      "reuhujft": {
      "id": "default-data-source"
      }
      },
      "sf_cluster_desc": {
      "parties": ["hmxibzri", "reuhujft"],
      "devices": [{
      "name": "spu",
      "type": "spu",
      "parties": ["hmxibzri", "reuhujft"],
      "config": "{"runtime_config":{"protocol":"SEMI2K","field":"FM128"},"link_desc":{"connect_retry_times":60,"connect_retry_interval_ms":1000,"brpc_channel_protocol":"http","brpc_channel_connection_type":"pooled","recv_timeout_ms":1200000,"http_timeout_ms":1200000}}"
      }, {
      "name": "heu",
      "type": "heu",
      "parties": ["hmxibzri", "reuhujft"],
      "config": "{"mode": "PHEU", "schema": "paillier", "key_size": 2048}"
      }],
      "ray_fed_config": {
      "cross_silo_comm_backend": "brpc_link"
      }
      },
      "sf_node_eval_param": {
      "domain": "stats",
      "name": "table_statistics",
      "version": "1.0.0",
      "attr_paths": ["input/input_ds/features"],
      "attrs": [{
      "is_na": false,
      "ss": ["age", "education", "default"]
      }],
      "checkpoint_uri": "ckkbzt-jgynegvt-node-4-output-0"
      },
      "sf_output_uris": ["kbzt_jgynegvt_node_4_output_0"],
      "sf_input_ids": ["kbzt-jgynegvt-node-3-output-0"],
      "sf_input_partitions_spec": [""],
      "sf_output_ids": ["kbzt-jgynegvt-node-4-output-0"],
      "table_attrs": [{
      "table_id": "dzlgvyli",
      "column_attrs": [{
      "col_name": "id2",
      "col_type": "label"
      }, {
      "col_name": "contact_cellular",
      "col_type": "feature"
      }, {
      "col_name": "contact_telephone",
      "col_type": "feature"
      }, {
      "col_name": "contact_unknown",
      "col_type": "feature"
      }, {
      "col_name": "month_apr",
      "col_type": "feature"
      }, {
      "col_name": "month_aug",
      "col_type": "feature"
      }, {
      "col_name": "month_dec",
      "col_type": "feature"
      }, {
      "col_name": "month_feb",
      "col_type": "feature"
      }, {
      "col_name": "month_jan",
      "col_type": "feature"
      }, {
      "col_name": "month_jul",
      "col_type": "feature"
      }, {
      "col_name": "month_jun",
      "col_type": "feature"
      }, {
      "col_name": "month_mar",
      "col_type": "feature"
      }, {
      "col_name": "month_may",
      "col_type": "feature"
      }, {
      "col_name": "month_nov",
      "col_type": "feature"
      }, {
      "col_name": "month_oct",
      "col_type": "feature"
      }, {
      "col_name": "month_sep",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_failure",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_other",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_success",
      "col_type": "feature"
      }, {
      "col_name": "poutcome_unknown",
      "col_type": "feature"
      }, {
      "col_name": "y",
      "col_type": "feature"
      }]
      }, {
      "table_id": "eybbajfo",
      "column_attrs": [{
      "col_name": "id1",
      "col_type": "label"
      }, {
      "col_name": "age",
      "col_type": "feature"
      }, {
      "col_name": "education",
      "col_type": "feature"
      }, {
      "col_name": "default",
      "col_type": "feature"
      }, {
      "col_name": "balance",
      "col_type": "feature"
      }, {
      "col_name": "housing",
      "col_type": "feature"
      }, {
      "col_name": "loan",
      "col_type": "feature"
      }, {
      "col_name": "day",
      "col_type": "feature"
      }, {
      "col_name": "duration",
      "col_type": "feature"
      }, {
      "col_name": "campaign",
      "col_type": "feature"
      }, {
      "col_name": "pdays",
      "col_type": "feature"
      }, {
      "col_name": "previous",
      "col_type": "feature"
      }, {
      "col_name": "job_blue-collar",
      "col_type": "feature"
      }, {
      "col_name": "job_entrepreneur",
      "col_type": "feature"
      }, {
      "col_name": "job_housemaid",
      "col_type": "feature"
      }, {
      "col_name": "job_management",
      "col_type": "feature"
      }, {
      "col_name": "job_retired",
      "col_type": "feature"
      }, {
      "col_name": "job_self-employed",
      "col_type": "feature"
      }, {
      "col_name": "job_services",
      "col_type": "feature"
      }, {
      "col_name": "job_student",
      "col_type": "feature"
      }, {
      "col_name": "job_technician",
      "col_type": "feature"
      }, {
      "col_name": "job_unemployed",
      "col_type": "feature"
      }, {
      "col_name": "marital_divorced",
      "col_type": "feature"
      }, {
      "col_name": "marital_married",
      "col_type": "feature"
      }, {
      "col_name": "marital_single",
      "col_type": "feature"
      }]
      }]
      }
      tolerable: false
      status:
      approveStatus:
      hmxibzri: JobAccepted
      reuhujft: JobAccepted
      completionTime: "2024-11-28T04:56:03Z"
      conditions:
  • lastTransitionTime: "2024-11-28T04:55:06Z"
    status: "True"
    type: JobValidated
    lastReconcileTime: "2024-11-28T04:56:03Z"
    phase: Failed
    stageStatus:
    hmxibzri: JobCreateStageSucceeded
    reuhujft: JobCreateStageSucceeded
    startTime: "2024-11-28T04:55:06Z"
    taskStatus:
    kbzt-jgynegvt-node-3: Failed

@lyh251543
Copy link
Author

bash-5.2# kubectl describe node root-kuscia-lite-bob-kube02
Name: root-kuscia-lite-bob-kube02
Roles: agent
Labels: beta.kubernetes.io/arch=x86_64
beta.kubernetes.io/os=linux
domain=bob
kubernetes.io/apiVersion=0.26.6
kubernetes.io/arch=x86_64
kubernetes.io/hostname=root-kuscia-lite-bob-kube02
kubernetes.io/os=linux
kubernetes.io/role=agent
kuscia.secretflow/namespace=bob
kuscia.secretflow/runtime=runc
Annotations: node.alpha.kubernetes.io/ttl: 0
CreationTimestamp: Thu, 28 Nov 2024 09:29:20 +0800
Taints: kuscia.secretflow/agent=v1:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: root-kuscia-lite-bob-kube02
AcquireTime:
RenewTime: Thu, 28 Nov 2024 15:41:45 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message


NetworkUnavailable False Thu, 28 Nov 2024 09:29:20 +0800 Thu, 28 Nov 2024 09:29:20 +0800 RouteCreated RouteController created a route
PIDPressure False Thu, 28 Nov 2024 09:29:20 +0800 Thu, 28 Nov 2024 09:29:20 +0800 AgentHasSufficientPID Agent has sufficient PID available
MemoryPressure False Thu, 28 Nov 2024 15:41:22 +0800 Thu, 28 Nov 2024 09:29:20 +0800 AgentHasSufficientMemory Agent has sufficient memory available, total=125.7GB, available=55.6GB
DiskPressure False Thu, 28 Nov 2024 15:41:22 +0800 Thu, 28 Nov 2024 09:29:20 +0800 AgentHasNoDiskPressure Agent has no disk pressure. @agent_volume(/home/kuscia/var/storage/data): space=9.5GB/100.0GB(9.5%) inode=143.8k/52.4M(0.3%)
OutOfDisk False Thu, 28 Nov 2024 15:41:22 +0800 Thu, 28 Nov 2024 09:29:20 +0800 AgentHasSufficientDisk Agent has sufficient disk space available. @agent_volume: free_space=90.5GB, free_inode=52.3M
Kernel-Params False Thu, 28 Nov 2024 15:41:22 +0800 Thu, 28 Nov 2024 09:29:20 +0800 Kernel parameters not satisfy kuscia recommended requirements tcp_max_syn_backlog=Unknown[ERR];somaxconn=128[ERR];tcp_retries2=Unknown[ERR];tcp_slow_start_after_idle=Unknown[ERR];tcp_tw_reuse=Unknown[ERR];file-max=13056068[OK]
Ready True Thu, 28 Nov 2024 15:41:22 +0800 Thu, 28 Nov 2024 09:29:21 +0800 AgentReady Agent is ready
Addresses:
InternalIP: 172.18.0.4
Capacity:
cpu: 96
memory: 4Gi
pods: 500
storage: 984407Mi
Allocatable:
cpu: 96
memory: 3596Mi
pods: 500
storage: 985324384Ki
System Info:
Machine ID: ae4c23c0-2177-11eb-abf6-3c7c3ff05c52
System UUID:
Boot ID: 1692060101-1732757360928742178
Kernel Version: 3.10.0-957.el7.x86_64
OS Image: docker://linux/anolis:23 (guest)
Operating System: linux
Architecture: x86_64
Container Runtime Version:
Kubelet Version: v0.12.0b0
Kube-Proxy Version:
PodCIDR: 10.42.1.0/24
PodCIDRs: 10.42.1.0/24
Non-terminated Pods: (1 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age


bob dataproxy-bob-6c65dc5966-pbgfh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 6h9m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits


cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
storage 0 0
Events:

@BrainWH
Copy link

BrainWH commented Nov 28, 2024

@lyh251543
Copy link
Author

我使用的命令是docker update ${container_name} --memory=16g --memory-swap=16g
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
7ec5de59fed4 root-kuscia-lite-secretpad-reuhujft 1.43% 2.12GiB / 16GiB 13.25% 368kB / 377kB 0B / 14.2MB 264
cdb6b6a55d6a root-kuscia-lite-reuhujft 2.81% 533MiB / 16GiB 3.25% 2.33MB / 1.44MB 5.32MB / 27.4MB 388
1a8b3bfde50c root-kuscia-lite-secretpad-hmxibzri 1.74% 2.012GiB / 16GiB 12.57% 73.8kB / 57.1kB 0B / 10.9MB 257
0915ee27403f root-kuscia-lite-hmxibzri 3.40% 541.8MiB / 16GiB 3.31% 2.28MB / 1.14MB 221kB / 30.5MB 389
b112b9681963 root-kuscia-master-secretpad 1.16% 2.161GiB / 16GiB 13.51% 218kB / 263kB 0B / 31.7MB 226
fd2d57ea0ad1 root-kuscia-lite-tee 6.12% 562.9MiB / 16GiB 3.44% 1.64MB / 969kB 57.3kB / 30.6MB 422
a23b5b03f07f root-kuscia-lite-bob 6.66% 501.5MiB / 16GiB 3.06% 1.47MB / 879kB 16.4kB / 20.5MB 383
0bac030ed226 root-kuscia-lite-alice 8.09% 505.8MiB / 16GiB 3.09% 1.92MB / 1.12MB 8.19kB / 21.9MB 390
2e3f5dbfe87b root-kuscia-master 93.75% 591.5MiB / 16GiB 3.61% 4.17MB / 7.44MB 0B / 313MB 397
扩容后报错依旧

@yushiqie
Copy link
Collaborator

从你贴出的日志中,任务已经调度成功了。kuscia 不支持 kubectl logs 查看pod 日志,可以通过查看任务日志进一步排查 https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.12.0b0/deployment/logdescription#id3

@lyh251543
Copy link
Author

平台日志(管道任务)
2024-12-02 11:06:27 INFO the jobId=jyyu, taskId=jyyu-vvvvmzkd-node-3 start ...
2024-12-02 11:06:30 INFO the jobId=jyyu, taskId=jyyu-vvvvmzkd-node-3 failed: party alice failed msg:

容器调度日志
bash-5.2# kubectl logs -f -n alice jyyu-vvvvmzkd-node-3-0
Error from server: Get "https://172.18.0.3:10250/containerLogs/alice/jyyu-vvvvmzkd-node-3-0/secretflow?follow=true": proxy error from 0.0.0.0:6443 while dialing 172.18.0.3:10250, code 502: 502 Bad Gateway

容器事件

Events:
Type Reason Age From Message


Warning FailedScheduling 15s kuscia-scheduler 0/5 nodes are available: waiting for task resource. preemption: 0/5 nodes are available: 5 Preemption is not helpful for scheduling., can not find related task resource.
Normal Scheduled 12s kuscia-scheduler Successfully assigned alice/spok-jyyu-vvvvmzkd-node-3-0 to root-kuscia-lite-alice-kube02
Normal Pulled 13s Agent Container image "secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/secretflow-lite-anolis8:1.10.0b1" already present on machine
Normal Created 12s Agent Created container secretflow
Normal Started 12s Agent Started container secretflow
Warning MissingClusterDNS 11s (x4 over 13s) Agent pod: "spok-jyyu-vvvvmzkd-node-3-0_alice(cf227b15-58df-4f0d-8c11-bdc21e34c4ab)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.

cat /home/kuscia/var/logs/kuscia.log |grep "2024-12-02 11:06"
2024-12-02 11:06:27.007 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Running},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:27.010 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (2.862471ms)
2024-12-02 11:06:27.129 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:27.129 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (134.709894ms)
2024-12-02 11:06:27.129 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:27.138 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Running},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:27.138 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (190.891µs)
2024-12-02 11:06:27.144 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (15.320838ms)
2024-12-02 11:06:27.145 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:27.160 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:27.160 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (15.771797ms)
2024-12-02 11:06:27.163 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (125.79µs)
2024-12-02 11:06:27.163 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Running},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:27.163 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (172.37µs)
2024-12-02 11:06:29.492 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:29.533 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:29.533 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (41.398919ms)
2024-12-02 11:06:29.543 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil},{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Running}}, kusciaJobId=jyyu
2024-12-02 11:06:29.543 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (263.89µs)
2024-12-02 11:06:29.544 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (228.57µs)
2024-12-02 11:06:29.870 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:29.928 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Running},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:29.928 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (428.57µs)
2024-12-02 11:06:30.122 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.122 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (252.57491ms)
2024-12-02 11:06:30.123 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.171 INFO handler/scheduler.go:898 JobStatusPhaseFrom readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Failed},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:30.171 INFO handler/scheduler.go:923 JobStatusPhaseFrom failed readyTasks={}, tasks={{taskId=jyyu-vvvvmzkd-node-3, dependencies=[], tolerable=false, phase=Failed},{taskId=jyyu-vvvvmzkd-node-4, dependencies=[jyyu-vvvvmzkd-node-3], tolerable=false, phase=nil}}, kusciaJobId=jyyu
2024-12-02 11:06:30.172 INFO resources/kusciajob.go:121 Start updating kuscia job "jyyu" status
2024-12-02 11:06:30.192 INFO resources/kusciajob.go:125 Finish updating kuscia job "jyyu" status
2024-12-02 11:06:30.192 INFO kusciajob/controller.go:304 Finished syncing KusciaJob "cross-domain/jyyu" (21.23097ms)
2024-12-02 11:06:30.192 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (21.49589ms)
2024-12-02 11:06:30.200 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.200 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (77.315406ms)
2024-12-02 11:06:30.200 INFO resources/kusciajob.go:121 Start updating kuscia job "jyyu" status
2024-12-02 11:06:30.304 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.306 INFO resources/kusciajob.go:125 Finish updating kuscia job "jyyu" status
2024-12-02 11:06:30.306 INFO kusciajob/controller.go:304 Finished syncing KusciaJob "cross-domain/jyyu" (106.121671ms)
2024-12-02 11:06:30.306 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (106.235371ms)
2024-12-02 11:06:30.316 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (43.36µs)
2024-12-02 11:06:30.415 INFO resources/kusciatask.go:70 Finish updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.415 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (214.922893ms)
2024-12-02 11:06:30.420 INFO queue/queue.go:126 Finish processing item: queue id[kuscia-job-controller], key[cross-domain/jyyu] (41.77µs)
2024-12-02 11:06:30.423 INFO resources/kusciatask.go:66 Start updating kuscia task "jyyu-vvvvmzkd-node-3" status
2024-12-02 11:06:30.438 INFO kusciatask/controller.go:584 Finish syncing KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" (22.972771ms)
2024-12-02 11:06:30.438 INFO kusciatask/controller.go:552 KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" was finished, skipping
2024-12-02 11:06:30.918 INFO kusciatask/controller.go:552 KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" was finished, skipping
2024-12-02 11:06:30.968 INFO kusciatask/controller.go:552 KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" was finished, skipping
2024-12-02 11:06:30.977 INFO kusciatask/controller.go:552 KusciaTask "cross-domain/jyyu-vvvvmzkd-node-3" was finished, skipping
2024-12-02 11:06:30.978 INFO port/port_provider.go:172 Delete ports indeed, owner=pod:jyyu-vvvvmzkd-node-3-0, ports=[23337 23338 23332 23333 23334 23335 23336], namespace=bob

cat /home/kuscia/var/logs/k3s.log |grep "2024-12-02 11:06"
I1202 11:06:26.943745 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-low" queue=45 time="2024-12-02 11:06:26.943532570" requestDescr1=&{IsResourceRequest:true Path:/api/v1/namespaces/bob/services/jyyu-vvvvmzkd-node-3-0-inference Verb:update APIPrefix:api APIGroup: APIVersion:v1 Namespace:bob Resource:services Subresource: Name:jyyu-vvvvmzkd-node-3-0-inference Parts:[services jyyu-vvvvmzkd-node-3-0-inference]} requestDescr2=&{Name:system:serviceaccount:bob:bob UID:29d1f2ff-b026-48df-a8e1-914916b33e71 Groups:[system:serviceaccounts system:serviceaccounts:bob system:authenticated] Extra:map[]} newVirtualStart="6578.81332019ss" deltaVirtualStart="0.06802678ss"
I1202 11:06:30.989248 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=91 time="2024-12-02 11:06:30.989182120" requestDescr1=&{IsResourceRequest:false Path:/apis/apps/v1 Verb:get APIPrefix:apis APIGroup: APIVersion: Namespace: Resource: Subresource: Name: Parts:[]} requestDescr2=&{Name:system:kube-controller-manager UID: Groups:[system:authenticated] Extra:map[]} newVirtualStart="456.33325174ss" deltaVirtualStart="0.00027229ss"
I1202 11:06:31.028529 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=111 time="2024-12-02 11:06:31.028425178" requestDescr1=&{IsResourceRequest:true Path:/apis/discovery.k8s.io/v1/namespaces/bob/endpointslices/jyyu-vvvvmzkd-node-3-0-global-5wv4w Verb:get APIPrefix:apis APIGroup:discovery.k8s.io APIVersion:v1 Namespace:bob Resource:endpointslices Subresource: Name:jyyu-vvvvmzkd-node-3-0-global-5wv4w Parts:[endpointslices jyyu-vvvvmzkd-node-3-0-global-5wv4w]} requestDescr2=&{Name:system:serviceaccount:kube-system:generic-garbage-collector UID:ae53322c-741e-4c47-a07f-96bf1f58db43 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.36922134ss" deltaVirtualStart="0.01870035ss"
I1202 11:06:31.029394 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=94 time="2024-12-02 11:06:31.029229089" requestDescr1=&{IsResourceRequest:true Path:/api/v1/namespaces/bob/endpoints/jyyu-vvvvmzkd-node-3-0-global Verb:delete APIPrefix:api APIGroup: APIVersion:v1 Namespace:bob Resource:endpoints Subresource: Name:jyyu-vvvvmzkd-node-3-0-global Parts:[endpoints jyyu-vvvvmzkd-node-3-0-global]} requestDescr2=&{Name:system:serviceaccount:kube-system:endpoint-controller UID:0425731a-0ea5-41ce-bb25-faa745ca88c8 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.37010546ss" deltaVirtualStart="0.01698327ss"
I1202 11:06:31.033803 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=79 time="2024-12-02 11:06:31.033645791" requestDescr1=&{IsResourceRequest:true Path:/api/v1/namespaces/bob/endpoints/jyyu-vvvvmzkd-node-3-0-fed Verb:delete APIPrefix:api APIGroup: APIVersion:v1 Namespace:bob Resource:endpoints Subresource: Name:jyyu-vvvvmzkd-node-3-0-fed Parts:[endpoints jyyu-vvvvmzkd-node-3-0-fed]} requestDescr2=&{Name:system:serviceaccount:kube-system:endpoint-controller UID:0425731a-0ea5-41ce-bb25-faa745ca88c8 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.37604920ss" deltaVirtualStart="0.02250776ss"
I1202 11:06:31.034302 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=116 time="2024-12-02 11:06:31.034112051" requestDescr1=&{IsResourceRequest:true Path:/apis/discovery.k8s.io/v1/namespaces/bob/endpointslices/jyyu-vvvvmzkd-node-3-0-spu-qns9k Verb:get APIPrefix:apis APIGroup:discovery.k8s.io APIVersion:v1 Namespace:bob Resource:endpointslices Subresource: Name:jyyu-vvvvmzkd-node-3-0-spu-qns9k Parts:[endpointslices jyyu-vvvvmzkd-node-3-0-spu-qns9k]} requestDescr2=&{Name:system:serviceaccount:kube-system:generic-garbage-collector UID:ae53322c-741e-4c47-a07f-96bf1f58db43 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.37671212ss" deltaVirtualStart="0.02254844ss"
I1202 11:06:31.034705 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=54 time="2024-12-02 11:06:31.034569551" requestDescr1=&{IsResourceRequest:true Path:/api/v1/namespaces/bob/endpoints/jyyu-vvvvmzkd-node-3-0-spu Verb:delete APIPrefix:api APIGroup: APIVersion:v1 Namespace:bob Resource:endpoints Subresource: Name:jyyu-vvvvmzkd-node-3-0-spu Parts:[endpoints jyyu-vvvvmzkd-node-3-0-spu]} requestDescr2=&{Name:system:serviceaccount:kube-system:endpoint-controller UID:0425731a-0ea5-41ce-bb25-faa745ca88c8 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.37753318ss" deltaVirtualStart="0.00603183ss"
I1202 11:06:31.035113 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=1 time="2024-12-02 11:06:31.034949411" requestDescr1=&{IsResourceRequest:true Path:/apis/discovery.k8s.io/v1/namespaces/bob/endpointslices/jyyu-vvvvmzkd-node-3-0-fed-8zpn9 Verb:get APIPrefix:apis APIGroup:discovery.k8s.io APIVersion:v1 Namespace:bob Resource:endpointslices Subresource: Name:jyyu-vvvvmzkd-node-3-0-fed-8zpn9 Parts:[endpointslices jyyu-vvvvmzkd-node-3-0-fed-8zpn9]} requestDescr2=&{Name:system:serviceaccount:kube-system:generic-garbage-collector UID:ae53322c-741e-4c47-a07f-96bf1f58db43 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.37818090ss" deltaVirtualStart="0.00613108ss"
I1202 11:06:31.055517 31 queueset.go:996] "AntiWindup tweaked queue" QS="workload-high" queue=111 time="2024-12-02 11:06:31.055375781" requestDescr1=&{IsResourceRequest:true Path:/apis/discovery.k8s.io/v1/namespaces/bob/endpointslices/jyyu-vvvvmzkd-node-3-0-global-5wv4w Verb:delete APIPrefix:apis APIGroup:discovery.k8s.io APIVersion:v1 Namespace:bob Resource:endpointslices Subresource: Name:jyyu-vvvvmzkd-node-3-0-global-5wv4w Parts:[endpointslices jyyu-vvvvmzkd-node-3-0-global-5wv4w]} requestDescr2=&{Name:system:serviceaccount:kube-system:generic-garbage-collector UID:ae53322c-741e-4c47-a07f-96bf1f58db43 Groups:[system:serviceaccounts system:serviceaccounts:kube-system system:authenticated] Extra:map[]} newVirtualStart="456.41874436ss" deltaVirtualStart="0.02472664ss"

bash-5.2# /home/kuscia/var/logs/containerd.log
bash: /home/kuscia/var/logs/containerd.log: No such file or directory

cat /home/kuscia/var/logs/kusciaapi.log |grep "2024-12-02 11:06"
2024-12-02 11:06:23.152 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/QueryDomainData] Duration: 9.846495ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":{"domain_id":"alice","domaindata_id":"alice-table"}}, Response: {"data":{"attributes":{"description":"alice demo data"},"author":"alice","columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","type":"float"},{"name":"marital_divorced","type":"float"},{"name":"marital_married","type":"float"},{"name":"marital_single","type":"float"}],"datasource_
2024-12-02 11:06:23.166 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/QueryDomainData] Duration: 9.181565ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":{"domain_id":"bob","domaindata_id":"bob-table"}}, Response: {"data":{"attributes":{"description":"bob demo data"},"author":"bob","columns":[{"name":"id2","type":"str"},{"name":"contact_cellular","type":"float"},{"name":"contact_telephone","type":"float"},{"name":"contact_unknown","type":"float"},{"name":"month_apr","type":"float"},{"name":"month_aug","type":"float"},{"name":"month_dec","type":"float"},{"name":"month_feb","type":"float"},{"name":"month_jan","type":"float"},{"name":"month_jul","type":"float"},{"name":"month_jun","type":"float"},{"name":"month_mar","type":"float"},{"name":"month_may","type":"float"},{"name":"month_nov","type":"float"},{"name":"month_oct","type":"float"},{"name":"month_sep","type":"float"},{"name":"poutcome_failure","type":"float"},{"name":"poutcome_other","type":"float"},{"name":"poutcome_success","type":"float"},{"name":"poutcome_unknown","type":"float"},{"name":"y","type":"int"}],"datasource_id":"default-data-source","domain_id":"bob","domaindata_id":"bob-table","file_format":0,"name":"bob.csv","relative_uri":"bob.csv","status":"Availa
2024-12-02 11:06:23.213 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.JobService/CreateJob] Duration: 23.080331ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"custom_fields":{},"initiator":"bob","job_id":"jyyu","max_parallelism":1,"tasks":[{"app_image":"secretflow-image","parties":[{"domain_id":"bob"},{"domain_id":"alice"}],"alias":"jyyu-vvvvmzkd-node-3","task_id":"jyyu-vvvvmzkd-node-3","task_input_config":"{\n "sf_datasource_config": {\n "bob": {\n "id": "default-data-source"\n },\n "alice": {\n "id": "default-data-source"\n }\n },\n "sf_cluster_desc": {\n "parties": ["bob", "alice"],\n "devices": [{\n "name": "spu",\n "type": "spu",\n "parties": ["bob", "alice"],\n "config": "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}"\n }, {\n "name": "heu",\n "type": "heu",\n , Response: {"data":{"job_id":"jyyu"},"status":{"code":0,"details":null,"message":"success"}}
2024-12-02 11:06:34.244 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 25.579143ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty
2024-12-02 11:06:34.246 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 28.050964ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty
2024-12-02 11:06:34.249 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 30.646544ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty

cat /home/kuscia/logs/envoy/internal.log |grep "2024-12-02 11:06"
2024-12-02 11:06:23.213 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.JobService/CreateJob] Duration: 23.080331ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"custom_fields":{},"initiator":"bob","job_id":"jyyu","max_parallelism":1,"tasks":[{"app_image":"secretflow-image","parties":[{"domain_id":"bob"},{"domain_id":"alice"}],"alias":"jyyu-vvvvmzkd-node-3","task_id":"jyyu-vvvvmzkd-node-3","task_input_config":"{\n "sf_datasource_config": {\n "bob": {\n "id": "default-data-source"\n },\n "alice": {\n "id": "default-data-source"\n }\n },\n "sf_cluster_desc": {\n "parties": ["bob", "alice"],\n "devices": [{\n "name": "spu",\n "type": "spu",\n "parties": ["bob", "alice"],\n "config": "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}"\n }, {\n "name": "heu",\n "type": "heu",\n , Response: {"data":{"job_id":"jyyu"},"status":{"code":0,"details":null,"message":"success"}}
2024-12-02 11:06:34.244 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 25.579143ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty
2024-12-02 11:06:34.246 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 28.050964ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty
2024-12-02 11:06:34.249 INFO interceptor/common.go:121 [GRPC] [GRPC /kuscia.proto.api.v1alpha1.kusciaapi.DomainDataService/BatchQueryDomainData] Duration: 30.646544ms, StatusCode: 0, ForwardHost: [], ContextType: [], Request: {"data":[{"domain_id":"alice","domaindata_id":"alice-table"},{"domain_id":"bob","domaindata_id":"bob-table"}]}, Response: {"data":{"domaindata_list":[{"domaindata_id":"alice-table","name":"alice.csv","type":"table","relative_uri":"alice.csv","domain_id":"alice","datasource_id":"default-data-source","attributes":{"description":"alice demo data"},"columns":[{"name":"id1","type":"str"},{"name":"age","type":"float"},{"name":"education","type":"float"},{"name":"default","type":"float"},{"name":"balance","type":"float"},{"name":"housing","type":"float"},{"name":"loan","type":"float"},{"name":"day","type":"float"},{"name":"duration","type":"float"},{"name":"campaign","type":"float"},{"name":"pdays","type":"float"},{"name":"previous","type":"float"},{"name":"job_blue-collar","type":"float"},{"name":"job_entrepreneur","type":"float"},{"name":"job_housemaid","type":"float"},{"name":"job_management","type":"float"},{"name":"job_retired","type":"float"},{"name":"job_self-employed","type":"float"},{"name":"job_services","type":"float"},{"name":"job_student","type":"float"},{"name":"job_technician","type":"float"},{"name":"job_unemployed","ty

bash-5.2# cat external.log |grep "2024-12-02 11:06"

bash-5.2# cat envoy.log |grep "2024-12-02 11:06"
[2024-12-02 11:06:25.984][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:25.985][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:25.987][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:25.988][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:32.070][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:32.071][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:32.076][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:32.076][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:32.592][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:32.593][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:32.595][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:32.596][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:37.726][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:37.727][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:37.733][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:37.734][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:38.394][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:38.395][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:38.405][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:38.406][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:40.945][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:40.946][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:40.953][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:40.953][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:47.090][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)
[2024-12-02 11:06:47.091][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:70] cds: added/updated 0 cluster(s), skipped 5 unmodified cluster(s)
[2024-12-02 11:06:47.100][890][info][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:31] cds: add 5 cluster(s), remove 4 cluster(s)

alicepod日志为空
drwxr-xr-x. 3 root root 24 Dec 2 11:06 alice_jyyu-vvvvmzkd-node-3-0_cd8b379f-9088-4135-9313-1fd0aa60e10d
drwxr-xr-x. 3 root root 24 Nov 28 17:58 alice_yspj-fhbkfntu-node-3-0_1942d1c8-fae8-45fc-bbbc-e869e936a8e1
bash-5.2# cat alice_jyyu-vvvvmzkd-node-3-0_cd8b379f-9088-4135-9313-1fd0aa60e10d/secretflow/0.log
bash-5.2#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants