feat(observability): a command check cluster information & status #12826

fuyufjh · 2023-10-13T03:47:49Z

As requested by users, we need a command to quickly tell the status of cluster, including

Nodes and their heartbeats e.g. the number/IP/resource/etc. of all registered nodes
Brief runtime info e.g. Current epoch, latency, etc.
Brief catalog info e.g. How any MVs or actors are running?
Misc. RW version, uptime, etc.

zwang28 · 2023-11-24T06:39:30Z

Nodes and their heartbeats e.g. the number/IP/resource/etc. of all registered nodes
Brief runtime info e.g. Current epoch, latency, etc.
Brief catalog info e.g. How any MVs or actors are running?
Misc. RW version, uptime, etc.

The above-mentioned info are now available through several system tables.
Are we supposed to further add another new command to summarize them?

fuyufjh · 2023-12-06T02:53:53Z

related #13764

shanicky · 2023-12-07T08:36:30Z

after #10886 #10905 #13764

We can directly connect to etcd to dump the current table fragments, workers, users, election members, and various types of catalogs, similar to the kubectl get command. You can specify the output format (yaml, json) using -o. Please note that the output could be very large

This provides a simple method to observe the data stored in etcd before the sql backend officially goes live.

risectl debug dump table,worker,table-catalog -o yaml


- kind: worker
  item:
    id: 1
    type: WORKER_TYPE_COMPUTE_NODE
    host:
      host: 127.0.0.1
      port: 5688
    state: RUNNING
    parallelUnits:
    - workerNodeId: 1
    - id: 1
      workerNodeId: 1
    - id: 2
      workerNodeId: 1
    - id: 3
      workerNodeId: 1
    property:
      isStreaming: true
      isServing: true
    transactionalId: 0
- kind: table
  item:
    tableId: 2001
    state: CREATED
    fragments:
      1001:
        fragmentId: 1001
        fragmentTypeMask: 2
        distributionType: SINGLE
        actors:
        - actorId: 1001
          fragmentId: 1001
          upstreamActorId:
          - 1002
          - 1003
          - 1004
          - 1005
          mviewDefinition: CREATE MATERIALIZED VIEW m2 AS SELECT max(v) FROM t
......
- kind: table_catalog
  item:
    id: 1002
    name: m
    columns:
    - columnDesc:
        columnType:
          typeName: INT32
          isNullable: true
        columnId: 1
        name: v
    - columnDesc:
        columnType:
......

zwang28 · 2024-03-06T02:34:15Z

TODO: add table catalog after redaction is supported.

github-actions · 2024-06-12T09:00:12Z

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

fuyufjh added the type/feature label Oct 13, 2023

github-actions bot added this to the release-1.4 milestone Oct 13, 2023

fuyufjh assigned zwang28 Nov 8, 2023

fuyufjh modified the milestones: release-1.4, release-1.5 Nov 8, 2023

fuyufjh mentioned this issue Nov 10, 2023

Discussion: persist system events and query via SQL #13267

Closed

This was referenced Nov 17, 2023

refactor(rw_catalog): make rw_worker_nodes list all nodes and resource #13487

Merged

feat(meta): record barrier latency in event table #13633

Merged

This comment was marked as outdated.

Sign in to view

zwang28 modified the milestones: release-1.5, release-1.6 Dec 4, 2023

zwang28 mentioned this issue Dec 18, 2023

feat(dashboard): add diagnose command #14025

Merged

9 tasks

zwang28 modified the milestones: release-1.6, release-1.7 Jan 9, 2024

zwang28 modified the milestones: release-1.7, release-1.8 Mar 6, 2024

zwang28 removed this from the release-1.8 milestone Apr 8, 2024

github-actions bot added the no-issue-activity label Jun 12, 2024

zwang28 mentioned this issue Jul 29, 2024

feat(dashboard): include table definition in diagnose report #17842

Merged

9 tasks

zwang28 closed this as completed in #17842 Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(observability): a command check cluster information & status #12826

feat(observability): a command check cluster information & status #12826

fuyufjh commented Oct 13, 2023

zwang28 commented Nov 24, 2023

This comment was marked as outdated.

fuyufjh commented Dec 6, 2023

shanicky commented Dec 7, 2023 •

edited

Loading

zwang28 commented Mar 6, 2024

github-actions bot commented Jun 12, 2024

feat(observability): a command check cluster information & status #12826

feat(observability): a command check cluster information & status #12826

Comments

fuyufjh commented Oct 13, 2023

zwang28 commented Nov 24, 2023

This comment was marked as outdated.

fuyufjh commented Dec 6, 2023

shanicky commented Dec 7, 2023 • edited Loading

zwang28 commented Mar 6, 2024

github-actions bot commented Jun 12, 2024

shanicky commented Dec 7, 2023 •

edited

Loading