Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(observability): a command check cluster information & status #12826

Closed
fuyufjh opened this issue Oct 13, 2023 · 6 comments · Fixed by #17842
Closed

feat(observability): a command check cluster information & status #12826

fuyufjh opened this issue Oct 13, 2023 · 6 comments · Fixed by #17842

Comments

@fuyufjh
Copy link
Member

fuyufjh commented Oct 13, 2023

As requested by users, we need a command to quickly tell the status of cluster, including

  • Nodes and their heartbeats e.g. the number/IP/resource/etc. of all registered nodes
  • Brief runtime info e.g. Current epoch, latency, etc.
  • Brief catalog info e.g. How any MVs or actors are running?
  • Misc. RW version, uptime, etc.
@zwang28
Copy link
Contributor

zwang28 commented Nov 24, 2023

Nodes and their heartbeats e.g. the number/IP/resource/etc. of all registered nodes
Brief runtime info e.g. Current epoch, latency, etc.
Brief catalog info e.g. How any MVs or actors are running?
Misc. RW version, uptime, etc.

The above-mentioned info are now available through several system tables.
Are we supposed to further add another new command to summarize them?

@zwang28

This comment was marked as outdated.

@zwang28 zwang28 modified the milestones: release-1.5, release-1.6 Dec 4, 2023
@fuyufjh
Copy link
Member Author

fuyufjh commented Dec 6, 2023

related #13764

@shanicky
Copy link
Contributor

shanicky commented Dec 7, 2023

after #10886 #10905 #13764

We can directly connect to etcd to dump the current table fragments, workers, users, election members, and various types of catalogs, similar to the kubectl get command. You can specify the output format (yaml, json) using -o. Please note that the output could be very large

This provides a simple method to observe the data stored in etcd before the sql backend officially goes live.

risectl debug dump table,worker,table-catalog -o yaml


- kind: worker
  item:
    id: 1
    type: WORKER_TYPE_COMPUTE_NODE
    host:
      host: 127.0.0.1
      port: 5688
    state: RUNNING
    parallelUnits:
    - workerNodeId: 1
    - id: 1
      workerNodeId: 1
    - id: 2
      workerNodeId: 1
    - id: 3
      workerNodeId: 1
    property:
      isStreaming: true
      isServing: true
    transactionalId: 0
- kind: table
  item:
    tableId: 2001
    state: CREATED
    fragments:
      1001:
        fragmentId: 1001
        fragmentTypeMask: 2
        distributionType: SINGLE
        actors:
        - actorId: 1001
          fragmentId: 1001
          upstreamActorId:
          - 1002
          - 1003
          - 1004
          - 1005
          mviewDefinition: CREATE MATERIALIZED VIEW m2 AS SELECT max(v) FROM t
......
- kind: table_catalog
  item:
    id: 1002
    name: m
    columns:
    - columnDesc:
        columnType:
          typeName: INT32
          isNullable: true
        columnId: 1
        name: v
    - columnDesc:
        columnType:
......

@zwang28
Copy link
Contributor

zwang28 commented Mar 6, 2024

TODO: add table catalog after redaction is supported.

@zwang28 zwang28 removed this from the release-1.8 milestone Apr 8, 2024
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants