Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support bundles #2235

Open
davepacheco opened this issue Jan 25, 2023 · 5 comments
Open

support bundles #2235

davepacheco opened this issue Jan 25, 2023 · 5 comments
Labels
customer For any bug reports or feature requests tied to customer requests Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more)
Milestone

Comments

@davepacheco
Copy link
Collaborator

There's wide room to scope this up or down for MVP (including possibly nothing for MVP). Many of us have had good past experiences with a built-in facility to collect information from deployed systems and make it directly available to our support and engineering teams. Much needs to be considered around customer consent and security (see RFDs 94 and 354).

Examples of stuff that's pretty easy and valuable to collect:

  • zpool get all for all zpools
  • zfs get all for all zfs datasets
  • svcs -Zap for all sleds (all SMF service states in all zones, plus running processes)
  • ptree for all sleds (all processes)
  • for processes we care about, maybe: pargs, pargs -e, pstack pfiles (but see below)
  • log files (e.g., SMF log files, syslog, FMA ereport and fault logs, CockroachDB logs)
  • existing core files (assumes we've established some place to put these)

Slightly more invasive but probably safe enough would also be:

  • gcore for any processes we care particularly about (e.g., Nexus, Sled Agent)
  • cockroach debug zip (their own support bundles)

This sounds like a lot, but I think it's fairly straightforward to collect most of this. More of the work seems like figuring out the security and privacy issues, temporary storage while we're assembling the bundles, and then putting them somewhere that we can access.

We can also start with very little and augment the collection facility with software updates. In past systems I've used, we tagged different kinds of data. A standard service bundle would collect a default set of tags. More specific bundles could be requested that would collect more data that was either too invasive or too expensive to do by default.

It could be we do none of this for MVP and move this to MVP+1. I think that's basically what RFD 354 proposes.

@davepacheco davepacheco added this to the MVP milestone Jan 25, 2023
@ahl
Copy link
Contributor

ahl commented Jan 25, 2023

  1. I'm all for it
  2. Do we have a mechanism (or mechanisms) in mind for how to extract these from the product (either us or customers)?

@smklein
Copy link
Collaborator

smklein commented Jan 25, 2023

See also: #1600

@davepacheco
Copy link
Collaborator Author

  • clickhouse dump
  • cockroach dump

@davepacheco
Copy link
Collaborator Author

  • zoneadm list -cv
  • svccfg archive for all zones

@jordanhendricks jordanhendricks added the Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more) label Aug 11, 2023
@morlandi7
Copy link

morlandi7 commented Oct 3, 2023

Host team would like to collect the things in here: https://github.com/oxidecomputer/customer-support/issues/39

See also: #2478

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer For any bug reports or feature requests tied to customer requests Debugging For when you want better data in debugging an issue (log messages, post mortem debugging, and more)
Projects
None yet
Development

No branches or pull requests

6 participants