Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sled agent] Consider setting uniform coreadm values to extract info from terminating processes? #1597

Closed
Tracked by #1600
smklein opened this issue Aug 16, 2022 · 12 comments
Assignees
Labels
Sled Agent Related to the Per-Sled Configuration and Management

Comments

@smklein
Copy link
Collaborator

smklein commented Aug 16, 2022

In the 8/16 control plane sync, we discussed the possibility of using https://illumos.org/man/8/coreadm to set a filter to extract core files from crashing non-global zones into the global zone.

Currently, when non-global zone services terminate, Sled Agent stops and deletes the underlying zone. This helps avoid leakage of that resource - we have no further execution-time usage for it - but limits visibility.

By dumping core files into the global zone, we'd be able to inspect errors, even after the zone is destroyed.

@smklein smklein added the Sled Agent Related to the Per-Sled Configuration and Management label Aug 16, 2022
@rmustacc
Copy link

In particular, we want to enable global cores and use the %z token to include the zone name for disambiguation. In the past we've done things like /var/xxx/%z/core.%f.%p.

@leftwo leftwo self-assigned this Aug 16, 2022
@jclulow
Copy link
Collaborator

jclulow commented Aug 16, 2022

A request: probably please don't hard code /var paths (or use of rpool specifically) for any more large files. It seems fine as a default but in the ramdisk environment we're going to want to direct those files to specific tmpfs or other mounted pools, etc. The current use of /var/oxide for writing a bunch of larger files is something we'll probably have to unwind so it'll be good to avoid adding more things like that.

@smklein smklein changed the title [sled agent] Consider setting uniform coreadm files to extract info from terminating processes? [sled agent] Consider setting uniform coreadm values to extract info from terminating processes? Aug 16, 2022
@leftwo
Copy link
Contributor

leftwo commented Aug 29, 2022

In the past we've done things like /var/xxx/%z/core.%f.%p.

coreadm wants the directory to exist before it will create a core. If we use /%z/ as part of the path, then I believe something outside coreadm will need to create that directory. In previous use cases, was there another subsystem that created the %z directory, or did I just miss the coreadm option that would create on demand?

@davepacheco
Copy link
Collaborator

I believe that in the case being referenced (Joyent's SmartOS), the path was really /zones/%z/cores/core.%f.%p, and yes, that directory was created by the machinery that created the zone (vmadm(1M)).

@leftwo
Copy link
Contributor

leftwo commented Sep 8, 2022

I believe that in the case being referenced (Joyent's SmartOS), the path was really /zones/%z/cores/core.%f.%p, and yes, that directory was created by the machinery that created the zone (vmadm(1M)).

https://www.illumos.org/issues/2123
That vmadm?

@jclulow
Copy link
Collaborator

jclulow commented Sep 8, 2022

If we need to create a cores directory we can do that in the brand code. It has hooks for installing and for booting and so on.

@leftwo
Copy link
Contributor

leftwo commented Sep 8, 2022

If we do want a .../%z/... in the path, then something will need to both create that directory on zone creation, and remove it (if empty) when the zone goes away. Otherwise we are left with a record of every zone ever created.

@davepacheco
Copy link
Collaborator

I believe that in the case being referenced (Joyent's SmartOS), the path was really /zones/%z/cores/core.%f.%p, and yes, that directory was created by the machinery that created the zone (vmadm(1M)).

https://www.illumos.org/issues/2123 That vmadm?

I expect so. I'm not sure where the path was actually managed, though. Maybe as Josh suggested it was done in the brand code.

@davepacheco
Copy link
Collaborator

In @2088 @smklein asked:

Is this ultimately the sled agent's job?

I'm not sure. We've got to decide first where the core files will go. That'll presumably be some directory on a ZFS dataset on some zpool. Who creates the pool? The dataset? The directory? My first thought is that we put all of the core files into one directory per Sled (i.e., don't create a per-zone dataset or even directory). That's because I'm not sure what we'd gain from separate datasets or directories per zone, and this way we don't have to do anything here when zones come and go.

Still, I'm not sure what storage we want to put these on, so I don't know what pool or dataset we want to put these on, so I don't know who's responsible for it.

@jclulow
Copy link
Collaborator

jclulow commented Dec 24, 2022

Separate datasets per zone would allow us to have a separate quota for core files per zone, which I suspect would be valuable. It would be good to avoid a run-away core generator in zone A from preventing a subsequent single core file being generated by zone B. We'll also want an overall quota that inhibits cores from exhausting the space in the pool they're in.

I think we'll want to put this stuff on a dataset we create in some U.2 device or devices. A few thoughts:

  • we'll need to account for the space we're setting aside for core files in the same way that we'll need to account for Cockroach DB and Clickhouse data files and any other internal data files (RFD 118)
  • we'll eventually want to hoover these files up and put them somewhere other than where they're generated; this could be a sled-agent responsibility, but it might also be valuable as a separate and simpler process that would then not be in the same fault domain as sled-agent itself (e.g., what if it's the sled agent that keeps dumping core)
  • if we put all the core files on one U.2 device, that might ease management, but it would also mean that if that device fails we would lose all of the core files in the system
  • similarly, if we put the core files dataset for a zone on the same U.2 device as the rest of the storage for the zone, then if the crashes occur because of some underlying fault that also upsets the U.2 device or its ZFS pool, we might also not be able to write those core files
  • if we put the core files for a zone on another SSD, then we'd have to be careful responding to the removal of a device other than the device on which the zone is resident and repoint its cores dataset, etc; if they go on the same device as the zone storage, then at least it can all be torn down at once on device failure or removal

There is not, I suspect, a single best answer to this problem.

@leftwo
Copy link
Contributor

leftwo commented Nov 11, 2023

Many/much/all of the work described here was completed in other PRs/Issues:

sled-agent performs archival of rotated logs for all zones onto U.2 debug dataset
Put process core dumps onto the U.2 debug zvol

I think if there are follow on issues, they should go here: #2478

@leftwo
Copy link
Contributor

leftwo commented Nov 11, 2023

#2478

@leftwo leftwo closed this as completed Nov 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sled Agent Related to the Per-Sled Configuration and Management
Projects
None yet
Development

No branches or pull requests

5 participants