Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

smklein · 2024-03-13T00:27:36Z

This may be an issue related to the dropshot repo too, as Omicron isn't the sole user of this pattern.

This issue describes a process which is documented in our README, but which is still painful: https://github.com/oxidecomputer/omicron?tab=readme-ov-file#generated-service-clients-and-updating

Background

We use dropshot to generate a JSON OpenAPI spec, that is used to generate client crates
When using dropshot, we write servers (see: http_entrypoints.rs throughout the codebase) which utilize dropshot macros and ingest types to define the specification.
These endpoints are capable of emitting an openapi specification, if and only if they can compile. This often takes the from of an EXPECTORATE test, which spits out that aforementioned JSON file.

When is this a problem

Mostly: When things don't compile, it can be exceptionally difficult to "regenerate the interface we want".

In the cleanest scenario, you'd make changes to your implementation AND interface together, then run the "openapi spec generation tests", then apply those changes to all clients which exist in the codebase. However, there are scenarios where this breaks:

Circular dependencies. For example, Nexus exposes an internal API to sled agent, and sled agent exposes an API to Nexus. This isn't necessarily bad! They both have reasons to communicate with each other. However, updating interactions between the two servers often involves re-running the JSON generation several times, then dealing with breakages, and then re-building. At an interface level, this is fine, but it's painful that "the rest of the crate (nexus or sled agent)'s implementation must also compile first, before the openapi test can run". This isn't too bad for adding new endpoints, but for altering interfaces that exist, it often means filling up a codebase with todo!() and commented-out code to get to a compiling state, then undoing all those changes immediately afterwards.
Merge conflicts. This forces you to deal with an openapi spec that is out-of-sync from main, but also potentially out-of-sync from your current tree. Again, this forces the same type of solution: find everyone using types that are generated by the openapi spec, and "force them to do the wrong thing, as long as it compiles".

A better world?

It would be really nice to decouple the implementation of these interfaces from the interface declarations themselves.
For example, if the Sled Agent API was not actually part of the sled-agent crate, but rather, a separate "sled-agent-interface" crate, we could do the following:

EXPECTORATE=overwrite cargo nt -p sled-agent-interface could re-generate the client bindings, regardless of whether or not "the rest of sled agent compiles"
The sled-agent crate could depend on the sled-agent-interface crate, and use it as a normal part of exposing an HTTP server
Any other code attempting to implement the interface could share the server code. For example, the "simulated sled agent" -- which currently copies the implementation from sled-agent's http_entrypoints.rs -- could actually rely on the same interface.

The main advantage here would be "when you want to generate the new interface, do that first, before necessarily touching the implementation" and that would work. This should remove the need for "commented-out" code to get things to compile, since it would no longer be necessary to make the rest of the crate arbitrarily compile before generating the new interface.

The text was updated successfully, but these errors were encountered:

sunshowers · 2024-03-13T00:29:29Z

We already have a small example of this pattern in installinator-artifactd, for reference.

edit: here's the trait

davepacheco · 2024-03-13T00:44:45Z

I'm not totally sure what you're proposing. To generate the OpenAPI spec, you'd need the entire dropshot server in sled-agent-interface, right? So is the idea that the types and endpoints would live in sled-agent-interface? Is the idea that SledAgent itself would impl some SledAgentApi trait, with one method for each endpoint, and the endpoint definitions would all be one liners that find the trait object and call into trait methods? I feel like there's cost (mostly in terms of developer complexity/experience) in the extra level of indirection that is worse for me than the problem being described here. But this is definitely subjective -- I hear that others are experiencing more pain with this than I am.

smklein · 2024-03-13T01:20:45Z

I'm admittedly being loose with what I'm proposing to avoid being too prescriptive -- I wanted the issue to mostly describe the problem space, not the solution space, beyond "decoupling would be nice".

Personally I'd really like the following:

Dropshot currently wraps an impl block of a concrete implementation. It could wrap a trait instead:

#[dropshot_server]
pub trait SledAgentService {
  #[endpoint { ... }]
  async fn xyz(ctx: RequestContext<Self>, query_params, path_params);
}

This could exist in a sled-agent-interface crate. In sled-agent proper, we could implement this trait with the following:

impl SledAgentService for SledAgent {
  async fn xyz(ctx: RequestContext<Self>, query_params, path_params) {
    ...
  }
}

Granted, it's also possible to do this separation "below the dropshot layer" by putting a Box<dyn Trait> within the RequestContext, and plugging whichever implementation (real or stubbed) we want to use.

But this would:

Make it possible to compile the interface crate in isolation, and "fix the interface first"
Make it pretty easy to share interfaces where there are multiple implementations (e.g., for "simulated servers")
Completely remove the need to "comment out implementations to rebuild the interface"

andrewjstone · 2024-03-13T16:44:42Z

I'm pretty strongly in favor of this decoupling regardless of implementation. I think the problems Sean has run into are real, and it is often frustrating to me to get these OpenAPI specs to generate even with just my own local changes. I don't think it's too much boilerplate either. We have much worse elsewhere. Some of it rhymes with weasel.

ahl · 2024-03-20T17:36:18Z

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

andrewjstone · 2024-03-20T17:46:17Z

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

I don't think anyone wants to change the autogeneration of specs from dropshot. That's the best part of dropshot. My complaint was mainly about when I create my own circular dependency, not how the spec generation happens. Although I know @ahl reasonably figured I was just being an ahole 😄

ahl · 2024-03-20T18:09:52Z

I don't think anyone wants to change the autogeneration of specs from dropshot. That's the best part of dropshot. My complaint was mainly about when I create my own circular dependency, not how the spec generation happens. Although I know @ahl reasonably figured I was just being an ahole 😄

To be clear: I think it's reasonable to consider an alternate, additional interface.

sunshowers · 2024-03-20T20:40:24Z

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

Yes. I think that should be relatively easy.

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

I don't want to write OpenAPI directly, but TypeSpec looks quite interesting! My worry is the loss of fidelity inherent in generating Rust from OpenAPI -- we've run into issues with this within clients which have required lots of From impls and occasional replace directives within progenitor.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

Definitely an alternative, yeah. At the very least we'll want to convert one service over at a time. In the longer term, it would be interesting to see if the alternative is better enough that it should become the only way to write Dropshot servers.

ahl · 2024-03-21T21:01:51Z

I spoke with @smklein about this a bit. These seem like good goals for an alternate interface to Dropshot. I'm inclined to prioritize this behind two other dropshot-related concepts I've been considering: multi-version support and multi-API per server support. But it's helpful to consider this alongside these other bodies of work.

I've assigned this to myself, but if someone else comes along with cycles and interest by all means feel free to take it over--I'm not trying to exclude others from investigating.

sunshowers · 2024-09-06T23:25:48Z

As of #6486, this is now done.

ahl self-assigned this Mar 21, 2024

This was referenced Apr 28, 2024

[9/n] [dropshot] add API traits with support for endpoints oxidecomputer/dropshot#984

Merged

[WIP] example of trait-based Dropshot server #5653

Closed

sunshowers self-assigned this Sep 6, 2024

sunshowers closed this as completed Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

smklein commented Mar 13, 2024

sunshowers commented Mar 13, 2024 •

edited

Loading

davepacheco commented Mar 13, 2024

smklein commented Mar 13, 2024

andrewjstone commented Mar 13, 2024

ahl commented Mar 20, 2024

andrewjstone commented Mar 20, 2024

ahl commented Mar 20, 2024

sunshowers commented Mar 20, 2024 •

edited

Loading

ahl commented Mar 21, 2024

sunshowers commented Sep 6, 2024

Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

Comments

smklein commented Mar 13, 2024

Background

When is this a problem

A better world?

sunshowers commented Mar 13, 2024 • edited Loading

davepacheco commented Mar 13, 2024

smklein commented Mar 13, 2024

andrewjstone commented Mar 13, 2024

ahl commented Mar 20, 2024

andrewjstone commented Mar 20, 2024

ahl commented Mar 20, 2024

sunshowers commented Mar 20, 2024 • edited Loading

ahl commented Mar 21, 2024

sunshowers commented Sep 6, 2024

sunshowers commented Mar 13, 2024 •

edited

Loading

sunshowers commented Mar 20, 2024 •

edited

Loading