Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tight coupling of "HTTP servers, as interface" to "Implementations" is painful #5247

Closed
smklein opened this issue Mar 13, 2024 · 10 comments
Closed
Assignees

Comments

@smklein
Copy link
Collaborator

smklein commented Mar 13, 2024

This may be an issue related to the dropshot repo too, as Omicron isn't the sole user of this pattern.

This issue describes a process which is documented in our README, but which is still painful: https://github.com/oxidecomputer/omicron?tab=readme-ov-file#generated-service-clients-and-updating

Background

  • We use dropshot to generate a JSON OpenAPI spec, that is used to generate client crates
  • When using dropshot, we write servers (see: http_entrypoints.rs throughout the codebase) which utilize dropshot macros and ingest types to define the specification.
  • These endpoints are capable of emitting an openapi specification, if and only if they can compile. This often takes the from of an EXPECTORATE test, which spits out that aforementioned JSON file.

When is this a problem

Mostly: When things don't compile, it can be exceptionally difficult to "regenerate the interface we want".

In the cleanest scenario, you'd make changes to your implementation AND interface together, then run the "openapi spec generation tests", then apply those changes to all clients which exist in the codebase. However, there are scenarios where this breaks:

  1. Circular dependencies. For example, Nexus exposes an internal API to sled agent, and sled agent exposes an API to Nexus. This isn't necessarily bad! They both have reasons to communicate with each other. However, updating interactions between the two servers often involves re-running the JSON generation several times, then dealing with breakages, and then re-building. At an interface level, this is fine, but it's painful that "the rest of the crate (nexus or sled agent)'s implementation must also compile first, before the openapi test can run". This isn't too bad for adding new endpoints, but for altering interfaces that exist, it often means filling up a codebase with todo!() and commented-out code to get to a compiling state, then undoing all those changes immediately afterwards.
  2. Merge conflicts. This forces you to deal with an openapi spec that is out-of-sync from main, but also potentially out-of-sync from your current tree. Again, this forces the same type of solution: find everyone using types that are generated by the openapi spec, and "force them to do the wrong thing, as long as it compiles".

A better world?

It would be really nice to decouple the implementation of these interfaces from the interface declarations themselves.
For example, if the Sled Agent API was not actually part of the sled-agent crate, but rather, a separate "sled-agent-interface" crate, we could do the following:

  • EXPECTORATE=overwrite cargo nt -p sled-agent-interface could re-generate the client bindings, regardless of whether or not "the rest of sled agent compiles"
  • The sled-agent crate could depend on the sled-agent-interface crate, and use it as a normal part of exposing an HTTP server
  • Any other code attempting to implement the interface could share the server code. For example, the "simulated sled agent" -- which currently copies the implementation from sled-agent's http_entrypoints.rs -- could actually rely on the same interface.

The main advantage here would be "when you want to generate the new interface, do that first, before necessarily touching the implementation" and that would work. This should remove the need for "commented-out" code to get things to compile, since it would no longer be necessary to make the rest of the crate arbitrarily compile before generating the new interface.

@sunshowers
Copy link
Contributor

sunshowers commented Mar 13, 2024

We already have a small example of this pattern in installinator-artifactd, for reference.

edit: here's the trait

@davepacheco
Copy link
Collaborator

I'm not totally sure what you're proposing. To generate the OpenAPI spec, you'd need the entire dropshot server in sled-agent-interface, right? So is the idea that the types and endpoints would live in sled-agent-interface? Is the idea that SledAgent itself would impl some SledAgentApi trait, with one method for each endpoint, and the endpoint definitions would all be one liners that find the trait object and call into trait methods? I feel like there's cost (mostly in terms of developer complexity/experience) in the extra level of indirection that is worse for me than the problem being described here. But this is definitely subjective -- I hear that others are experiencing more pain with this than I am.

@smklein
Copy link
Collaborator Author

smklein commented Mar 13, 2024

I'm admittedly being loose with what I'm proposing to avoid being too prescriptive -- I wanted the issue to mostly describe the problem space, not the solution space, beyond "decoupling would be nice".

Personally I'd really like the following:

Dropshot currently wraps an impl block of a concrete implementation. It could wrap a trait instead:

#[dropshot_server]
pub trait SledAgentService {
  #[endpoint { ... }]
  async fn xyz(ctx: RequestContext<Self>, query_params, path_params);
}

This could exist in a sled-agent-interface crate. In sled-agent proper, we could implement this trait with the following:

impl SledAgentService for SledAgent {
  async fn xyz(ctx: RequestContext<Self>, query_params, path_params) {
    ...
  }
}

Granted, it's also possible to do this separation "below the dropshot layer" by putting a Box<dyn Trait> within the RequestContext, and plugging whichever implementation (real or stubbed) we want to use.

But this would:

  • Make it possible to compile the interface crate in isolation, and "fix the interface first"
  • Make it pretty easy to share interfaces where there are multiple implementations (e.g., for "simulated servers")
  • Completely remove the need to "comment out implementations to rebuild the interface"

@andrewjstone
Copy link
Contributor

I'm pretty strongly in favor of this decoupling regardless of implementation. I think the problems Sean has run into are real, and it is often frustrating to me to get these OpenAPI specs to generate even with just my own local changes. I don't think it's too much boilerplate either. We have much worse elsewhere. Some of it rhymes with weasel.

@ahl
Copy link
Contributor

ahl commented Mar 20, 2024

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

@andrewjstone
Copy link
Contributor

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

I don't think anyone wants to change the autogeneration of specs from dropshot. That's the best part of dropshot. My complaint was mainly about when I create my own circular dependency, not how the spec generation happens. Although I know @ahl reasonably figured I was just being an ahole 😄

@ahl
Copy link
Contributor

ahl commented Mar 20, 2024

I don't think anyone wants to change the autogeneration of specs from dropshot. That's the best part of dropshot. My complaint was mainly about when I create my own circular dependency, not how the spec generation happens. Although I know @ahl reasonably figured I was just being an ahole 😄

To be clear: I think it's reasonable to consider an alternate, additional interface.

@sunshowers
Copy link
Contributor

sunshowers commented Mar 20, 2024

So, we'd define the interface first as a trait. Would we need some way to emit the OpenAPI document from that trait (indirectly, I'd expect)?

Yes. I think that should be relatively easy.

An alternative (more traditional) approach would be to define the interface e.g. in OpenAPI directly or in something like TypeSpec and then generate a trait (or stubs or both) from that.

I don't want to write OpenAPI directly, but TypeSpec looks quite interesting! My worry is the loss of fidelity inherent in generating Rust from OpenAPI -- we've run into issues with this within clients which have required lots of From impls and occasional replace directives within progenitor.

Presumably we'd imagine this as an alternative mechanism to the current approach in Dropshot rather than as a replacement?

Definitely an alternative, yeah. At the very least we'll want to convert one service over at a time. In the longer term, it would be interesting to see if the alternative is better enough that it should become the only way to write Dropshot servers.

@ahl ahl self-assigned this Mar 21, 2024
@ahl
Copy link
Contributor

ahl commented Mar 21, 2024

I spoke with @smklein about this a bit. These seem like good goals for an alternate interface to Dropshot. I'm inclined to prioritize this behind two other dropshot-related concepts I've been considering: multi-version support and multi-API per server support. But it's helpful to consider this alongside these other bodies of work.

I've assigned this to myself, but if someone else comes along with cycles and interest by all means feel free to take it over--I'm not trying to exclude others from investigating.

@sunshowers
Copy link
Contributor

As of #6486, this is now done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants