Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

Closed
nhtruong opened this issue Feb 8, 2024 · 18 comments
Closed

[PROPOSAL] Replace Smithy with a native OpenAPI spec #189

nhtruong opened this issue Feb 8, 2024 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@nhtruong
Copy link
Collaborator

nhtruong commented Feb 8, 2024

In the past discussions on Smithy vs OpenAPI [1] [2], we decided to have OpenAPI as the official spec of OpenSearch but use Smithy to describe the API as it's easier to organize:

  • We won’t use Smithy features that are not translatable to OpenAPI.
  • Any update to the Smithy spec must always be reflected in the OpenAPI spec.
  • Smithy can be dropped in the future if necessary, allowing contributors to update to the OpenAPI spec directly rather than through the Smithy spec.

After a year of going with this strategy, as we've learned a lot more about the quirks of OpenSearch API that are not compatible with Smithy's many guardrails, writing the spec in Smithy is becoming more challenging:


1. Smithy does not support certain data types for path and query-string parameters

Many OpenSearch operations have URL parameters that accepts either a single value as a string, an array of values separated by commas. That is, these parameters have string | string[] as data type. This can be described in Smithy as a union. However, Smithy doesn't allow unions in its URL parameters (It also doesn't allow arrays/lists in path parameters specifically).

WORKAROUND: represent the data type as a string in Smithy but mark it with xDataType('string | string[]'). If the member of the list is not generic, we must also include xArrayItems('MemberName'), and xEnum([...]) when applied.

smithy-lang/smithy#2131

2. Smithy does not support untagged unions.

Only tagged unions are supported, which enforce a wrapping structure, while we have fields that are for example int | boolean. This would map to a oneOf in OpenAPI.

smithy-lang/smithy#2143

3. OpenSearch operations cause URI Conflict errors in Smithy

For example, since operations regarding an index or a list of indices have URI starting with /{index}/..., other operations, say GET _cat/indices, cause URL Conflict for Smithy as it sees _cat as an {index} parameter.

WORKAROUND: Mark every operations with @suppress(["HttpUriConflict"]) and every path parameter with @pattern("^(?!_|template|query|field|point|clear|usage|stats|hot|reload|painless).+$") and @xPattern("")

smithy-lang/smithy#1539

4. Unable to customise inlining and $ref naming strategy in OpenAPI conversion

If we are able to customise this strategy we can stop certain “simple” types from always being inlined in the OpenAPI conversion, and we can implement a custom naming scheme for the $refs such that we can name them to match the ElasticSearch spec we inherit. (e.g. #/components/schemas/_type:Duration)

smithy-lang/smithy#2144

5. Smithy mixins are always flattened/erased.

We’re unable to easily represent an inheritance hierarchy within Smithy that is translated to OpenAPI. Potential solution by introducing flag in translator that somehow retains mixins and puts them into allOf in OpenAPI.

smithy-lang/smithy#2145


In short, Smithy, with its many guardrails, is unfit to describe OpenSearch API:

  • We can't correctly describe OpenSearch API with Smithy
  • Translation from Smithy to OpenAPI is not lossless
  • There are too many duck-tape workarounds to circumvent Smithy guardrails

The many gotchas, and conventions we have to establish (as seen in the DEVELOPER_GUIDE) also deter people from contributing to this project. People maintaining this repo, on top of understanding OpenAPI, must also learn Smithy and the workarounds to make it work with OpenSearch API.

Given the difficulties in mapping OpenSearch's API to Smithy, I propose we instead maintain our API definitions directly in OpenAPI. Smithy a great tool for creating new APIs thanks to its many guardrails, but OpenSearch API has many unconventional traits that are simply incompatible with it.

@dblock
Copy link
Member

dblock commented Feb 8, 2024

  1. Are there things we lose by switching to writing OpenAPI directly? For example, we will likely lose some of the nice features of Smithy such as traits. What about the folder structure? Will developers be editing a giant file?
  2. Are there related issues for the ones described. I believe URI Pattern Validation and Conflict Resolution design doc smithy-lang/smithy#1539 is (3). Maybe we should open issues for others and see what the smithy folks say about it?

@dblock dblock added the enhancement New feature or request label Feb 8, 2024
@nhtruong
Copy link
Collaborator Author

nhtruong commented Feb 8, 2024

  1. We will definitely lose all the features that are unique to Smithy. While its many guardrails are not compatible with existing OpenSearch API, they are actually great for new endpoints and new applications. Many of Smithy traits do have counterparts in OpenAPI and we can leverage OpenAPI extensions to cover the missing ones.

  2. That might solve Bullet point number 2. But looks like it's still a draft for a design from Dec 2022? Though I see that restriction in Smithy as a necessary guardrail to prevent conflicting endpoints, so I didn't think of asking Smithy to "correct" it as it's not a bug to me. It's just not compatible with OpenSearch existing endpoints.

@reta
Copy link
Contributor

reta commented Feb 9, 2024

👍 to go OpenAPI full speed, Smithy gets its credits but the "recent" updates to OpenAPI spec (3.1) are buying us much more (specifically, JSON Schema support)

@Tokesh
Copy link
Collaborator

Tokesh commented Feb 10, 2024

Its sounds logical that we simply and write directly in OpenAPI.

Not all Smithy models are translated to named OpenAPI schema

OpenSearch operations cause URI Conflict errors in Smithy

I have encountered the two problems described above. It would be interesting to try writing directly into OpenAPI without the extra layer of Smithy.

@VachaShah
Copy link
Collaborator

  1. Are we planning to write plain json for API specs? Or is there a more developer friendly way we are thinking to adopt to write OpenAPI effectively?
  2. If we go the direct OpenAPI route, are there any other issues we would face when writing OpenSearch APIs in OpenAPI?

Just trying to understand that the issues we are facing currently with Smithy are entirely related to how Smithy works or are they because OpenSearch APIs are written in a certain way and we end up writing workarounds with OpenAPI as well?

@nhtruong
Copy link
Collaborator Author

nhtruong commented Feb 14, 2024

@VachaShah

  1. We definitely want to avoid editing a giant JSON/YAML file by hand. OpenAPI uses JSON Pointer in its reference objects that supports reusing and defining components in different files. We can leverage this and break our api into smaller files by namespaces, for example. And write a script to merge those into a single file like we have today.
  2. OpenAPI is very flexible and forgiving (it's both its strength and weakness comparing to Smithy), which is suitable for a peculiar API like OpenSearch.

Of course we will lose out on the biggest perk that comes out of the box with Smithy: Organization of files. Like mentioned in the first bullet, OpenAPI does support that but we will have to be smart about it.

I'm working on the Proof of Concept on how an editable pure OpenSearch API Spec would look like right now.

@nhtruong
Copy link
Collaborator Author

nhtruong commented Feb 20, 2024

Here's the POC Branch: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi
Check out the DEVELOPER_GUIDE to have a better understanding of the new layout.

Note that the POC spec are not the 1:1 translation from the current spec:

  • The POC spec also includes the missing request and response bodies
  • Many shemas and references have be renamed and/or moved to heed the new guidelines stated in the Dev Guide, and to keep things tidy and consistent.

@dblock
Copy link
Member

dblock commented Feb 27, 2024

Looks good. Lowercase tools ;) @Xtansia thoughts?

@Xtansia
Copy link
Collaborator

Xtansia commented Feb 27, 2024

It looks good, I've had good success updating my Java codegen POC to operate off the split spec: opensearch-project/opensearch-java#366

Only issue to note is that at least in the case of the multi-file spec, Java/swagger-parser is unhappy with the $refs that include extra #s like below, because it attempts to parse the $ref as a URI (because it's a relative reference calculated from the current document base URI) and # is a reserved character and should be percent encoded unless it's the fragment delimiter:

 - $ref: '#/components/parameters/cluster.health#query:expand_wildcards'

@dblock
Copy link
Member

dblock commented Feb 28, 2024

Ideas.

  1. As part of this change I'd like to propose we stop publishing the combined spec as a PR back into the repo. Can we have a "build" for that, and publish something versioned that can be a "release" that ends up matching the version of the product? I think it will become important when we have caught up with the spec, or have major versions.
  2. Can/should OpenSearch.openapi.yaml be broken up where all the API-related contents are in subfolders?

@nhtruong
Copy link
Collaborator Author

nhtruong commented Feb 28, 2024

@Xtansia RE $ref in split spec. Java Swagger-Parser doesn't have the same complain in single-file spec? It matters because if we can limit programmatic consumption to single-file spec, this is no longer an issue. If it's still an issue, do you think replacing # with : would work?

@dblock

  1. I'm thinking having the single-file updated as the contributor lint their changes (The linter will need to perform OpenAPI validation on the single-file version as part of its process anyway). It's nice to see how changes to the component files affect the resulted single-file spec. It gives a different angle to the changes when reviewing the PR. That is, the PR must also include the update to the single-file spec, and a workflow will check if the changes to the split spec match that of the single-file spec.
    We would still have a build process like you proposed of course.
  2. Aren't all contents in the spec API related contents? Can you elaborate on this idea?

@Xtansia
Copy link
Collaborator

Xtansia commented Feb 29, 2024

@nhtruong for the $refs it's not blocking and I wasn't relying on the libraries built in resolving anyway. However it is something to keep in mind even in the single file spec if we want to be fully correct according to the OpenAPI specification. Valid characters for a URI fragment are: A-Za-z, 0-9, -._~!$&'()*+,;=:@/? or otherwise percent-encoded characters

@nhtruong
Copy link
Collaborator Author

@Xtansia I see. I'll convert all # to @ in $ref

@dblock
Copy link
Member

dblock commented Mar 1, 2024

  1. I'm thinking having the single-file updated as the contributor lint their changes

I understand. I think the question is really whether we check in the combined result or not. My initial thought is not because it's a generated file, but I don't feel that strongly about it.

We would still have a build process like you proposed of course.

I'm thinking a release process is more important, something we don't have today in this repo. For example, creating a semver-versioned tag would trigger a proper release like here.

Aren't all contents in the spec API related contents? Can you elaborate on this idea?

My issue is simpler, https://github.com/opensearch-project/opensearch-api-specification/blob/feature/native_openapi/OpenSearch.openapi.yaml is huge and difficult to edit. Maybe it can be broken up into a file per path.

@nhtruong
Copy link
Collaborator Author

nhtruong commented Mar 3, 2024

That file you linked is not meant to be edited. The spec is split into smaller files in the spec folder: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi/spec

The OpenSearch.openapi.yaml in the root folder is generated from the files in the folder above. It's meant to be consumed programmatically and compare the diff of different versions of the spec.

@dblock dblock changed the title [PROPOSAL] Dropping Smithy in favor of Native OpenAPI [PROPOSAL] Replace Smithy with a native OpenAPI spec Mar 4, 2024
@dblock
Copy link
Member

dblock commented Mar 4, 2024

That file you linked is not meant to be edited. The spec is split into smaller files in the spec folder: https://github.com/opensearch-project/opensearch-api-specification/tree/feature/native_openapi/spec

The OpenSearch.openapi.yaml in the root folder is generated from the files in the folder above. It's meant to be consumed programmatically and compare the diff of different versions of the spec.

I see. That goes back to my other comment about checking in generated files. We should either remove those from the repo, or at least have a build folder or something like that for them and add markers/comments into the file(s) to say they should not be edited.

@nhtruong
Copy link
Collaborator Author

nhtruong commented Mar 26, 2024

Here's the PR for the switch from Smithy to OpenAPI: #208. It's easier to review this PR like reviewing a new repo by inspecting the feature branch.

We'll need to disable the coverage workflow to get this merged. I'm working on 2 follow-up PRs:

  • Make coverage workflow work for yaml specs
  • Add Validator to make sure that new changes into the spec folder follow guidelines in the developer guide.

@nhtruong
Copy link
Collaborator Author

nhtruong commented Apr 2, 2024

Closing this since #208 has already been merged.

@nhtruong nhtruong closed this as completed Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants