Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardizing bundle format across clients #8

Closed
haydentherapper opened this issue Jul 20, 2023 · 25 comments
Closed

Standardizing bundle format across clients #8

haydentherapper opened this issue Jul 20, 2023 · 25 comments
Labels
enhancement New feature or request

Comments

@haydentherapper
Copy link
Contributor

I would like to propose a standard bundle output format for Sigstore clients. The user journey we're targeting is "I can sign an artifact in client X and verify it in client Y".

This has come up a few times across different communities:

Before we go down this path, I'd like to also know how clients are currently outputting the bundle format. @woodruffw, @kommendorkapten, @bdehamer for python and JS, what are you doing currently? Note that we don't have bundle output for Cosign/sigstore-go yet.

Here's a few options:

cc @trishankatdatadog @mnm678 @TomHennen @znewman01 @bobcallaway for thoughts too

@haydentherapper
Copy link
Contributor Author

cc @steiza @di too

@woodruffw
Copy link
Member

I'm strongly in favor of this, and the sooner the better!

what are you doing currently?

For sigstore-python, we currently emit the Sigstore bundle format, i.e. the one defined here:

https://github.com/sigstore/protobuf-specs/blob/85dce20afb5e8ad9e170328abb7ff2e61b758958/protos/sigstore_bundle.proto#L70-L92

We emit that bundle format in JSON form, using proto3's defined JSON mapping:

https://protobuf.dev/programming-guides/proto3/#json


Generally speaking, I'm in favor of having the Sigstore bundle format (emitted as JSON) be the standard bundle format across clients: my understanding is that it's what it was intended to be, and both the Python and JS (AFAIK) clients have downstream users who produce and consume Sigstore bundles (e.g., the CPython release process).

I'm still a little hazy on the pros/cons here, but on the client development side I think it makes sense to have DSSE signatures embedded within the Sigstore bundle (i.e. as Bundle.content.dsse_envelope). My rationale there is that it's already designed into the Sigstore bundle format and would be relatively easy to add client support for, whereas inverting the envelope structure (i.e. DSSE-Envelope(Sigstore-Bundle)) results in a double-signature scenario.

@TomHennen
Copy link

Can you say more about what you mean by the double signature scenario in DSSE-Envelope(Sigstore-Bundle). I believe there would just be one signature in that case?

One reason to prefer the the DSSE approach is that throughout the supply chain it may be that not everyone is using Sigstore. By putting the Sigstore signature in the DSSE it makes it easier to transmit all sorts of attestations together even if they were signed using different methods. Certainly in any given ecosystem folks might just be using Sigstore, but I suspect most e2e supply chains are heterogeneous and having a solution that can handle multiple methods of signing would be easier to handle?

@woodruffw
Copy link
Member

Can you say more about what you mean by the double signature scenario in DSSE-Envelope(Sigstore-Bundle). I believe there would just be one signature in that case?

A Sigstore bundle contains one or more signatures and so does a DSSE envelope (I think? source), so unless I'm misunderstanding I believe this would mean that the envelope would be signing over the bundle, which itself contains signatures (which themselves may be formatted as DSSE envelopes).

@woodruffw
Copy link
Member

woodruffw commented Jul 20, 2023

Specifically, here's the Sigstore Bundle format:

message Bundle {
        // MUST be application/vnd.dev.sigstore.bundle+json;version=0.1
        // or application/vnd.dev.sigstore.bundle+json;version=0.2
        // when encoded as JSON.
        string media_type = 1;
        // When a signer is identified by a X.509 certificate, a verifier MUST
        // verify that the signature was computed at the time the certificate
        // was valid as described in the Sigstore client spec: "Verification
        // using a Bundle".
        // <https://docs.google.com/document/d/1kbhK2qyPPk8SLavHzYSDM8-Ueul9_oxIMVFuWMWKz0E/edit#heading=h.x8bduppe89ln>
        VerificationMaterial verification_material = 2 [(google.api.field_behavior) = REQUIRED];
        oneof content {
                dev.sigstore.common.v1.MessageSignature message_signature = 3 [(google.api.field_behavior) = REQUIRED];
                // A DSSE envelope can contain arbitrary payloads.
                // Verifiers must verify that the payload type is a
                // supported and expected type. This is part of the DSSE
                // protocol which is defined here:
                // <https://github.com/secure-systems-lab/dsse/blob/master/protocol.md>
                io.intoto.Envelope dsse_envelope = 4 [(google.api.field_behavior) = REQUIRED];
        }
        // Reserved for future additions of artifact types.
        reserved 5 to 50;
}

so either a message_signature or a dsse_envelope is required. This means that Sigstore bundle signatures can't be detached a la PKCS#7/CMS (which IMO is a good thing; lots of silly bugs happen because of detached PKCS#7 signatures!)

@codysoyland
Copy link
Member

We emit that bundle format in JSON form, using proto3's defined JSON mapping:

https://protobuf.dev/programming-guides/proto3/#json

This is also how we encode bundles in sigstore-js (which is used for npm provenance).

@adityasaky
Copy link
Member

A Sigstore bundle contains one or more signatures and so does a DSSE envelope (I think? source), so unless I'm misunderstanding I believe this would mean that the envelope would be signing over the bundle, which itself contains signatures (which themselves may be formatted as DSSE envelopes).

@woodruffw the proposal for adding sigstore specific information to a DSSE signature uses VerificationMaterial rather than the whole sigstore bundle. That way, we don't have the Envelope(Bundle(Envelope)) nesting of signatures.

@woodruffw
Copy link
Member

the proposal for adding sigstore specific information to a DSSE signature uses VerificationMaterial rather than the whole sigstore bundle. That way, we don't have the Envelope(Bundle(Envelope)) nesting of signatures.

Aha! That's what I was missing. That makes sense then.

Could you link to that proposal?

@adityasaky
Copy link
Member

The general discussion on signature extensions is here: secure-systems-lab/dsse#59. That thread also includes some examples about sigstore, this confusion between using Bundle and VerificationMaterial has come up as well.

I opened secure-systems-lab/dsse#61 to add extensions to DSSE. That includes a tentative pass at the sigstore extension though writing it made me wonder if the DSSE spec should be the place for it. Either way, it also points to the VerificationMaterial definition in the sigstore proto.

@woodruffw
Copy link
Member

Thanks. Given that both of these container formats basically end up being unauthenticated wrappers for certificates and transparency entries, is there a strict advantage to any particular ecosystem emitting a tweaked DSSE envelope vs. a Sigstore bundle?

In other words: the signatures in question aren't over the other parts of the bundle (because they can't be) so anybody who wants to turn e.g. a Sigstore bundle into a DSSE envelope can do so via a transparent proxy, client-side tooling, etc.

My intuition is that that might be the best route forward in situations like this: every change to output formats causes a decent amount of pain for downstream users of Sigstore, and having clients that emit Sigstore bundles doesn't actually block an equivalent DSSE envelope representation (since all the relevant bits can be re-packed without compromising the underlying signature).

@adityasaky
Copy link
Member

Yes, a sigstore bundle can be turned into a DSSE envelope with the proposed extension and vice versa.

We started looking at embedding verification material in DSSE instead of using the sigstore bundle format to simplify situations where we have multiple envelopes of which only a subset use sigstore. Specifically, @mnm678 and I discussed this in the context of TUF's* TAP-18 that adds support for TUF roles to be signed using sigstore. in-toto has a generalized proposal for using X.509 which would be used for sigstore signing.

In these mixed signing method cases, either:

  1. sigstore-specific envelopes must include the verification material and clients receive every piece of metadata as an envelope
  2. clients must be able to identify from just the envelope which ones must actually be swapped out for a sigstore bundle
  3. producers must ship a mixture of regular envelopes and serialized sigstore bundles

The simplest / cleanest option from that perspective seemed to be 1. There's the option of using sigstore bundles for all envelopes but that'll likely collide with some other signing ecosystem with its own set of requirements.

* I'm not a TUF maintainer.

@bdehamer
Copy link

Much like sigstore-python, the sigstore-js library emits the JSON-serialized form of the Sigstore Bundle. For the provenance statements we're generating for npm packages, we're using these JSON-serialized bundles as the way this information is transmitted to the npm registry -- changing the format at this point is NOT impossible, but it would be a big lift.

@woodruffw
Copy link
Member

Yeah, the same is true of sigstore-python's current downstream use: switching from separate .crt and .sig inputs was already decently painful and the bundle itself is changing a bit between 0.1 and 0.2, so I'm a little change-shy about throwing another new format at users 🙂

IMO, given that these are fundamentally both unauthenticated containers, retaining the existing Sigstore bundle protobuf + JSON-serialized form as the "standard" client output makes the most sense. Clients could additionally support mutating to different but-equivalent-outputs, but we've already made some conformance suite progress with the current bundle format (and it sounds like the JS client's use is even more baked in than Python's is).

@znewman01
Copy link
Contributor

Given that we can translate between these formats, I'm less concerned about any specific decision here. The real thing to avoid is churn in terms of changing the default behavior. Maybe something like:

  1. Default output format from signers SHOULD be Sigstore-bundle-as-JSON (for backwards compat). Verifiers MAY support DSSE output too.
  2. Verifiers MUST be able to parse Sigstore-bundle-as-JSON, SHOULD be able to parse DSSE.

This doesn't feel like a huge lift from an implementation complexity standpoint—you'd be able to implement VerifyDsse() as VerifyBundle()+TranslateDsseToBundle().

@kommendorkapten
Copy link
Member

Also I would like to point out that even if the "Sigstore bundle" is called "Sigstore" it's not strictly limited to Sigstore client/contents.
It's a container that captures signatures, certificates and related verification material. So it can group a X.509 Certificate chain, RFC3161 signed timestamps and a DSSE envelope. The only part which is Sigstore specific is the transparency log entry (Rekor).
So for a signer/verifier only using X.509 certificates and RFC3161 signed timestamps, the bundle got you covered.

I'm in favour of having Sigstore clients default to produce SIgstore bundles as JSON (as JavaScript and Python clients does today), but as mentioned they can be translated easily to the other format.

What I like with the current bundle format is that we build a complex message via layering/composition. DSSE is a very simple (which is a good thing IMO) format, which we can bundle up with other (simple) messages to get a more complex message. Where each of the component in a bundle has a specific responsibility. Simplifies both composition and decomposition.

(Currently the Rekor entry breaks this a bit as it includes the canonicalized_body, but we are aware of that and will try to fix that)

@woodruffw
Copy link
Member

I'm not sure what our final approval process looks like here, but it sounds like we have consensus among the current Sigstore client maintainers w/r/t using the Sigstore bundle format (encoded as JSON using the Protobuf-defined mapping) as the "standard" bundle format.

@znewman01 @haydentherapper what would it look like to formalize this? A vote in the clients WG/a section in the client spec somewhere?

@znewman01
Copy link
Contributor

Yeah, sounds great on both fronts. I'll stick this on the agenda for the next clients meeting (Aug 1). As soon as that's official we'll stick it in the clients spec. But I don't anticipate any contention so clients should feel free to act as though this is already policy.

@TomHennen
Copy link

TomHennen commented Jul 21, 2023

I'm not sure what our final approval process looks like here, but it sounds like we have consensus among the current Sigstore client maintainers w/r/t using the Sigstore bundle format (encoded as JSON using the Protobuf-defined mapping) as the "standard" bundle format.

If we go this route can we specify the output JSON be JSON Lines compatible?

[edit] I think all we'd need to do is say that the output should not include newlines?

@adityasaky
Copy link
Member

adityasaky commented Jul 21, 2023

Yes, a sigstore bundle can be turned into a DSSE envelope with the proposed extension and vice versa.

One more thing to note: in my original message I was considering situations where a DSSE envelope has a single signature (which is likely to be the case with in-toto metadata). But in the TUF case I mentioned above, metadata files can and do have more than a single signature in a single envelope. So these would have to be split into multiple sigstore bundles with the envelope duplicated for each signature. The conversion is still possible but IMO it perhaps makes a lot of sense in the long run to build support for DSSE signature extensions (where each signature is associated with its own verification materials rather than the verification materials being associated with an entire envelope) into clients.

@trishankatdatadog
Copy link

Yes, my vote is for Option 2: Supporting ecosystem specific extensions in secure-systems-lab/dsse#59, for which secure-systems-lab/dsse#61 looks like an implementation

@woodruffw
Copy link
Member

Paraphrasing from chat:

From trying to do protobuf -> JSON Schema -> Rust, it turns out that protobuf's JSON mapping is flexible in all kinds of ways that we probably don't want: enums can be strings or integers, keys can be lowerCamelCase or snake_case, etc.

As part of maximizing client interoperability, we should probably identify a stable subset of the protobuf JSON mapping that we expect clients to both consume and generate.

@znewman01
Copy link
Contributor

I have added a section "Serialization and wire format" to the client spec; it covers a couple of these ambiguities. Happy to add more.

@adityasaky
Copy link
Member

Opened a separate issue to discuss the DSSE option in greater detail: #9

@woodruffw
Copy link
Member

Following up: I believe this can now be closed: The Sigstore standard documents the Sigstore bundle as the standard inter-client format, as well as constraints on that format (JSON serialized, key conventions, etc.).

Anything else to do here?

@kommendorkapten
Copy link
Member

Agree with @woodruffw 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants