Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in-toto format with hashes of shards as subjects #267

Merged

Conversation

mihaimaruseac
Copy link
Collaborator

@mihaimaruseac mihaimaruseac commented Jul 27, 2024

Summary

Note: This is an experiment serialization, one of the 4 in a series of PRs (#264, #265, #266, #267). Before a stable release of the library, we would standardize on an ergonomic format, with as little corner cases / dangerous corners as possible.

This converts model serialization manifests that record every model file shard hash into an in-toto payload that can then be passed to Sigstore's sign_intoto for signing to generate a Sigstore Bundle (if using Sigstore).

This time, we record every hash as part of the subject instead of in the payload. We require verifiers to be aware of this and acknowledge that verifiers that only check subject by subject (that is, they check if the hash of a passed in argument is in the list of subjects and don't check if all the hashes are present), can fail to fully detect if the model integrity is compromised by renaming one file in the model, interchanging two file names, deleting a file, or reordering two shards. The signing library will have additional checks for this, but verifying the signature with other tools might result in invalid results.

CC @susperius for converting manifest to in-toto. This should cover #111, #224, and #248 (first part of the machinery). CC @laurentsimon and (optionally) @TomHennen to make sure I did not mishandle in-toto.

Note: I still had to pass some payload to the in-toto predicate due to in-toto/attestation#374. Right now, it is a key-val pair that should be ignored, but there is also the possibility of registering only one predicate type for all model signing in-toto formats and then registering a subtype as part of the predicate, that we can control as much as we need.

Note: This is the equivalent of #264, but with digests added as subjects.

Note: This is the equivalent of #266, but for file shard hashes instead of file hashes.

Note: This builds on #266. I decided to split every feature into its own PR to make it easier to review what changes (should be only the last commit) and to be able to merge partial work and continue from there.

Finally, merging this will:

Release Note

NONE

Documentation

NONE

@mihaimaruseac mihaimaruseac requested review from a team as code owners July 27, 2024 16:44
@mihaimaruseac mihaimaruseac force-pushed the in-toto-shard-digests-as-subjects branch from 6543a2d to 994fb41 Compare July 27, 2024 17:57
@mihaimaruseac mihaimaruseac added this to the V1 release milestone Jul 27, 2024
@mihaimaruseac mihaimaruseac force-pushed the in-toto-shard-digests-as-subjects branch from 994fb41 to 5385320 Compare July 29, 2024 23:28
@mihaimaruseac mihaimaruseac force-pushed the in-toto-shard-digests-as-subjects branch 5 times, most recently from cb05df9 to 5bdd5ad Compare August 1, 2024 17:49
laurentsimon
laurentsimon previously approved these changes Aug 1, 2024
This converts model serialization manifests that record every model file
shard hash into an in-toto payload that can then be passed to Sigstore's
`sign_intoto` for signing to generate a Sigstore `Bundle` (if using
Sigstore).

This time, we record every hash as part of the subject instead of in the
payload. We require verifiers to be aware of this and acknowledge that
verifiers that only check subject by subject (that is, they check if the
hash of a passed in argument is in the list of subjects and don't check
if all the hashes are present), can fail to fully detect if the model
integrity is compromised by renaming one file in the model,
interchanging two file names, deleting a file, or reordering two shards.
The signing library will have additional checks for this, but verifying
the signature with other tools might result in invalid results.

Signed-off-by: Mihai Maruseac <[email protected]>
@mihaimaruseac mihaimaruseac dismissed laurentsimon’s stale review August 2, 2024 01:03

The merge-base changed after approval.

@mihaimaruseac mihaimaruseac force-pushed the in-toto-shard-digests-as-subjects branch from 5bdd5ad to 0464e61 Compare August 2, 2024 01:03
Copy link
Contributor

@spencerschrock spencerschrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to Laurent's review, I did confirm Mihai's latest force-push was a rebase on top of main.

Note: In the PR description I'm assuming

Note: This is the equivalent of #265, but for file shard hashes instead of file hashes.

Should be

Note: This is the equivalent of #266, but for file shard hashes instead of file hashes.

@mihaimaruseac
Copy link
Collaborator Author

Thank you for the review!

Note: In the PR description I'm assuming

Note: This is the equivalent of #265, but for file shard hashes instead of file hashes.

Should be

Note: This is the equivalent of #266, but for file shard hashes instead of file hashes.

It should have been 2 notes, one describing the diff from #264 (subjects instead of payload) and one from #266. Fixed now.

@mihaimaruseac mihaimaruseac merged commit b4a4f6a into sigstore:main Aug 2, 2024
20 checks passed
@mihaimaruseac mihaimaruseac deleted the in-toto-shard-digests-as-subjects branch August 2, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for converting manifests to in-toto statements Manifest file format Manifest file
3 participants