-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate serialize_v0
to new API (as part of serialization layer)
#190
Conversation
ea63a4a
to
b89a6cc
Compare
3da46a9
to
a7d78ea
Compare
a7d78ea
to
6ad6254
Compare
Missed this in sigstore#188, but found out I need it when working on sigstore#190. The `serialize_v0`/`serialize_v1` methods all had headers in front of the files, so we need to do that too. Will update usage of header on sigstore#190 shortly. As a benefit, we can simulate hashing a file with a header for the first portion of the file and a sharded hasher for the remainder of the file. Signed-off-by: Mihai Maruseac <[email protected]>
ac1f654
to
4cae6b3
Compare
This is the middle layer of the API design work (sigstore#172). We add a manifest abstract class to represent various manifests (sigstore#111 sigstore#112) and also ways to serialize a model directory into manifests and ways to verify the manifests. For now, this only does what was formerly known as `serialize_v0`. The v1 and the manifest versions will come soon. Note: This has a lot of inspiration from sigstore#112, but makes the API work with all the usecases we need to consider right now. Signed-off-by: Mihai Maruseac <[email protected]>
4cae6b3
to
5b75d08
Compare
serialize_v0
to new API (as part of serialization layer)
Opening this one for review now while I'm working on v1 and manifest (v2) |
Signed-off-by: Mihai Maruseac <[email protected]>
|
||
assert manifest == new_manifest | ||
|
||
def test_file_model_hash_changes_if_content_changes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be a good idea to be more thorough how we update models: iterate over each character and each file name to make sure we have everything covered. Too easy to make mistakes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. It will slow down testing and the only thing it can catch is if any hashing engine has collisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may uncover bugs for more complicated serializations, eg with sharding and different parameters. Here you're right it's simple enough that we're probably OK. Note that we could add just a couple test function and small models to minimize performance during testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it for end-to-end testing (#5), added a comment there to refer to this
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Currently blocked by DCO (dcoapp/app#211). Will make new PRs on top of this one and rebase when this gets merged. |
See dcoapp/app#211 (comment) Signed-off-by: Mihai Maruseac <[email protected]>
|
||
|
||
@dataclass | ||
class DigestManifest(Manifest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a model look like as a manifest?
As the digest manifest is defined now, it contains a single digest. Shouldn't it contain a list of digests to represent the model?
Where would we place the signature in this manifest?
Perhaps we could align a bit more with the sigstore bundle type and least provide a model manifest that contains the digests, the signature and verification material.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on that now, hope to have a PR later today/early tomorrow. This PR and the one after it only migrated the old serialize_v0
/serialize_v1
versions
Summary
Migrate
serialize_v0
to new API.This is the middle layer of the API design work (#172). We add a manifest abstract class to represent various manifests (#111 #112) and also ways to serialize a model directory into manifests and ways to verify the manifests.
For now, this only does what was formerly known as
serialize_v0
. The v1 and the manifest versions will come soon.Note: This has a lot of inspiration from #112, but makes the API work with all the usecases we need to consider right now.
Release Note
NONE
Documentation
NONE