Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the ContentHash implementations for std::Option and MergedTreeId and RemoteRefState #3061

Merged
merged 1 commit into from
Feb 16, 2024

Conversation

emesterhazy
Copy link
Contributor

The ContentHash documentation specifies that implementations for enums should hash the ordinal number of the variant contained in the enum as a 32-bit little-endian number and then hash the contents of the variant, if any.

The current implementations for std::Option and MergedTreeId are non-conformant since they hash the ordinal number as a u8 with platform specific endianness.

Fixes #3051

@emesterhazy
Copy link
Contributor Author

Of course I forgot to update all of the tests which are now failing since we changed the hash function... I'll mail a new commit once I've worked through the dozens of failures.

@yuja
Copy link
Contributor

yuja commented Feb 16, 2024

We have one more enum: RemoteRefState. Well, I've basically copied it from the other enum impls, so most enums use u8 ordinal (with the exception of MergedTreeId which uses ASCII b'0'.)

I've worked through the dozens of failures.

Just fyi, what I usually do is:

cargo insta test --test-runner nextest
cargo insta review # or accept

@emesterhazy
Copy link
Contributor Author

We have one more enum: RemoteRefState. Well, I've basically copied it from the other enum impls, so most enums use u8 ordinal (with the exception of MergedTreeId which uses ASCII b'0'.)

I didn't see this before. Thanks for pointing it out. I'm not sure how to read the implementation for this one but it seems like it's non-conformant (even ignoring the u8) since I don't think Rust defines what tag values are used for the enum variants.

https://github.com/martinvonz/jj/blob/903f18acfd2365d4488eeea3425ab18615e23aa1/lib/src/op_store.rs#L207-L219

I'll fix it in the morning and push a new commit.

@yuja
Copy link
Contributor

yuja commented Feb 16, 2024

I'm not sure how to read the implementation for this one but it seems like it's non-conformant (even ignoring the u8) since I don't think Rust defines what tag values are used for the enum variants.

It's repr(u8) enum, so the resulting value should be stable (across platforms). It doesn't conform to the documented spec, though.

…d RemoteRefState

The `ContentHash` documentation specifies that implementations for enums should
hash the ordinal number of the variant contained in the enum as a 32-bit
little-endian number and then hash the contents of the variant, if any.

The current implementations for `std::Option`, `MergedTreeId`, and
`RemoteRefState` are non-conformant since they hash the ordinal number as a u8
with platform specific endianness.


Fixes #3051
lib/src/op_store.rs Show resolved Hide resolved
@Ralith
Copy link
Contributor

Ralith commented Feb 16, 2024

u8 with platform specific endianness

For reference, note that u8 has no endianness, as it's represented by a single byte. Endianness describes the layout of the bytes of an integer's representation, not individual bits.

I don't think Rust defines what tag values are used for the enum variants.

Implicit discriminant values are specified in the reference.

@emesterhazy
Copy link
Contributor Author

For reference, note that u8 has no endianness

Right, of course... brain fart.

Implicit discriminant values are specified in the reference

Thanks, I was having trouble finding this for some reason.

@emesterhazy
Copy link
Contributor Author

emesterhazy commented Feb 16, 2024

All the tests are passing, but when I ran jj with the new binary on an existing repo, I got an error message:

➜  ~/o/g/m/jj ((500358b0)) jjx
Failed to load commit index file '3fdaff954f214d8d04fc21cd305961cc85cb4dcabb6c9d1c492f091b37c197afda06abf584900d24db8e24d98aeb7e516a6a
bba41a80fae16cb947e660bcc383': No such file or directory (os error 2). Reindexing...

After that it seemed to run fine. When I ran my 0.13.0 stable binary again on the same repo I got a different error:

➜  ~/o/g/m/jj ((500358b0)) jj
The index was corrupt (maybe the format has changed). Reindexing...

The difference in the message seems to be due to a change in the message itself in 1f6d1de6, which is very recent and not included in my stable binary.

Again it seemed fine after re-indexing. I took a look at the code and I think everything will be ok, but would someone please confirm that this should be benign before I submit this change? @yuja I think you changed this index code most recently. What do you think?

We should probably mention in the patch notes for the next release that users will see this error after upgrading. FYI @martinvonz

Edit: This is in the patch notes for the next release:

The on-disk index format changed.

Maybe we can change the migration code to just quietly rebuild the index with the new hash values so that users don't get confused by seeing an error?

@yuja
Copy link
Contributor

yuja commented Feb 16, 2024

Failed to load commit index file '3fdaff954f214d8d04fc21cd305961cc85cb4dcabb6c9d1c492f091b37c197afda06abf584900d24db8e24d98aeb7e516a6a
bba41a80fae16cb947e660bcc383': No such file or directory (os error 2). Reindexing...

It's not a problem of this PR. I'm doing some index format changes in this release cycle.

Maybe we can change the migration code to just quietly rebuild the index

That's doable, but I'm not sure which is better. Reindexing can take few tens of seconds in mid-size repos.

@emesterhazy
Copy link
Contributor Author

That checks out. I did some more testing without this commit and I see the same error.

Maybe we can just tweak the error message? The current message is:

Failed to load commit index file '3fdaff954f214d8d04fc21cd305961cc85cb4dcabb6c9d1c492f091b37c197afda06abf584900d24db8e24d98aeb7e516a6a
bba41a80fae16cb947e660bcc383': No such file or directory (os error 2). Reindexing...

Maybe we can just add (maybe the format has changed) to clarify that this might be ok.

Failed to load commit index file (maybe the format has changed) '3fdaff954f214d8d04fc21cd305961cc85cb4dcabb6c9d1c492f091b37c197afda06abf584900d24db8e24d98aeb7e516a6a
bba41a80fae16cb947e660bcc383': No such file or directory (os error 2). Reindexing...

It's an issue for a separate PR though.

@emesterhazy emesterhazy changed the title Fix the ContentHash implementations for std::Option and MergedTreeId Fix the ContentHash implementations for std::Option and MergedTreeId and RemoteRefState Feb 16, 2024
@emesterhazy emesterhazy merged commit e1fd402 into main Feb 16, 2024
15 checks passed
@emesterhazy emesterhazy deleted the push-rxvxkpzkqsqo branch February 16, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The ContentHash implementation for std::Option is inconsistent with the documentation
3 participants