-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
working_copy: Only serialise clock if only clock changed #3928
Conversation
When the working copy is identical to the last snapshot - which is quite often - only the `watchman_clock` changes. When this happens, the _entirety_ of the working copy proto is reserialised, including the O(n) `file_states` field, which is slow. As non-repeated proto fields always take the last field in the serialised message ([1]), we can skip the serialisation of the whole working copy by only serialising the fields that have changed and appending it to the last serialised proto. This speeds up most `jj` commands by ~50ms on a large repository: ``` $ hyperfine --sort command --warmup 3 --runs 20 -L bin jj-before,jj-after \ "target/release/{bin} -R ~/chromiumjj/src show -s @" Benchmark 1: target/release/jj-before -R ~/chromiumjj/src show -s @ Time (mean ± σ): 401.5 ms ± 4.5 ms [User: 227.2 ms, System: 172.9 ms] Range (min … max): 394.3 ms … 414.2 ms 20 runs Benchmark 2: target/release/jj-after -R ~/chromiumjj/src show -s @ Time (mean ± σ): 347.3 ms ± 4.6 ms [User: 179.3 ms, System: 166.4 ms] Range (min … max): 340.7 ms … 358.0 ms 20 runs Relative speed comparison 1.16 ± 0.02 target/release/jj-before -R ~/chromiumjj/src show -s @ 1.00 target/release/jj-after -R ~/chromiumjj/src show -s @ ``` [1]: https://protobuf.dev/programming-guides/encoding/#last-one-wins
// https://protobuf.dev/programming-guides/encoding/#last-one-wins | ||
// Reuse the existing serialisation of the proto in the current state to avoid | ||
// serialising non-changed fields. | ||
let bytes_written = std::fs::copy(&target_path, temp_file.path()).map_err(|err| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably better to use std::io::copy()
to read/write contents over open fds. It might be a bit slower, but can avoid possible problems (like mandatory file locking on Windows.)
@@ -671,6 +684,56 @@ impl TreeState { | |||
Ok(()) | |||
} | |||
|
|||
fn save_only_clock(&mut self) -> Result<(), TreeStateError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we somehow add test exercising this code path?
// https://protobuf.dev/programming-guides/encoding/#last-one-wins | ||
// Reuse the existing serialisation of the proto in the current state to avoid | ||
// serialising non-changed fields. | ||
let bytes_written = std::fs::copy(&target_path, temp_file.path()).map_err(|err| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just rubber ducking)
- this never copies inconsistent tree state because the in-memory version is loaded after lock was acquired.
- even if it could, it's safe to rewind the watchman clock.
BTW, it might be simpler to extract watchman_clock
to separate proto file if the second bullet point holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if it could, it's safe to rewind the watchman clock.
Yes, that is correct. From @arxanas:
We should still save the watchman clock due to this "incremental processing work" - for example, consider a large build directory that is jj-ignored but not .watchmanconfig ignored.
it might be simpler to extract
watchman_clock
to separate proto file if the second bullet point holds.
This matches @arxanas' idea of splitting the states:
I think this is quite reasonable, but I was a bit concerned about the atomicity of having two working copy protos. However, this might not be a problem?
What do you think about splitting the working state in general? I think that going forward, this might be easier to maintain than special-casing the clock in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think atomic update in general is required here (because we'll take a lock and reload on mutation, and actual working-copy files can't be updated atomically anyway), but I'm not pretty sure. @martinvonz, any concerns?
From @arxanas' comment, I think it's safe to split watchman_clock
to a separate file. The combination of (new_tree_state, old_watchman_clock)
won't cause consistency problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - as long as we can't end up writing (old_tree_state, new_watchman_clock)
, there shouldn't be any problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was slightly worried about the following situation:
- Time 0:
foo.txt
has contents1
. (old_watchman_clock
) - Time 1:
foo.txt
has contents2
. (new_tree_state
) - Time 2:
foo.txt
has contents1
.
At time 2, querying watchman with the time 0 clock successfully shows foo.txt
being updated, even though its contents technically have not been updated.
I'll abandon this PR and will work on splitting out the proto. Just a few questions before I get started:
- Splitting - what part of
TreeState
should we "split out"? I think there's two reasonable options:- Move only
watchman_clock
into a new message, as discussed. I think this is most reasonable as an incremental update toTreeState
. - Move only
file_states
into a new message. This notably allows us to updatetree_ids
andsparse_patterns
without writingfile_states
, but I think this is probably not needed.
- Move only
- Naming - what should we name the new message / the on-disk file? Thankfully, protobuf makes it easy to rename messages, so we don't need to think about this too much.
- If moving
watchman_clock
: Should we keep it tied with the idea of a file system watcher, or keep it more general? I thinkTreeStateSecondary
is reasonable - it emphasises the fact that a "primary" update can be done without updating the "secondary". - If moving
file_states
: I think it's reasonable for it to be calledFileStates
.
- If moving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At time 2, querying watchman with the time 0 clock successfully shows foo.txt being updated, even though its contents technically have not been updated.
That should just be a performance problem, not a correctness problem. It would mean that we read the file contents an extra time before we determine that it hasn't changed.
- Move only
file_states
into a new message. This notably allows us to updatetree_ids
andsparse_patterns
without writingfile_states
, but I think this is probably not needed.
More importantly, those needs to be written atomically.
Naming - what should we name the new message / the on-disk file?
I would go with something more specific since we don't yet know what else might go into that file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was dealing with the same problem at work recently, specifically concerned with the case that the user creates a new file at time 0, we process it at time 1 and then abort, and then the user deletes the file at time 2, so that the working copy state compared to the last saved clock looks identical to us, but we created extra state somewhere that now needs to be cleaned up. But Watchman will still report the file change so that we can notice and delete the extra state.
let proto = crate::protos::working_copy::TreeState { | ||
watchman_clock: self.watchman_clock.clone(), | ||
..Default::default() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I assume this never serializes fields other than watchman_clock
because Default
values can be omitted in protobuf, and the serialization should be skipped by prost.
Can you add a short comment about that?
@@ -1768,7 +1840,7 @@ impl LockedWorkingCopy for LockedLocalWorkingCopy { | |||
message: "Failed to read the working copy state".to_string(), | |||
err: err.into(), | |||
})?; | |||
self.tree_state_dirty |= tree_state.snapshot(options)?; | |||
self.tree_state_dirty = self.tree_state_dirty.max(tree_state.snapshot(options)?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you add inline comment to TreeStateChanged
type saying the order matters.
Alternatively, maybe we can use bitflags
.
Closing this PR as discussed in #3928 (comment) - we will move |
When the working copy is identical to the last snapshot - which is quite often - only the
watchman_clock
changes. When this happens, the entirety of the working copy proto is reserialised, including the O(n)file_states
field, which is slow.As non-repeated proto fields always take the last field in the serialised message (1), we can skip the serialisation of the whole working copy by only serialising the fields that have changed and appending it to the last serialised proto.
This speeds up most
jj
commands by ~50ms on a large repository:Checklist
If applicable:
CHANGELOG.md