-
Notifications
You must be signed in to change notification settings - Fork 381
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sparse-v2: design doc proposition for Sparse Patterns refactoring
Kicks off work on issue #1896
- Loading branch information
1 parent
c2da984
commit 8598f31
Showing
2 changed files
with
221 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,220 @@ | ||
# Sparse Patterns v2 redeisgn | ||
|
||
Authors: [Daniel Ploch](mailto:[email protected]) | ||
|
||
**Summary:** This Document documents a redesign of the sparse patterns list | ||
stored in the JJ working copy, in order to facilitate several desirable | ||
improvements. It covers both the migration path and the planned end state. | ||
|
||
## Objective | ||
|
||
Redesign Sparse Patterns to accomodate more advanced features for native | ||
and custom implementations. This includes three main goals: | ||
|
||
1. Sparse Patterns should be versioned with the working copy | ||
1. Sparse Patterns should support more flexible matching rules | ||
1. Sparse Patterns should support client path remapping | ||
|
||
## Current State (as of jj 0.13.0) | ||
|
||
Sparse patterns are an effectively unordered list of prefix strings: | ||
|
||
```txt | ||
path/one | ||
path/to/dir/two | ||
``` | ||
|
||
The _set_ of files identified by the Sparse Patterns is all paths which | ||
match any provided prefix. This governs what gets materialized in the | ||
working copy on checkout, and what the user can modify. The set is stored | ||
in working copy state files which are not versioned with the rest of the | ||
Op Store. | ||
|
||
## Proposed State (Sparse Patterns v2) | ||
|
||
Sparse Patterns v2 will be stored as objects in the Op Store, referenced | ||
by a `SparsePatternsId` from the active `View`. They will have a new, | ||
ordered structure which can fully represent previous patterns. | ||
|
||
```rust | ||
pub enum SparsePatternsPathType { | ||
DirInclusion, // Includes <path>/... | ||
DirExclusion, // Excludes <path>/... | ||
Files, // Includes <path>/* | ||
File, // Includes <path> exactly | ||
} | ||
|
||
pub struct SparsePatternsPath { | ||
type: SparsePatternsPathType, | ||
path: RepoPathBuf, | ||
} | ||
|
||
pub struct SparsePatternsMapping { | ||
src_path: RepoPathBuf, | ||
dst_path: RepoPathBuf, | ||
} | ||
|
||
pub struct SparsePatterns { | ||
paths: Vec<SparsePatternsPath>, | ||
mappings: Vec<SparsePatternsMapping>, | ||
} | ||
|
||
pub trait OpStore { | ||
... | ||
pub fn read_sparse_patterns(&self, id: &SparsePatternsIn) -> OpStoreResult<SparsePatterns> { ... } | ||
pub fn write_sparse_patterns(&self, sparse_patterns: &SparsePatterns) -> OpStoreResult<SparsePatternsId> { .. } | ||
} | ||
``` | ||
|
||
To support these more complex behaviors, a new `SparsePatterns` trait will be | ||
introduced, mainly to facilitate migration to the new data type at all the | ||
places where sparse patterns are currently used. | ||
|
||
```rust | ||
impl SparsePatterns { | ||
pub fn to_matcher(&self) -> Box<dyn Matcher> { | ||
... | ||
} | ||
|
||
... | ||
} | ||
``` | ||
|
||
## Command Syntax | ||
|
||
Path rules can be specified on the CLI and in an editor via a `<type>:<path>` | ||
syntax. The editor will only support the new explicit syntax, whereas the CLI | ||
arguments will continue to support bare `<path>` arguments in a | ||
backwards-compatible way at least for a migration period. | ||
|
||
```rust | ||
pub fn type_string(type: SparsePatternsPathType) -> &'static str { | ||
match type { | ||
SparsePatternsPathType::DirInclusion => "include", | ||
SparsePatternsPathType::DirExclusion => "exclude", | ||
SparsePatternsPathType::Files => "files", | ||
SparsePatternsPathType::File => "file", | ||
} | ||
} | ||
``` | ||
|
||
- `jj sparse set --add foo/bar` is equal to `jj sparse set --add include:foo/bar` | ||
- `jj sparse set --add exclude:foo/bar` adds a new `DirExclusion` type rule | ||
- `jj sparse set --exclude foo/bar` as a possible shorthand for the above | ||
- `jj sparse list` will print the explicit rules | ||
|
||
Paths will be stored in an ordered, canonical form which unambiguously describes | ||
the set of files to be included. Every `--add` command will append to the end of | ||
this list before the patterns are canonicalized. Whether a file is included is | ||
determined by the first matching rule in reverse order. | ||
|
||
For example: | ||
|
||
```txt | ||
include:foo | ||
exclude:foo/bar | ||
include:foo/bar/baz | ||
exclude:foo/bar/baz/qux | ||
``` | ||
|
||
Produces rule set which includes "foo/file.txt", excludes "foo/bar/file.txt", | ||
includes "foo/bar/baz/file.txt", and excludes "foo/bar/baz/qux/file.txt". If the | ||
rules are subtly re-ordered, they become canonicalized to a smaller, but | ||
functionally equivalent form: | ||
|
||
```txt | ||
# Before | ||
include:foo | ||
exclude:foo/bar/baz/qux | ||
include:foo/bar/baz | ||
exclude:foo/bar | ||
# Canonicalized | ||
include:foo | ||
exclude:foo/bar | ||
``` | ||
|
||
## Working Copy Map | ||
|
||
WARNING: This section is intentionally lacking, more research is needed. | ||
|
||
All Sparse Patterns v2 will come equipped with a default no-op mapping. | ||
|
||
```rust | ||
vec![WorkingCopyMap { | ||
src_path: RepoPathBuf::root(), | ||
dst_path: RepoPathBuf::root(), | ||
}] | ||
``` | ||
|
||
`SparsePatterns` will provide an interface to map client paths into repo paths, | ||
and vice versa. The WorkingCopy trait will apply this mapping to all snapshot | ||
and checkout operations, and and jj commands which accept relative paths will | ||
need to be updated to perform client path -> repo path translations as needed. | ||
It's not clear at this time _which_ commands will need changing, as some are | ||
more likely to refer to repo paths rather than client paths. | ||
|
||
TODO: Expand this section. | ||
|
||
In particular, the path rules for sparse patterns for _always_ be repo paths, | ||
not client paths. Thus, if the client wants to track "foo" and rename it to | ||
"subdir/bar", they must `jj sparse set --add foo` and | ||
`jj client-map set --from foo --to bar`. In other words, the mapping operation can | ||
be thought of as always _after_ the sparse operation. | ||
|
||
New commands will enable editing of the WorkingCopyMap: | ||
|
||
- `jj client-map list` will print all mapping pairs. | ||
- `jj client-map set --from foo --to bar` will add a new mapping to the end of the list. | ||
- `jj client-map remove --from foo` will remove a specific mapping rule. | ||
- `jj client-map edit` will pull up a text editor for manual editing. | ||
|
||
Like sparse paths, mapping rules are defined to apply in _order_ and on any | ||
save operation will be modified to a minimal canonical form. Thus, | ||
`jj client-map set --from "" --to ""` will always completely wipe the map. | ||
The first matching rule in reverse list order determines how a particular | ||
folder/file should be named in the client. | ||
|
||
For simplicity, we require that all client-maps are _invertible_, that is: | ||
|
||
- For any repo path R, there is at most one associated client path C | ||
- Having no associated client path is fine (that part of the repo will never be materialized) | ||
- For any client path C, there is at most one associated repo path R | ||
- Having no associated repo path is fine (all files in such paths will be ignored) | ||
|
||
## Versioning and Storage | ||
|
||
Updating the active Sparse Pattern for a particular working copy will now | ||
take place in two separate steps, one which updates the Op Store, and | ||
another which updates the actual working copy. | ||
|
||
This gives the user the ability to update the active sparse patterns whilst | ||
not interacting with the local working copy, which is useful for custom | ||
integrations which may not be _able_ to check out particular sparse patterns due | ||
to problems with the backend (encoding, permission errors, etc.). A bad | ||
`jj sparse set --add oops` command can thus be undone, even via `jj op undo` | ||
if desired. | ||
|
||
### View Updates | ||
|
||
The View object will be migrated to store sparse patterns via id. The | ||
indirection will save on storage since sparse patterns are not expected to | ||
change very frequently. | ||
|
||
```rust | ||
// Before: | ||
pub wc_commit_ids: HashMap<WorkspaceId, CommitId>, | ||
|
||
// After: | ||
pub struct WorkingCopyInfo { | ||
pub commit_id: CommitId, | ||
pub sparse_patterns_id: SparsePatternsId, | ||
} | ||
... | ||
pub wc_info: HashMap<WorkspaceId, WorkingCopyInfo>, | ||
``` | ||
|
||
A repo with no sparse patterns in View storage will be actively migrated upon | ||
running the first jj command with the new release, filling the curring active | ||
SparsePatternsId into all historical Views. This will be maintained for ~4 | ||
releases (and ~4 months) for backwards compatibility. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters