Skip to content

Commit

Permalink
sparse-v2: design doc proposition for Sparse Patterns refactoring
Browse files Browse the repository at this point in the history
Kicks off work on issue #1896
  • Loading branch information
torquestomp committed Jan 25, 2024
1 parent c2da984 commit 8598f31
Show file tree
Hide file tree
Showing 2 changed files with 221 additions and 0 deletions.
220 changes: 220 additions & 0 deletions docs/design/sparse-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
# Sparse Patterns v2 redeisgn

Authors: [Daniel Ploch](mailto:[email protected])

**Summary:** This Document documents a redesign of the sparse patterns list
stored in the JJ working copy, in order to facilitate several desirable
improvements. It covers both the migration path and the planned end state.

## Objective

Redesign Sparse Patterns to accomodate more advanced features for native

Check failure on line 11 in docs/design/sparse-v2.md

View workflow job for this annotation

GitHub Actions / Codespell

accomodate ==> accommodate
and custom implementations. This includes three main goals:

1. Sparse Patterns should be versioned with the working copy
1. Sparse Patterns should support more flexible matching rules
1. Sparse Patterns should support client path remapping

## Current State (as of jj 0.13.0)

Sparse patterns are an effectively unordered list of prefix strings:

```txt
path/one
path/to/dir/two
```

The _set_ of files identified by the Sparse Patterns is all paths which
match any provided prefix. This governs what gets materialized in the
working copy on checkout, and what the user can modify. The set is stored
in working copy state files which are not versioned with the rest of the
Op Store.

## Proposed State (Sparse Patterns v2)

Sparse Patterns v2 will be stored as objects in the Op Store, referenced
by a `SparsePatternsId` from the active `View`. They will have a new,
ordered structure which can fully represent previous patterns.

```rust
pub enum SparsePatternsPathType {
DirInclusion, // Includes <path>/...
DirExclusion, // Excludes <path>/...
Files, // Includes <path>/*
File, // Includes <path> exactly
}

pub struct SparsePatternsPath {
type: SparsePatternsPathType,
path: RepoPathBuf,
}

pub struct SparsePatternsMapping {
src_path: RepoPathBuf,
dst_path: RepoPathBuf,
}

pub struct SparsePatterns {
paths: Vec<SparsePatternsPath>,
mappings: Vec<SparsePatternsMapping>,
}

pub trait OpStore {
...
pub fn read_sparse_patterns(&self, id: &SparsePatternsIn) -> OpStoreResult<SparsePatterns> { ... }
pub fn write_sparse_patterns(&self, sparse_patterns: &SparsePatterns) -> OpStoreResult<SparsePatternsId> { .. }
}
```

To support these more complex behaviors, a new `SparsePatterns` trait will be
introduced, mainly to facilitate migration to the new data type at all the
places where sparse patterns are currently used.

```rust
impl SparsePatterns {
pub fn to_matcher(&self) -> Box<dyn Matcher> {
...
}

...
}
```

## Command Syntax

Path rules can be specified on the CLI and in an editor via a `<type>:<path>`
syntax. The editor will only support the new explicit syntax, whereas the CLI
arguments will continue to support bare `<path>` arguments in a
backwards-compatible way at least for a migration period.

```rust
pub fn type_string(type: SparsePatternsPathType) -> &'static str {
match type {
SparsePatternsPathType::DirInclusion => "include",
SparsePatternsPathType::DirExclusion => "exclude",
SparsePatternsPathType::Files => "files",
SparsePatternsPathType::File => "file",
}
}
```

- `jj sparse set --add foo/bar` is equal to `jj sparse set --add include:foo/bar`
- `jj sparse set --add exclude:foo/bar` adds a new `DirExclusion` type rule
- `jj sparse set --exclude foo/bar` as a possible shorthand for the above
- `jj sparse list` will print the explicit rules

Paths will be stored in an ordered, canonical form which unambiguously describes
the set of files to be included. Every `--add` command will append to the end of
this list before the patterns are canonicalized. Whether a file is included is
determined by the first matching rule in reverse order.

For example:

```txt
include:foo
exclude:foo/bar
include:foo/bar/baz
exclude:foo/bar/baz/qux
```

Produces rule set which includes "foo/file.txt", excludes "foo/bar/file.txt",
includes "foo/bar/baz/file.txt", and excludes "foo/bar/baz/qux/file.txt". If the
rules are subtly re-ordered, they become canonicalized to a smaller, but
functionally equivalent form:

```txt
# Before
include:foo
exclude:foo/bar/baz/qux
include:foo/bar/baz
exclude:foo/bar
# Canonicalized
include:foo
exclude:foo/bar
```

## Working Copy Map

WARNING: This section is intentionally lacking, more research is needed.

All Sparse Patterns v2 will come equipped with a default no-op mapping.

```rust
vec![WorkingCopyMap {
src_path: RepoPathBuf::root(),
dst_path: RepoPathBuf::root(),
}]
```

`SparsePatterns` will provide an interface to map client paths into repo paths,
and vice versa. The WorkingCopy trait will apply this mapping to all snapshot
and checkout operations, and and jj commands which accept relative paths will
need to be updated to perform client path -> repo path translations as needed.
It's not clear at this time _which_ commands will need changing, as some are
more likely to refer to repo paths rather than client paths.

TODO: Expand this section.

In particular, the path rules for sparse patterns for _always_ be repo paths,
not client paths. Thus, if the client wants to track "foo" and rename it to
"subdir/bar", they must `jj sparse set --add foo` and
`jj client-map set --from foo --to bar`. In other words, the mapping operation can
be thought of as always _after_ the sparse operation.

New commands will enable editing of the WorkingCopyMap:

- `jj client-map list` will print all mapping pairs.
- `jj client-map set --from foo --to bar` will add a new mapping to the end of the list.
- `jj client-map remove --from foo` will remove a specific mapping rule.
- `jj client-map edit` will pull up a text editor for manual editing.

Like sparse paths, mapping rules are defined to apply in _order_ and on any
save operation will be modified to a minimal canonical form. Thus,
`jj client-map set --from "" --to ""` will always completely wipe the map.
The first matching rule in reverse list order determines how a particular
folder/file should be named in the client.

For simplicity, we require that all client-maps are _invertible_, that is:

- For any repo path R, there is at most one associated client path C
- Having no associated client path is fine (that part of the repo will never be materialized)
- For any client path C, there is at most one associated repo path R
- Having no associated repo path is fine (all files in such paths will be ignored)

## Versioning and Storage

Updating the active Sparse Pattern for a particular working copy will now
take place in two separate steps, one which updates the Op Store, and
another which updates the actual working copy.

This gives the user the ability to update the active sparse patterns whilst
not interacting with the local working copy, which is useful for custom
integrations which may not be _able_ to check out particular sparse patterns due
to problems with the backend (encoding, permission errors, etc.). A bad
`jj sparse set --add oops` command can thus be undone, even via `jj op undo`
if desired.

### View Updates

The View object will be migrated to store sparse patterns via id. The
indirection will save on storage since sparse patterns are not expected to
change very frequently.

```rust
// Before:
pub wc_commit_ids: HashMap<WorkspaceId, CommitId>,

// After:
pub struct WorkingCopyInfo {
pub commit_id: CommitId,
pub sparse_patterns_id: SparsePatternsId,
}
...
pub wc_info: HashMap<WorkspaceId, WorkingCopyInfo>,
```

A repo with no sparse patterns in View storage will be actively migrated upon
running the first jj command with the new release, filling the curring active
SparsePatternsId into all historical Views. This will be maintained for ~4
releases (and ~4 months) for backwards compatibility.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ nav:
- 'git-submodules': 'design/git-submodules.md'
- 'git-submodule-storage': 'design/git-submodule-storage.md'
- 'JJ run': 'design/run.md'
- 'Sparse Patterns v2': 'sparse-v2.md'
- 'Tracking branches': 'design/tracking-branches.md'


0 comments on commit 8598f31

Please sign in to comment.