Skip to content

Commit

Permalink
Document how difftastic handles unordered syntax
Browse files Browse the repository at this point in the history
Closes #723
  • Loading branch information
Wilfred committed Jul 9, 2024
1 parent 79af24a commit f1bd870
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 0 deletions.
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,22 @@ AST diffing is a lossy process from the perspective of a text
diff. Difftastic will ignore whitespace that isn't syntactically
significant, but merging requires tracking whitespace.

### Can difftastic ignore reordering?

No. Difftastic always considers order to be important, so diffing
e.g. `set(1, 2)` and `set(2, 1)` will show changes.

If you're diffing JSON, consider sorting the keys before passing them
to difftastic.

```
$ difft <(jq --sort-keys < file_1.json) <(jq --sort-keys < file_2.json)
```

See also [Tricky Cases: Unordered Data
Types](https://difftastic.wilfred.me.uk/tricky_cases.html#unordered-data-types)
in the manual.

### Can I use difftastic to check for syntactic changes without diffing?

Yes. Difftastic can check if the two files have the same AST, without
Expand Down
33 changes: 33 additions & 0 deletions manual/src/tricky_cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,39 @@ Syntactic diffing can ignore whitespace changes, but it has to assume
punctuation is meaningful. This can lead to punctuation changes being
highlighted, which may be quite far from the relevant content change.

## Unordered Data Types

```
// Before
set(1, 2)
// After
set(2, 1)
```

Users may expect difftastic to find no changes here. This is difficult
for several reasons.

For programming languages, side effects might make the order
relevant. `set(foo(), bar())` might behave differently to `set(bar(),
foo())`.

For configuration languages like JSON or YAML, some parser
implementations do actually expose ordering information
(e.g. `object_pairs_hook=OrderedDict` in Python, or serde_json's
`preserve_order` feature in Rust).

To make matters worse, unordered tree diffing is NP-hard.

> For the unordered case, it turns out that all of the problems in
> general are NP-hard. Indeed, the tree edit distance and alignment
> distance problems are even MAX SNP-hard.
>
> -- [A survey on tree edit distance and related problems](https://doi.org/10.1016/j.tcs.2004.12.030)
**Difftastic**: Difftastic considers ordering to be meaningful
everywhere, so it will always report ordering changes.

## Novel Blank Lines

Blank lines are challenging for syntactic diffs. We are comparing
Expand Down

0 comments on commit f1bd870

Please sign in to comment.