conflicts: insert trailing newline when materializing conflicts #3977

ilyagr · 2024-06-26T23:59:57Z

This happens for non-empty files only; the materialization code can
handle files and hunks with 0 or more lines, but each line should
terminate in a newline.

This is an imperfect fix to #3968, a better fix is TODO; see #3975.

Checklist

If applicable:

I have updated CHANGELOG.md
(n/a) I have updated the documentation (README.md, docs/, demos/)
(n/a) I have updated the config schema (cli/src/config-schema.json)
I have added tests to cover my changes

lib/src/conflicts.rs

yuja · 2024-06-27T14:47:18Z

lib/src/merge.rs

@@ -415,6 +415,12 @@ impl<T> FromIterator<T> for MergeBuilder<T> {
    }
 }

+impl<T> FromIterator<T> for Merge<T> {


iirc, .collect() into Merge is not implemented by design, and the caller should use MergeBuilder instead.

I guess the point was to avoid changing the length of the Merge. That is significant, thanks. [[Update: Actually, on second though, I'm not sure .collect() is any worse in this regard than the .collect::<MergeBuilder<_>>().build() it expands to, but having an iterator in a situation where the length is dangerous to change seems ugly regardless]]. Merge::from_vec still exists, but I guess we should eventually make it private.

I think it would be better to add a owned_map method to Merge (Or into_map? I'm not sure about the naming or how to make it look idiomatic. In principle, the current map could become something like .as_ref().map(), but I'm not sure that's nice). Using MergeBuilder when the length of the merge doesn't change feels ugly and unintuitive.

I went with into_map. PTAL.

I think I tried to explain the reason here:
https://github.com/martinvonz/jj/blob/eba9ac0b1b6cc8a564749598dd8101d60bceffe5/lib/src/merge.rs#L374-L379

However, I don't understand it myself now. Feel free to try to remove MergeBuilder and see how it goes. Maybe I was just confused (or maybe I'm missing something now).

I also don't understand why Option matters, but I think the goal was to make .collect() not panic.

yuja · 2024-06-27T14:50:51Z

lib/src/conflicts.rs

+            };
+            ContentHunk(side)
+        })
+        .collect();


Can't we do padding when writing hunks or added/removed contents? It will help implement "a better solution" later.

It might render eol-only change as "different", but I think that's correct.

What kind of padding do you mean?

something like

write_all(data); if !data.ends_with(b'\n') { // TODO: better solution ... write_all(b"\n"); }

Edit: I assume it's a presentation problem rather than a bug in diff/merge logic.

First of all, what we are trying to avoid is:

<<<<<<< Conflict 1 of 1 %%%%%%% Changes from base to side #1 -base+++++++ Contents of side #2 right>>>>>>> Conflict 1 of 1 ends

The most naive interpretation of your suggestion (putting the padding at the end of the function) wouldn't help with that AFAICT.

What makes more sense (but I'm leaning against from) is to put the padding inside the conflict processing, around https://github.com/martinvonz/jj/pull/3977/files#diff-15b48dd183d7f89959f15b80cee2dcae0bb7099e1e934497f74c73384315fa6bR254 (this is what I tried doing first, the new line is a remnant of this). Note that each side of the hunk must be padded, and this will only matter if there is a conflict hunk at the very end of the file.

I think that approach is doable, and it has the advantage that, if there is no conflict at the end of the file, the file would not be mangled.

However, the reason I moved it to the beginning is that my plan was to replace padding with the more complicated transformation I described in #3795 (Update: I meant #3975 here and elsewhere). That transformation would need to be applied both inside and outside conflicts, so the beginning of the function seems to be the right place for it.

Is moving the padding inside conflict processing what you meant?

Is moving the padding inside conflict processing what you meant?

Yes. If I understand the problem, the logic is correct to the point of let merge_result = files::merge(&slices), and the rendering code is broken. We'll need to fix output.write_all(&left.0)?;, write_diff_hunks(), etc. to be compatible with line-oriented conflict format, by inserting "\n" (or "\n\\No newline at end of file?)

Since the rendering code is internal, I was thinking that we can simply declare that there is a precondition for it to work correctly: the input has to be 0 or more newline-terminated lines.

In my mind, this is similar to the fact that our rendering code is broken (produces results that cannot be parsed correctly) when the input contains conflict markers.

To me, the rendering code is broken, so we should fix it rather than introducing lossy transformation. Perhaps, nested conflict markers can be escaped somehow (e.g. by using hunk syntax?)

I'm now thinking of just preparing the full solution from #3795, since it treats both issues.

nit: wrong issue number?

I'm worried that you'll oppose it since it's similar in spirit to this PR: each side is encoded to satisfy the 2 conditions (newlines & not having and conflict markers) before the file conflict is split into hunks.

I'm not sure why you want to preprocess the raw inputs. It might be slightly easier, but is wrong in that "foo" and "foo\n" no longer produce a conflict? That might be okay in practice, but would be weird if conflicted file had no conflicts inside.

If I remember correctly, one of the proposed solutions was to insert something like \ No newline at end of file and decode it by parse_conflict(). I think parse_conflict_hunk() will roughly do:

split by conflict markers

parse contents and hunks (where "No EOL" will be processed)

collect them into Merge<ContentHunk>

And it makes sense that materialize_merge_result() does the reverse.

nit: wrong issue number?

The correct issue number is #3975. Sorry for the confusion!

That approach would make the transformation lossless. Generally, I hope that my comments will make more sense if you look at that.

My current iteration for the plan for #3975 is to start special lines with \JJ, e.g. \JJ: No newline at the end of file or \JJ Verbatim line: <<<<<<<.

Once that is done, my preliminary plan for #3223 is to materialize deleted sides as \JJ There is no file at this path on this side of the conflict. It may have been deleted.

That seems good (though I don't have any particular preference regarding syntax.) Anyway, my point is that the source hunks shouldn't be padded, which is lossy.

is wrong in that "foo" and "foo\n" no longer produce a conflict? That might be okay in practice, but would be weird if conflicted file had no conflicts inside.

By the way, this is a good point, I hadn't initially realized that there could be conflicts that are no longer conflicts after adding the newline at the end, potentially resulting in occasional problems like #3223.

I'll close this in favor in hopefully upcoming comprehensive solution.

…wline jj-vcs#3968

… reference We are about to do some pre-processing of that value, and we will do more in the near future.

This happens for non-empty files only; the materialization code can handle files and hunks with 0 or more lines, but each line should terminate in a newline. This is an imperfect fix to jj-vcs#3968, a better fix is TODO; see jj-vcs#3975. This also implements `Merge::into_map` to avoid distracting iterations and MergeBuilders. I could split it into a separate commit, but then it'd have no uses.

ilyagr · 2024-07-15T03:43:54Z

Replaced by #4088

ilyagr marked this pull request as ready for review June 27, 2024 00:01

ilyagr commented Jun 27, 2024

View reviewed changes

lib/src/conflicts.rs Show resolved Hide resolved

ilyagr mentioned this pull request Jun 27, 2024

tests: add word-level hunks and capture some colorized diff outputs #3978

Merged

4 tasks

ilyagr force-pushed the conflictnewlines branch 4 times, most recently from a3f28aa to c33166f Compare June 27, 2024 03:55

yuja reviewed Jun 27, 2024

View reviewed changes

ilyagr force-pushed the conflictnewlines branch 5 times, most recently from dba45bb to 0e065e8 Compare June 29, 2024 18:56

ilyagr force-pushed the conflictnewlines branch 2 times, most recently from 97373a3 to 2ed234e Compare July 7, 2024 07:37

ilyagr added 3 commits July 10, 2024 09:53

conflicts: demo failure to materialize if conflicts don't end in a ne…

dc03612

…wline jj-vcs#3968

conflicts: Have materialize_merge_result take the object instead of a…

d5b1eec

… reference We are about to do some pre-processing of that value, and we will do more in the near future.

ilyagr force-pushed the conflictnewlines branch from 2ed234e to a6f8d2a Compare July 10, 2024 16:59

ilyagr closed this Jul 10, 2024

ilyagr mentioned this pull request Jul 14, 2024

diff: leverage BStr for debug printing #4082

Merged

4 tasks

ilyagr mentioned this pull request Jul 18, 2024

lib conflicts: Materialize (sides with (conflict markers) or (no newline at EOF)) in a way that can be parsed #4088

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conflicts: insert trailing newline when materializing conflicts #3977

conflicts: insert trailing newline when materializing conflicts #3977

ilyagr commented Jun 26, 2024

yuja Jun 27, 2024

ilyagr Jun 27, 2024 •

edited

Loading

ilyagr Jun 27, 2024

martinvonz Jun 28, 2024

yuja Jun 28, 2024

yuja Jun 27, 2024

ilyagr Jun 27, 2024 •

edited

Loading

yuja Jun 28, 2024 •

edited

Loading

ilyagr Jun 28, 2024 •

edited

Loading

yuja Jun 28, 2024

yuja Jul 9, 2024

ilyagr Jul 9, 2024 •

edited

Loading

ilyagr Jul 9, 2024

yuja Jul 10, 2024

ilyagr Jul 10, 2024 •

edited

Loading

ilyagr commented Jul 15, 2024

conflicts: insert trailing newline when materializing conflicts #3977

conflicts: insert trailing newline when materializing conflicts #3977

Conversation

ilyagr commented Jun 26, 2024

Checklist

Choose a reason for hiding this comment

ilyagr Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilyagr Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

yuja Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

ilyagr Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilyagr Jul 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilyagr Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

ilyagr commented Jul 15, 2024

ilyagr Jun 27, 2024 •

edited

Loading

ilyagr Jun 27, 2024 •

edited

Loading

yuja Jun 28, 2024 •

edited

Loading

ilyagr Jun 28, 2024 •

edited

Loading

ilyagr Jul 9, 2024 •

edited

Loading

ilyagr Jul 10, 2024 •

edited

Loading