-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
conflicts: insert trailing newline when materializing conflicts #3977
Conversation
a3f28aa
to
c33166f
Compare
lib/src/merge.rs
Outdated
@@ -415,6 +415,12 @@ impl<T> FromIterator<T> for MergeBuilder<T> { | |||
} | |||
} | |||
|
|||
impl<T> FromIterator<T> for Merge<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc, .collect()
into Merge
is not implemented by design, and the caller should use MergeBuilder
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the point was to avoid changing the length of the Merge. That is significant, thanks. [[Update: Actually, on second though, I'm not sure .collect()
is any worse in this regard than the .collect::<MergeBuilder<_>>().build()
it expands to, but having an iterator in a situation where the length is dangerous to change seems ugly regardless]]. Merge::from_vec still exists, but I guess we should eventually make it private.
I think it would be better to add a owned_map
method to Merge (Or into_map
? I'm not sure about the naming or how to make it look idiomatic. In principle, the current map
could become something like .as_ref().map()
, but I'm not sure that's nice). Using MergeBuilder when the length of the merge doesn't change feels ugly and unintuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with into_map
. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I tried to explain the reason here:
https://github.com/martinvonz/jj/blob/eba9ac0b1b6cc8a564749598dd8101d60bceffe5/lib/src/merge.rs#L374-L379
However, I don't understand it myself now. Feel free to try to remove MergeBuilder
and see how it goes. Maybe I was just confused (or maybe I'm missing something now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't understand why Option
matters, but I think the goal was to make .collect()
not panic.
lib/src/conflicts.rs
Outdated
}; | ||
ContentHunk(side) | ||
}) | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do padding when writing hunks or added/removed contents? It will help implement "a better solution" later.
It might render eol-only change as "different", but I think that's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of padding do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something like
write_all(data);
if !data.ends_with(b'\n') {
// TODO: better solution ...
write_all(b"\n");
}
Edit: I assume it's a presentation problem rather than a bug in diff/merge logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, what we are trying to avoid is:
<<<<<<< Conflict 1 of 1
%%%%%%% Changes from base to side #1
-base+++++++ Contents of side #2
right>>>>>>> Conflict 1 of 1 ends
The most naive interpretation of your suggestion (putting the padding at the end of the function) wouldn't help with that AFAICT.
What makes more sense (but I'm leaning against from) is to put the padding inside the conflict processing, around https://github.com/martinvonz/jj/pull/3977/files#diff-15b48dd183d7f89959f15b80cee2dcae0bb7099e1e934497f74c73384315fa6bR254 (this is what I tried doing first, the new line is a remnant of this). Note that each side of the hunk must be padded, and this will only matter if there is a conflict hunk at the very end of the file.
I think that approach is doable, and it has the advantage that, if there is no conflict at the end of the file, the file would not be mangled.
However, the reason I moved it to the beginning is that my plan was to replace padding with the more complicated transformation I described in #3795 (Update: I meant #3975 here and elsewhere). That transformation would need to be applied both inside and outside conflicts, so the beginning of the function seems to be the right place for it.
Is moving the padding inside conflict processing what you meant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is moving the padding inside conflict processing what you meant?
Yes. If I understand the problem, the logic is correct to the point of let merge_result = files::merge(&slices)
, and the rendering code is broken. We'll need to fix output.write_all(&left.0)?;
, write_diff_hunks()
, etc. to be compatible with line-oriented conflict format, by inserting "\n"
(or "\n\\No newline at end of file
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the rendering code is internal, I was thinking that we can simply declare that there is a precondition for it to work correctly: the input has to be 0 or more newline-terminated lines.
In my mind, this is similar to the fact that our rendering code is broken (produces results that cannot be parsed correctly) when the input contains conflict markers.
To me, the rendering code is broken, so we should fix it rather than introducing lossy transformation. Perhaps, nested conflict markers can be escaped somehow (e.g. by using hunk syntax?)
I'm now thinking of just preparing the full solution from #3795, since it treats both issues.
nit: wrong issue number?
I'm worried that you'll oppose it since it's similar in spirit to this PR: each side is encoded to satisfy the 2 conditions (newlines & not having and conflict markers) before the file conflict is split into hunks.
I'm not sure why you want to preprocess the raw inputs. It might be slightly easier, but is wrong in that "foo"
and "foo\n"
no longer produce a conflict? That might be okay in practice, but would be weird if conflicted file had no conflicts inside.
If I remember correctly, one of the proposed solutions was to insert something like \ No newline at end of file
and decode it by parse_conflict()
. I think parse_conflict_hunk()
will roughly do:
- split by conflict markers
- parse contents and hunks (where "No EOL" will be processed)
- collect them into
Merge<ContentHunk>
And it makes sense that materialize_merge_result()
does the reverse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wrong issue number?
The correct issue number is #3975. Sorry for the confusion!
That approach would make the transformation lossless. Generally, I hope that my comments will make more sense if you look at that.
My current iteration for the plan for #3975 is to start special lines with \JJ
, e.g. \JJ: No newline at the end of file
or \JJ Verbatim line: <<<<<<<
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once that is done, my preliminary plan for #3223 is to materialize deleted sides as \JJ There is no file at this path on this side of the conflict. It may have been deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems good (though I don't have any particular preference regarding syntax.) Anyway, my point is that the source hunks shouldn't be padded, which is lossy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is wrong in that "foo" and "foo\n" no longer produce a conflict? That might be okay in practice, but would be weird if conflicted file had no conflicts inside.
By the way, this is a good point, I hadn't initially realized that there could be conflicts that are no longer conflicts after adding the newline at the end, potentially resulting in occasional problems like #3223.
I'll close this in favor in hopefully upcoming comprehensive solution.
dba45bb
to
0e065e8
Compare
97373a3
to
2ed234e
Compare
… reference We are about to do some pre-processing of that value, and we will do more in the near future.
This happens for non-empty files only; the materialization code can handle files and hunks with 0 or more lines, but each line should terminate in a newline. This is an imperfect fix to jj-vcs#3968, a better fix is TODO; see jj-vcs#3975. This also implements `Merge::into_map` to avoid distracting iterations and MergeBuilders. I could split it into a separate commit, but then it'd have no uses.
Replaced by #4088 |
This happens for non-empty files only; the materialization code can
handle files and hunks with 0 or more lines, but each line should
terminate in a newline.
This is an imperfect fix to #3968, a better fix is TODO; see #3975.
Checklist
If applicable:
CHANGELOG.md