Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: implement jj gerrit send for sending changes to Gerrit #2845

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

thoughtpolice
Copy link
Member

@thoughtpolice thoughtpolice commented Jan 18, 2024

Warning

This PR is very rough, and needs significant polish to handle edge cases and other small nits, but functions for basic uses just fine, I think. Please only use it on copies of your existing repositories, or throwaway repositories that you can jj undo on. This warning will be removed once I feel it's more stable.

Please remember: Do not taunt Happy Fun Ball.

This implements the most basic workflow for submitting changes to Gerrit, through a verb called 'send'. Send was chosen to avoid conflicting terminology with terms like "submit" (which merges a change in Gerrit) or "review" which is ambiguous.

Given a list of revsets (specified by multiple -r options), this will parse the footers of every commit, collect them, insert a Change-Id (if one doesn't already exist), and then push them into the given remote.

Because the argument is a revset, you may submit entire trees of changes at once, including multiple trees of independent changes, e.g.

jj gerrit send -r foo:: -r baz::

The remote to push into is, by default, main@origin, and will translate to refs/for/main at that remote. I.e. it is expected that you are using Gerrit as the source of truth, of course. You can specify another remote using the argument --for however, e.g. --for main@gerrit which will change the remote. This syntax is intended to resemble existing syntax for remote references in jj for the sake of UX consistency, rather than blindly exposing the Git refs/ syntax to users, which isn't needed, practically speaking.

There are many other improvements that can be applied on top of this, including a ton of consistency and "does this make sense?" checks. However, it is flexible and a good starting point, and you can in fact both submit and cycle reviews with this interface.

@thoughtpolice thoughtpolice self-assigned this Jan 18, 2024
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from ae28e5f to 1fb8e16 Compare January 18, 2024 20:53
@thoughtpolice thoughtpolice changed the title cli: implement jj gerrit mail for submitting changes to Gerrit cli: implement jj gerrit mail for sending changes to Gerrit Jan 18, 2024
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from 1fb8e16 to 0d6be43 Compare January 19, 2024 21:00
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from 0d6be43 to 65e7135 Compare January 19, 2024 21:16
Copy link
Member

@martinvonz martinvonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still haven't reviewed the last commit, but it's getting late, so here's what I have for now.

cli/src/cli_util.rs Outdated Show resolved Hide resolved
lib/src/footer.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
lib/src/footer.rs Outdated Show resolved Hide resolved
// to ensure we parse the footer in an unambiguous manner; this avoids cases
// where a colon in the body of the message is mistaken for a footer line
let mut footer = IndexMap::new();
let lines: Vec<&str> = body.trim().lines().rev().collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the first line could be skipped by .next() and could do .rev().map().take_while() or something.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
/// the form `branch@remote`, where `branch` is the target branch name that
/// the change will be submitted to, and `remote` is the remote to push to.
/// This remote MUST be a Git push URL for your Gerrit instance.
#[arg(long = "for", short = 'f', default_value = "main@origin")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be better to use trunk() as the default. We can't determine the remote if the trunk revision had multiple remote branches, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while trunk() probably does make some sense since it tracks what the upstream would be implicitly (without hardcoding pre-existing branch/remote names), I think it's rather confusing because revsets are already used to specify the trees you want to push, and can be specified multiple times, and trunk() looks quite like one at a glance. For example,

jj gerrit mail -r 'open() ~ empty()' -f 'trunk()'

reads quite strangely to me, especially because trunk() would be rather special in this context where it actually specifies a pair of (branch, remote) to send the change to; it's not clear at a glance what that means. I think that the branch@remote syntax is more familiar in this case since it's what shows up in the default log output, it's used for all other @git refs, etc. At the same time, it's also reasonably easy to translate to Gerrit's git push origin HEAD:refs/for/main so it's not too unfamiliar. It's half-way between us and them.

Related, but what I would actually like to do is have this default be over-rideable in the repository configuration, so that people can also do things like set the URL to the Gerrit instance. e.g.

jj config set --repo gerrit.default-remote canon@upstream

or something of this manner to override main@origin as the default --for. We could also have a nested TOML list in the same manner so that remotes can have individually specified default branch targets, e.g. set gerrit.<remote>.default-for "master" where <remote> can be any Git remote.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jj gerrit mail -r 'open() ~ empty()' -f 'trunk()'

Yeah, it's a bit odd. I don't think --for should support a revset in general, but I mean the default could be derived from the trunk(). If the user configured per-repository trunk(), it would be something like trunk() = "foo@origin", so the purpose is quite similar to gerrit.default-remote. Maybe we should have a parsable configuration for that, and both trunk() and the gerrit --for should default to it?

BTW, is this origin in --for foo@origin the gerrit endpoint? The project I worked before had a gerrit-specific SSH remote, and we usually don't pull from it. In that setup, the user-visible remote branch is something like master@origin, but the remote to push to is gerrit.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
// mark the old commit as rewritten to itself, but only because it
// was a --dry-run, so we can give better error messages later
old_to_new.insert(original_commit.clone(), (original_commit.clone(), true));
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just rewrite the commits, and discard the transaction? iirc, git push --dry-run -c does a similar thing.

} else {
write!(ui.stderr(), "Skipped Change-Id (it already exists) for ")?;
}
tx.write_commit_summary(ui.stderr_formatter().as_mut(), new)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, this loop can be moved to the previous "rewriting" loop, and we wouldn't have to maintain the mapping between old and new commits.

I think the commit summary can be printed by tx.base_workspace_helper() to suppress scary markers.

cli/src/commands/gerrit.rs Outdated Show resolved Hide resolved
@necauqua
Copy link
Contributor

From an outside perspective of a non-gerrit user, the word 'mail' in something seemingly unrelated to email (it is, right?) feels.. really weird

Just my input, I've no real arguments for or against

@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 2 times, most recently from e41fca1 to 4fb99d4 Compare January 21, 2024 04:15
@martinvonz
Copy link
Member

martinvonz commented Jan 23, 2024

From an outside perspective of a non-gerrit user, the word 'mail' in something seemingly unrelated to email (it is, right?) feels.. really weird

What do you think we should call it? @thoughtpolice's first suggestion was review. I think the main problem with that is that it sounds more like "perform review" than "send for review". send-for-review seems too long. Or is it okay given tab completion? Any other options? get-review? (Maybe just review is still the best?)

@thoughtpolice
Copy link
Member Author

I think send as Yuya mentioned in his review wasn't bad, either. I admit, I find mail to be very "fun", so I quite like it. (Maybe that's the inner Haskell programmer coming out...) I'd prefer we avoid hyphenated-words but, that's mostly aesthetic, I admit.

I also plan on adding some other commands to fetch changes from Gerrit too (though sending them is the most important since fetching them has some workarounds.) Not sure how that factors in to the vocabulary, but something to consider...

@joyously
Copy link

joyously commented Jan 23, 2024

I suggest send also, since that's what Breezy uses. See Breezy doc.

Mail or create a merge-directive for submitting changes.

Oh, and Git has send-email and send-pack,

git-send-email - Send a collection of patches as emails
git-send-pack - Push objects over Git protocol to another repository

and Darcs has send.

Prepare a bundle of patches to be applied to some target repository.

@benbrittain
Copy link
Member

I too am very supportive of send over mail. There might be a better word but I appreciate not overloading the gerrit terminology.

@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from 4fb99d4 to 462f966 Compare February 9, 2024 21:08
@thoughtpolice thoughtpolice changed the title cli: implement jj gerrit mail for sending changes to Gerrit cli: implement jj gerrit send for sending changes to Gerrit Feb 9, 2024
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 2 times, most recently from bcf8a9c to 699346f Compare February 14, 2024 22:53
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 5 times, most recently from 8a40482 to 08e210b Compare February 23, 2024 03:16
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch 2 times, most recently from c16fc43 to 3afc3e4 Compare May 7, 2024 17:00
@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from 3afc3e4 to 3c2e0b0 Compare May 14, 2024 00:06
@matts1
Copy link
Contributor

matts1 commented May 14, 2024

This PR appears to be maintained, constantly rebasing on top of main, but it's been two months since the last comment.

Could we either get this merged, or come up with a list of blockers that we need resolved before it can be merged? I see the tests aren't passing on CI - is that the only blocker?

From what I could tell after looking at the PR, there are some minor improvements that could be made, but as @thoughtpolice said, it works, and I think it'd be nice to submit now and iterate on this later.

@martinvonz
Copy link
Member

As it happens, there was a thread about this on Discord today. @benbrittain also wanted to this merged. IIUC, the main reason it's not merged is that @thoughtpolice has run into some libgit2 (?) bug once in a while which manifests like this: https://stackoverflow.com/questions/16586642/git-unpack-error-on-push-to-gerrit. It still seems useful even if that happens sometimes.

It looks like there are some unresolved comments too, and maybe we'll want another review of the code since it's been a while. I basically don't remember anything from my previous rounds of review. Do you have time to review it, @matts1?

@matts1
Copy link
Contributor

matts1 commented May 14, 2024

Sure, I can review - I'll take a look once CI is passing.

Copy link
Contributor

@mlcui-corp mlcui-corp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random drive-by comments.

lib/src/footer.rs Show resolved Hide resolved
///
/// In this case, there are four footer lines: two `Co-authored-by` lines, one
/// `Reviewed-by` line, and one `Change-Id` line.
pub fn get_footer_lines(body: &str) -> Vec<FooterEntry> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth handling this case from git-interpret-trailers' man page?

The may be split over multiple lines with each subsequent line starting with at least one whitespace, like the "folding" in RFC 822. Example:

key: This is a very long value, with spaces and
  newlines in it.

Gerrit code uses RevCommit#getFooterLines which seems to handle this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Yes, I'll look into fixing this.

@thoughtpolice thoughtpolice force-pushed the aseipp/push-uytvkxyqyspn branch from 3c2e0b0 to e71a4ee Compare June 12, 2024 20:49
cli/src/commands/gerrit/send.rs Show resolved Hide resolved
cli/src/commands/gerrit/send.rs Show resolved Hide resolved
}

// case 2
if let Ok(remote) = config.get_string("gerrit.default_remote") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should case 3 and case 4 come before case 2? If the user sets gerrit.default_remote to foo, and there's only one remote bar, should we just upload to bar? I don't think there's a right or wrong answer here, but whatever we decide, I think we should add a comment to the code to show that the behaviour is intentional.

Following on from that, should case 4 be combined with case 2? We could just make it let remote = config.get_string("gerrit.default_remote").unwrap_or("gerrit").

cli/src/commands/gerrit/send.rs Show resolved Hide resolved
if footer.iter().filter(|entry| entry.0 == "Change-Id").count() > 1 {
writeln!(
ui.warning_default(),
"warning: multiple Change-Id footers in commit {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You called this an error in the comment above, but this isn't an error - it looks like it'll specifically just skip uploading this one commit, but upload all others.

I'd personally prefer we made this into an actual error. I think that if, for example, we have a parent commit which has duplicate change IDs, and a child commit which is valid, uploading the child without the parent seems like a bad idea.


// check the change-id format is correct in any case
if cid.len() != 41 || !cid.starts_with('I') {
writeln!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - I'd prefer an error

// them, and I realized my old signoff.sh script created invalid
// ones, so this is a helpful barrier.

continue; // fallthrough
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these continues make it very difficult to understand the code path. I'd prefer we use if/else instead.

If it's too nested that way or something like that, putting it into functions may make it easier

Err(user_error(err.to_string()))
}
// XXX (aseipp): more cases to handle here?
_ => Err(user_error(err.to_string())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a user error?

"warning: internal git error while pushing to gerrit: {}",
err
)?;
Err(user_error(err.to_string()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a user error

///
/// This command modifies each commit in the revset to include a `Change-Id`
/// footer in its commit message if one does not already exist. Note that
/// this ID is NOT compatible with jj IDs, and is Gerrit-specific.
Copy link
Contributor

@matts1 matts1 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't remember if this was discussed earlier, but have we considered making this a transient hash of the JJ change ID that isn't actually stored in the committee description?

I've implemented something roughly similar to what you've done, and one problem I had was that when I run JJ split you end up with, by default, two commits with the same gerrit change ID.

That being said, I consider this a very good improvement over the current behaviour, so I'd recommend we submit as is even if we were to change it to that in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like using jj change hex as Change-Id (see #4477 (comment)) but I'm not sure if we can get away with not storing it explicitly -- splits could get very confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even git interactions could get weird (git commit --amend --no-edit, for example).

Copy link
Contributor

@matts1 matts1 Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a hybrid approach where we don't store it explicitly automatically, but if you have manually stored it explicitly then we use that Change-ID would work relatively well. It'd also work well with splits by providing reasonable defaults, but could be overridden if required.

Re splits, it should work just fine as long as the user knows exactly what's gonna happen with the split (specifically, jj split to associate the change-ID with the second one, and jj commit to associate the Change-ID with the first one). Maybe we need something like jj change-id swap foo bar or jj change-id set -r foo <change-id>.

lilyball added a commit to lilyball/jj that referenced this pull request Oct 19, 2024
This merges PR jj-vcs#3191 (OpenSSH) and PR jj-vcs#2845 (Gerrit) in order to build
from the combination.
lilyball added a commit to lilyball/jj that referenced this pull request Oct 19, 2024
This merges PR jj-vcs#3191 (OpenSSH) and PR jj-vcs#2845 (Gerrit) in order to build
from the combination.
Natural extension of the existing `[T]` instance.

Signed-off-by: Austin Seipp <[email protected]>
Change-Id: Ib7f6fd829096b2cac8e3d7b9471a92ddb76a621b
To be used for parsing `Change-Id`s from commits, in service of Gerrit
support.

Signed-off-by: Austin Seipp <[email protected]>
Change-Id: I434d76b1229b36b815622ad7409ced3a405cbe22
This implements the most basic workflow for submitting changes to Gerrit,
through a verb called 'send'. This verb is intended to be distinct from the word
'submit', which for Gerrit means 'merge a change into the repository.'

Given a list of revsets (specified by multiple `-r` options), this will parse
the footers of every commit, collect them, insert a `Change-Id` (if one doesn't
already exist), and then push them into the given remote.

Because the argument is a revset, you may submit entire trees of changes at
once, including multiple trees of independent changes, e.g.

    jj gerrit send -r foo:: -r baz::

There are many other improvements that can be applied on top of this, including
a ton of consistency and "does this make sense?" checks. However, it is flexible
and a good starting point, and you can in fact both submit and cycle reviews
with this interface.

Signed-off-by: Austin Seipp <[email protected]>
@melutovich
Copy link

I understand that this PR is working to send a special changeID for gerrit, I would assume that ideally the changeID for jj would similarly come from the gerrit changeID when commits come with a gerrit changeID, implementing at least for gerrit issue #4706

@jtolio
Copy link

jtolio commented Dec 14, 2024

Oh man, with all of the recent smattering of new jj buzz, I decided to take the plunge. I am so happy to have found this PR, which is the last remaining need for me (I really don't want to go do the EDITOR Change-Id hack).

What is left to get this merged? Sure seems like this branch is stable, well maintained, in use, etc.

@thoughtpolice
Copy link
Member Author

Oh man, with all of the recent smattering of new jj buzz, I decided to take the plunge. I am so happy to have found this PR, which is the last remaining need for me (I really don't want to go do the EDITOR Change-Id hack).

What is left to get this merged? Sure seems like this branch is stable, well maintained, in use, etc.

My attention has been split elsewhere — among other JJ stuff, and a new job that does not use Gerrit — so I haven't had the time to push this over the finish line. There are many people using it, so I agree it's not ideal.

It mostly just needs some tests and I would consider it in a reviewable state (modulo typical review cycles), it's just low on my stack of TODO items.

@martinvonz
Copy link
Member

t's just low on my stack of TODO items.

Would you like someone else to help finish it? I'm not volunteering, but it maybe @jtolio or someone else is interested.

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Dec 14, 2024

I understand that this PR is working to send a special changeID for gerrit, I would assume that ideally the changeID for jj would similarly come from the gerrit changeID when commits come with a gerrit changeID, implementing at least for gerrit issue #4706

That would be nice, but it is kind of annoying for a number of reasons. A common one (expressed elsewhere) is that you might want to abandon a change on Gerrit, then try to re-push a new version, but it will reject that if the Change-Id footer is the same (e.g. imagine you split out some changes into a new commit, then jj duplicate'd the original change, which keeps the original message.) So then you need to go re-generate the Change-Id footer, duplicate, etc etc.

I don't think there's a clear "best way" to make it work when the Change-Id trailer is not part of the actual commit object. That's part of the reason I changed the logic from using the stable jj change id (in an earlier version of this PR) to using a fully random one, since it avoided things like that. I was also thinking that maybe having a jj gerrit refresh-cid utility function would be useful for cases like that.

Honestly, I still think the best long-term strategy might be to just cooperate with the Git developers and for the two of us (Git and Jujutsu) to standardize on a Change-Id header to be added to commit objects, and then use that. Gerrit could then be patched to recognize it as well.

@thoughtpolice
Copy link
Member Author

t's just low on my stack of TODO items.

Would you like someone else to help finish it? I'm not volunteering, but it maybe @jtolio or someone else is interested.

Yes, I'd be happy to get some help; someone else could even re-file this PR (assuming they kept my name as a coauthor) and I'd be happy to review it and suggest some things I wanted. I think some people have even offered previously; but of course life finds a way to make us busy...

@avamsi
Copy link
Member

avamsi commented Dec 15, 2024

In case it helps anyone, I set Change-Id using templates.draft_commit_description and just git push:

[template-aliases]
	'gerrit_change_id(change_id)' = '"Id0000000" ++ change_id.normal_hex()'

[templates]
	draft_commit_description = '''
		separate("\n",
			description.remove_suffix("\n"),
			if(!description.contains(change_id.normal_hex()),
				"\nChange-Id: " ++ gerrit_change_id(change_id)
			),
			"\n",
			surround("JJ: Changes:\n", "", indent("JJ: \t", diff.summary()))
		)
	'''

(https://github.com/avamsi/dotfiles/blob/28a0e40439de56fe179a1b0e705727c85e075464/.jjconfig.toml#L146-L156)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.