Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: mark locally created changes as such and add local() revset (like mine()) #3529

Open
dpc opened this issue Apr 18, 2024 · 24 comments
Open
Labels
polish🪒🐃 Make existing features more convenient and more consistent

Comments

@dpc
Copy link

dpc commented Apr 18, 2024

Is your feature request related to a problem? Please describe.

The longer description of my motivation is already in #3528

But in a nutshell: it's hard to comprehensively track own work in progress without branches.

Describe the solution you'd like

I wish jj allowed easier revset selection of changes that were created locally, as these are the changes users usually would care about. Seems easy to understand and reason about.

mine(), author()/committer() are not quite like it as they match tons of remote stuff.

Most people suggest some cobinations of @ and trunk(), but these don't handle release (or other branches, one might work with).

local() seems trivial to implement. Just an additional set in the db, where any jj new-created change_id gets added.

To support cases where user would like to "import/consider" stuff as local some kind of CLI command to add and remove changes to local() would be needed as well.

Describe alternatives you've considered

I wasted hours trying to come up with a revset formula that would show me only stuff I care about in a repository with lots of commits, release branches. Random branches in repos have my commits cherry-picked soemwhere insiede, and there are tons of random branches made by me in the past, that I can't confidently clean all. I tried a combination of picking latest of committer/author, etc., finding common ancestors, and so and so.

It always either undershoots or overshoots. It might be doable, but why go through such pain, where in practice a simple "all changes I created locally that are not yet part of these few major branches" would do.

@dpc dpc changed the title FR: mark locally created changes as such and give local() revset (like mine()) FR: mark locally created changes as such and add local() revset (like mine()) Apr 18, 2024
@yuja
Copy link
Collaborator

yuja commented Apr 18, 2024

local() seems trivial to implement. Just an additional set in the db, where any jj new-created change_id gets added.

I don't think this particular model will work out (because growing set can be a problem in small repos), but some commit metadata (similar to #3402) might be useful to determine whether commits were originally created locally. Currently, the "locally-created" state could deduced from change id, but it's implementation detail of the Git backend.

@dpc
Copy link
Author

dpc commented Apr 18, 2024

I don't think this particular model will work out (because growing set can be a problem in small repos), but some commit metadata (similar to #3402) might be useful to determine whether commits were originally created locally.

I don't understand this part. Storing sets of things efficiently is all that databases do. By a "set" here I meant a list of keys. In an SQL db: a table with primary key and no other columns.

Edit: Seems like jj stores everything as files on disk. Which is fine too. A set here could be a directory with files.

@yuja
Copy link
Collaborator

yuja commented Apr 18, 2024

Storing sets of things efficiently is all that databases do. By a "set" here I meant a list of keys.

I just mean the "local change ids" set will grow indefinitely, but the size of the active local changes is quite small compared to the historical "local change ids" set.

Perhaps, "can be a problem in small repos" is wrong. I said that Mercurial in mind, and Mercurial has a "negative" set which has to be always loaded.

@dpc
Copy link
Author

dpc commented Apr 18, 2024

I just mean the "local change ids" set will grow indefinitely

It definitely is not going to grow as fast as the whole history and jj deals with that and many other sets just fine. Eg. mine() or author() can potentially refer to even bigger set. I don't really have preferences how to implemented it BTW, just wanted to underline that logically it's just a plain set without much extra functionality/requirements.

BTW. The local() set could possibly be prunned. As soon as a local change hits the trunk(), etc. it could be considered no longer local and deleted. I don't know if that's a good idea and should be autoatic, but in theory a set of local changes that were not landed is generally small - of the size that a human can manage.

@yuja
Copy link
Collaborator

yuja commented Apr 18, 2024

I just mean the "local change ids" set will grow indefinitely

It definitely is not going to grow as fast as the whole history and jj deals with that and many other sets just fine. Eg. mine() or author() can potentially refer to even bigger set.

Ah, okay. These revset predicates aren't implemented as sets, but they are conceptually sets. Anyway, my point is that "local" can be a commit metadata (like topic), not a separate set of change ids which would require extra maintenance code.

@PhilipMetzger PhilipMetzger added the polish🪒🐃 Make existing features more convenient and more consistent label Apr 18, 2024
@dpc
Copy link
Author

dpc commented Apr 20, 2024

@yuja If this is stored in commit metadata, then it will propagate with the commit itself. So I don't think it can be a simple flag on metadata.

It could be some kind of an extra header JJ_LOCAL_REPO_ID that gets compared with with the local id, possibly randomly generated on jj init or something. But then the commit id changes when users toggles the local flag, etc. which I don't think is desirable.

@yuja
Copy link
Collaborator

yuja commented Apr 20, 2024

@yuja If this is stored in commit metadata, then it will propagate with the commit itself. So I don't think it can be a simple flag on metadata.

There's local-only commit metadata storage where change ids are stored, so we can use it for this feature and topics. I don't know if we need both, though. If topic can solve the problem, we might not want to add two separate concepts.

But then the commit id changes when users toggles the local flag, etc. which I don't think is desirable.

I don't think user needs to toggle the flag off. Once the local commit is merged to the main branch, it can be excluded by ~::main.

(Actually, local commit metadata doesn't affect the commit id, but mutating it without changing the commit id would lead to consistency problem.)

@scott2000
Copy link
Collaborator

If there is no way to toggle it off except for being merged to trunk(), then it seems like it would be mostly equivalent to immutable_heads().. & author(<USER_EMAIL>) & committer(<USER_EMAIL>), because generally (ignoring commits imported from Git when first initializing the repo) shouldn't all local commits have the author and committer be the current user? And shouldn't all non-local commits either have a different author or a different committer?

@joyously
Copy link

The original description says

mine(), author()/committer() are not quite like it as they match tons of remote stuff.

@martinvonz
Copy link
Member

How do we want "created locally" to be defined in a useful way when we support pushing between repos? If I'm working on some project from two different machines (maybe desktop and laptop) and I occasionally push between them, what's a useful definition of "created locally"?

@dpc
Copy link
Author

dpc commented Apr 20, 2024

Can you point me to the local only commit metadata in the code?

@martinvonz
Copy link
Member

Can you point me to the local only commit metadata in the code?

The Git backend stores it here: https://github.com/martinvonz/jj/blob/0fb582ed8f8c45b183874bf479bb9086b95bfcdf/lib/src/git_backend.rs#L123

As Yuya said, it's specific to the Git backend, which means we cannot rely on it being local above that layer, so the UI cannot consider it local.

@dpc
Copy link
Author

dpc commented Apr 20, 2024

How do we want "created locally" to be defined in a useful way when we support pushing between repos? If I'm working on some project from two different machines (maybe desktop and laptop) and I occasionally push between them, what's a useful definition of "created locally"?

You would be able to mark stuff local manually. If it doesn't work for your use case it's ok. It's specifically meant for local stuff, doesn't need to replace all other ways to match revsets.

If you ask feel strongly that it should be generalize to topics, I could try to implement it as such, but I feel like the requirements might be different for both.

@martinvonz
Copy link
Member

If you ask feel strongly that it should be generalize to topics, I could try to implement it as such, but I feel like the requirements might be different for both.

I haven't thought very much about it but my gut feeling is that this feature is too specific to deserve its own place in the data model. OTOH, I'm not sure topics would work for your use case either, since there's probably only going to be a single topic per commit.

Also, the semantics of this feature are very unclear to me. Is it basically a bit you set on a commit? Does it stay there forever once set (until you explicitly clear it)?

I think it would help to better understand the problem you're running into with revsets. Maybe you can share some examples of where it undershoots or overshoots?

@dpc
Copy link
Author

dpc commented Apr 20, 2024

Also, the semantics of this feature are very unclear to me. Is it basically a bit you set on a commit? Does it stay there forever once set (until you explicitly clear it)?

Yes. A single bit of a mark on a change signifying that it was created locally (added by new and alikes), that does not propagate outside.

I think it would help to better understand the problem you're running into with revsets. Maybe you can share some examples of where it undershoots or overshoots?

I bring jj to a long running project where both in upstream remote repo and my forked origin remote there are plenty of one-off branches of mine. So stuff like immutable_heads().. & author("me") & committer("me") shows a lot of stuff, that I can't really clean up, but I don't really care about anymore.

I made some commits against both master and release branches and wanted jj to list all the outstanding work I care about. I pushed them for all sorts of reasons, and can't really clean them up en masse. Even stuff like trunk()..mine() seems to show some upstream remote branches from dependabot and merge queue that I clicked etc because they are also descendants of trunk(), though maybe author("me") & committer("me") would work better.

In my frustration I thought "Instead of having to get a PhD in revsets, I wish jj just let me match all the changes I created on this local repo, because that's what I really care about. How hard can that be?" Some variation of immutable_heads().. local() would be exactly what I want.

@martinvonz

Edit: Isn't it something that would appeal to more users? How do jj users eg. check if there weren't some changes somewhere that they forgot about, while waiting for feedback, etc. Being able to narrow whatever revset to "stuff I was working on here" with & local() seems broadly useful to me.

@yuja
Copy link
Collaborator

yuja commented Apr 21, 2024

Edit: Isn't it something that would appeal to more users? How do jj users eg. check if there weren't some changes somewhere that they forgot about, while waiting for feedback, etc. Being able to narrow whatever revset to "stuff I was working on here" with & local() seems broadly useful to me.

As @scott2000 mentioned in the other thread, something like untracked_remote_branches() might help exclude old uninteresting remote branches.

I assume topic aims to improve handling of multiple ongoing changes, but I don't follow the discussion.

@scott2000
Copy link
Collaborator

scott2000 commented Jul 13, 2024

#4068 added a new revset untracked_remote_branches() which I think could be a good workaround for this issue. Specifically, if you add untracked_remote_branches() to immutable_heads(), then non-local commits would be hidden from the log, and you could use mutable() like local(). I think this would meet most of your requirements:

  • All commits created locally are initially not in a remote branch at all, so they are considered local.
  • Any branches you push to a remote will be tracked by default, so they are still considered local.
  • When a branch gets merged into trunk() (or some other release branch in immutable_heads()), it will no longer be considered local.
  • Branches from the remote are untracked by default, so they are not considered local. This includes old branches created by the current user and any new branches created by coworkers.
  • If you want to mark a branch from a remote as local, you can just run jj branch track <branch@remote>, and all of its commits will be considered local.

@joyously
Copy link

This confuses me:

Any branches you push to a remote will be tracked by default, so they are still considered local.

If the commits are available on a remote, why would they be considered local?
And then tracking a remote branch makes it local???

And the combination of these two is hurting my head. The first seems to say local has to be in trunk, while the second says merging to trunk means it's not local.

All commits created locally are initially not in a branch at all
When a branch gets merged into trunk(), it will no longer be considered local.

@scott2000
Copy link
Collaborator

scott2000 commented Jul 13, 2024

If the commits are available on a remote, why would they be considered local? And then tracking a remote branch makes it local???

I was using "local" to mean "commits that I'm currently working on locally". So that would include any commits which I've created locally that haven't been merged to trunk() yet, or commits from remote branches I'm currently working on. So I'm using tracking a remote branch as a way to indicate that it's a branch I'm currently working on, meaning that the commits should be considered "local". So pushing a commit to a remote doesn't make it stop being "local", because it might still be a commit I'm working on.

Perhaps "local" isn't the best word for this, but I thought the behavior matched the behavior described for local() pretty well, so I'm just using that terminology.

And the combination of these two is hurting my head. The first seems to say local has to be in trunk, while the second says merging to trunk means it's not local.

The first one is saying that commits you haven't pushed yet are initially "local", and the second is saying that commits which get merged into trunk() stop being "local". Since commits that haven't been pushed yet can't be in trunk(), there shouldn't be any contradiction between these.

@joyously
Copy link

I was using "local" to mean "commits that I'm currently working on locally".

Typically in version control, if you are working on it, it's not a commit yet, however with jj it would only be @ that you are working on.

Since commits that haven't been pushed yet can't be in trunk()

This seems to contradict what you said before: "All commits created locally are initially not in a branch at all". Not in a branch means they are in trunk.

@scott2000
Copy link
Collaborator

scott2000 commented Jul 13, 2024

Typically in version control, if you are working on it, it's not a commit yet, however with jj it would only be @ that you are working on.

By "commits I'm working on", I don't mean commits which are currently checked out. I mean it more broadly as to include "commits which I might still want to change in the future" (e.g. due to comments on a PR). So basically including any of my commits which haven't been merged into trunk() yet.

This seems to contradict what you said before: "All commits created locally are initially not in a branch at all". Not in a branch means they are in trunk.

trunk() is generally a remote branch, so I'm a bit confused what you mean here. By "trunk" I mean "the main branch of the remote where commits generally get merged after a PR". And in jj, new commits are very often not in any branch due to the branchless workflow.

@joyously
Copy link

I mean it more broadly as to include "commits which I might still want to change in the future"

This is too vague and can include all that are not immutable.

trunk() is generally a remote branch

In my mind, trunk is whatever has a direct line back to the root commit. And that exists even if there is no other copy of the repo on a remote.

And in jj, new commits are very often not in any branch due to the branchless workflow.

Since all commits are in the graph, it doesn't matter if new ones are on the main line (which leads to root) or on a branch. I was simply trying to understand your comment about "all commits created locally are not in a branch"...

@scott2000
Copy link
Collaborator

scott2000 commented Jul 13, 2024

I was simply trying to understand your comment about "all commits created locally are not in a branch"...

Maybe my wording wasn't clear. I meant that when you create a new commit, it will initially not be in ::remote_branches() since you haven't pushed it yet, so therefore it definitely isn't in ::untracked_remote_branches() either.

With the workflow I use, it also won't be in ::branches() initially, so that's why I originally said "not in a branch" rather than "not in a remote branch", but that's technically not true for everyone so I edited that comment to update the wording.

@Zoybean
Copy link

Zoybean commented Aug 22, 2024

How do jj users eg. check if there weren't some changes somewhere that they forgot about, while waiting for feedback, etc. Being able to narrow whatever revset to "stuff I was working on here" with & local() seems broadly useful to me.

personally, I have my own alias defined as "local":

[revset-aliases]
'local()' = "~::remote_branches()"

This is different to your definition in a few key ways, but generally for me, a lot of my forgotten work-in-progress is not on a remote. Once I've had a look at mine() & local(), there's a broader set of stuff that I remembered to push, but haven't merged yet - that's all the mine() & mutable() commits (now that untracked_remote_branches() is in immutable_heads()). If I want to consider some work finished locally I can just jj branch untrack foo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
polish🪒🐃 Make existing features more convenient and more consistent
Projects
None yet
Development

No branches or pull requests

7 participants