Replies: 17 comments
-
There's definitely perceived overlap, which may be confusing to new users accustomed to I played with both commands and came up with this comparative summary:
So clearly their motivations and purposes are different right now. But still, they are somewhat confusing given possible overlap and similar output. My suggestion would be to keep both but consider redesigning
Conclusion
Perhaps reordering the output of I think the real solution will be when/if #770 is addressed (some sort of semantic data format plugins). |
Beta Was this translation helpful? Give feedback.
-
Agree.
💯 Both will be needed with content diff |
Beta Was this translation helpful? Give feedback.
-
Related: #1406 |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
-
Another distinction between status and diff is that the former checks untracked workspace changes (same as Git working tree) while the latter checks tracked but uncommitted changes in the workspace (including Git staged DVC-file and .gitignore for ex.) So that's another reason to close this one, I think? We're working on just improving the docs to clarify all this. See iterative/dvc.org#953 (review). |
Beta Was this translation helpful? Give feedback.
-
I don't quite understand this. But if my intuition is correct that this hasn't been decided yet. My 2c regarding status vs diff. They are different from a different perspective (as just committed/uncommitted stuff).
|
Beta Was this translation helpful? Give feedback.
-
Yeah it's confusing. I think your approach of looking at is from the PoV of DVC outputs + dependencies (#3385) is easier to understand. But from an implementation standpoint, it's also worth mentioning the difference between Git-unstaged changes (not yet tracked by DVC) vs. staged changes (uncommitted, but already tracked by DVC): which should be shown by status, and which by diff (#3385, #3386).
OK so due to ^ and also per #3386 (comment), this issue will be mostly to decide what the behavior of |
Beta Was this translation helpful? Give feedback.
-
I don't understand the meaning of this, sorry :( At least, I'm not 100% sure that we are on the same page here.
same here May be we can start w/o using some specific examples? |
Beta Was this translation helpful? Give feedback.
-
OK,
$ mkdir test-dvc-diff
$ cd test-dvc-diff
$ git init
$ dvc init
$ git commit -m 'dvc init'
# 1.0
$ echo foo > data # create data file (not yet tracked)
$ git status # -> sees new data as *unstaged*
$ git diff # doesn't see anything yet
$ dvc status # doesn't see anything yet
$ dvc diff # doesn't see anything yet
# 1.1
$ dvc add data # track data file with DVC
$ git status # -> sees new data.dvc as *unstaged*
# git diff and dvc status don't see anything yet
$ dvc diff # -> sees new data (tracked by DVC but uncommitted)
Added:
data
files summary: 1 added, 0 deleted, 0 modified
# 1.2
$ git add --all # track new data.dvc with Git
$ git status # -> sees data.dvc as *staged*
# git diff and dvc status don't see anything yet
$ dvc diff # -> sees data (tracked by DVC but uncommitted)
# 1.3
$ git commit -m 'echo foo > data' # commit data version to Git
# blank slate: all status and diff commands see nothing Summary: only
# 2.0
$ echo bar > data # change data
# git status and git diff see nothing
$ dvc status # -> sees change in tracked data
data.dvc:
changed outs:
modified: data
$ dvc diff # doesn't see the change
# 2.1
$ dvc add data # update the tracked data with DVC
$ git status # -> sees data.dvc as *unstaged*
$ git diff # -> displays diff in data.dvc from HEAD to working tree
$ dvc status # Up to date again.
$ dvc diff # -> sees data change (tracked by DVC but uncommitted)
Modified:
data
files summary: 0 added, 0 deleted, 1 modified
# 2.2
$ git add --all # track change in data.dvc with Git
$ git status # -> sees data.dvc as a staged file
# git diff and dvc status don't care
$ dvc diff # -> still sees data change (tracked by DVC but uncommitted)
# 2.3
$ git commit -m 'echo bar > data' # blank slate again (Repeat.) |
Beta Was this translation helpful? Give feedback.
-
it's not exactly right I think, right?
That's what I would expect from |
Beta Was this translation helpful? Give feedback.
-
🤦♂ sorry I meant "DVC-tracked changes (whether Git staged or not)"! (Original comment updated.)
Depends what you mean by "tracked":
|
Beta Was this translation helpful? Give feedback.
-
this is confusing or not true either. To my mind if I add a file into a dir and this dir is DVC-tracked I suppose to see the difference, right? But this does not happen. (like
I mean that it just reads DVC-files from the working tree, it does not probably analyze anything related to Git about them. |
Beta Was this translation helpful? Give feedback.
-
I think I agree with you about So is the main take away so far that
@shcheklein Not if you haven't run |
Beta Was this translation helpful? Give feedback.
-
@shcheklein To me it feels like you could ask the same question about |
Beta Was this translation helpful? Give feedback.
-
what question?
@efiop yep, I think I was not arguing about this - #3255 (comment) ... at least may be I was not clear enough? or missed something? |
Beta Was this translation helpful? Give feedback.
-
For the record: had a discussion during the call about it, looks like I mistook it for "throw out diff because it is the same as status", which is now resolved. |
Beta Was this translation helpful? Give feedback.
-
git status
andgit diff
have different syntax (diff is based on checksums, status - not) and semantics (diff is based on history, status - on workspace and index).dvc diff
anddvc status
have different syntax with similar semantics becausedvc diff
is not doing an actual content diff. So, these commands could be potentially "merged" into a single one.Even if we decide to keep both I'd still prefer to see a status-like output for
dvc diff
when a file was not added to the repo bydvc add
.Beta Was this translation helpful? Give feedback.
All reactions