Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: treat all non-symlink reparse points as ModeIrregular on Windows #61893

Closed
qmuntal opened this issue Aug 9, 2023 · 21 comments
Closed

os: treat all non-symlink reparse points as ModeIrregular on Windows #61893

qmuntal opened this issue Aug 9, 2023 · 21 comments

Comments

@qmuntal
Copy link
Contributor

qmuntal commented Aug 9, 2023

fs.ModeIrregular is useful to know whether a file is a reparse point or not. It is a signal that it needs more processing to fully understand how to handle it. Reparse points started to be treated as fs.ModeIrregular in CL 460595 (submitted by @bcmills), landed in go1.21.

Unfortunately, it did not mark all reparse points as irregular, only those that are not identified with ModeSymlink | ModeDir | ModeNamedPipe | ModeDevice | ModeCharDevice. This limits the usefulness of fs.ModeIrregular, as it is not possible to know, for example, if a directory is a reparse point or not. One can currently infer that all fs.ModeSymlink are reparse points, which currently includes IO_REPARSE_TAG_MOUNT_POINT, the most popular directory reparse point, but that's not a good reason to also properly support less common directory reparse ponts, such as IO_REPARSE_TAG_LX_SYMLINK or vendors defined reparse points.

I propose to treat all reparse points as fs.ModeIrregular, and to modify its definition from non-regular file; nothing else is known about this file to non-regular file; may need special handling.

Worth noting that this will help identifying mounted directories (aka IO_REPARSE_TAG_MOUNT_POINT) as reparse points if we manage to not treat them as symbolic links, that is fixing this TODO.

Edit: I'm opening this issue to discuss potential concerns.

@golang/windows

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 9, 2023
@mknyszek mknyszek added this to the Backlog milestone Aug 9, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/517815 mentions this issue: os: treat all non-symlink reparse points as ModeIrregular on Windows

@bcmills
Copy link
Contributor

bcmills commented Aug 9, 2023

Hmm. Lstat never follows reparse points, right? Can that be used to determine whether an arbitrary path is a reparse point?

Or does Lstat report, say, ModeDir for reparse points that are also directory-like?

@qmuntal
Copy link
Contributor Author

qmuntal commented Aug 9, 2023

Hmm. Lstat never follows reparse points, right? Can that be used to determine whether an arbitrary path is a reparse point?

Or does Lstat report, say, ModeDir for reparse points that are also directory-like?

Lstat reports fs.ModeDir for directory-like non-symlink reparse points. Unfortunately it doesn't currently set fs.ModeIrregular on that case. That's why I submitted this proposal.

@bcmills
Copy link
Contributor

bcmills commented Aug 9, 2023

Hmm. I do think it makes sense to mark directories as irregular if they may require additional special handling, although I also think it makes sense to continue to omit the irregular bit for reparse points that are “just” symlinks, since the os package already does the special handling needed for those.

@qmuntal
Copy link
Contributor Author

qmuntal commented Aug 9, 2023

I gave this another thought, and now I think I've given the wrong solution to the issue of consistently identifing directory reparse points.

IMO we shouldn't be treating any reparse points as a directory, trying to follow them while recursively walking directories is a recipe for problems. In fact, we almost have it right now: symlinks and mount points never have the directory bit set thanks to CL 41830. If we do the same for all reparse points, then the current logic will already mark them as irregular files.

require additional special handling, although I also think it makes sense to continue to omit the irregular bit for reparse points that are “just” symlinks, since the os package already does the special handling needed for those.

Good point, agree.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/536655 mentions this issue: os,path/filepath: treat all non-symlink reparse points as irregular files

@jakebailey
Copy link

jakebailey commented Oct 20, 2023

Is it correct to say that the above change would start causing junctions to stop showing up as symlink'd directories, and start showing up as irregular files?

Given real symlinks are locked behind Developer Mode without elevation, loads of tools use junctions to emulate symlinks (e.g. yarn, pnpm in the JS ecosystem). I'm a little worried that some tooling out there will stop working properly.

Right now, WSL sees junctions as symlinks (when observed via /mnt/c or something). I know the WSL FS works via 9p; given the name/protocol I had kinda assumed it was in Go but maybe not. git-bash also treats junctions as symlinked dirs on Windows (though curiously git itself thinks they are regular directories, something I had tried to change in git-for-windows/git#4383 as it totally breaks recursive symlinks, unfortunately with the justification that Go treated them like symlinks.... 🙃 ).

@qmuntal
Copy link
Contributor Author

qmuntal commented Oct 21, 2023

Is it correct to say that the above change would start causing junctions to stop showing up as symlink'd directories, and start showing up as irregular files?

Junctions are currently only identified as symlinks, not directories. This prevents the infinite recursion trap for users blindly walking directories. If CL 536655 is approved, then junctions won't be identified as symlinks, but as irregular files.

I'm planning to submit a companion proposal to make query the file reparse tag easier , so users can take informed decisions when implementing different FS operations. Something like this: #63429 (comment).

@bcmills bcmills changed the title os: treat all non-symlink reparse points as ModeIrregular on Windows proposal: os: treat all non-symlink reparse points as ModeIrregular on Windows Oct 24, 2023
@bcmills bcmills modified the milestones: Backlog, Proposal Oct 24, 2023
@bcmills bcmills moved this to Incoming in Proposals Oct 24, 2023
@rsc
Copy link
Contributor

rsc commented Oct 27, 2023

Per discussion on #63429, even if we adopt a general rule that all reparse points are Irregular, we should also add an exception for dedup points, which are (1) regular enough, and (2) common enough to be worth special casing.

@rsc
Copy link
Contributor

rsc commented Oct 27, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Oct 27, 2023
@bcmills
Copy link
Contributor

bcmills commented Oct 31, 2023

Just to clarify, for a MOUNT_POINT reparse point:

  • os.Lstat would return a FileInfo for the reparse point itself, with the ModeIrregular bit set and the ModeDir bit unset. (Before this proposal, it would instead be reported as ModeSymlink and not ModeDir.)

  • Because IO_REPARSE_TAG_MOUNT_POINT has the surrogate bit set, os.Stat would return a FileInfo for the underlying directory joined at the mount point, with the ModeDir bit set and the ModeIrregular bit unset. (No change in behavior from today.)

@rsc
Copy link
Contributor

rsc commented Dec 6, 2023

What is the current proposal and is there general agreement for it yet?

@rsc
Copy link
Contributor

rsc commented Jan 10, 2024

Waiting here for the specific proposal. I think we agreed that dedup reparse points will not count as irregular too.

@qmuntal
Copy link
Contributor Author

qmuntal commented Jan 11, 2024

Here comes the specific proposal, affecting os.Lstat and os.Stat:

  • Don't set the fs.ModeSymlink type bit for IO_REPARSE_TAG_MOUNT_POINT reparse points.
  • Set the fs.ModeIrregular bit for all reparse points, except for IO_REPARSE_TAG_SYMLINK and IO_REPARSE_TAG_DEDUP.
  • For files with the fs.ModeIrregular bit set, don't set any other type bit (e.g. fs.ModeDir).
  • Add a GODEBUG setting (winreparsepoint?) that changes the behavior of os.Lstat and os.Stat to restore the old behavior.

This proposal makes two breaking changes:

  • Mount points will no longer be identified as symbolic links, which seems like the right thing to do in the general case.
  • Reparse points pointing to directories won't have the fs.ModeDir set anymore (this was already the case for mount points). This is a safety measure to avoid infinite recursion in directory traversal, as discussed before.

@rsc
Copy link
Contributor

rsc commented Jan 18, 2024

Probably the GODEBUG should seem like something that can be set to true or false. "winreparsepoint" doesn't say what happens to the reparse points. What about winreparseirregular=1 for the new behavior?

@rsc
Copy link
Contributor

rsc commented Jan 31, 2024

Assuming we use winreparseirregular as the GODEBUG, have all remaining concerns been addressed?

@qmuntal
Copy link
Contributor Author

qmuntal commented Feb 1, 2024

@rsc I was rereading #63703 and realized that @bcmills proposed to use the same debug flag, winsymlink, both for #63703 and for this proposal:

At new Go releases (winsymlink=1):

os.Lstat would only report ModeSymlink for IO_REPARSE_TAG_SYMLINK. All other reparse tags — including mount points — would be reported as either ModeIrregular or regular files (for DEDUP reparse points in particular) per #61893 and #63429.

Using a single GODEBUG is fine with me, as these two proposals are very related.

@rsc
Copy link
Contributor

rsc commented Feb 8, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The proposal is detailed in #61893 (comment).

@rsc rsc moved this from Active to Likely Accept in Proposals Feb 8, 2024
@rsc
Copy link
Contributor

rsc commented Feb 14, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

The proposal is detailed in #61893 (comment).

@rsc rsc moved this from Likely Accept to Accepted in Proposals Feb 14, 2024
@rsc rsc changed the title proposal: os: treat all non-symlink reparse points as ModeIrregular on Windows os: treat all non-symlink reparse points as ModeIrregular on Windows Feb 14, 2024
@rsc rsc modified the milestones: Proposal, Backlog Feb 14, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/565136 mentions this issue: os: don't treat mount points as symbolic links

@qmuntal
Copy link
Contributor Author

qmuntal commented Feb 23, 2024

I want to clarify this point of the proposal, which might lead to confusion:

  • Reparse points pointing to directories won't have the fs.ModeDir set anymore (this was already the case for mount points).

By Reparse points pointing to directories I meant surrogate reparse points. If a reparse point is not a surrogate, then it doesn't reference a directory, it is a directory, and therefore should be reported as such by os.Lstat. Symlinks and mount points are surrogate reparse points, so this confusion doesn't affect them.

@dmitshur dmitshur modified the milestones: Backlog, Go1.23 Mar 5, 2024
@dmitshur dmitshur removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

7 participants