Discuss returning 404 for privacy reasons #14

RubenVerborgh · 2019-07-18T22:23:11Z

No description provided.

csarven · 2019-07-19T07:52:23Z

Some thoughts:

I like the 404 here.

Would it be useful to make sure that the 404 response MUST be accompanied with an expiration time for caches?

If so, related concern: if implementations only include freshness information for the purpose of hiding, then the evilApp could infer that there is a resource being hidden even while they can't access it. So, then MUST all 404s be always accompanied with an expiration time so as to not leak that information?

Note that if 404 with SHOULD or MAY for expiration time, it increases the uncertainty of a resource existing for the evilApp to infer.

humont · 2019-07-29T14:48:14Z

Hi, new to this community but have some thoughts here:

My impression is that the active use case for 404 in Solid is to check whether a username is taken - which can be done at root level (and return a 404).

Any requests in deeper directories could return a 403 Forbidden if the ACL's don't allow public read. The 403 can also be made default regardless of whether or not the resource/path is valid.

This would prevent inference of resource existence, but avoids using 404 when a resource may in actual fact exist (obfuscation).

humont · 2019-07-30T09:07:01Z

But such obfuscation is explicitly allowed by the spec (see above); and a constant 403 is no less an obfuscation. So at the moment, I am still leaning toward 404 in case of privacy reasons, if the pod user so desires. On the other hand, a 403 makes it easier, since no such preference should be stated then, and it hints at authenticating (whereas a 404 does not).

There is ambiguity as to which should be used, that's for sure. After browsing around on Stack Overflow and various blog posts, the common wisdom seems to indicate that people assume a 403 implies the existence but no permission (despite this not being in line with the actual definition of 403). So theres that...

My preference (for what it's worth) is a semantic one:
403 = "you're not allowed to ask this question"
wheras:
404 = "you're question is allowed, but i won't tell you the answer".

as far as obfuscation go, 404 seems to be more fit for purpose as it provides some confusion as it leaves the question of "did i not find the resource in general, or did i not find it just for you" ?

dmitrizagidulin · 2019-07-30T17:03:41Z

I would also vote for the '404 by default, if unauthorized or not found' option. (All social media platforms, from LiveJournal to GitHub (private repos) to Facebook, take this approach.)

That said, I do think that it would be useful to add a Web Access Control term (something like wac:allowRequestPermission) that signals a 403 instead of a 404 and allows users to ask for permission.

csarven · 2019-07-31T11:17:53Z

@dmitrizagidulin Isn't a given that the requesting agent can come forward with credentials on any resource? If I'm interpreting correctly, I think signalling allowRequestPermission along with a 404 works contrary to obfuscating whether a resource actually exists/reachable or not.

csarven · 2019-07-31T11:23:08Z

I would only ask how much would the 404 approach will cause the applications lose out on considering the option to authenticate (if not already) and re-request.

So, re 404, I think it'd be useful to cover both ends in the Solid spec. The 404 in RFC:

or is not willing to disclose that one exists.

can be supported/clarified with by adding the following:

"The client MAY repeat the request with new or different credentials." -- repurposing the text from 403.

Aside: I wonder if this was already considered in RFC7231 and what was the rationale to omit.

TallTed · 2019-07-31T21:54:12Z

Speaking as a user, I hate 404 when I'm unauthorized because my login has expired or the like, because it often enough means I go in loops of "where'd that thing I know was here go?" Yes, the RFC says it's OK to do this, but the 403 response seems better to me -- because whether or not the thing does exist, it shows that there might be a change with different authentication, while 404 suggests that authentication doesn't matter.

pmcb55 · 2019-08-12T22:07:42Z

I agree with @ajs6f in the Trellis issue referenced by @acoburn above (trellis-ldp/trellis#454). Basically any conversion of 403 to 404 should be done at the outermost layer of the architecture (and not in the internals, to @ajs6f's point). But I'd like to see that conversion be controllable/overridable on a per-resource basis (even though an administrator may also provide a default 'conversion setting' for all resources served by an entire Solid server), and controllable by the user themselves (i.e. it's just another piece of resource meta-data that users can set explicitly).

csarven · 2019-08-16T11:55:17Z

Punting the responsibility/decision making to the "outermost layer of the architecture" certainly sounds reasonable, if and only if, there is an "outermost layer" to speak of. Certainly we expect a Solid server to be self-contained (ie. working without any dependency or knowledge of the outer layer) to the point that it has some opinion on the primary UC, whether that's realised via 403 or 404 out of the box, or even configurable. Put differently, if we do acknowledge the UC, the Solid spec(s) should probably say something about it at the very least for the "internal architecture" so that it is prescribed and have tests. Anything pertaining to the "outermost layer" may be out of Solid spec's scope or at most be only descriptive (as opposed to prescriptive) in the end.

pmcb55 · 2019-08-17T17:01:36Z

@csarven I'm not sure I follow really. For me the 'outermost' layer of a Web server is pretty easy to define and configure - it's just a JAX-RS filter, and/or the last (or first) processor in a Camel route. In fact I'd see it as a classic example of the filter pattern, i.e. filtering a response to set it's status code to either 403 or 404 based on the context of the request itself and the configuration of the server.
If by UC you mean Use-Case, I certainly wouldn't see that being defined by the Solid server or spec, instead it's defined by the context of the request (i.e. the preferences of the particular user) and the configuration of the server (i.e. the preferences of the Pod provider). So from a spec perspective I'd say the server CAN set a 403 or a 404, but that that SHOULD be override-able by user preferences set per resource.

TallTed · 2019-11-10T03:47:24Z

Having let this gel for a while, it occurs to me that (depending on the usage scenario) different responses may be appropriate for the same resource depending on whether the user is known or unknown, and on specifics of a known user. That is, it may be appropriate to return 404 to unauthenticated users, and 403 to (some or all) authenticated users.

kjetilk · 2019-11-11T11:24:04Z

To me, this sounds like an implementation issue that the Solid spec does not need to address, or might address in a best practices documentation. The "404 for privacy" is already there in RFC7231, and implementors may want to heed that advice if they are concerned with privacy, which most will want to be, and if so, they have many ways to achieve it, as @pmcb55 says, it is a filter pattern thing.

Even if many will want it so, I fail to see the value of turning it into a stronger normative feature of Solid.

dmitrizagidulin · 2019-11-11T16:04:07Z

Not sure I agree. Privacy is a core issue to Solid. And different servers handling this behavior differently might become a source of confusion.

acoburn · 2022-04-04T12:36:57Z

CORS preflight requests are a vitally important consideration.

What I might suggest is this:

servers need to be able to distinguish between CORS preflight OPTIONS requests and all other OPTIONS requests. A server can do this by looking for the presence of three headers, which constitute CORS preflight requests:
all such CORS preflight requests always return a 2xx response (e.g., 200 or 204)
the response of a CORS preflight request gives away no information about the presence, absence or any distinguishing type of the target resource. In a word: all responses to CORS preflight requests are the same
all other OPTIONS requests consider authorization, will include resource-specific status codes (e.g., 200, 404, 403), and will include resource-specific headers (e.g., Link)

edwardsph · 2022-09-01T15:52:13Z

This issue has been quiet for a while but I have a question as a result of working on the tests for read access controls. In the table for POST C/ Slug: R there are 2 cases that don't make sense to me. You have read access to the container and that may be inherited. You attempt to POST a new resource to a target child container. The table suggests that if the target exists you would get a 403 as you are not permitted to write to the target. However it suggests that you would get a 404 of the target does not exist since you have read access to the parent container.

# edited to clarify
Read, -, 403, 404
-, Read, 403, 404

I think this is a problem for a few reasons:

You attempted to write and that is forbidden, whether or not the target exists. The response should be a 403. If you then read the parent container you would indeed have permissions to see that the target didn't exist but why expose that information when it was not asked for? The agent is authorized to know about it' existence but it isn't asking that.
Authorization should take precedence over the 404 - that would align with the http decision trees referenced in Precedence of response codes #146
It appears to conflict with the earlier statement

When an agent is forbidden to allocate a URI to a resource, 403 is used.

csarven · 2022-09-01T16:30:53Z

The request semantics of POST (including Slug: R in this case, but not particularly important here) is to "perform resource-specific processing on the request payload" targeting a resource (i.e., a container in this case). The server does not have a current representation for the target resource, which is what the 404 indicates so that the client can try again by changing the request (if it wants to).

The content in issue 146 is not fully worked out and overlaps with the work in this issue, specifically the tables.

As mentioned elsewhere, 403 would be a valid (acceptable) response, but 404 is both accurate as per request semantics and more useful for the client.

edwardsph · 2022-09-01T17:08:57Z

Ok, whilst the discussion is ongoing, I will at least allow 403,404 in the tests.

ericprud · 2022-09-21T09:57:13Z

I feel like we arrived at the conclusion that sometimes a user wants to hide existence of stuff (404) and sometimes they don't (403, or even 401 if you shortcut someone's miss-impression that they have a prayer of re-authing to get access). Earlier, I mentioned NetWare's control on the directory controlling who gets to ls it. A more fine-grained approach would be to stick the you-cant-see-me control on the resource itself. This seems consistent with the notion that in App-Interop, there are meta controls to save you the some grief of fiddling with detailed, resource-level ACLs.

edwardsph · 2022-10-07T10:24:39Z

Can I query a row for DELETE?

C/     C/R  C/R exists	C/R doesn't exist
Write  -    204         403

When the resource doesn't exist, why couldn't the response also be 204? If the resource existed, the user would have been able to delete it. Although it didn't exist, the delete operation can be deemed to have succeeded as the resource doesn't exist.

csarven · 2022-10-07T23:52:15Z

No access is granted to know about the existence of the target resource. Access is granted to remove the target resource. The request semantics can be successfully applied to the target resource when the resource exists and access is granted. Request to remove a non-existing resource is forbidden.

The difference in 204 and 403 is useful to the client in that the user can be informed about whether the target resource is removed or cannot be removed (for any reason, including non-existence, no access, non-owner, permanence, or something else).

Whether read permission on the target resource is required to reveal its past state, you may find the following to be a good formulation of the problem or an invitation to philosophical ramblings: #311 .

woutermont · 2023-09-10T10:12:41Z

I noticed @csarven included this to the proposed milestone of v0.11.0, but I never had the feeling this has been properly resolved, and issues with regards to the status quo 'solution' keep popping up.

A very good (i.m.o. better) alternative to the status quo has already been proposed by @humont, yet has i.m.o. not been given enough attention; at least, no concrete arguments against it have been made, except perhaps @RubenVerborgh's question "Why we would prefer a 403 over a 404?" Let's revisit that proposal and answer that question.

[@humont:] The 403 can also be made default regardless of whether or not the resource/path is valid.

[@RubenVerborgh:] Good point, that does not seem to be disallowed by RFC7231 ... However, the question is why we would prefer a 403 over a 404. You write:

[@humont:] avoids using 404 when a resource may in actual fact exist (obfuscation).

[@RubenVerborgh:] But ... a constant 403 is no less an obfuscation.

The idea is thus to use 403s instead of 404s when permission is lacking,
regardless of resource existence

This approach has a number of advantages.

Clarity: Using 403 instead of 404 directs the obscurity entirely towards the agent lacking permission, rather than sharing the burden between that agent and the permissioned agent that happens to access an unexisting resource (cf. the concern voiced by @TallTed). Moreover, as @RubenVerborgh himself already acknowledged, towards the unpermissioned, "[403] hints at authenticating (whereas a 404 does not)": it clearly communicates the cause of the problem, and specifically allows the server to include more detailed information.
Simplicity: Using 403 instead of 404 is less complex for implementers. Servers do not need to provide possibly confusing configuration options for hiding resource existence (which was also acknowledged by @RubenVerborgh). Moreover, since 403 is by default not cacheable (while 404 is), no additional cache controls need to be implemented (which lead to additional questions).
Separation of concerns: Using 403 instead of 404 adheres to a more typical processing order that allows for a more clean layering of the authorization mechanism (cf. @edwardsph's comment and the HTTP decision trees provided by @csarven in Precedence of response codes #146). This is in line with HTTP Semantics, which specifies that "normal" request checks (including authorization) take place before precondition checks, which take place before any content processing or action. Issue Any resource server implementation is forced to couple authorization and storage #379 is the perfect example of the kind of implementation impact it has to not adhere to this order: we can no longer adequately separate the mechanisms of authorization and storage.

Note that a 403 response reveals nothing more about the state of the resource than a 404 response. That seems to be an implicit assumption in some comments throughout this discussion. Just like 404, its meaning relates to a target resource, which does not necessarily exist.

csarven · 2023-09-10T12:43:49Z

To be clear, the comment in #14 (comment) is the running rough consensus. It is taken to be consistent with HTTP and the constraints that Solid Protocol sets. The information prior to the tables sets the axioms that's intended to holds the whole thing together. Deviations from that have shown inconsistencies in implementations, and if anything, misunderstanding of the specs, and possibly security concerns. There may be incorrect/invalid data in the tables given the understanding, and as mentioned, that's something we can correct easily. The table is not intended to be complete/show all combinations, e.g., some of it can be derived from HTTP specs. PATCH may need an update since the Solid Protocol currently use N3.

The milestone is set so that we go through the Solid Protocol (and possibly other specs) and update where necessary based on the understanding that we seem to have arrived. (Originally I tried to make the case for when both 403 and 404 would be meaningful... and there is a lot of mention of this everywhere/in other issues...)

woutermont · 2023-09-10T13:26:58Z

If you mean to say my comment is already in line with the general consensus, I believe the following rows of the tables would need to be corrected.

GET/HEAD/OPTIONS C/R

C/ C/R C/R exists C/R doesn't exist

Read Write 403 404 => 403
POST C/ with Slug: R

/ C/ C/ exists C/ doesn't exist

- Read 403 404 => 403

Read - 403 404 => 403
PATCH C/R

C/ C/R Payload Match C/R exists C/R doesn't exist

- Read 403 404 => 403
DELETE C/R

C/ C/R C/R exists C/R doesn't exist

- Read 403 404 => 403

Read - 403 404 => 403

Append Read 403 404 => 403

Write Read 403 404 => 403
DELETE C/

C/ C empty C/ exists C/ doesn't exist

Read 403 404 => 403

These are all cases where an unauthorized request is NOT a Read operation. I argue above that in those cases, a 403 response should always be returned, EVEN IF the requesting agent has Read permission on the target resource, since the request was not to Read the resource, but to do something else, which was unauthorized. This improves implementation simplicity and separation of concerns.

EDIT: Deleted subsequent message that maybe too eagerly added some other changes.

RubenVerborgh mentioned this issue Jul 18, 2019

Add CORS section #13

Merged

3 tasks

RubenVerborgh added the discussion label Jul 18, 2019

acoburn mentioned this issue Aug 1, 2019

Response types for forbidden resources trellis-ldp/trellis#454

Closed

csarven mentioned this issue Aug 9, 2019

Add HTTP section #26

Merged

Mitzi-Laszlo added the topic: security privacy cryptography label Oct 1, 2019

csarven added this to the December 19th milestone Oct 2, 2019

csarven mentioned this issue Oct 6, 2019

Add the semantics of slashes, which is shared by client and server #35

Closed

Mitzi-Laszlo assigned justinwb Oct 11, 2019

csarven mentioned this issue Oct 19, 2019

Resource creation behaviour when using the Slug header #96

Open

RubenVerborgh removed the discussion label Oct 29, 2019

csarven mentioned this issue Oct 29, 2019

Use 410 Gone for deleted documents #103

Closed

csarven unassigned justinwb Nov 10, 2019

csarven closed this as completed Nov 12, 2019

csarven reopened this Nov 12, 2019

joachimvh mentioned this issue Mar 21, 2022

Implicit assumptions about status codes #384

Open

csarven added this to <https://csarven.ca/#i> foaf:interest Sep 21, 2022

csarven moved this to Todo in <https://csarven.ca/#i> foaf:interest Sep 21, 2022

edwardsph mentioned this issue Oct 7, 2022

Test for non-existing resources solid-contrib/specification-tests#95

Merged

csarven moved this from Todo to In Progress in <https://csarven.ca/#i> foaf:interest Oct 18, 2022

This was referenced Feb 14, 2023

Discoverability of supported update methods on resources without read permission #497

Open

Community Solid Server Access Modes differ from WAC spec CommunitySolidServer/CommunitySolidServer#1576

Closed

csarven modified the milestones: ~Proposed Recommendation, Release 0.11.0 May 3, 2023

csarven mentioned this issue Jun 6, 2023

PATCH do not allow to create new file with APPEND nodeSolidServer/node-solid-server#1731

Closed

RubenVerborgh closed this as completed Nov 10, 2023

github-project-automation bot moved this from In Progress to Done in <https://csarven.ca/#i> foaf:interest Nov 10, 2023

csarven reopened this Nov 10, 2023

RubenVerborgh changed the title ~~Discuss returning 404 for privacy reasons~~ [Obsolete] Discuss returning 404 for privacy reasons Nov 10, 2023

csarven changed the title ~~[Obsolete] Discuss returning 404 for privacy reasons~~ Discuss returning 404 for privacy reasons Nov 10, 2023

edwardsph mentioned this issue Jan 11, 2024

404 on DELETE fictive Resource solid-contrib/specification-tests#112

Closed

bourgeoa mentioned this issue Feb 11, 2024

Try to run the latest versions of webid, crud and wac tests nodeSolidServer/node-solid-server#1756

Open

3 tasks

bourgeoa mentioned this issue Jun 3, 2024

fix: Return 201 when PATCH creates a new resource nodeSolidServer/node-solid-server#1786

Merged

csarven modified the milestones: Release 0.11.0, Release 0.12.0 Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discuss returning 404 for privacy reasons #14

Discuss returning 404 for privacy reasons #14

RubenVerborgh commented Jul 18, 2019 •

edited

Loading

csarven commented Jul 19, 2019

humont commented Jul 29, 2019

humont commented Jul 30, 2019

dmitrizagidulin commented Jul 30, 2019

csarven commented Jul 31, 2019

csarven commented Jul 31, 2019

TallTed commented Jul 31, 2019

pmcb55 commented Aug 12, 2019

csarven commented Aug 16, 2019

pmcb55 commented Aug 17, 2019

TallTed commented Nov 10, 2019

kjetilk commented Nov 11, 2019

dmitrizagidulin commented Nov 11, 2019

acoburn commented Apr 4, 2022

edwardsph commented Sep 1, 2022 •

edited

Loading

csarven commented Sep 1, 2022

edwardsph commented Sep 1, 2022

ericprud commented Sep 21, 2022

edwardsph commented Oct 7, 2022

csarven commented Oct 7, 2022

woutermont commented Sep 10, 2023

csarven commented Sep 10, 2023

woutermont commented Sep 10, 2023 •

edited

Loading

Discuss returning 404 for privacy reasons #14

Discuss returning 404 for privacy reasons #14

Comments

RubenVerborgh commented Jul 18, 2019 • edited Loading

csarven commented Jul 19, 2019

humont commented Jul 29, 2019

humont commented Jul 30, 2019

dmitrizagidulin commented Jul 30, 2019

csarven commented Jul 31, 2019

csarven commented Jul 31, 2019

TallTed commented Jul 31, 2019

pmcb55 commented Aug 12, 2019

csarven commented Aug 16, 2019

pmcb55 commented Aug 17, 2019

TallTed commented Nov 10, 2019

kjetilk commented Nov 11, 2019

dmitrizagidulin commented Nov 11, 2019

acoburn commented Apr 4, 2022

edwardsph commented Sep 1, 2022 • edited Loading

csarven commented Sep 1, 2022

edwardsph commented Sep 1, 2022

ericprud commented Sep 21, 2022

edwardsph commented Oct 7, 2022

csarven commented Oct 7, 2022

woutermont commented Sep 10, 2023

The idea is thus to use 403s instead of 404s when permission is lacking, regardless of resource existence

csarven commented Sep 10, 2023

woutermont commented Sep 10, 2023 • edited Loading

RubenVerborgh commented Jul 18, 2019 •

edited

Loading

edwardsph commented Sep 1, 2022 •

edited

Loading

The idea is thus to use 403s instead of 404s when permission is lacking,
regardless of resource existence

woutermont commented Sep 10, 2023 •

edited

Loading