Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

$id updates #1537

Merged
merged 19 commits into from
Nov 2, 2024
Merged

$id updates #1537

merged 19 commits into from
Nov 2, 2024

Conversation

gregsdennis
Copy link
Member

@gregsdennis gregsdennis commented Oct 1, 2024

What kind of change does this PR introduce?

Cleans up some language around $id.

Issue & Discussion References

Summary

Just some clean-up and clarification that we've previously identified.

Schema document roots SHOULD have an $id (maybe relax this? ...) - from #1444 (comment)

I decided to leave this in for now. We can discuss and change separately if needed.

Does this PR introduce a breaking change?

No

@gregsdennis gregsdennis requested a review from a team October 1, 2024 00:33
Copy link
Member

@jviotti jviotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Most of the changes are simple one-character edits, and the rest read very well to me

Due to the potential break in functionality described above, the behavior for
using JSON Pointer fragments that point to or cross a resource boundary is
undefined. Schema authors SHOULD NOT rely on such IRIs, as using them may
reduce interoperability.
Copy link
Contributor

@notEthan notEthan Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stating here that this behavior is undefined seems to me to contradict the above "an embedded schema resource and its subschemas can be identified by JSON Pointer fragments relative to either its own canonical IRI, or relative to any containing resource's IRI".

This also leaves [^8] with no reference to it.

And, minor, but two spaces after a period isn't consistent with the rest of the document.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also leaves [^8] with no reference to it.

I'll add that back in.

Copy link
Member Author

@gregsdennis gregsdennis Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was discussed that we need to make pointers that cross boundaries an undefined behavior rather than explicitly allowing implementations to support it. Of course, implementations can still support it if they want, but the spec should push schema authors a bit more forcefully toward using the proper $id instead.

I think the critical point here is "an embedded schema resource and its subschemas". It's arguable that an embedded schema resource isn't strictly a subschema since it can be refactored out without any change in behavior. If we need to clarify that, I'm happy to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I get making this undefined is the main point of this PR (I edited my comment a bit to clarify that, had it slightly wrong initially ... I'm not sure this should be undefined but if that is the way it is to be, okay).

To me "an embedded schema resource ... can be identified by JSON Pointer fragments ... relative to any containing resource's IRI" reads as defined behavior and not consistent with making it undefined some paragraphs later, and needs some change. I see your point on ambiguity whether "subschema" includes embedded schemas but I don't see language present that actually calls embedded schemas subschemas.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read that paragraph as informative, not prescriptive. It's setting up the reasoning behind the requirement.

Embedded schemas can be referenced this way, but because reasons it's a bad idea to actually do it, so we're gonna say probably don't. We're also letting implementations decide whether they want to support it, no pressure either way.

jsonschema-core.md Outdated Show resolved Hide resolved
@gregsdennis gregsdennis added this to the stable-release milestone Oct 2, 2024
Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling these issues!

jsonschema-core.md Outdated Show resolved Hide resolved
jsonschema-validation.md Outdated Show resolved Hide resolved
jsonschema-validation.md Outdated Show resolved Hide resolved
jsonschema-core.md Show resolved Hide resolved
jsonschema-core.md Outdated Show resolved Hide resolved
jsonschema-core.md Outdated Show resolved Hide resolved
@gregsdennis
Copy link
Member Author

@notEthan @jdesrosiers I've almost completely reworked the $id section. There were some redundancies in it that didn't sit right with me.

I moved the syntactic definition of the keyword up to the top sentence next to its purpose, and I reworked how it talks about defining a resource, needing to be resolved against the current base IRI, and then providing a base IRI for child resources.

This relationship is necessarily recursive (the $id value needs a base IRI but then also serves as a base IRI), so it's quite difficult to state simply.

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for the delay in reviewing.

jsonschema-core.md Outdated Show resolved Hide resolved
jsonschema-core.md Outdated Show resolved Hide resolved
jsonschema-core.md Outdated Show resolved Hide resolved
jsonschema-core.md Show resolved Hide resolved
Comment on lines 1378 to 1382
Note that single-level custom keywords with identical syntax and semantics to
`$defs` do not allow for any intervening `$id` keywords, and therefore will
behave correctly under implementations that attempt to use any reference target
as a schema. However, this behavior is implementation-specific and MUST NOT be
relied upon for interoperability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this paragraph confusing. The word "intervening" doesn't make sense in this context. I also found it awkward that it starts by saying it will behave correctly, then says it must not be relied upon. I think it would be more clear if it was the other way around.

I think this was originally about unknown keywords. Since unknown keywords aren't allowed anymore, it becomes about custom vocabulary defined keywords. Therefore, I think this is saying that you can't use $id in a schema in a custom keyword. I don't think that's what we want. $id should work in any known schema, even in custom keywords.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph was just moved up. It was previously at line 1404 (which you can see just below).

I also found it awkward that it starts by saying it will behave correctly, then says it must not be relied upon.

The text is "will behave correctly under implementations that attempt to use any reference target as a schema" which is correct.

I think this was originally about unknown keywords.

It's not just about unknown keywords. It's also about keywords that can define structure but not subschemas, like default or const (or custom keywords with structured data). The point of this paragraph is to say that $id in any place where a schema is not expected is to treated as data, not an identifier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noticed that it was moved. I'm not arguing about its correctness. I can't even say anything about its correctness at this point because I don't understand what it's trying to say.

It's not just about unknown keywords. It's also about keywords that can define structure but not subschemas, like default or const (or custom keywords with structured data).

I don't see how that paragraph can be understood to be about that. It appears this only applies to custom keywords with identical semantics to $defs.

The point of this paragraph is to say that $id in any place where a schema is not expected is to treated as data, not an identifier.

That's well stated and clear. I suggest replacing the paragraph with some variation of that sentence.

Copy link
Member Author

@gregsdennis gregsdennis Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, reading it again, I don't think it's necessary. It looks like it's just trying to explain that you can't have an $id between the keyword and a potential subschema that isn't a subschema because of the structure of the keyword. I'm not sure that needs explaining, so I'll just remove it.

I'm not sure if we need to keep the "requirement" about "MUST NOT be relied upon". It's definitely a requirement on schema authors, but it's a weird one at that.

The next paragraph actually covers exactly what I was saying:

A reference target under a keyword for which the value is known not to be a schema results in undefined behavior in order to avoid burdening implementations with the need to detect such targets.

I still think this can be cleaned up, though.

@gregsdennis gregsdennis force-pushed the gregsdennis/id-updates branch from de4ba41 to 8407415 Compare October 21, 2024 19:54
jsonschema-core.md Outdated Show resolved Hide resolved
Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes look great. I think, https://github.com/json-schema-org/json-schema-spec/pull/1537/files#r1819699325 is the only thing left.

@gregsdennis
Copy link
Member Author

@jdesrosiers this should be done now.

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working with me through the details. This is a huge improvement and I'm excited to see this merged. 🎉

@gregsdennis gregsdennis merged commit 11cb49d into main Nov 2, 2024
6 checks passed
@gregsdennis gregsdennis deleted the gregsdennis/id-updates branch November 2, 2024 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants