Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify that contentSchema holds a subschema and when/how it applies #1564

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

gregsdennis
Copy link
Member

@gregsdennis gregsdennis commented Nov 21, 2024

What kind of change does this PR introduce?

clarification

Issue & Discussion References

Summary

Updates the text for contentSchema to indicate that its value is indeed a subschema (and therefore should be treated as such when scanning for identifiers). Also cleans up language around its dependency on contentMediaType.

I didn't include any explicit text about it containing identifiers. Instead I declare that the value is a subschema and removed the "SHOULD ignore" text discussed in the issue.

Also pertinent to the issue discussion is the final couple sentences (already present), which I broke out into a new paragraph rewrote to make it more apparent that it is a note of guidance rather than a requirement.

Accessing the schema through the schema location IRI included as part of the
annotation will ensure that it is correctly processed as a subschema. Using the
extracted annotation value directly is only safe if the subschema is an embedded
resource with both $schema and an absolute IRI $id.

Does this PR introduce a breaking change?

No.

@gregsdennis gregsdennis requested a review from a team November 21, 2024 21:30
@gregsdennis gregsdennis self-assigned this Nov 21, 2024
@gregsdennis gregsdennis added this to the stable-release milestone Nov 21, 2024
Comment on lines 545 to 547
Since `contentMediaType` is required to provide instruction on how to interpret
string content, the annotation schema produced by this keyword has no meaning if
`contentMediaType` is not present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer that no annotation is produced at all if contentMediaType is missing -- in order to discourage structuring schemas in this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that then also mean that identifiers are not to be processed in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that then also mean that identifiers are not to be processed in that case?

IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we happy with it saying that an annotation SHOULD not be produced (etc.)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be a "MUST".


Accessing the schema through the schema location IRI included as part of the
annotation will ensure that it is correctly processed as a subschema. Using the
extracted annotation value directly is only safe if the subschema is an embedded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"safe" and "correctly processed as a subschema" is vague -- can we say something else? I think this is trying to say that the evaluation behaviour won't be reproducable because the schema is evaluated in isolation, rather than in the context of the surrounding dialect and location identifier (from the containing schema's $schema and $id keywords). So how about instead saying something like:

Because this subschema is intended to be processed in isolation, outside of the context of its containing schema, usage of both the $schema and $id keywords is recommended to ensure predictable and reproducable results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This text was already present. I just put it in a new paragraph. I did think it was a bit convoluted. Happy to update it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it could be okay to use the subschema in its original context, so that should still be addressed. This is what the original discussion was about.


Accessing the schema through the schema location IRI included as part of the
annotation will ensure that it is correctly processed as a subschema. Using the
extracted annotation value directly is only safe if the subschema is an embedded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make this a SHOULD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're suggesting. This is informative, not a requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from what I'm reading here, there are edge cases in contentSchema if the schema doesn't have $id and $schema. If that's the case, shouldn't we highlight in more of a "SHOULD" manner, which from what I understand, it something you should do rather than a MUST (something you HAVE to do)? Or maybe "RECOMMENDED" is the right one similar to this case: https://json-schema.org/draft/2020-12/json-schema-validation#section-7.2.2-5?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an edge case. It just means that if you intend to use the contentSchema subschema solely in its own context, as you would if you received it as an annotation, then any relative $refs will only be resolvable if the subschema has both $schema and $id.

This goes back to my comment here where I show a contentSchema subschema attempting to reference a definition in its parent schema. If you extract the subschema (again, because you've received it as an annotation), then that reference fails.

There's not a best practice here. Both approaches have valid use cases, and schema authors are free to do what makes sense for them. This is merely a caution to schema authors to understand the implications of their approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's obvious to me from @karenetheridge's and your comments that this paragraph isn't clear, so I'll reword it.

@gregsdennis
Copy link
Member Author

@karenetheridge / @jviotti I've rewritten the last paragraph note and added another "editor" footnote that points back to the subject issue linked above. Let me know what you think.

@jviotti
Copy link
Member

jviotti commented Nov 25, 2024

Reads much better now!

specs/jsonschema-validation.md Outdated Show resolved Hide resolved
Comment on lines 545 to 547
Since `contentMediaType` is required to provide instruction on how to interpret
string content, the annotation schema produced by this keyword has no meaning if
`contentMediaType` is not present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that then also mean that identifiers are not to be processed in that case?

IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.

@gregsdennis
Copy link
Member Author

@karenetheridge @jdesrosiers @jviotti I believe I've addressed your concerns here.

Copy link
Member

@jviotti jviotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

contentSchema has implementation-defined referencing behavior when contentMediaType is not present
4 participants