Skip to content

Commit

Permalink
Feedback from Feb 7 meeting (plus fixing indent issues).
Browse files Browse the repository at this point in the history
  • Loading branch information
otherdaniel committed Feb 21, 2024
1 parent 6cf1f81 commit bee5caa
Showing 1 changed file with 108 additions and 80 deletions.
188 changes: 108 additions & 80 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ The <dfn for="DOM/Document">parseHTMLUnsafe</dfn>(|html|, |options|?) method ste
1. If |options| is set:
1. Let |config| be the result of calling [=canonicalize a configuration=] on
|options|[`"sanitizer"`] and `false`.
1. If |config| exists:
1. Call [=sanitize=] on |document|'s [=tree/root|root node=] with |config|.
1. Return |document|.

Expand Down Expand Up @@ -241,13 +242,18 @@ To <dfn>set and filter HTML</dfn>, given an {{Element}} or {{DocumentFragment}}
|target|, an {{Element}} |contextElement|, a [=string=] |html|, and a
[=dictionary=] |options|, and a [=boolean=] flag |safe|, run these steps:

1. If |safe| and |contextElement|'s [=Element/local name=] is `"script"` and
|contextElement|'s [=Element/namespace=] is the [=HTML namespace=] or the
[=SVG namespace=]:
1. Return.
1. Let |config| be the result of calling [=canonicalize a configuration=] on
|options|[`"sanitizer"`] and |safe|.
1. Let |newChildren| be the result of the HTML [=fragment parsing algorithm=]
given |contextElement|, |html|, and `true`.
1. Let |fragment| be a new {{DocumentFragment}} whose [=node document=] is |contextElement|'s [=node document=].
1. [=list/iterate|For each=] |node| in |newChildren|, [=list/append=] |node| to |fragment|.
1. Run [=sanitize=] on |fragment| using |config|.
1. If |config| exists:
1. Run [=sanitize=] on |fragment| using |config|.
1. [=Replace all=] with |fragment| within |target|.

</div>
Expand All @@ -264,25 +270,26 @@ For the main <dfn>sanitize</dfn> operation, using a {{ParentNode}} |node|, a
1. [=Assert=]: |child| [=implements=] {{Text}}, {{Comment}}, or {{Element}}.

Note: Currently, this algorithm is only be called on output of the HTML
parser, for which this assertion should hold. If this is to be
generalized, this algorithm needs to be re-examined.
parser for which this assertion should hold. If in the future
this algorithm will be used in different contexts, this assumption
needs to be re-examined.
1. If |child| [=implements=] {{Text}}:
1. Do nothing.
1. else if |child| [=implements=] {{Comment}}:
1. If |config|'s {{SanitizerConfig/comments}} is not `true`:
1. [=/remove=] |child|.
1. else if |child| [=implements=] {{Element}}:
1. else:
1. Let |elementName| be a {{SanitizerElementNamespace}} with |child|'s
[=Element/local name=] and [=Element/namespace=].
1. If |config|["{{SanitizerConfig/elements}}"] exists and
|config|["{{SanitizerConfig/elements}}"] does not [=list/contain=]
|config|["{{SanitizerConfig/elements}}"] does not [=SanitizerConfig/contain=]
[|elementName|]:
1. [=/remove=] |child|.
1. else if |config|["{{SanitizerConfig/removeElements}}"] exists and
|config|["{{SanitizerConfig/removeElements}}"] [=list/contains=]
|config|["{{SanitizerConfig/removeElements}}"] [=SanitizerConfig/contains=]
[|elementName|]:
1. [=/remove=] |child|.
1. If |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] exists and |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] [=list/contains=] |elementName|:
1. If |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] exists and |config|["{{SanitizerConfig/replaceWithChildrenElements}}"] [=SanitizerConfig/contains=] |elementName|:
1. Call [=sanitize=] on |child| with |config|.
1. Call [=replace all=] with |child|'s [=tree/children=] within |child|.
1. If |elementName| [=equals=] &laquo;[ `"name"` &rightarrow; `"template"`,
Expand All @@ -294,32 +301,32 @@ For the main <dfn>sanitize</dfn> operation, using a {{ParentNode}} |node|, a
1. Let |attrName| be a {{SanitizerAttributeNamespace}} with |attr|'s
[=Attr/local name=] and [=Attr/namespace=].
1. If |config|["{{SanitizerConfig/attributes}}"] exists and
|config|["{{SanitizerConfig/attributes}}"] does not [=list/contain=]
|config|["{{SanitizerConfig/attributes}}"] does not [=SanitizerConfig/contain=]
|attrName|:
1. If "data-" is a [=code unit prefix=] of [=Attr/local name=] and
if [=Attr/namespace=] is "" and
if [=Attr/namespace=] is `null` and
if |config|["{{SanitizerConfig/attributes}}"] exists and
if |config|["{{SanitizerConfig/dataAttributes}}"] exists and is `true`:
1. Do nothing.
1. Else:
1. Remove |attr| from |child|.
1. else if |config|["{{SanitizerConfig/removeAttributes}}"] exists and
|config|["{{SanitizerConfig/removeAttributes}}"] [=list/contains=]
|config|["{{SanitizerConfig/removeAttributes}}"] [=SanitizerConfig/contains=]
|attrName|:
1. Remove |attr| from |child|.
1. If |config|["{{SanitizerConfig/elements}}"][|elementName|] exists,
and if
|config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/attributes}}"]
exists, and if
|config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/attributes}}"]
does not [=list/contain=] |attrName|:
does not [=SanitizerConfig/contain=] |attrName|:
1. Remove |attr| from |child|.
1. If |config|["{{SanitizerConfig/elements}}"][|elementName|] exists,
and if
|config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"]
exists, and if
|config|["{{SanitizerConfig/elements}}"][|elementName|]["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"]
[=list/contains=] |attrName|:
[=SanitizerConfig/contains=] |attrName|:
1. Remove |attr| from |child|.
1. If &laquo;[|elementName|, |attrName|]&raquo; matches an entry in the
[=navigating URL attributes list=], and if |attr|'s [=protocol=] is
Expand All @@ -328,8 +335,6 @@ For the main <dfn>sanitize</dfn> operation, using a {{ParentNode}} |node|, a
1. Call [=sanitize=] on |child|'s [=Element/shadow root=] with |config|.
1. else:
1. [=/remove=] |child|.
1. else:
1. [=Assert=]: We shouldn't reach this branch.

</div>

Expand All @@ -339,9 +344,9 @@ For the main <dfn>sanitize</dfn> operation, using a {{ParentNode}} |node|, a
A |config| is <dfn for="SanitizerConfig">valid</dfn> if all these conditions are met:

1. |config| is a [=dictionary=]
1. |config|'s [=map/keys|key set=] does not contain both
1. |config|'s [=map/keys|key set=] does not [=list/contain=] both
"{{SanitizerConfig/elements}}" and "{{SanitizerConfig/removeElements}}"
1. |config|'s [=map/keys|key set=] does not contain both
1. |config|'s [=map/keys|key set=] does not [=list/contain=] both
"{{SanitizerConfig/removeAttributes}}" and "{{SanitizerConfig/attributes}}".
1. [=list/iterate|For any=] |key| of &laquo;[
"{{SanitizerConfig/elements}}",
Expand All @@ -353,46 +358,55 @@ A |config| is <dfn for="SanitizerConfig">valid</dfn> if all these conditions are
1. |config|[|key|] is [=SanitizerNameList/valid=].
1. If |config|["{{SanitizerConfig/elements}}"] exists, then
[=list/iterate|for any=] |element| in |config|[|key|] that is a [=dictionary=]:
1. |element| does not contain both
1. |element| does not [=list/contain=] both
"{{SanitizerElementNamespaceWithAttributes/attributes}}" and
"{{SanitizerElementNamespaceWithAttributes/removeAttributes}}".
1. If either |element|["{{SanitizerElementNamespaceWithAttributes/attributes}}"]
or |element|["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"]
[=map/exists=], then it is [=SanitizerNameList/valid=].
1. Let |tmp| be a [=dictionary=], and for any |key| &laquo;[
"{{SanitizerConfig/elements}}",
"{{SanitizerConfig/removeElements}}",
"{{SanitizerConfig/replaceWithChildrenElements}}",
"{{SanitizerConfig/attributes}}",
"{{SanitizerConfig/removeAttributes}}"
]&raquo; |tmp|[|key|] is set to the result of [=canonicalize a sanitizer
element list=] called on |config|[|key|], and [=HTML namespace=] as default
namespace for the element lists, and "" as default namespace for the
attributes lists.

Given theses canonlicalized name lists, all of the following conditions hold:

1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/elements}}"] and
|tmp|["{{SanitizerConfig/removeElements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/removeElements}}"]
|tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"] and
|tmp|["{{SanitizerConfig/elements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/attributes}}"] and
|tmp|["{{SanitizerConfig/removeAttributes}}"]
is empty.

Note: The intent here is to detect duplicates, but without regard of
whether the string shortcut syntax or the explicit dictionary
syntax is used. An implementation might well do this without
explicitly canonicalizing the lists at this point.
1. Let |tmp| be a [=dictionary=], and for any |key| &laquo;[
"{{SanitizerConfig/elements}}",
"{{SanitizerConfig/removeElements}}",
"{{SanitizerConfig/replaceWithChildrenElements}}",
"{{SanitizerConfig/attributes}}",
"{{SanitizerConfig/removeAttributes}}"
]&raquo; |tmp|[|key|] is set to the result of [=canonicalize a sanitizer
element list=] called on |config|[|key|], and [=HTML namespace=] as default
namespace for the element lists, and `null` as default namespace for the
attributes lists.

Note: The intent here is to assert about list erlements, but without regard
of whether the string shortcut syntax or the explicit dictionary
syntax is used. For example, having "img" in `elements` and
`{ name: "img" }` in `removeElements`. An implementation might well
do this without explicitly canonicalizing the lists at this point.

1. Given theses canonlicalized name lists, all of the following conditions hold:

1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/elements}}"] and
|tmp|["{{SanitizerConfig/removeElements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/removeElements}}"]
|tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/replaceWithChildrenElements}}"] and
|tmp|["{{SanitizerConfig/elements}}"]
is empty.
1. The [=set/intersection=] between
|tmp|["{{SanitizerConfig/attributes}}"] and
|tmp|["{{SanitizerConfig/removeAttributes}}"]
is empty.

1. Let |tmpattrs| be |tmp|["{{SanitizerConfig/attributes}}"] if it exists,
and otherwise [=built-in default config=]["{{SanitizerConfig/attributes}}"].
1. [=list/iterate|For any=] |item| in |tmp|["{{SanitizerConfig/elements}}"]:
1. If either |item|["{{SanitizerElementNamespaceWithAttributes/attributes}}"]
or |item|["{{SanitizerElementNamespaceWithAttributes/removeAttributes}}"]
exists:
1. Then the [=set/difference=] between it and |tmpattrs| is empty.

</div>

Expand Down Expand Up @@ -420,10 +434,9 @@ A |config| is <dfn for="SanitizerConfig">canonical</dfn> if all these conditions
"{{SanitizerConfig/attributes}}",
"{{SanitizerConfig/removeAttributes}}",
"{{SanitizerConfig/comments}}",
"{{SanitizerConfig/dataAttributes}}",
"safe"
"{{SanitizerConfig/dataAttributes}}"
]&raquo;
1. |config|'s [=map/keys|key set=] contains either:
1. |config|'s [=map/keys|key set=] [=list/contains=] either:
1. both "{{SanitizerConfig/elements}}" and "{{SanitizerConfig/attributes}}",
but neither of
"{{SanitizerConfig/removeElements}}" or "{{SanitizerConfig/removeAttributes}}".
Expand All @@ -442,8 +455,7 @@ A |config| is <dfn for="SanitizerConfig">canonical</dfn> if all these conditions
1. |config|["{{SanitizerConfig/elements}}"] is [=SanitizerNameWithAttributesList/canonical=].
1. For any |key| of &laquo;[
"{{SanitizerConfig/comments}}",
"{{SanitizerConfig/dataAttributes}}",
"safe"
"{{SanitizerConfig/dataAttributes}}"
]&raquo;:
1. if |config|[|key|] [=map/exists=], |config|[|key|] is a {{boolean}}.

Expand Down Expand Up @@ -498,8 +510,20 @@ if all these conditions are met:
In order to <dfn>canonicalize a configuration</dfn> |config| with a boolean
parameter |safe|, run the following steps:

TODO: Handle empty |config|.
Note: The initial set of [=assert=]s assert properties of the built-in
constants, like the [=built-in default config|defaults=] and
the lists of known [=known elements|elements=] and
[=known attributes|attributes=].

1. [=Assert=]: [=built-in default config=] is [=SanitizerConfig/canonical=].
1. [=Assert=]: [=built-in default config=]["elements"] is a [=subset=] of [=known elements=].
1. [=Assert=]: [=built-in default config=]["attributes"] is a [=subset=] of [=known attributes=].
1. [=Assert=]: &laquo;[
"elements" &rightarrow; [=known elements=],
"attributes" &rightarrow; [=known attributes=],
]&raquo; is [=SanitizerConfig/canonical=].
1. If |config| is empty is not |safe|:
1. Return.
1. If |config| is not [=SanitizerConfig/valid=], then [=throw=] a {{TypeError}}.
1. Let |result| be a new [=dictionary=].
1. For each |key| of &laquo;[
Expand All @@ -513,7 +537,7 @@ TODO: Handle empty |config|.
"{{SanitizerConfig/attributes}}",
"{{SanitizerConfig/removeAttributes}}" ]&raquo;:
1. If |config|[|key|] exists, set |result|[|key|] to the result of running
[=canonicalize a sanitizer element list=] on |config|[|key|] with `""` as
[=canonicalize a sanitizer element list=] on |config|[|key|] with `null` as
the default namespace.
1. Set |result|["{{SanitizerConfig/comments}}"] to
|config|["{{SanitizerConfig/comments}}"].
Expand Down Expand Up @@ -565,7 +589,6 @@ TODO: Handle empty |config|.
|config|["{{SanitizerConfig/removeAttributes}}"] [=map/exist=]:
1. Set |result|["{{SanitizerConfig/attributes}}"] to
|default|["{{SanitizerConfig/attributes}}"].
1. Set |result|["safe"] to |safe|.
1. [=Assert=]: |result| is [=SanitizerConfig/valid=].
1. [=Assert=]: |result| is [=SanitizerConfig/canonical=].
1. Return |result|.
Expand Down Expand Up @@ -601,10 +624,21 @@ namespace |defaultNamespace|, run the following steps:

## Supporting Algorithms ## {#alg-support}

Set difference (or set subtraction) is a clone of a set A, but with all members
removed that occur in a set B.
<div algorithm>
For the [=canonicalize a sanitizer name|canonicalized=]
{{SanitizerElementNamespace|element}} and {{SanitizerAttributeNamespace|attribute name}} lists
used in this spec, list membership is based on matching both `"name"` and `"namespace"`
entries:
A Sanitizer name |list| <dfn for="SanitizerConfig">contains</dfn> an |item|
if there exists an |entry| of |list| that is an [=ordered map=], and where
|item|["name"] [=equals=] |entry|["name"] and
|item|["namespace"] [=equals=] |entry|["namespace"].

</div>

<div algorithm>
Set difference (or set subtraction) is a clone of a set A, but with all members
removed that occur in a set B:
To compute the <dfn for="set">difference</dfn> of two [=ordered sets=] |A| and |B|:

1. Let |set| be a new [=ordered set=].
Expand All @@ -615,16 +649,19 @@ To compute the <dfn for="set">difference</dfn> of two [=ordered sets=] |A| and |

</div>

Equality for [=ordered sets=] is equality of its members, but without
regard to order.

<div algorithm>
Equality for [=ordered sets=] is equality of its members, but without
regard to order:
[=Ordered sets=] |A| and |B| are <dfn for=set>equal</dfn> if both |A| is a
[=superset=] of |B| and |B| is a [=superset=] of |A|.

</div>

## Defaults ## {#sanitization-defaults}

Note: The defaults should follow a certain form, which is checked for at the
beginning of [=canonicalize a configuration=].

The <dfn>built-in default config</dfn> is as follows:
```
{
Expand All @@ -638,28 +675,19 @@ The <dfn>built-in default config</dfn> is as follows:
The <dfn>known elements</dfn> are as follows:
```
[
{ name: "div", namespace: "http://www.w3.org/1999/xhtml"" },
{ name: "div", namespace: "http://www.w3.org/1999/xhtml" },
...
]
```

The <dfn>known attributes</dfn> are as follows:
```
[
{ name: "class", namespace: "" },
{ name: "class", namespace: null },
...
]
```

1. [=Assert=]: [=built-in default config=] is [=SanitizerConfig/canonical=]
1. [=Assert=]: [=built-in default config=]["elements"] is a [=subset=] of [=known elements=].
1. [=Assert=]: [=built-in default config=]["attributes"] is a [=subset=] of [=known attributes=].
1. [=Assert=]: &laquo;[
"elements" &rightarrow; [=known elements=],
"attributes" &rightarrow; [=known attributes=],
"safe" &rightarrow; `false`,
]&raquo; is [=SanitizerConfig/canonical=].

Note: The [=known elements=] and [=known attributes=] should be derived from the
HTML5 specification, rather than being explicitly listed here. Currently,
there are no mechanics to do so.
Expand All @@ -672,27 +700,27 @@ navigations are unsafe, are as follows:
<br>
[
{ `"name"` &rightarrow; `"a"`, `"namespace"` &rightarrow; "[=HTML namespace=]" },
{ `"name"` &rightarrow; `"href"`, `"namespace"` &rightarrow; "" }
{ `"name"` &rightarrow; `"href"`, `"namespace"` &rightarrow; `null` }
],
<br>
[
{ `"name"` &rightarrow; `"area"`, `"namespace"` &rightarrow; "[=HTML namespace=]" },
{ `"name"` &rightarrow; `"href"`, `"namespace"` &rightarrow; "" }
{ `"name"` &rightarrow; `"href"`, `"namespace"` &rightarrow; `null` }
],
<br>
[
{ `"name"` &rightarrow; `"form"`, `"namespace"` &rightarrow; "[=HTML namespace=]" },
{ `"name"` &rightarrow; `"action"`, `"namespace"` &rightarrow; "" }
{ `"name"` &rightarrow; `"action"`, `"namespace"` &rightarrow; `null` }
],
<br>
[
{ `"name"` &rightarrow; `"input"`, `"namespace"` &rightarrow; "[=HTML namespace=]" },
{ `"name"` &rightarrow; `"formaction"`, `"namespace"` &rightarrow; "" }
{ `"name"` &rightarrow; `"formaction"`, `"namespace"` &rightarrow; `null` }
],
<br>
[
{ `"name"` &rightarrow; `"button"`, `"namespace"` &rightarrow; "[=HTML namespace=]" },
{ `"name"` &rightarrow; `"formaction"`, `"namespace"` &rightarrow; "" }
{ `"name"` &rightarrow; `"formaction"`, `"namespace"` &rightarrow; `null` }
],
<br>
]&raquo;
Expand Down

0 comments on commit bee5caa

Please sign in to comment.