Fix tag balancing #23

adon-at-work · 2015-08-17T09:58:40Z

Changes

fix: optional tags are ignored during balancing, i.e., well-balanced input will not be broken by attempting to close optional tags.
tag balancing logics simplified (accept closing tag as long as it exists in the stack)
ignore self-closing solidus, which is required only for foreign elements
added a cap to limit the max levels of unclosed/nested tags

before the fix
default x 0.64 ops/sec ±5.90% (6 runs sampled)
disabled tag balancing x 0.60 ops/sec ±3.82% (6 runs sampled)
Fastest is [ 'default' ]
Speed/Time is 0.6423016849725829MB/s 1.5567777936666667s

after the fix
default x 0.72 ops/sec ±5.01% (6 runs sampled)
disabled tag balancing x 0.60 ops/sec ±3.91% (6 runs sampled)
Fastest is [ 'default' ]
Speed/Time is 0.7190234217142741MB/s 1.3906654078333334s

it actually runs faster than without it using the given 1m file. so tag balancing doesn't seem to incur any significant performance hit.

adon-at-work · 2015-08-17T10:05:19Z

@maditya, please check. thanks :)

maditya · 2015-08-17T20:23:39Z

can we track updates to benchmark suite in a separate PR ?

adon-at-work · 2015-08-18T03:33:15Z

ok. rebased to separate it to PR #25

adon-at-work · 2015-08-18T10:27:46Z

@maditya , as discussed over phone, i optimized the code a bit, and imposed a cap on stack size.

yukinying · 2015-08-19T18:23:33Z

src/html-purify.js

-                        if (openedTag) {
-                            this.output += '</' + openedTag + '>';
+                    if (tagBalance.enabled && !optionalElements[tagName]) {
+                        // relaxed tag balancing, accept it as long as the tag exists in the stack


This seems to be a heuristic approach. We should have specific examples, probably in a separate document, on what cases we are handling and what cases we are not.

A document was prepared and shared internally on Aug 13. In case you missed it, please search "Untrusted input should be self-contained by tag balancing" in your mailbox.

Updated the README to explain tag balancing.

@maditya

maditya · 2015-09-02T07:14:39Z

src/html-purify.js

@@ -37,28 +48,29 @@ See the accompanying LICENSE file for terms.
            enableCanonicalization: config.enableCanonicalization,
            enableVoidingIEConditionalComments: config.enableVoidingIEConditionalComments
        }).on('postWalk', function (lastState, state, i, endsWithEOF) {
-                processTransition.call(that, lastState, state, i);
+            !tagBalance.stackOverflow && processTransition.call(that, lastState, state, i);


another approach on handling stackOverflow is to :
a. clear the stack
b. disregard the check "if tag exists in the stack".
(in other words, behave as if tag balancing was disabled)
That would prevent truncating valid/balanced input just because it has nesting > stack size(as currently seen in the test case at https://github.com/yahoo/html-purify/pull/23/files#diff-cfdb6a0bb85a9c9dcb6708d2225134cfR42). Also, our assertion that "we won't do tag balacing when nesting > stack size" still holds true. Thoughts?

really want to let an untrusted input to disable tag balancing?

an attacker can first pad <b> n+1 times, where n equals the stack size limit, optionally close all of them, and then start crafting "dangerous" tags. Examples:

<b>...<b><textarea>

<b>...<b><div style="display:none;visibility:hidden;font-size:0;color:#000"> (accepting/allowing either of those CSS will result in a consequence literally like <!--. It's common to allow an untrusted email to specify font size/color)

<div style="display:none;visibility:hidden;font-size:0;color:#000"> n+1 times

Let me come up with a simple use case, as for email threads/conversations:

<div> To: inHTMLData(email_recipient1)<br> From: inHTMLData(email_sender1)<br> Subject: inHTMLData(email_subj1)<br> purify(email_content1) <br><button>Reply</button><button>Forward</button><button>Delete</button> </div> <div> To: inHTMLData(email_recipient2)<br> From: inHTMLData(email_sender2)<br> Subject: inHTMLData(email_subj2)<br> purify(email_content2) <br><button>Reply</button><button>Forward</button><button>Delete</button> </div> <div> To: inHTMLData(email_recipient3)<br> From: inHTMLData(email_sender3)<br> Subject: inHTMLData(email_subj3)<br> purify(email_content3) <br><button>Reply</button><button>Forward</button><button>Delete</button> </div>

I hope this use case can explain why we do tag balancing. No one would accept:
(1) the purify(email_content1) can hide the buttons like Reply, Forward, and Delete from the trusted Mail app
(2) even worst, purify(email_content1) can collude with purify(email_content3) to hide email 2, when email_content1 is </div><div style="display:none;..."><div> and email_content3 is </div></div><div>

thoughts? this pretty much explained the need of self-containing the untrusted data.

I think we should make sure the output is safe, or we just return empty output with an error.

yukinying · 2015-09-03T19:42:12Z

@maditya, @adon-at-work

Please see http://jsperf.com/array-push-int-and-string for performance with various array push technique. Since we are having a whitelist of tags, I think we would prefer using integer (enum based) assignment to the tag list stack (array).

There may be other optimization techniques, please feel free to edit the jsperf test for such.

adon-at-work · 2015-09-10T01:45:49Z

@yukinying
I see you're suggesting to deal with integers instead of tagName (string), and thanks for the detailed performance tests. :)

Right now, the tagName whitelist is like ['a', 'div', ...], and I'm aware of making use of the array indices as the integers we want.

@yukinying @maditya
Looking farther, it boils down to whether we'll support tag-specific attribute whitelist, e.g., href is whitelisted for <a> but not <div>, as in the current Mail or YIV use case. This will for sure affect the list structure. As a reference, secure-handlebars attempted to support similar tag-specific attribute matching by: https://github.com/yahoo/secure-handlebars/blob/master/src/parser-utils.js#L68-L70. Here, we may need a very different structure.

Hence, should we now keep the current implementation as is before settling the use case regarding tag-specific attribute whitelist and its required the list structure? And perhaps, (if the list structure really requires a change,) let's do it in separate PRs.

yukinying · 2015-09-11T07:24:52Z

The suggestion there is to avoid using Array.prototype.push as it is not optimized for the use case. Instead, keeping a counter of the stack size and use it to mimic the push and pop operation will give you a huge performance boost.

A further boost can be possible if we use enum based approach for identifying the tag name, as the space required for building the stack will be much smaller.

Please take the suggestion on avoiding use of array push as a basic requirement for the tag balancing stack. Once that's done, we should be able to relax the requirement of stack size limitation (e.g. grow it to 100,000) to some value that we would consider the input as totally malicious / invalid, and then just return error for the case.

adon-at-work · 2015-09-11T07:49:04Z

I appreciate your suggestions, and indeed, thank you for a midnight comment. 👍

But there has no more .push since 24 days ago. Such an optimization is completed in fa24fa2#diff-9c568acf92dc1a69fdd058c41195c719R121. You may like to confirm by searching for .push after clicking "Files Changed".

I personally think legitimate HTML won't nest more than a maximum of 100 levels. 100,000 sounds way too relax perhaps. Will you consider/accept 1,000 or 10,000?

Regarding "enum based approach for identifying the tag name", I also see its potential as you do. But could we not to do that in this PR before we finalize the enum/whitelist structure?

yukinying · 2015-09-11T08:21:47Z

I would rather use some measurable metrics to determine the level. Could you provide a simple benchmark with nested tags with different levels, e.g. 100, 1000, 10000?

We will also need a doc (or MD) file that describes the actual tag balancing algorithm. The doc you sent me previously has too many open questions there. My preference is to have a document that clearly explains the algorithm, and we use the algorithm spec to validate if any future pull requests or bug fixes from any contributors would align with the design principles and security consideration of such.

It'd actually be much nicer if the commit fa24fa2 has come along with a description of the changes.

yukinying · 2015-09-11T08:24:36Z

Looking farther, it boils down to whether we'll support tag-specific attribute whitelist, e.g., href is whitelisted for but not
, as in the current Mail or YIV use case. This will for sure affect the list structure. As a reference, secure-handlebars attempted to support similar tag-specific attribute matching by: https://github.com/yahoo/secure-handlebars/blob/master/src/parser-utils.js#L68-L70. Here, we may need a very different structure.

Hence, should we now keep the current implementation as is before settling the use case regarding tag-specific attribute whitelist and its required the list structure? And perhaps, (if the list structure really requires a change,) let's do it in separate PRs.

Agree that the matrix of allowable tags and attributes combination should come as a separate PR.

adon-at-work · 2015-09-11T09:24:53Z

I would rather use some measurable metrics to determine the level. Could you provide a simple benchmark with nested tags with different levels, e.g. 100, 1000, 10000?

I'm not sure if I misunderstood.

I perceive that the cap is more related to what the norm is, for which no legit inputs will get truncated because of exceeding the cap. As long as the cap is not hit, 100 and 100,000 won't make a difference in terms of performance (stackPtr < 100 vs. stackPtr < 100000).

In fact, I'm confused about the relationship between the cap and anything related to benchmark, including the worst-case performance you sort of mentioned a while ago. Assume we've a 1m file. Say, we encountered 300KB (e.g., <b> x 100) of tags that are so nested to hit the cap, then the remaining 9700KB of data will be ignored. One can obviously tell the performance is even better once hitting the cap, as no further inputs will be consumed.

Determining that big enough number may sound magical, but more accurately it's very use-case specific. The best effort is to set a sufficiently big number. 100, 1000, 10000 all look reasonable to me. The best approach is to have it configurable as already did, and let users judge based on their needs. May be Mail never want to truncate people's emails to avoid complaints, then the number could be MAX of what a Number can hold.

adon-at-work · 2015-09-11T09:55:42Z

We will also need a doc (or MD) file that describes the actual tag balancing algorithm. The doc you sent me previously has too many open questions there. My preference is to have a document that clearly explains the algorithm, and we use the algorithm spec to validate if any future pull requests or bug fixes from any contributors would align with the design principles and security consideration of such.

MD (or even comments inside the code) is good, since it can be included in this open-sourced project. The description on tag balancing in already included in the README.md. I'd very much appreciate if you could directly edit/improve/polish it over there. Thanks.

yukinying · 2015-09-11T16:13:59Z

The description on tag balancing in already included in the README.md. I'd very much appreciate if you could directly edit/improve/polish it over there. Thanks.

We need more than a paragraph, and we need the algorithm rather than an overview. Without such, this pull request would not be accepted.

yukinying · 2015-09-11T16:20:01Z

MD (or even comments inside the code) is good

Also, we would not accept source code comment as a proper documentation unless they are gathered in one big comment block. The reason is that it is very easy to overlook any potential new pull requests that may wipe a single line comment or moved certain code logic and mess up all the remaining comments. The requirement may not be necessary for other libraries, yet we try to set a higher bar to make this security library be more easier manageable.

- logics simplified - optional tags are ignored during balancing - ignore self-closing solidus, which is required only for foreign elements

- imposed a max stack size, i.e., nested-levels - stack pointer is used instead of push()

- also improved the arrayLastIndexOf to use the native Array.prototype.lastIndexOf if exixts

adon-at-work · 2015-10-05T09:37:16Z

please find the changes rebased to the latest master. comments added too. @maditya @yukinying

adon-at-work force-pushed the fix-tag-balancing branch from 65129b3 to 9925fd4 Compare August 17, 2015 10:19

adon-at-work mentioned this pull request Aug 17, 2015

Add proper quotes #22

Open

adon-at-work force-pushed the fix-tag-balancing branch from 9925fd4 to 8243352 Compare August 18, 2015 03:34

yukinying reviewed Aug 19, 2015
View reviewed changes

adon-at-work force-pushed the fix-tag-balancing branch from 4f3f134 to bcd6047 Compare August 30, 2015 17:00

maditya reviewed Sep 2, 2015
View reviewed changes

adon added 2 commits October 5, 2015 15:59

refactored a misspelled var name

8e598d7

fix tag balancing logics

be7e501

- logics simplified - optional tags are ignored during balancing - ignore self-closing solidus, which is required only for foreign elements

adon-at-work force-pushed the fix-tag-balancing branch 4 times, most recently from d1ce433 to 12d5fd2 Compare October 5, 2015 09:32

adon added 2 commits October 5, 2015 17:35

updated after reviewer's comments

4c0436d

- imposed a max stack size, i.e., nested-levels - stack pointer is used instead of push()

updated comments and README

3701cb7

- also improved the arrayLastIndexOf to use the native Array.prototype.lastIndexOf if exixts

adon-at-work force-pushed the fix-tag-balancing branch from 12d5fd2 to 3701cb7 Compare October 5, 2015 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tag balancing #23

Fix tag balancing #23

adon-at-work commented Aug 17, 2015

adon-at-work commented Aug 17, 2015

maditya commented Aug 17, 2015

adon-at-work commented Aug 18, 2015

adon-at-work commented Aug 18, 2015

yukinying Aug 19, 2015

adon-at-work Aug 30, 2015

adon-at-work Aug 30, 2015

maditya Sep 2, 2015

adon-at-work Sep 2, 2015

adon-at-work Sep 2, 2015

adon-at-work Sep 10, 2015

yukinying Sep 11, 2015

yukinying commented Sep 3, 2015

adon-at-work commented Sep 10, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

yukinying commented Sep 11, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

yukinying commented Sep 11, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Oct 5, 2015

Fix tag balancing #23

Are you sure you want to change the base?

Fix tag balancing #23

Conversation

adon-at-work commented Aug 17, 2015

adon-at-work commented Aug 17, 2015

maditya commented Aug 17, 2015

adon-at-work commented Aug 18, 2015

adon-at-work commented Aug 18, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yukinying commented Sep 3, 2015

adon-at-work commented Sep 10, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

yukinying commented Sep 11, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

adon-at-work commented Sep 11, 2015

yukinying commented Sep 11, 2015

yukinying commented Sep 11, 2015

adon-at-work commented Oct 5, 2015