Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding malicious code instead of removing it #298

Open
bmscodespace opened this issue Jan 12, 2024 · 4 comments
Open

Encoding malicious code instead of removing it #298

bmscodespace opened this issue Jan 12, 2024 · 4 comments

Comments

@bmscodespace
Copy link

bmscodespace commented Jan 12, 2024

Hi,

is it possible to build a policy that, instead of removing problematic parts of a html string, just encodes those parts in such a way that they can do no harm when the string is used in a html-page?
So
<script>alert`1`</script>
would then be replaced by something like
&lt;script&gt;alert`1`&lt;/script&gt;.

Thank you for any answer ;)

p.s. the idea behind my question is that I would like to use a policy that does not know if it deals with a string that will be used as inner html or as an "ordinary text field" with no html but where we could read a text about the "<script>" tag. If malicious code is removed by the sanitizer, then this could destroy "ordinary text". On the other hand, if I would use output encoding on my string I would loose text formatting in the case of inner html.

@csware
Copy link
Contributor

csware commented Jan 23, 2024

Please provide an example where it is not working as expected.

@bmscodespace
Copy link
Author

bmscodespace commented Jan 24, 2024

Hi @csware ,

suppose a string is imported into an application and suppose we can't know if it will be used as inner HTML, as f.e. formatted text, or as a data string.

Suppose first to secure the text before it gets displayed we always sanitize it. But if that text is given as f.e.

"A script tag begins with <script> and ends like </script>" ,

then with no appropriate policy, the string

"A script tag begins with"

might reach the view and text is missing we might want to be displayed.

On the other hand, if I just encode every string that is imported and one such string is a formatted text (with some p tags in it or some list tags, or some b tags etc.) which is used as inner HTML, then I loose the possibility of formatted text.

My question would be if it is possible to secure a text where it is not clear if it will be used as a data string or as inner HTML. I hope this makes it a little bit clearer ;)

@csware
Copy link
Contributor

csware commented Jan 28, 2024

I suppose this could be achieved using a preprocessor.

However, the input is not correctly encoded. If <script> should be shown on the screen, then it needs to be properly encoded in the first place, if there are also HTML tags for formatting.

@Dashlet26
Copy link

Hi, @bmscodespace
Coming to your question wheter it is possible or not so Yes, it is possible to build a policy that encodes problematic parts of an HTML string instead of removing them. This approach is known as HTML encoding or output encoding, and it helps prevent XSS attacks by converting potentially harmful characters into their HTML entity equivalents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants