-
Notifications
You must be signed in to change notification settings - Fork 716
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix possible crash in HTML CSS image extraction
When processing UTF-8 HTML code, the image extraction logic may panic if the string contains a multi-byte grapheme that includes a '(', ')', whitespace, or one of the other characters used to split the text when searching for the base64 image content. The panic is because the `split_at()` method will panic if you try to split in the middle of a unicode grapheme. This commit fixes the issue by processing the HTML string one grapheme at a time instead of one character (byte) at a time. The `grapheme_indices()` method is used to get the correct position of the start of each grapheme for splitting the string.
- Loading branch information
1 parent
7a6fe78
commit 2a21451
Showing
4 changed files
with
36 additions
and
27 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters