Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep html entities like é and € escaped #138

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jeroenvdheuvel
Copy link

Keep the escaped html entities as is. Instead of changing them to their ISO 8859-1 or UTF-8 counterparts é and .

Html entities should not be touched, since not all everybody understands UTF-8 characters. By not escaping them, the user of this library stays in control.

…. Before dumping the html put the replaced `&` back again.
@jeroenvdheuvel jeroenvdheuvel force-pushed the keep_html_entities_escaped branch from 94bdb45 to 79d875b Compare May 10, 2016 11:11
@jeroenvdheuvel
Copy link
Author

jeroenvdheuvel commented May 10, 2016

@tijsverkoyen merge conflict is resolved

@jeroenvdheuvel
Copy link
Author

@tijsverkoyen could you provide comment on this issue? I'm fine with changing it if needed. When you don't like to merge it (because you don't like to feature) is okay too (but unfortunate for me).

voku added a commit to voku/CssToInlineStyles that referenced this pull request May 20, 2016
@voku
Copy link

voku commented May 20, 2016

I have a small problem with this change ...

1.) if we have a html-template with "\r" (Mac) carriage return
2.) the xml-parser will convert it into "&#13"
3.) we convert it into "!AMP!#13"

... as a hack I replaced this chars, but I don't know if there are more magic from the xml-parser that isn't covert by our unit-tests?!

@w3guy
Copy link

w3guy commented Jun 14, 2016

I am of the opinion that this should be merged. cc @tijsverkoyen

@voku
Copy link

voku commented Jun 29, 2016

@jeroenvdheuvel I moved this logic into "https://github.com/voku/simple_html_dom", so now the Dom-Parser is separated from the CssToInline class. It looks much more cleaner and we have some extra unit-tests in the "simple_html_dom"-repository. What do you think?

-> voku@b0da918

@techi602
Copy link
Contributor

I guess the issue could be solved with html_entity_decode() to decode DomDocument::saveHtml() output
Since some e-mail clients such as thunderbird or android have issues with displaying entities. Well they truly display the entities in their hex format which is not suitable for many people 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants