-
-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #274 - Views Incorrectly Encode UTF Characters as HTML Entities #403
base: hotfix
Are you sure you want to change the base?
Conversation
Hello @GoryMoon, Would you be so kind to check with me replication steps:
Is this something you could have a look at if you have time (characters ', '', <, >)? Thanks in advance! Regards, |
I don't know if anybody read my comment in the Issue... I'd really like to understand this better. The question "how is this in the Database" is an important one. If there's HTML encoding in the database, it's wrong. If we don't put it there in the first-place, it's better. I also don't think that we should be re-implementing basic stuff that PHP is supposed to be able to handle correctly, like string substitutions to encode HTML. Otherwise, now you fixed four cases, then @serhiisamko091184 asks you for 4 more, and we can go on indefinitely but we should just use complete functions from the existing libraries. |
I have done some debugging to get a better understanding of when the encoding issues occur and why it works in some places. This is just a collection of stuff I found related to this issue.
From testing it looks like it saves it like normal and not HTML encoded, when saving in cleans up the input and removes some stuff if it's not allowed. When trying to insert The encoding error happens on pages where the GraphQL returns the data HTML encoded and angular handles the data. For pages when it works, it's either not encoded when angular receives it or it's a legacy iframe that contains the HTML-encoded values directly. As an example: on the
Most issues where you have a As an example this is from the
Another place would be the timeline where it could be called to not encode the data from the backend code like the A SuiteCRM-Core/core/backend/Data/LegacyHandler/PresetDataHandlers/HistoryTimelineDataHandler.php Line 283 in 117dd81
I agree with using existing functions rather than reimplementing them. |
I've looked over this once again after discovering that a change in version 8.6.1 caused descriptions and other fields to have the wrong encoding. I've tested adding the string that @yunusyerli1 was using in #501 ( Compared to the previous change this is a bit bigger. There a more |
Even after the previous changes quotes In some specific cases it double encoded some characters so that's why there are two This mapper is used for the api and This change resolves #510 |
If there's double-encoding, shouldn't it be fixed by preventing the second encoding, instead of adding a second decode? I'm very concerned with the way that the craziness of v7 misguided/excessive purification is creeping into v8 too... we really really really must not just patch "undo" code on top of "let's mess things up" code. Twice. This is not a criticism of the current PR, it is a wider issue... still, we shouldn't merge just because it solves the current problem, we should really check that we're doing things the right way. |
The code paths could be traced, and places where double encoding happens should be eliminated, which could be difficult. Or, the PHP |
Just documenting the double encoding reason, it's needed for the specific cases below. Without the double decoding the description looks like this: Adding one decode it will look like this, the part that is double encoded looks to be non html/xml block. And with both decodes it looks like this: This issue was introduced in f483bec for the v8.6.1 release to fix what looks like a bunch of CVEs. |
Description
See the issue #274 for more details about the issue.
I identified that when the incorrectly encoded characters showed up the to_html function didn't have the
$toHTML
global array so instead used the htmlentities function causing the html encodings.SuiteCRM-Core/public/legacy/include/utils/db_utils.php
Lines 99 to 102 in 117dd81
With the small change of defining the variable in the$GLOBALS
array, the function will use that and encode it correctly.There might be more related to encoding issues but this fixes my issues and what it looks like the issue is having.See this comment for an update on the changes to this: #403 (comment)
Motivation and Context
How To Test This
a. For example, change the company name that is referenced in a Call. (Like adding åäö or other letters)
Types of changes
Final checklist