LEAF-4581 - extended character sets #2612
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change is mainly for emojis and any data that is falling in the 4 byte space. We were notified of this when a form grid had issues when trying to save the smiley character. There is database changes as well as small code change.
Set Names
The
SET NAMES
command is used to tell the client (your PHP script) what character encoding should be used when sending queries to the server. This setting also influences how data will be retrieved from the server.https://dev.mysql.com/doc/refman/8.4/en/set-names.html
With mb3 and latin1 columns and this setting, any data that is not valid like emojis will be replaced with ?. On the latin1 it replaces 😀 with one ?. On mb3 fields you will see 😀 replaced with ????. I think this is better than having things like serialized data breaking due to failed characters. It also allows for a bit of a stop gap in switching all data over.
Why not convert everything now?
Right now there are a few conversions blocked by foreign key constraints and we will need to navigate those tables differently. We also have latin1 instead of mb3/4, the data should convert properly from what I can tell however extra testing will need to be had.
Testing
There is a database change so you will need to run the database updater if manually testing.
When testing use special characters like ALT+0233 as well as the emoji keyboard (https://support.microsoft.com/en-us/windows/windows-keyboard-tips-and-tricks-588e0b72-0fff-6d3f-aeee-6e5116097942).
The original issue was with form grids and emojis. The smiley face would break the serialized data so any data after the emoji would not be saved.
Will need to work with testing team for tests since I am not sure of the best approach at this exact moment.
Automated Test
department-of-veterans-affairs/LEAF-Automated-Tests#22