You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These Unicode characters are handled gracefully in our indexing and web apps, but these escape sequences aren't strictly needed as we store all JSON as utf-8. Meanwhile, they're hideous and make it hard to read and search the stored files on GitHub.
Is this something we want or need to fix?
Would this fix apply to all document types (studies, tree collections, tax. amendments)?
Are there other clients or use cases that would be broken by this change?
If we want to restore pretty Unicode for data saved in the future, it seems to all boil down to a single call to json.dump in peyotl that's used for all JSON docs. If we add ensure_ascii=False to this call as shown here, it should save Unicode characters directly (sans escape) in phylesystem.
The text was updated successfully, but these errors were encountered:
While chasing a Unicode-related bug, I realized that our stored JSON (on GitHub) has ugly escaped Unicode characters, e.g. in this study and this tree collection.
These Unicode characters are handled gracefully in our indexing and web apps, but these escape sequences aren't strictly needed as we store all JSON as utf-8. Meanwhile, they're hideous and make it hard to read and search the stored files on GitHub.
If we want to restore pretty Unicode for data saved in the future, it seems to all boil down to a single call to
json.dump
in peyotl that's used for all JSON docs. If we addensure_ascii=False
to this call as shown here, it should save Unicode characters directly (sans escape) in phylesystem.The text was updated successfully, but these errors were encountered: