Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace escaped Unicode chars (\u20ac) in stored JSON? #173

Open
jimallman opened this issue Feb 22, 2017 · 2 comments
Open

Replace escaped Unicode chars (\u20ac) in stored JSON? #173

jimallman opened this issue Feb 22, 2017 · 2 comments

Comments

@jimallman
Copy link
Member

While chasing a Unicode-related bug, I realized that our stored JSON (on GitHub) has ugly escaped Unicode characters, e.g. in this study and this tree collection.

These Unicode characters are handled gracefully in our indexing and web apps, but these escape sequences aren't strictly needed as we store all JSON as utf-8. Meanwhile, they're hideous and make it hard to read and search the stored files on GitHub.

  • Is this something we want or need to fix?
  • Would this fix apply to all document types (studies, tree collections, tax. amendments)?
  • Are there other clients or use cases that would be broken by this change?

If we want to restore pretty Unicode for data saved in the future, it seems to all boil down to a single call to json.dump in peyotl that's used for all JSON docs. If we add ensure_ascii=False to this call as shown here, it should save Unicode characters directly (sans escape) in phylesystem.

@jimallman jimallman changed the title Replace encoded Unicode chars (\u20ac) in stored JSON? Replace escaped Unicode chars (\u20ac) in stored JSON? Feb 22, 2017
@jimallman
Copy link
Member Author

See related Python docs for json.dump here.

@jar398
Copy link
Member

jar398 commented Feb 22, 2017

  • Yes, 'we' want to fix it (I have always urged the project to be UTF-8 only)
  • Apply everywhere
  • I doubt it, and if there are, we'll find out and can fix them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants