You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In Python 2, it's non-obvious how to handle strings correctly, such that plugin code doesn't have problems when encountering non-ASCII text. Code like this can cause problems:
result = libs.run_bash(connection, "cat settings.txt | grep FIRST_NAME | awk '{print $2'}")
first_name = result.stdout
message = "The first name is {}".format(first_name)
The problem happens when the remote-side output contains non-ASCII characters. When we create message, we are calling format on an str object, not a unicode object. That means that Python needs to convert those non-ASCII characters into bytes. But, we've never specified which encoding to use to do so.
Describe the solution you'd like
A general best practice is the so-called "Unicode Sandwich". This says to always use unicode objects, not str objects.
The only exception is when directly interacting with other code that really expects/produces sequences of bytes (not characters). Even so, you should immediately decode any received bytes before the rest of your code sees them, and you should encode characters to bytes at the last possible second before sending them out. For plugins, any strings passed to/from Delphix code already support unicode objects, so this exception does not apply to plugins
So, we want to encourage plugin authors to:
Always use Unicode objects (u"Hello, World", not "Hello, World")
Never call encode or decode.
We should:
Document this as a best practice. This includes giving examples of problematic code, as above
Change our documentation examples so that they actually follow this best practice.
Change dvp init so that the code it generates also follows this best practice.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
In Python 2, it's non-obvious how to handle strings correctly, such that plugin code doesn't have problems when encountering non-ASCII text. Code like this can cause problems:
The problem happens when the remote-side output contains non-ASCII characters. When we create
message
, we are callingformat
on anstr
object, not aunicode
object. That means that Python needs to convert those non-ASCII characters into bytes. But, we've never specified which encoding to use to do so.Describe the solution you'd like
A general best practice is the so-called "Unicode Sandwich". This says to always use
unicode
objects, notstr
objects.The only exception is when directly interacting with other code that really expects/produces sequences of bytes (not characters). Even so, you should immediately
decode
any received bytes before the rest of your code sees them, and you shouldencode
characters to bytes at the last possible second before sending them out. For plugins, any strings passed to/from Delphix code already supportunicode
objects, so this exception does not apply to pluginsSo, we want to encourage plugin authors to:
Unicode
objects (u"Hello, World"
, not"Hello, World"
)encode
ordecode
.We should:
dvp init
so that the code it generates also follows this best practice.The text was updated successfully, but these errors were encountered: