Add best practices doc section for string manipulation #385

mothslaw · 2021-06-17T19:21:40Z

Is your feature request related to a problem? Please describe.
In Python 2, it's non-obvious how to handle strings correctly, such that plugin code doesn't have problems when encountering non-ASCII text. Code like this can cause problems:

result = libs.run_bash(connection, "cat settings.txt | grep FIRST_NAME | awk '{print $2'}")
first_name = result.stdout
message = "The first name is {}".format(first_name)

The problem happens when the remote-side output contains non-ASCII characters. When we create message, we are calling format on an str object, not a unicode object. That means that Python needs to convert those non-ASCII characters into bytes. But, we've never specified which encoding to use to do so.

Describe the solution you'd like

A general best practice is the so-called "Unicode Sandwich". This says to always use unicode objects, not str objects.

The only exception is when directly interacting with other code that really expects/produces sequences of bytes (not characters). Even so, you should immediately decode any received bytes before the rest of your code sees them, and you should encode characters to bytes at the last possible second before sending them out. For plugins, any strings passed to/from Delphix code already support unicode objects, so this exception does not apply to plugins

So, we want to encourage plugin authors to:

Always use Unicode objects (u"Hello, World", not "Hello, World")
Never call encode or decode.

We should:

Document this as a best practice. This includes giving examples of problematic code, as above
Change our documentation examples so that they actually follow this best practice.
Change dvp init so that the code it generates also follows this best practice.

The text was updated successfully, but these errors were encountered:

mothslaw self-assigned this Jun 17, 2021

mothslaw changed the title ~~Add best practices doc section for string manipuation~~ Add best practices doc section for string manipulation Jun 17, 2021

mothslaw mentioned this issue Oct 4, 2021

Fixes #401 Networking capabilities should be documented #402

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add best practices doc section for string manipulation #385

Add best practices doc section for string manipulation #385

mothslaw commented Jun 17, 2021 •

edited

Loading

Add best practices doc section for string manipulation #385

Add best practices doc section for string manipulation #385

Comments

mothslaw commented Jun 17, 2021 • edited Loading

mothslaw commented Jun 17, 2021 •

edited

Loading