Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refrain from transcoding SBE field names in snake_case #79

Open
salsferrazza opened this issue Dec 20, 2022 · 1 comment
Open

Refrain from transcoding SBE field names in snake_case #79

salsferrazza opened this issue Dec 20, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@salsferrazza
Copy link
Collaborator

I believe there is an option in the SBE decoding library that snake cases all of the decoded field names, this should be suppressed and default to verbatim transcoding of the field name as specified in the schema.

@mservidio
Copy link
Collaborator

mservidio commented Dec 22, 2022

@salsferrazza Yes, field names are converted using this:

def convert_to_underscore(name):
    name = name.strip('@').strip('#')
    sub_str = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', sub_str).lower()

However, naming requirements differ per output type. IE: BigQuery won't accept a dash '-' in a column name. So if we considered doing something like this we still need some way to sanitize field names based on the output type requirements.

See: https://cloud.google.com/bigquery/docs/schemas

@mservidio mservidio added the enhancement New feature or request label Dec 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants