Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlencode nested serializer temp file names so they dont explode stuff #15068

Merged

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Oct 3, 2023

Description

Fixes a bug caused by #14919, which was just using the column name as part of a temp file name, which.. isn't very cool, my bad. Switched to use StringUtils.urlEncode so that ugly chars don't explode stuff. The modified test fails without the changes in this PR.

This PR has:

  • been self-reviewed.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@@ -135,7 +135,9 @@ public int lookupString(@Nullable String value)
public int lookupLong(@Nullable Long value)
{
if (longDictionary == null) {
Path longFile = makeTempFile(name + NestedCommonFormatColumnSerializer.LONG_DICTIONARY_FILE_NAME);
final Path longFile = makeTempFile(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: seems like makeTempFile is a class method; it could be possible to push the StringUtils.urlEncode(name) into that; so that call sites could look like:

makeTempFile(NestedCommonFormatColumnSerializer.LONG_DICTIONARY_FILE_NAME)

the method already appends .tmp ; so it could pick up this as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of that too but if we are doing that, we should change the name of the method so that encoding is not done twice?

@abhishekagarwal87 abhishekagarwal87 merged commit 3afe09a into apache:master Oct 5, 2023
74 checks passed
@clintropolis clintropolis deleted the fix-nested-column-temp-files branch October 5, 2023 04:46
ektravel pushed a commit to ektravel/druid that referenced this pull request Oct 16, 2023
apache#15068)

Fixes a bug caused by apache#14919, which was just using the column name as part of a temp file name, which.. isn't very cool, my bad. Switched to use StringUtils.urlEncode so that ugly chars don't explode stuff. The modified test fails without the changes in this PR.
CaseyPan pushed a commit to CaseyPan/druid that referenced this pull request Nov 17, 2023
apache#15068)

Fixes a bug caused by apache#14919, which was just using the column name as part of a temp file name, which.. isn't very cool, my bad. Switched to use StringUtils.urlEncode so that ugly chars don't explode stuff. The modified test fails without the changes in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants