Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check num_children() == 0 in Column.from_column_view #17193

Merged

Conversation

cwharris
Copy link
Contributor

@cwharris cwharris commented Oct 28, 2024

Description

This fixes a bug where Column.from_column_view is not verifying the existence of a string column's offsets child column prior to accessing it, resulting in a segmentation fault when passing a column_view from Column.view() to Column.from_column_view(...).

The issue can be reproduced with:

import cudf
from cudf.core.column.column import as_column
df = cudf.DataFrame({'a': cudf.Series([[]], dtype=cudf.core.dtypes.ListDtype('string'))})
s = df['a']
col = as_column(s)
col2 = cudf._lib.column.Column.back_and_forth(col)
print(col)
print(col2)

where back_and_forth is defined as:

    @staticmethod
    def back_and_forth(Column input_column):
        cdef column_view input_column_view = input_column.view()
        return Column.from_column_view(input_column_view, input_column)

I don't have the expertise to write the appropriate tests for this without introducing the back_and_forth function as an API, which seems undesirable.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@cwharris cwharris added bug Something isn't working non-breaking Non-breaking change labels Oct 28, 2024
@cwharris cwharris requested a review from a team as a code owner October 28, 2024 19:05
@cwharris cwharris requested review from vyasr and mroeschke October 28, 2024 19:05
@github-actions github-actions bot added the Python Affects Python cuDF API. label Oct 28, 2024
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. We don't really have a way to test Cython-only methods at this time, as far as I am aware.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks correct to me, thanks! I agree with Bradley, there really isn't a great way to test this right now.

@rapids-bot rapids-bot bot merged commit bf5b778 into rapidsai:branch-24.12 Oct 29, 2024
99 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants