Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Concatenate an empty and non-empty series #6157

Conversation

marlenezw
Copy link
Contributor

Hi! I updated the concat function to allow concatenating an empty series with a non-empty series. I also changed the test for concat_empty_series so dtypes for empty series are ignored. Looks like its passing tests and working on my end.

@marlenezw marlenezw requested a review from a team as a code owner September 4, 2020 15:51
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@shwina
Copy link
Contributor

shwina commented Sep 4, 2020

@marlenezw to trigger CI, you'll need to update CHANGELOG.md and push that change. Add an entry with the PR number and title to it under cuDF 0.16 (bug fixes).

@rgsl888prabhu
Copy link
Contributor

rgsl888prabhu commented Sep 4, 2020

I am not sure if this scenario has been taken care

>>> import cudf
>>> sr1 = cudf.Series([1, 2, 3, 4])
>>> sr2 = cudf.Series()
>>> cudf.concat([sr1, sr2], axis=1, ignore_index=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/reshape.py", line 249, in concat
    df.index = None
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 572, in __setattr__
    object.__setattr__(self, key, col)
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 2522, in index
    new_length = len(value)
TypeError: object of type 'NoneType' has no len()

@marlenezw
Copy link
Contributor Author

I am not sure if this scenario has been taken care

>>> import cudf
>>> sr1 = cudf.Series([1, 2, 3, 4])
>>> sr2 = cudf.Series()
>>> cudf.concat([sr1, sr2], axis=1, ignore_index=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/reshape.py", line 249, in concat
    df.index = None
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 572, in __setattr__
    object.__setattr__(self, key, col)
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 2522, in index
    new_length = len(value)
TypeError: object of type 'NoneType' has no len()

Oof, I'll make some changes to this! Thanks for the catch @rgsl888prabhu

@marlenezw marlenezw changed the title [REVIEW] Concatenate an empty and non-empty series [WIP] Concatenate an empty and non-empty series Sep 4, 2020
@marlenezw
Copy link
Contributor Author

Changing it to WIP while I make changes.

@codecov
Copy link

codecov bot commented Sep 4, 2020

Codecov Report

Merging #6157 into branch-0.16 will increase coverage by 0.32%.
The diff coverage is 85.71%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.16    #6157      +/-   ##
===============================================
+ Coverage        84.30%   84.63%   +0.32%     
===============================================
  Files               82       82              
  Lines            13745    14092     +347     
===============================================
+ Hits             11588    11927     +339     
- Misses            2157     2165       +8     
Impacted Files Coverage Δ
python/cudf/cudf/core/reshape.py 89.06% <85.71%> (+0.21%) ⬆️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/custreamz/custreamz/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_orc.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_json.py 100.00% <0.00%> (ø)
...ython/custreamz/custreamz/tests/test_dataframes.py 100.00% <0.00%> (ø)
...ython/dask_cudf/dask_cudf/io/tests/test_parquet.py 100.00% <0.00%> (ø)
... and 32 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 338968e...fe43053. Read the comment docs.

@marlenezw
Copy link
Contributor Author

I added some changes that should hopefully have fixed this issue.

I am not sure if this scenario has been taken care

>>> import cudf
>>> sr1 = cudf.Series([1, 2, 3, 4])
>>> sr2 = cudf.Series()
>>> cudf.concat([sr1, sr2], axis=1, ignore_index=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/reshape.py", line 249, in concat
    df.index = None
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 572, in __setattr__
    object.__setattr__(self, key, col)
  File "/home/rgsl888/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/dataframe.py", line 2522, in index
    new_length = len(value)
TypeError: object of type 'NoneType' has no len()

@marlenezw marlenezw added the 2 - In Progress Currently a work in progress label Sep 8, 2020
@marlenezw marlenezw self-assigned this Sep 8, 2020
docs/cudf/source/conf.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_concat.py Show resolved Hide resolved
python/cudf/cudf/core/reshape.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@marlenezw marlenezw requested a review from cwharris September 15, 2020 17:28
Copy link
Contributor

@cwharris cwharris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🎉

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Sep 15, 2020
@kkraus14 kkraus14 merged commit ca45852 into rapidsai:branch-0.16 Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] No error message when concating an empty Series with a non-empty Series
9 participants