Skip to content

Commit

Permalink
decrease memory footprint of drop_duplicates
Browse files Browse the repository at this point in the history
  • Loading branch information
VibhuJawa committed Nov 21, 2023
1 parent 6069f3c commit 64ec881
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion python/cugraph/cugraph/structure/symmetrize.py
Original file line number Diff line number Diff line change
Expand Up @@ -300,5 +300,5 @@ def _memory_efficient_drop_duplicates(ddf, vertex_col_name, num_workers):
"""
# drop duplicates has a 5x+ overhead
ddf = ddf.reset_index(drop=True).repartition(npartitions=num_workers * 2)
ddf = ddf.drop_duplicates(subset=[*vertex_col_name], ignore_index=True)
ddf = ddf.drop_duplicates(subset=[*vertex_col_name], ignore_index=True, split_out=num_workers*2)
return ddf

0 comments on commit 64ec881

Please sign in to comment.