Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ORC dictionary encoding to migrate to the new cuco::static_map #17049

Merged
merged 10 commits into from
Oct 12, 2024

Conversation

mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Oct 10, 2024

Description

Part of #12261. This PR refactors ORC writer's dictionary encoding to migrate from cuco::legacy::static_map to the new cuco::static_map. No performance impact measured. Results here.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 10, 2024
@mhaseeb123 mhaseeb123 self-assigned this Oct 10, 2024
@mhaseeb123 mhaseeb123 added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change cuIO cuIO issue cuco cuCollections related issue labels Oct 10, 2024
@mhaseeb123
Copy link
Member Author

mhaseeb123 commented Oct 10, 2024

Performance Results

Overall: No measurable improvement or regression

Benchmark:

./ORC_WRITER_NVBENCH -d 0 -b orc_write_encode

Hardware

GPU: NVIDIA RTX 5880 Ada Generation
SM Version: 890 (PTX Version: 860)
Number of SMs: 110
SM Default Clock Rate: 18446744071874 MHz
Global Memory: 23879 MiB Free / 48632 MiB Total
Global Memory Bus Peak: 960 GB/sec (384-bit DDR @10001MHz)
Max Shared Memory: 100 KiB/SM, 48 KiB/Block
L2 Cache Size: 98304 KiB
Maximum Active Blocks: 24/SM
Maximum Active Threads: 1536/SM, 1024/Block
Available Registers: 65536/SM, 65536/Block
ECC Enabled: No

Improvement

|    data_type    | cardinality | run_length |  GPUTime_old  |  GPUTime_new  |   pct_change   | bytes_per_sec_old | bytes_per_sec_new | peak_mem_use | enc_file_size |
|-----------------|-------------|------------|---------------|---------------|----------------|-------------------|-------------------|--------------|---------------|
| INTEGRAL_SIGNED |           0 |          1 |     31.192 ms |     31.204 ms |   -0.038471403 |       17211608525 |       17205466243 |    1.391 GiB |   511.692 MiB |
| INTEGRAL_SIGNED |        1000 |          1 |     90.620 ms |     90.665 ms |   -0.049657912 |        5924414468 |        5921487685 |    1.391 GiB |   328.830 MiB |
| INTEGRAL_SIGNED |       10000 |          1 |     38.486 ms |     38.437 ms |    0.127319025 |       13949680271 |       13967488869 |    1.391 GiB |   494.221 MiB |
| INTEGRAL_SIGNED |      100000 |          1 |     31.849 ms |     31.862 ms |   -0.040817608 |       16856899480 |       16849943530 |    1.391 GiB |   511.688 MiB |
| INTEGRAL_SIGNED |           0 |          8 |     55.732 ms |     55.771 ms |   -0.069977751 |        9633006704 |        9626342330 |  779.204 MiB |    92.533 MiB |
| INTEGRAL_SIGNED |        1000 |          8 |     68.316 ms |     68.463 ms |   -0.215176533 |        7858676998 |        7841749312 |  779.168 MiB |    62.252 MiB |
| INTEGRAL_SIGNED |       10000 |          8 |     56.560 ms |     56.617 ms |   -0.100777935 |        9492055828 |        9482522344 |  779.205 MiB |    89.232 MiB |
| INTEGRAL_SIGNED |      100000 |          8 |     55.648 ms |     55.776 ms |   -0.230017251 |        9647690656 |        9625531384 |  779.209 MiB |    92.217 MiB |
| INTEGRAL_SIGNED |           0 |         32 |     25.001 ms |     24.970 ms |     0.12399504 |       21474119196 |       21500766486 |  713.603 MiB |    28.156 MiB |
| INTEGRAL_SIGNED |        1000 |         32 |     26.835 ms |     26.850 ms |   -0.055897149 |       20006619053 |       19995487566 |  713.584 MiB |    18.608 MiB |
| INTEGRAL_SIGNED |       10000 |         32 |     25.145 ms |     25.182 ms |    -0.14714655 |       21351343072 |       21319669972 |  713.604 MiB |    27.052 MiB |
| INTEGRAL_SIGNED |      100000 |         32 |     25.133 ms |     25.097 ms |    0.143237974 |       21361473439 |       21392031225 |  713.604 MiB |    28.052 MiB |
|           FLOAT |           0 |          1 |     26.247 ms |     26.202 ms |    0.171448166 |       20454264766 |       20489336549 |    1.347 GiB |   509.351 MiB |
|           FLOAT |        1000 |          1 |     93.993 ms |     94.000 ms |   -0.007447363 |        5711831402 |        5711376505 |    1.347 GiB |   276.772 MiB |
|           FLOAT |       10000 |          1 |     33.781 ms |     33.763 ms |    0.053284391 |       15892704603 |       15901147685 |    1.347 GiB |   485.771 MiB |
|           FLOAT |      100000 |          1 |     27.065 ms |     26.805 ms |    0.960650286 |       19836131969 |       20029095601 |    1.347 GiB |   509.302 MiB |
|           FLOAT |           0 |          8 |     22.633 ms |     22.682 ms |   -0.216498034 |       23720310896 |       23669682214 |    1.344 GiB |   102.881 MiB |
|           FLOAT |        1000 |          8 |     27.088 ms |     27.134 ms |   -0.169816893 |       19819872127 |       19785894582 |    1.345 GiB |    81.608 MiB |
|           FLOAT |       10000 |          8 |     23.274 ms |     23.293 ms |   -0.081636161 |       23067207030 |       23048750858 |    1.344 GiB |    99.858 MiB |
|           FLOAT |      100000 |          8 |     22.712 ms |     22.744 ms |   -0.140894681 |       23637869026 |       23604817225 |    1.344 GiB |   102.590 MiB |
|           FLOAT |           0 |         32 |     19.310 ms |     19.307 ms |    0.015535992 |       27802834381 |       27806376847 |    1.344 GiB |    45.046 MiB |
|           FLOAT |        1000 |         32 |     19.426 ms |     19.455 ms |   -0.149284464 |       27636187275 |       27594993783 |    1.344 GiB |    42.911 MiB |
|           FLOAT |       10000 |         32 |     19.125 ms |     19.112 ms |    0.067973856 |       28071582028 |       28090157752 |    1.344 GiB |    44.800 MiB |
|           FLOAT |      100000 |         32 |     19.055 ms |     19.093 ms |   -0.199422724 |       28174395662 |       28119440093 |    1.344 GiB |    45.023 MiB |
|         DECIMAL |           0 |          1 |     41.985 ms |     42.053 ms |   -0.161962606 |       12787313840 |       12766583172 |    1.808 GiB |   446.462 MiB |
|         DECIMAL |        1000 |          1 |     88.130 ms |     88.487 ms |     -0.4050834 |        6091805078 |        6067211903 |    1.809 GiB |   242.871 MiB |
|         DECIMAL |       10000 |          1 |     47.113 ms |     47.019 ms |    0.199520302 |       11395269502 |       11418156292 |    1.808 GiB |   421.609 MiB |
|         DECIMAL |      100000 |          1 |     42.156 ms |     42.330 ms |   -0.412752633 |       12735266298 |       12683026162 |    1.808 GiB |   445.146 MiB |
|         DECIMAL |           0 |          8 |     31.597 ms |     31.722 ms |   -0.395607178 |       16991262373 |       16924071562 |    1.807 GiB |    96.594 MiB |
|         DECIMAL |        1000 |          8 |     32.908 ms |     32.983 ms |   -0.227908107 |       16314147103 |       16277412275 |    1.808 GiB |    83.015 MiB |
|         DECIMAL |       10000 |          8 |     31.751 ms |     31.765 ms |   -0.044093099 |       16908726723 |       16901513042 |    1.807 GiB |    94.850 MiB |
|         DECIMAL |      100000 |          8 |     31.656 ms |     31.790 ms |    -0.42330048 |       16959433869 |       16888118218 |    1.807 GiB |    96.439 MiB |
|         DECIMAL |           0 |         32 |     29.086 ms |     29.215 ms |   -0.443512343 |       18458115527 |       18376484779 |    1.807 GiB |    50.813 MiB |
|         DECIMAL |        1000 |         32 |     29.220 ms |     29.342 ms |   -0.417522245 |       18373716410 |       18296953961 |    1.807 GiB |    49.546 MiB |
|         DECIMAL |       10000 |         32 |     29.172 ms |     29.236 ms |   -0.219388455 |       18403759838 |       18363247788 |    1.807 GiB |    50.665 MiB |
|         DECIMAL |      100000 |         32 |     29.142 ms |     29.237 ms |    -0.32598998 |       18422683025 |       18362914881 |    1.807 GiB |    50.811 MiB |
|       TIMESTAMP |           0 |          1 |     25.843 ms |     25.888 ms |   -0.174128391 |       20774499895 |       20738158767 |    1.604 GiB |   393.833 MiB |
|       TIMESTAMP |        1000 |          1 |     81.826 ms |     81.892 ms |   -0.080658959 |        6561093247 |        6555846603 |    1.604 GiB |   312.233 MiB |
|       TIMESTAMP |       10000 |          1 |     33.452 ms |     33.428 ms |    0.071744589 |       16048968338 |       16060652769 |    1.604 GiB |   387.870 MiB |
|       TIMESTAMP |      100000 |          1 |     26.527 ms |     26.496 ms |    0.116862065 |       20238766154 |       20262487566 |    1.604 GiB |   393.808 MiB |
|       TIMESTAMP |           0 |          8 |     47.515 ms |     47.583 ms |   -0.143112701 |       11299063410 |       11282798716 |    1.293 GiB |    74.980 MiB |
|       TIMESTAMP |        1000 |          8 |     55.172 ms |     55.204 ms |   -0.058000435 |        9730876944 |        9725268382 |    1.293 GiB |    52.224 MiB |
|       TIMESTAMP |       10000 |          8 |     48.267 ms |     48.288 ms |   -0.043507987 |       11123053432 |       11118191738 |    1.293 GiB |    72.421 MiB |
|       TIMESTAMP |      100000 |          8 |     47.482 ms |     47.546 ms |    -0.13478792 |       11306914895 |       11291718526 |    1.293 GiB |    74.744 MiB |
|       TIMESTAMP |           0 |         32 |     20.390 ms |     20.420 ms |   -0.147130947 |       26330411831 |       26291094626 |    1.243 GiB |    23.648 MiB |
|       TIMESTAMP |        1000 |         32 |     21.141 ms |     21.193 ms |   -0.245967551 |       25394775245 |       25332749392 |    1.243 GiB |    16.678 MiB |
|       TIMESTAMP |       10000 |         32 |     20.500 ms |     20.533 ms |    -0.16097561 |       26188656750 |       26146621402 |    1.243 GiB |    22.766 MiB |
|       TIMESTAMP |      100000 |         32 |     20.379 ms |     20.443 ms |   -0.314048776 |       26344869867 |       26261435539 |    1.243 GiB |    23.562 MiB |
|          STRING |           0 |          1 |     41.877 ms |     41.607 ms |    0.644745326 |       12820281300 |       12903506687 |    1.281 GiB |   451.937 MiB |
|          STRING |        1000 |          1 |     31.094 ms |     31.251 ms |   -0.504920563 |       17265977145 |       17179228069 |  660.517 MiB |    62.248 MiB |
|          STRING |       10000 |          1 |     64.846 ms |     64.210 ms |    0.980785245 |        8279189582 |        8361141167 |  732.672 MiB |   134.189 MiB |
|          STRING |      100000 |          1 |     91.161 ms |     88.399 ms |    3.029804412 |        5889277939 |        6073288812 |    1.281 GiB |   451.644 MiB |
|          STRING |           0 |          8 |     42.183 ms |     41.698 ms |     1.14975227 |       12727238981 |       12875097140 |    1.281 GiB |   451.937 MiB |
|          STRING |        1000 |          8 |     36.364 ms |     36.110 ms |    0.698493015 |       14763967700 |       14867703644 |  618.153 MiB |    18.543 MiB |
|          STRING |       10000 |          8 |     54.531 ms |     53.980 ms |    1.010434432 |        9845208603 |        9945692212 |  648.568 MiB |    50.278 MiB |
|          STRING |      100000 |          8 |     56.165 ms |     55.164 ms |    1.782248731 |        9558777092 |        9732256472 |  660.968 MiB |    62.981 MiB |
|          STRING |           0 |         32 |     42.164 ms |     41.845 ms |    0.756569585 |       12732907180 |       12829864303 |    1.281 GiB |   451.937 MiB |
|          STRING |        1000 |         32 |     29.114 ms |     28.826 ms |    0.989214811 |       18440050674 |       18624651147 |  608.503 MiB |     9.820 MiB |
|          STRING |       10000 |         32 |     33.053 ms |     32.733 ms |    0.968142075 |       16242625502 |       16401275227 |  614.532 MiB |    16.461 MiB |
|          STRING |      100000 |         32 |     33.477 ms |     33.409 ms |    0.203124533 |       16036958311 |       16069885154 |  615.540 MiB |    17.537 MiB |
|            LIST |           0 |          1 |     43.004 ms |     41.064 ms |     4.51120826 |       12484312445 |       13074066703 |    1.211 GiB |   452.167 MiB |
|            LIST |        1000 |          1 |    111.790 ms |    111.790 ms |              0 |        4802489223 |        4802485837 |    1.211 GiB |   337.461 MiB |
|            LIST |       10000 |          1 |     47.000 ms |     47.077 ms |   -0.163829787 |       11422669560 |       11404208990 |    1.211 GiB |   446.113 MiB |
|            LIST |      100000 |          1 |     41.618 ms |     41.828 ms |    -0.50458936 |       12899961875 |       12835085883 |    1.211 GiB |   452.148 MiB |
|            LIST |           0 |          8 |    108.269 ms |    108.226 ms |    0.039715893 |        4958655266 |        4960643731 |  829.180 MiB |    90.197 MiB |
|            LIST |        1000 |          8 |    127.245 ms |    127.225 ms |     0.01571771 |        4219196979 |        4219861813 |  829.180 MiB |    60.308 MiB |
|            LIST |       10000 |          8 |    110.501 ms |    110.565 ms |   -0.057918028 |        4858525654 |        4855690880 |  829.180 MiB |    86.856 MiB |
|            LIST |      100000 |          8 |    108.406 ms |    108.387 ms |    0.017526705 |        4952404516 |        4953256805 |  829.180 MiB |    89.872 MiB |
|            LIST |           0 |         32 |     53.603 ms |     53.599 ms |    0.007462269 |       10015659873 |       10016425767 |  754.483 MiB |    33.441 MiB |
|            LIST |        1000 |         32 |     64.849 ms |     64.883 ms |    -0.05242949 |        8278725137 |        8274443227 |  754.483 MiB |    23.436 MiB |
|            LIST |       10000 |         32 |     55.156 ms |     55.178 ms |   -0.039886866 |        9733621363 |        9729843860 |  754.483 MiB |    32.156 MiB |
|            LIST |      100000 |         32 |     53.808 ms |     53.786 ms |    0.040886114 |        9977594948 |        9981530169 |  754.483 MiB |    33.317 MiB |
|          STRUCT |           0 |          1 |     53.002 ms |     52.687 ms |    0.594317196 |       10129334007 |       10189909788 |    2.196 GiB |   468.936 MiB |
|          STRUCT |        1000 |          1 |     78.450 ms |     78.840 ms |   -0.497131931 |        6843512059 |        6809616526 |    1.803 GiB |   157.032 MiB |
|          STRUCT |       10000 |          1 |     88.336 ms |     85.419 ms |    3.302164463 |        6077613570 |        6285133663 |    1.872 GiB |   264.807 MiB |
|          STRUCT |      100000 |          1 |     99.057 ms |     94.564 ms |    4.535772333 |        5419822847 |        5677309886 |    2.194 GiB |   465.584 MiB |
|          STRUCT |           0 |          8 |     58.834 ms |     58.532 ms |    0.513308631 |        9125216252 |        9172241955 |    2.136 GiB |   349.458 MiB |
|          STRUCT |        1000 |          8 |     63.701 ms |     63.182 ms |    0.814743882 |        8427963904 |        8497179122 |    1.714 GiB |    40.006 MiB |
|          STRUCT |       10000 |          8 |     76.923 ms |     74.700 ms |     2.88990289 |        6979292858 |        7187000488 |    1.736 GiB |    69.027 MiB |
|          STRUCT |      100000 |          8 |     81.695 ms |     77.851 ms |    4.705306322 |        6571622069 |        6896130598 |    1.742 GiB |    76.481 MiB |
|          STRUCT |           0 |         32 |     54.180 ms |     52.601 ms |    2.914359542 |        9909012139 |       10206534688 |    2.127 GiB |   330.027 MiB |
|          STRUCT |        1000 |         32 |     52.633 ms |     50.972 ms |    3.155814793 |       10200275483 |       10532687834 |    1.697 GiB |    19.226 MiB |
|          STRUCT |       10000 |         32 |     54.909 ms |     53.265 ms |    2.994044692 |        9777516495 |       10079291605 |    1.700 GiB |    24.090 MiB |
|          STRUCT |      100000 |         32 |     54.446 ms |     53.554 ms |    1.638320538 |        9860669998 |       10024763213 |    1.701 GiB |    24.750 MiB |

@mhaseeb123 mhaseeb123 marked this pull request as ready for review October 10, 2024 20:28
@mhaseeb123 mhaseeb123 requested a review from a team as a code owner October 10, 2024 20:28
@mhaseeb123 mhaseeb123 added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 10, 2024
cpp/src/io/orc/dict_enc.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/dict_enc.cu Outdated Show resolved Hide resolved
Co-authored-by: Yunsong Wang <[email protected]>
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

cpp/src/io/orc/dict_enc.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/dict_enc.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/dict_enc.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/writer_impl.cu Show resolved Hide resolved
@mhaseeb123
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 4dbb8a3 into rapidsai:branch-24.12 Oct 12, 2024
101 checks passed
@mhaseeb123 mhaseeb123 deleted the migrate-orc-dict-encoding branch October 14, 2024 18:29
@mhaseeb123 mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge cuco cuCollections related issue cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants