Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cudf v24.12 #17406

Merged
merged 337 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
337 commits
Select commit Hold shift + click to select a range
f926a61
Add release tracking to project automation scripts (#17001)
jarmak-nv Oct 7, 2024
7e1e475
Address all remaining clang-tidy errors (#16956)
vyasr Oct 7, 2024
2d02bdc
Implement `extract_datetime_component` in `libcudf`/`pylibcudf` (#16776)
brandon-b-miller Oct 7, 2024
09ed210
Migrate nvtext generate_ngrams APIs to pylibcudf (#17006)
Matt711 Oct 8, 2024
219ec0e
Expunge NamedColumn (#16962)
wence- Oct 8, 2024
bcf9425
Compute whole column variance using numerically stable approach (#16448)
wence- Oct 8, 2024
cc23474
Turn on `xfail_strict = true` for all python packages (#16977)
wence- Oct 8, 2024
553d8ec
Performance optimization of JSON validation (#16996)
karthikeyann Oct 8, 2024
618a93f
Migrate nvtext jaccard API to pylibcudf (#17007)
Matt711 Oct 8, 2024
349ba5d
make conda installs in CI stricter (#17013)
jameslamb Oct 8, 2024
5b931ac
Add string.convert.convert_urls APIs to pylibcudf (#17003)
mroeschke Oct 9, 2024
ded4dd2
Add pinning for pyarrow in wheels (#17018)
vyasr Oct 9, 2024
a6853f4
Refactor `histogram` reduction using `cuco::static_set::insert_and_fi…
srinivasyadav18 Oct 9, 2024
bfac5e5
Disable kvikio remote I/O to avoid openssl dependencies in JNI build …
pxLi Oct 9, 2024
9c37e1e
Merge pull request #17027 from rapidsai/branch-24.10
GPUtester Oct 9, 2024
dfdae59
Use std::optional for host types (#17015)
robertmaynard Oct 9, 2024
bd51a25
[DOC] Document limitation using `cudf.pandas` proxy arrays (#16955)
Matt711 Oct 9, 2024
c7b5119
Fix `host_span` constructor to correctly copy `is_device_accessible` …
vuule Oct 9, 2024
3791c8a
Add string.convert_floats APIs to pylibcudf (#16990)
mroeschke Oct 9, 2024
31423d0
Update all rmm imports to use pylibrmm/librmm (#16913)
Matt711 Oct 10, 2024
7173b52
Fix regex parsing logic handling of nested quantifiers (#16798)
davidwendt Oct 10, 2024
69b0f66
Add string.convert.convert_lists APIs to pylibcudf (#16997)
mroeschke Oct 10, 2024
7d49df7
Add json APIs to pylibcudf (#17025)
mroeschke Oct 10, 2024
097778e
Move pylibcudf/libcudf/wrappers/decimals to pylibcudf/libcudf/fixed_p…
mroeschke Oct 11, 2024
1436cac
Remove unneeded pylibcudf.libcudf.wrappers.duration usage in cudf (#1…
mroeschke Oct 11, 2024
89a6fe5
make conda installs in CI stricter (part 2) (#17042)
jameslamb Oct 11, 2024
7cf0a1b
Pylibcudf: pack and unpack (#17012)
madsbk Oct 11, 2024
66a94c3
Replace deprecated cuco APIs with updated versions (#17052)
PointKernel Oct 11, 2024
349010e
Remove unused hash helper functions (#17056)
PointKernel Oct 11, 2024
891e5aa
Organize parquet reader mukernel non-nullable code, introduce manual …
pmattione-nvidia Oct 11, 2024
0b840bb
docs: change 'CSV' to 'csv' in python/custreamz/README.md to match ka…
a-hirota Oct 11, 2024
b8f3e21
Reorganize `cudf_polars` expression code (#17014)
brandon-b-miller Oct 11, 2024
fea87cb
Move `flatten_single_pass_aggs` to its own TU (#17053)
PointKernel Oct 11, 2024
c8a56a5
Migrate Min Hashing APIs to pylibcudf (#17021)
Matt711 Oct 11, 2024
be1dd32
Add an example to demonstrate multithreaded `read_parquet` pipelines …
mhaseeb123 Oct 11, 2024
4dbb8a3
Refactor ORC dictionary encoding to migrate to the new `cuco::static_…
mhaseeb123 Oct 12, 2024
3bee678
Made cudftestutil header-only and removed GTest dependency (#16839)
lamarrr Oct 14, 2024
e41dea9
Add profilers to CUDA 12 conda devcontainers (#17066)
vyasr Oct 14, 2024
768fbaa
Fix ORC reader when using `device_read_async` while the destination d…
ttnghia Oct 14, 2024
44afc51
Add clang-tidy to CI (#16958)
vyasr Oct 14, 2024
86db980
Clean up hash-groupby `var_hash_functor` (#17034)
PointKernel Oct 14, 2024
319ec3b
Adding assertion to check for regular JSON inputs of size greater tha…
shrshi Oct 14, 2024
c141ca5
Add string.convert.convert_integers APIs to pylibcudf (#16991)
mroeschke Oct 15, 2024
7bcfc87
Fix regex handling of fixed quantifier with 0 range (#17067)
davidwendt Oct 15, 2024
3420c71
Migrate remaining nvtext NGrams APIs to pylibcudf (#17070)
Matt711 Oct 16, 2024
95df62a
Remove unnecessary `std::move`'s in pylibcudf (#16983)
Matt711 Oct 16, 2024
f1cbbcc
Reenable huge pages for arrow host copying (#17097)
vyasr Oct 16, 2024
b513df8
Include timezone file path in error message (#17102)
bdice Oct 16, 2024
c9202a0
bug fix: use `self.ck_consumer` in `poll` method of kafka.py to align…
a-hirota Oct 16, 2024
5f863a5
Implement batch construction for strings columns (#17035)
ttnghia Oct 17, 2024
3683e46
Add strings.combine APIs to pylibcudf (#16790)
mroeschke Oct 17, 2024
e493340
Make tests more deterministic (#17008)
galipremsagar Oct 17, 2024
9980997
Add conda recipe for cudf-polars (#17037)
bdice Oct 17, 2024
6eeb7d6
Fix `DataFrame._from_arrays` and introduce validations (#17112)
galipremsagar Oct 17, 2024
14209c1
Correctly set `is_device_accesible` when creating `host_span`s from o…
vuule Oct 17, 2024
920a5f6
Remove the additional host register calls initially intended for perf…
kingcrimsontianyu Oct 17, 2024
00feb82
Limit the number of keys to calculate column sizes and page starts in…
mhaseeb123 Oct 17, 2024
ce93c36
Migrate NVText Normalizing APIs to Pylibcudf (#17072)
Matt711 Oct 17, 2024
8ebf0d4
Add device aggregators used by shared memory groupby (#17031)
PointKernel Oct 18, 2024
b891722
Control whether a file data source memory-maps the file with an envir…
vuule Oct 18, 2024
6ca721c
Fix the GDS read/write segfault/bus error when the cuFile policy is s…
kingcrimsontianyu Oct 18, 2024
e1c9a5a
Fix clang-tidy violations for span.hpp and hostdevice_vector.hpp (#17…
davidwendt Oct 18, 2024
e242dce
Disable the Parquet reader's wide lists tables GTest by default (#17120)
mhaseeb123 Oct 18, 2024
6ad9074
Add custom "fused" groupby aggregation to Dask cuDF (#17009)
rjzamora Oct 18, 2024
98eef67
Extend `device_scalar` to optionally use pinned bounce buffer (#16947)
vuule Oct 18, 2024
fdd2b26
Changing developer guide int_64_t to int64_t (#17130)
hyperbolic2346 Oct 19, 2024
1ce2526
Replace old host tree algorithm with new algorithm in JSON reader (#1…
karthikeyann Oct 19, 2024
074ab74
Split hash-based groupby into multiple smaller files to reduce build …
PointKernel Oct 19, 2024
69ca387
Ignore loud dask warnings about legacy dataframe implementation (#17137)
galipremsagar Oct 21, 2024
13de3c1
Add compile time check to ensure the `counting_iterator` type in `cou…
mhaseeb123 Oct 22, 2024
637e320
Unify treatment of `Expr` and `IR` nodes in cudf-polars DSL (#17016)
wence- Oct 22, 2024
4fe338c
Add string.replace_re APIs to pylibcudf (#17023)
mroeschke Oct 22, 2024
14cdf53
Migrate NVText Replacing APIs to pylibcudf (#17084)
Matt711 Oct 22, 2024
27c0c9d
Set the default number of threads in KvikIO thread pool to 8 (#17126)
kingcrimsontianyu Oct 22, 2024
cff1296
JSON tokenizer memory optimizations (#16978)
shrshi Oct 23, 2024
3126f77
[Bug] Fix Arrow-FS parquet reader for larger files (#17099)
rjzamora Oct 23, 2024
f0c6a04
Add JNI Support for Multi-line Delimiters and Include Test (#17139)
SurajAralihalli Oct 23, 2024
02ee819
Use async execution policy for true_if (#17146)
PointKernel Oct 23, 2024
deb9af4
Replace direct `cudaMemcpyAsync` calls with utility functions (limite…
vuule Oct 23, 2024
e7653a7
Use managed memory for NDSH benchmarks (#17039)
karthikeyann Oct 23, 2024
0287972
Use the full ref name of `rmm.DeviceBuffer` in the sphinx config file…
Matt711 Oct 23, 2024
d7cdf44
Migrate NVText Stemming APIs to pylibcudf (#17085)
Matt711 Oct 24, 2024
3a62314
Upgrade to polars 1.11 in cudf-polars (#17154)
wence- Oct 24, 2024
b75036b
Remove unused variable in internal merge_tdigests utility (#17151)
davidwendt Oct 24, 2024
7115f20
Move `segmented_gather` function from the copying module to the lists…
Matt711 Oct 24, 2024
03777f6
Fix host-to-device copy missing sync in strings/duration convert (#17…
davidwendt Oct 25, 2024
e98e6b9
Deprecate current libcudf nvtext minhash functions (#17152)
davidwendt Oct 25, 2024
0bb699e
Move nvtext ngrams benchmarks to nvbench (#17173)
davidwendt Oct 25, 2024
2113bd6
devcontainer: replace `VAULT_HOST` with `AWS_ROLE_ARN` (#17134)
jjacobelli Oct 25, 2024
5cba4fb
lint: replace `isort` with Ruff's rule I (#16685)
Borda Oct 25, 2024
8bc9f19
Add to_dlpack/from_dlpack APIs to pylibcudf (#17055)
mroeschke Oct 25, 2024
8c4d1f2
Use make_device_uvector instead of cudaMemcpyAsync in inplace_bitmask…
davidwendt Oct 28, 2024
ef28cdd
Add compute_mapping_indices used by shared memory groupby (#17147)
PointKernel Oct 28, 2024
a83e1a3
Add 2-cpp approvers text to contributing guide [no ci] (#17182)
davidwendt Oct 28, 2024
7b17fbe
Remove java reservation (#17189)
revans2 Oct 28, 2024
abecd0b
build wheels without build isolation (#17088)
jameslamb Oct 28, 2024
4c04b7c
Added strings AST vs BINARY_OP benchmarks (#17128)
lamarrr Oct 28, 2024
1ad9fc1
Remove includes suggested by include-what-you-use (#17170)
vyasr Oct 28, 2024
bf5b778
Check `num_children() == 0` in `Column.from_column_view` (#17193)
cwharris Oct 29, 2024
4b0a634
Auto assign PR to author (#16969)
Matt711 Oct 29, 2024
3775f7b
Fixed unused attribute compilation error for GCC 13 (#17188)
lamarrr Oct 29, 2024
ddfb284
Support storing `precision` of decimal types in `Schema` class (#17176)
ttnghia Oct 29, 2024
63b773e
Add in new java API for raw host memory allocation (#17197)
revans2 Oct 29, 2024
52d7e63
Unified binary_ops and ast benchmarks parameter names (#17200)
lamarrr Oct 29, 2024
8d7b0d8
[BUG] Replace `repo_token` with `github_token` in Auto Assign PR GHA …
Matt711 Oct 29, 2024
eeb4d27
Parquet reader list microkernel (#16538)
pmattione-nvidia Oct 29, 2024
6328ad6
Make ai.rapids.cudf.HostMemoryBuffer#copyFromStream public. (#17179)
liurenjie1024 Oct 30, 2024
5ee7d7c
[no ci] Add empty-columns section to the libcudf developer guide (#17…
davidwendt Oct 30, 2024
6c2eb4e
Upgrade nvcomp to 4.1.0.6 (#17201)
bdice Oct 30, 2024
0b9277b
Fix bug in recovering invalid lines in JSONL inputs (#17098)
shrshi Oct 30, 2024
7157de7
Add conversion from cudf-polars expressions to libcudf ast for parque…
wence- Oct 30, 2024
5a6d177
Fix ``to_parquet`` append behavior with global metadata file (#17198)
rjzamora Oct 30, 2024
3cf186c
Add remaining datetime APIs to pylibcudf (#17143)
Matt711 Oct 31, 2024
0e294b1
Add compute_shared_memory_aggs used by shared memory groupby (#17162)
PointKernel Oct 31, 2024
893d0fd
Migrate NVText Tokenizing APIs to pylibcudf (#17100)
Matt711 Oct 31, 2024
3f66087
Fix some documentation rendering for pylibcudf (#17217)
mroeschke Oct 31, 2024
0db2463
Migrate NVText Byte Pair Encoding APIs to pylibcudf (#17101)
Matt711 Oct 31, 2024
a69de57
Migrate hashing operations to `pylibcudf` (#15418)
brandon-b-miller Oct 31, 2024
a0711d0
Migrate NVtext subword tokenizing APIs to pylibcudf (#17096)
Matt711 Oct 31, 2024
01cfcff
Remove unsanitized nulls from input strings columns in reduction gtes…
davidwendt Oct 31, 2024
cafcf6a
Add jaccard_index to generated cuDF docs (#17199)
davidwendt Oct 31, 2024
e512258
Move strings::concatenate benchmark to nvbench (#17211)
davidwendt Oct 31, 2024
9657c9a
Fix `Schema.Builder` does not propagate precision value to `Builder` …
ttnghia Oct 31, 2024
3db6a0e
Add TokenizeVocabulary to api docs (#17208)
davidwendt Oct 31, 2024
f99ef41
Move detail header floating_conversion.hpp to detail subdirectory (#1…
davidwendt Oct 31, 2024
f7020f1
Expose stream-ordering in partitioning API (#17213)
shrshi Oct 31, 2024
02a50e8
Remove `nvtext::load_vocabulary` from pylibcudf (#17220)
Matt711 Oct 31, 2024
a83debb
Fix groupby.get_group with length-1 tuple with list-like grouper (#17…
mroeschke Oct 31, 2024
6055393
Fix binop with LHS numpy datetimelike scalar (#17226)
mroeschke Oct 31, 2024
0929115
Support for polars 1.12 in cudf-polars (#17227)
wence- Oct 31, 2024
b5b47fe
use rapids-generate-pip-constraints to pin to oldest dependencies in …
jameslamb Oct 31, 2024
0a87284
Expose streams in public round APIs (#16925)
Matt711 Nov 1, 2024
8219d28
Minor I/O code quality improvements (#17105)
kingcrimsontianyu Nov 1, 2024
6ce9ea4
Change default KvikIO parameters in cuDF: set the thread pool size to…
kingcrimsontianyu Nov 1, 2024
3d07509
Add `num_iterations` axis to the multi-threaded Parquet benchmarks (#…
vuule Nov 2, 2024
0d37506
Expose stream-ordering in subword tokenizer API (#17206)
shrshi Nov 4, 2024
e6f5c0e
Make HostMemoryBuffer call into the DefaultHostMemoryAllocator (#17204)
revans2 Nov 4, 2024
076ad58
Expose mixed and conditional joins in pylibcudf (#17235)
wence- Nov 4, 2024
a2001dd
Use more pylibcudf.io.types enums in cudf._libs (#17237)
mroeschke Nov 4, 2024
1d25d14
Fix discoverability of submodules inside `pd.util` (#17215)
galipremsagar Nov 4, 2024
45563b3
Refactor Dask cuDF legacy code (#17205)
rjzamora Nov 4, 2024
9d5041c
Separate evaluation logic from `IR` objects in cudf-polars (#17175)
rjzamora Nov 5, 2024
ac5b3ed
Deprecate single component extraction methods in libcudf (#17221)
Matt711 Nov 5, 2024
adf3269
Search for kvikio with lowercase (#17243)
vyasr Nov 6, 2024
06b3f83
Disallow cuda-python 12.6.1 and 11.8.4 (#17253)
bdice Nov 6, 2024
57900de
KvikIO shared library (#17239)
madsbk Nov 7, 2024
29484cb
Put a ceiling on cuda-python (#17264)
jameslamb Nov 7, 2024
bbd3b43
Fix the example in documentation for `get_dremel_data()` (#17242)
mhaseeb123 Nov 7, 2024
e29e0ab
Move strings/numeric convert benchmarks to nvbench (#17255)
davidwendt Nov 7, 2024
4cbc15a
Added ast tree to simplify expression lifetime management (#17156)
lamarrr Nov 7, 2024
e4c52dd
`cudf-polars` string/numeric casting (#17076)
brandon-b-miller Nov 7, 2024
1981445
Fix extract-datetime deprecation warning in ndsh benchmark (#17254)
davidwendt Nov 7, 2024
67c71e2
Refactor gather/scatter benchmarks for strings (#17223)
davidwendt Nov 7, 2024
08e4853
AWS S3 IO through KvikIO (#16499)
madsbk Nov 7, 2024
c209dae
Add io.text APIs to pylibcudf (#17232)
mroeschke Nov 7, 2024
2db58d5
Add support for `pyarrow-18` (#17256)
galipremsagar Nov 7, 2024
5147882
Process parquet bools with microkernels (#17157)
pmattione-nvidia Nov 7, 2024
64c72fc
Move strings to date/time types benchmarks to nvbench (#17229)
davidwendt Nov 7, 2024
773aefc
Use `pylibcudf.strings.convert.convert_integers.is_integer` in cudf p…
Matt711 Nov 7, 2024
c73defd
Use pylibcudf.search APIs in cudf python (#17271)
Matt711 Nov 7, 2024
e52ce85
Mark column chunks in a PQ reader `pass` as large strings when the cu…
mhaseeb123 Nov 7, 2024
b3b5ce9
Add optional column_order in JSON reader (#17029)
karthikeyann Nov 8, 2024
1777c29
Allow generating large strings in benchmarks (#17224)
davidwendt Nov 8, 2024
3c5f787
Fix data_type ctor call in JSON_TEST (#17273)
davidwendt Nov 8, 2024
18041b5
Plumb pylibcudf datetime APIs through cudf python (#17275)
Matt711 Nov 8, 2024
7b80a44
Add IWYU to CI (#17078)
vyasr Nov 8, 2024
e8935b9
Rewrite Java API `Table.readJSON` to return the output from libcudf `…
ttnghia Nov 8, 2024
150d8d8
Implement inequality joins by translation to conditional joins (#17000)
wence- Nov 8, 2024
0f1ae26
Wrap custom iterator result (#17251)
galipremsagar Nov 8, 2024
263a7ff
Make constructor of DeviceMemoryBufferView public (#17265)
liurenjie1024 Nov 8, 2024
c46cf76
remove WheelHelpers.cmake (#17276)
jameslamb Nov 8, 2024
990734f
Switch to using `TaskSpec` (#17285)
galipremsagar Nov 8, 2024
2e0d2d6
Improve the performance of low cardinality groupby (#16619)
PointKernel Nov 8, 2024
d295f17
Add `cudf::calendrical_month_sequence` to pylibcudf (#17277)
Matt711 Nov 8, 2024
fea46cd
Add read_parquet_metadata to pylibcudf (#17245)
mroeschke Nov 8, 2024
db69c52
Follow up making Python tests more deterministic (#17272)
mroeschke Nov 8, 2024
0fc5fab
Use numba-cuda<0.0.18 (#17280)
gmarkall Nov 9, 2024
e399e95
Use pylibcudf enums in cudf Python quantile (#17287)
mroeschke Nov 9, 2024
7a499f6
Use more pylibcudf Python enums in cudf._lib (#17288)
mroeschke Nov 9, 2024
5cbdcd0
Expose delimiter character in JSON reader options to JSON reader APIs…
shrshi Nov 9, 2024
84743c3
Fix `Dataframe.__setitem__` slow-downs (#17222)
galipremsagar Nov 12, 2024
61031cc
Expose streams in public quantile APIs (#17257)
shrshi Nov 12, 2024
bdddab3
cmake option: `CUDF_KVIKIO_REMOTE_IO` (#17291)
madsbk Nov 12, 2024
202c231
Replace workaround of JNI build with CUDF_KVIKIO_REMOTE_IO=OFF (#17293)
pxLi Nov 12, 2024
043bcbd
[FEA] Report all unsupported operations for a query in cudf.polars (#…
Matt711 Nov 12, 2024
ccfc95a
Add new nvtext minhash_permuted API (#16756)
davidwendt Nov 12, 2024
7682edb
Add type stubs for pylibcudf (#17258)
wence- Nov 12, 2024
796de4b
Add cudf::strings::contains_multiple (#16900)
davidwendt Nov 12, 2024
1f9ad2f
enforce wheel size limits, README formatting in CI (#17284)
jameslamb Nov 12, 2024
bbaa1ab
Support polars 1.13 (#17299)
wence- Nov 12, 2024
487f97c
Always prefer `device_read`s and `device_write`s when kvikIO is enabl…
vuule Nov 12, 2024
76a5e32
Raise errors on specific types of fallback in `cudf.pandas` (#17268)
Matt711 Nov 13, 2024
f5c0e5c
Expose stream-ordering in public transpose API (#17294)
shrshi Nov 13, 2024
918266a
Exclude nanoarrow and flatbuffers from installation (#17308)
vyasr Nov 13, 2024
1b045dd
Add `catboost` to the third-party integration tests (#17267)
Matt711 Nov 13, 2024
c4a4a91
Fixed lifetime issue in ast transform tests (#17292)
lamarrr Nov 13, 2024
6acd33d
Replace FindcuFile with upstream FindCUDAToolkit support (#17298)
KyleFromNVIDIA Nov 13, 2024
5e40691
Fix synchronization bug in bool parquet mukernels (#17302)
pmattione-nvidia Nov 13, 2024
8294953
Update CI jobs to include Polars in nightlies and improve IWYU (#17306)
vyasr Nov 13, 2024
13c7115
Move strings filter benchmarks to nvbench (#17269)
davidwendt Nov 13, 2024
353d2de
Clean up misc, unneeded pylibcudf.libcudf in cudf._lib (#17309)
mroeschke Nov 13, 2024
9da8eb2
Add documentation for low memory readers (#17314)
btepera Nov 14, 2024
5d5b35d
Polars: DataFrame Serialization (#17062)
madsbk Nov 14, 2024
4cd40ee
Java JNI for Multiple contains (#17281)
res-life Nov 14, 2024
d93c3fc
Add version config (#17312)
vyasr Nov 14, 2024
a7194f6
Fix reading of single-row unterminated CSV files (#17305)
vuule Nov 14, 2024
66c5a2d
prefer wheel-provided libcudf.so in load_library(), use RTLD_LOCAL (#…
jameslamb Nov 14, 2024
927ae9c
Do not exclude nanoarrow and flatbuffers from installation if statica…
hyperbolic2346 Nov 15, 2024
8a9131a
Update java datetime APIs to match CUDF. (#17329)
revans2 Nov 15, 2024
d67d017
Remove cudf._lib.avro in favor of inlining pylicudf (#17319)
mroeschke Nov 15, 2024
d475dca
Fix various issues with `replace` API and add support in `datetime` a…
galipremsagar Nov 15, 2024
aa8c0c4
Implement `cudf-polars` chunked parquet reading (#16944)
brandon-b-miller Nov 15, 2024
81cd4a0
Remove another reference to `FindcuFile` (#17315)
KyleFromNVIDIA Nov 15, 2024
8664fad
add telemetry setup to test (#16924)
msarahan Nov 15, 2024
e683647
Update cmake to 3.28.6 in JNI Dockerfile (#17342)
jlowe Nov 15, 2024
9cc9071
Use pylibcudf contiguous split APIs in cudf python (#17246)
Matt711 Nov 16, 2024
e4de8e4
Move strings translate benchmarks to nvbench (#17325)
davidwendt Nov 18, 2024
aeb6a30
Move cudf._lib.unary to cudf.core._internals (#17318)
mroeschke Nov 18, 2024
03ac845
Reading multi-source compressed JSONL files (#17161)
shrshi Nov 18, 2024
d514517
Test the full matrix for polars and dask wheels on nightlies (#17320)
vyasr Nov 18, 2024
43f2f68
Fix reading Parquet string cols when `nrows` and `input_pass_limit` >…
mhaseeb123 Nov 18, 2024
18b40dc
Remove cudf._lib.hash in favor of inlining pylibcudf (#17345)
mroeschke Nov 18, 2024
ba21673
Remove cudf._lib.concat in favor of inlining pylibcudf (#17344)
mroeschke Nov 18, 2024
02c35bf
Remove cudf._lib.quantiles in favor of inlining pylibcudf (#17347)
mroeschke Nov 18, 2024
302e625
Remove cudf._lib.labeling in favor of inlining pylibcudf (#17346)
mroeschke Nov 18, 2024
5f9a97f
Support polars 1.14 (#17355)
wence- Nov 19, 2024
384abae
Writing compressed output using JSON writer (#17323)
shrshi Nov 19, 2024
9c5cd81
fix library-loading issues in editable installs (#17338)
jameslamb Nov 19, 2024
c7bfa77
Fix integer overflow in compiled binaryop (#17354)
wence- Nov 19, 2024
03c055f
Move strings replace benchmarks to nvbench (#17301)
davidwendt Nov 19, 2024
56061bd
Optimize distinct inner join to use set `find` instead of `retrieve` …
PointKernel Nov 19, 2024
7158ee0
Add compute_column_expression to pylibcudf for transform.compute_colu…
mroeschke Nov 20, 2024
05365af
Bug fix: restrict lines=True to JSON format in Kafka read_gdf method …
a-hirota Nov 20, 2024
6f83b58
Adapt to KvikIO API change in the compatibility mode (#17377)
kingcrimsontianyu Nov 20, 2024
fc08fe8
Benchmarking JSON reader for compressed inputs (#17219)
shrshi Nov 20, 2024
a2a62a1
Deselect failing polars tests (#17362)
pentschev Nov 20, 2024
3111aa4
Add new ``dask_cudf.read_parquet`` API (#17250)
rjzamora Nov 20, 2024
be9ba6c
Added Arrow Interop Benchmarks (#17194)
lamarrr Nov 20, 2024
2e88835
Use `libcudf_exception_handler` throughout `pylibcudf.libcudf` (#17109)
brandon-b-miller Nov 20, 2024
f550ccc
Extract ``GPUEngine`` config options at translation time (#17339)
rjzamora Nov 20, 2024
04502c8
Move strings url_decode benchmarks to nvbench (#17328)
davidwendt Nov 20, 2024
332cc06
Support pivot with index or column arguments as lists (#17373)
mroeschke Nov 20, 2024
d927992
Move strings repeat benchmarks to nvbench (#17304)
davidwendt Nov 20, 2024
68c4285
Add `pynvml` as a dependency for `dask-cudf` (#17386)
pentschev Nov 21, 2024
0d9e577
Ignore errors when testing glibc versions (#17389)
vyasr Nov 21, 2024
f54c1a5
Migrate CSV writer to pylibcudf (#17163)
Matt711 Nov 21, 2024
305182e
Enable unified memory by default in `cudf_polars` (#17375)
galipremsagar Nov 22, 2024
439321e
Turn off cudf.pandas 3rd party integrations tests for 24.12 (#17500)
Matt711 Dec 4, 2024
2f5bf76
Simplify serialization protocols (#17552)
vyasr Dec 10, 2024
5836d08
Update Changelog [skip ci]
raydouglass Dec 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@ ENV PYTHONDONTWRITEBYTECODE="1"

ENV SCCACHE_REGION="us-east-2"
ENV SCCACHE_BUCKET="rapids-sccache-devs"
ENV VAULT_HOST="https://vault.ops.k8s.rapids.ai"
ENV AWS_ROLE_ARN="arn:aws:iam::279114543810:role/nv-gha-token-sccache-devs"
ENV HISTFILE="/home/coder/.cache/._bash_history"
ENV LIBCUDF_KERNEL_CACHE_PATH="/home/coder/cudf/cpp/build/${PYTHON_PACKAGE_MANAGER}/cuda-${CUDA_VERSION}/latest/jitify_cache"
6 changes: 3 additions & 3 deletions .devcontainer/cuda11.8-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda11.8-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.12-cpp-cuda11.8-mambaforge-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda11.8-conda"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.12-cuda11.8-conda"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.12": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
6 changes: 3 additions & 3 deletions .devcontainer/cuda11.8-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda11.8-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.12-cpp-cuda11.8-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda11.8-pip"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.12-cuda11.8-pip"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.12": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
28 changes: 25 additions & 3 deletions .devcontainer/cuda12.5-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,41 @@
"args": {
"CUDA": "12.5",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.10-cpp-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.12-cpp-mambaforge-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda12.5-conda"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.12-cuda12.5-conda"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
"ghcr.io/rapidsai/devcontainers/features/cuda:24.12": {
"version": "12.5",
"installCompilers": false,
"installProfilers": true,
"installDevPackages": false,
"installcuDNN": false,
"installcuTensor": false,
"installNCCL": false,
"installCUDARuntime": false,
"installNVRTC": false,
"installOpenCL": false,
"installcuBLAS": false,
"installcuSPARSE": false,
"installcuFFT": false,
"installcuFile": false,
"installcuRAND": false,
"installcuSOLVER": false,
"installNPP": false,
"installnvJPEG": false,
"pruneStaticLibs": true
},
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.12": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/cuda",
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
],
"initializeCommand": ["/bin/bash", "-c", "mkdir -m 0755 -p ${localWorkspaceFolder}/../.{aws,cache,config,conda/pkgs,conda/${localWorkspaceFolderBasename}-cuda12.5-envs}"],
Expand Down
6 changes: 3 additions & 3 deletions .devcontainer/cuda12.5-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@
"args": {
"CUDA": "12.5",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.10-cpp-cuda12.5-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.12-cpp-cuda12.5-ubuntu22.04"
}
},
"runArgs": [
"--rm",
"--name",
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.10-cuda12.5-pip"
"${localEnv:USER:anon}-rapids-${localWorkspaceFolderBasename}-24.12-cuda12.5-pip"
],
"hostRequirements": {"gpu": "optional"},
"features": {
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.10": {}
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils:24.12": {}
},
"overrideFeatureInstallOrder": [
"ghcr.io/rapidsai/devcontainers/features/rapids-build-utils"
Expand Down
17 changes: 17 additions & 0 deletions .github/workflows/auto-assign.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: "Auto Assign PR"

on:
pull_request_target:
types:
- opened
- reopened
- synchronize

jobs:
add_assignees:
runs-on: ubuntu-latest
steps:
- uses: actions-ecosystem/action-add-assignees@v1
with:
github_token: "${{ secrets.GITHUB_TOKEN }}"
assignees: ${{ github.actor }}
28 changes: 14 additions & 14 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ concurrency:
jobs:
cpp-build:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -37,7 +37,7 @@ jobs:
python-build:
needs: [cpp-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -46,7 +46,7 @@ jobs:
upload-conda:
needs: [cpp-build, python-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -57,7 +57,7 @@ jobs:
if: github.ref_type == 'branch'
needs: python-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.12
with:
arch: "amd64"
branch: ${{ inputs.branch }}
Expand All @@ -69,7 +69,7 @@ jobs:
sha: ${{ inputs.sha }}
wheel-build-libcudf:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.12
with:
# build for every combination of arch and CUDA version, but only for the latest Python
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
Expand All @@ -81,7 +81,7 @@ jobs:
wheel-publish-libcudf:
needs: wheel-build-libcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -92,7 +92,7 @@ jobs:
wheel-build-pylibcudf:
needs: [wheel-publish-libcudf]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -102,7 +102,7 @@ jobs:
wheel-publish-pylibcudf:
needs: wheel-build-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -113,7 +113,7 @@ jobs:
wheel-build-cudf:
needs: wheel-publish-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -123,7 +123,7 @@ jobs:
wheel-publish-cudf:
needs: wheel-build-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -134,7 +134,7 @@ jobs:
wheel-build-dask-cudf:
needs: wheel-publish-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.12
with:
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Expand All @@ -146,7 +146,7 @@ jobs:
wheel-publish-dask-cudf:
needs: wheel-build-dask-cudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -157,7 +157,7 @@ jobs:
wheel-build-cudf-polars:
needs: wheel-publish-pylibcudf
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.12
with:
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Expand All @@ -169,7 +169,7 @@ jobs:
wheel-publish-cudf-polars:
needs: wheel-build-cudf-polars
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.12
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/labeler.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
name: "Pull Request Labeler"

on:
- pull_request_target

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pandas-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
pandas-tests:
# run the Pandas unit tests
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.10
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.12
with:
# This selects "ARCH=amd64 + the latest supported Python + CUDA".
matrix_filter: map(select(.ARCH == "amd64")) | group_by(.CUDA_VER|split(".")|map(tonumber)|.[0]) | map(max_by([(.PY_VER|split(".")|map(tonumber)), (.CUDA_VER|split(".")|map(tonumber))]))
Expand Down
Loading
Loading