feat(batch): support spill hash agg for the batch query #16771
Merged
Task list completed / task-list-completed
Started
2024-05-29 14:36:35
ago
0 / 8 tasks completed
8 tasks still to be completed
Details
Required Tasks
Task | Status |
---|---|
Related RFC: risingwavelabs/rfcs#89 | Incomplete |
Tracking issue #16615 | Incomplete |
Support spill hash agg for the batch query. | Incomplete |
When HashAggExecutor told memory is insufficient, AggSpillManager will start to partition the hash table and spill to disk. After spilling the hash table, AggSpillManager will consume all chunks from the input executor, partition and spill to disk with the same hash function as the hash table spilling. Finally, we would get e.g. 20 partitions. Each partition should contain a portion of the original hash table and input data. A sub HashAggExecutor would be used to consume each partition one by one. If memory is still not enough in the sub HashAggExecutor , it will partition its hash table and input recursively. |
Incomplete |
SpillOp is used to manage the spill directory of the spilling executor and it will drop the directory with a RAII style. |
Incomplete |
An environment variable RW_BATCH_SPILL_DIR would be used to configure the path to spill, by default /tmp/ . |
Incomplete |
I have written necessary rustdoc comments | Incomplete |
I have added necessary unit tests and integration tests | Incomplete |
I have added test labels as necessary. See details. | Incomplete |
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features #7934). | Incomplete |
My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future). | Incomplete |
All checks passed in ./risedev check (or alias, ./risedev c ) |
Incomplete |
My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details) | Incomplete |
My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users) | Incomplete |
Support spill hash agg for the batch query. | Incomplete |
If file doesn’t exist, it will be created and just like calling write. | Incomplete |
If file exists, data will be appended to the end of the file. | Incomplete |
Loading