Introduce HybridSpillBuf #655

zuston · 2024-11-23T06:25:36Z

Is your feature request related to a problem? Please describe.

Digging into the OnHeapSpillManager, I found the two spill buf implementations of File based and memory based. From my sight, maybe we could introduce the HybridSpillBuf to use the memory + file based bufs in one spill buf, which will make full use of memory for best effort.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

richox · 2024-11-24T10:42:57Z

there is another path that when on-heap memory is almost full, spill.rs will create FileSpill instead of OnHeapSpill, which spills data directly to disk file. and FileBasedSpillBuf is only used when it is spilled by other non-native operators.
so i suggest we keep OnHeapSpillManager simpler and easier to be maintained.

zuston · 2024-11-24T13:28:09Z

there is another path that when on-heap memory is almost full, spill.rs will create FileSpill instead of OnHeapSpill, which spills data directly to disk file. and FileBasedSpillBuf is only used when it is spilled by other non-native operators. so i suggest we keep OnHeapSpillManager simpler and easier to be maintained.

Yes, gotten it. But if the onHeapSpill is activated on requiring from native side when on heap memory is enough, the path mentioned above will be used that the file spill is still used if on heap is not enough when using at that time. Anyway, I think this is just a corner case improvement, that may not achieve significant improvement for performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce HybridSpillBuf #655

Introduce HybridSpillBuf #655

zuston commented Nov 23, 2024

richox commented Nov 24, 2024

zuston commented Nov 24, 2024

Introduce HybridSpillBuf #655

Introduce HybridSpillBuf #655

Comments

zuston commented Nov 23, 2024

richox commented Nov 24, 2024

zuston commented Nov 24, 2024