-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Distinct aggregation OOM when getOutput #8025
Comments
Looks it's the same issue as shuffle spill. All spill merge should have the same issue. we should solve it by similar way.
What's the vanilla spark's spill buffer size? is it configurable? in theory vanilla spark has the same issue as Gluten. @jinchengchenghh do you know? |
I can only find the configuration: spark.shuffle.spill.diskWriteBufferSize. No spill merge one. |
Thank you, @ccat3z . I encounted the same issue in orderby operator and debugged several days! |
The kSpillReadBufferSize controls the size to read each file, the ordered reader will create FileOutputStream for each file and allocate kSpillReadBufferSize (default 1MB) for one file. Can you try to adjust this value?
kSpillWriteBufferSize controls the serialization buffer, if up to this threshold, flush and compress the buffer. |
Spark also open all the spill file to read.
Spark use the
It has config to control the read buffer size (default 1 MB) as following:
class
It needs to load the bufferSize in |
In that case, we need to respect UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE. |
Thank you, @jinchengchenghh . With the tuning of kMaxSpillRunRows and kSpillWriteBufferSize. one of my task succeed but the other one still fails. Looks like it still have some large memory allocation in getoutput. |
Can you add it as config in Gluten? |
should we propose the way of #7861? |
So the worst case of Vanilla spark is also 1M buffer per file, right? Let's hornor the value of spark.unsafe.sorter.spill.reader.buffer.size then. It may be set in queries. |
#7861 releases the buffer after read, Velox FileInputStream reuse the |
Yes, I will draft a PR to respect this config. |
Maybe because the Streams will hold all the buffers, and released after all the files read completed. I don't see the compression in Spark spill, so it doesn't need to request memory for compression. I will add a new config to control the velox spill codec. It is still OOM or kill by yarn? |
Spark closes the reader when |
And also close the serializer in serde. |
No, 7681 uses mmap, memory is mapped into user space directly. velox uses file read/write, data is copied to buffer. |
I'm adding to #8026 |
we already should have spill codec and spill buffer configured. |
Now it is Spark codec In SortBuffer |
Backend
VL (Velox)
Bug description
Distinct aggregation will merge all sorted spill file in
getOutput()
(SpillPartition::createOrderedReader
). If there are too many spill files, reading the first batch of each file into memory will consume a significant amount of memory. In one of our internal cases, one task generated 300 spill files, which requires close to 3G of memory.Possible workarounds:
kMaxSpillRunRows
,1M
will generate too many spill files for hundreds million rows of input. [GLUTEN-7249][VL] Lower default overhead memory ratio and spill run size #7531kSpillWriteBufferSize
to1M
or lower. Why it is set to 4M by default? Is there any experience in performance tuning?Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response
The text was updated successfully, but these errors were encountered: