-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dispatcher decompressing data when not needed #31
Comments
Another issue is that even if we are sending uncompressed bytes of a shallow copy the bool DispatcherServer::sendBytes has a hardcoded value to true for the indicator compressedOrNot, thus it will probably not work. |
Thanks Dimitrije. (1) The reason that it decompress data before forwarding the compressed data is to give dispatcher a chance to look into the content to do size checking or other things such as from line 98 to line 108. (2) I once did write an example to successfully use shallow copy and send uncompressed bytes directly, please check it here: https://github.com/riceplinygroup/plinycompute/blob/master/applications/TPCHBench/tpchDataGenerator.cc Let me know if you have further questions. |
Hi Jia, (1) The only check we do is whether the vector is empty or not, can be done on the other side as well. The code for type validation is commented out 267-292 that is called on 110 and it will always pass. (2) If you look at the sendBytes method 349 - 352 you will see that the second last argument is set to true. Which corresponds to the parameter compressedOrNot in StorageAddData. Meaning it will always be set as compressed regardless if the flag ENABLE_COMPRESSION is enabled or not. |
Hey Dimitrije, Not very sure what exactly your concerns are and how important it is. Just to clarify it a bit and share information/ideas from my side: For your first point, we can easily change code to cancel the size checking and directly forward the compressed bytes, but I'm afraid for some complex dispatching features such as partitioning or building indexing while dispatching data, decompression is unavoidable. For your second point, I guess it also only requires small change of code to make the flag in sendBytes to be configurable and I suspect it once worked like that. Hope this helps. |
Hi Jia, |
Sure. |
Hi I've been adding stuff to the Dispatcher to get my new algorithm running.
I noticed that the dispatcher is decompressing the data even though it does not use the decompressed data.
In the file
https://github.com/riceplinygroup/plinycompute/blob/master/pdb/src/serverFunctionalities/source/DispatcherServer.cc
line 88 to 95 we are decompressing the data with snappy stored in tempPage. The decompressed result is stored in readToHere. But if we look at the 130 to 136 only the tempPage (the compressed data is sent), thus we are just doing useless work on the dispatcher.
The text was updated successfully, but these errors were encountered: