-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example reading data from an mmap
ed IPC file
#6986
Conversation
b6df782
to
1c0b81e
Compare
@@ -87,6 +87,9 @@ criterion = { version = "0.5", default-features = false } | |||
half = { version = "2.1", default-features = false } | |||
rand = { version = "0.8", default-features = false, features = ["std", "std_rng"] } | |||
serde = { version = "1.0", default-features = false, features = ["derive"] } | |||
# used in examples | |||
memmap2 = "0.9.3" | |||
bytes = "1.9" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could avoid this bytes
dependency if we added Buffer::from_owned
use std::path::PathBuf; | ||
use std::sync::Arc; | ||
|
||
/// This example shows how to read data from an Arrow IPC file without copying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-read the code in StreamDecoder
again and I am pretty sure it will also be zero copy array creation if fed via StreamDecoder::decode
--
arrow-rs/arrow-ipc/src/reader/stream.rs
Line 131 in fc6936a
pub fn decode(&mut self, buffer: &mut Buffer) -> Result<Option<RecordBatch>, ArrowError> { |
Perhaps I can extend this example to show that as well 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for the review @tustvold |
Which issue does this PR close?
Rationale for this change
Reading arrow IPC files without copying is a key format feature, but it is hard to understand how to make this happen today
What changes are included in this PR?
mmap
withFileDecoder
Are there any user-facing changes?
Example
Potential follow ons (TODO):
Buffer::from_owner
as suggested by @tustvold in MMap support for IPC files #6709 (comment)IPCBufferDecoder
or something similar into the actual crate API