-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial support for tokio #121
base: sync-async
Are you sure you want to change the base?
Conversation
@@ -4,4 +4,9 @@ fn main() { | |||
if var("CARGO_FEATURE_DEFLATE_MINIZ").is_ok() { | |||
println!("cargo:warning=Feature `deflate-miniz` is deprecated; replace it with `deflate`"); | |||
} | |||
#[cfg(not(any(feature = "sync", feature = "tokio")))] | |||
compile_error!("Missing Required feature"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update ci.yml
to enable sync
wherever it has --no-default-features
, or else convert it to a no-sync
feature (which would be unidiomatic but have the advantage of being backward-compatible for more users).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the tokio feature is properly working in tests I will update the ci. Currently that is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for letting me know; but if it's going to take longer than about another 3 days, then I'd also greatly appreciate a CI-able update once or twice per week so we could get an idea of how the PR was progressing toward a releasable state where all further work could be deferred to follow-up PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I will try doing it this week after I some of my college exams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SapryWenInera Have you finished your exams? If not, when do you expect to be able to address the comments on this PR? An ETA would be helpful not only for me, but also for the authors of the other major PRs, given the likelihood of merge conflicts.
compile_error!("Missing Required feature"); | ||
|
||
#[cfg(all(feature = "sync", feature = "tokio"))] | ||
compile_error!("The features sync and tokio cannot be used together") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This restriction may turn out to be a problem for some users. It's possible to import two configurations of a crate twice by renaming one, but then they won't recognize each other's struct types or traits. I can think of two solutions:
- (my preference) Give the async methods different names (e.g.
parse_async
) instead. If they're used in the same calling code, move that code to a macro and use method-scoped type names (e.g.type Read = AsyncRead;
and macro parameters to differentiate them. See the example in my separate comment. - Make a separate crate for the Tokio features, and another for the shared core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of pushing the old synchronous code in a module of read named sync and then the async code would be in a tokio module, this would avoid conflicts between the codebases, by the end of the week i'm probably gonna push a PR for that since it doesn't require any feature or ci changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good, but will there be a shared-core module as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that the structs and enums can be shared but the actual logic can be separate. This should avoid import conflicts and if there are code changes in the sync part it should not conflict with the async part. Do u think this or the macro one is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I favor sharing as much code as possible, based on the Don't Repeat Yourself principle. But the modular design should be compatible with the macro approach: define the macros in the shared core module, and invoke them in the sync and tokio modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SapryWenInera Could you please prioritize this issue for a fix, so that I can run CI on your work in progress and use the results to estimate how much longer this PR will take?
#[cfg(feature = "tokio")] | ||
impl Zip64CentralDirectoryEndLocator { | ||
pub async fn parse<T>(reader: &mut T) -> ZipResult<Self> | ||
where | ||
T: AsyncRead + Unpin, | ||
{ | ||
let magic = reader.read_u32_le().await?; | ||
if magic != ZIP64_CENTRAL_DIRECTORY_END_LOCATOR_SIGNATURE { | ||
return Err(ZipError::InvalidArchive( | ||
"Invalid zip64 locator digital signature header", | ||
)); | ||
} | ||
let disk_with_central_directory = reader.read_u32_le().await?; | ||
let end_of_central_directory_offset = reader.read_u64_le().await?; | ||
let number_of_disks = reader.read_u32_le().await?; | ||
|
||
Ok(Self { | ||
disk_with_central_directory, | ||
end_of_central_directory_offset, | ||
number_of_disks, | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[cfg(feature = "tokio")] | |
impl Zip64CentralDirectoryEndLocator { | |
pub async fn parse<T>(reader: &mut T) -> ZipResult<Self> | |
where | |
T: AsyncRead + Unpin, | |
{ | |
let magic = reader.read_u32_le().await?; | |
if magic != ZIP64_CENTRAL_DIRECTORY_END_LOCATOR_SIGNATURE { | |
return Err(ZipError::InvalidArchive( | |
"Invalid zip64 locator digital signature header", | |
)); | |
} | |
let disk_with_central_directory = reader.read_u32_le().await?; | |
let end_of_central_directory_offset = reader.read_u64_le().await?; | |
let number_of_disks = reader.read_u32_le().await?; | |
Ok(Self { | |
disk_with_central_directory, | |
end_of_central_directory_offset, | |
number_of_disks, | |
}) | |
} | |
} | |
macro_rules! parse { | |
($maybe_await:ident) => { | |
let magic = maybe_await(reader.read_u32_le())?; | |
if magic != ZIP64_CENTRAL_DIRECTORY_END_LOCATOR_SIGNATURE { | |
return Err(ZipError::InvalidArchive( | |
"Invalid zip64 locator digital signature header", | |
)); | |
let disk_with_central_directory = maybe_await(reader.read_u32_le())?; | |
let end_of_central_directory_offset = maybe_await(reader.read_u64_le()?); | |
let number_of_disks = maybe_await(reader.read_u32_le())?; | |
Ok(Self { | |
disk_with_central_directory, | |
end_of_central_directory_offset, | |
number_of_disks, | |
}) | |
} | |
} | |
#[cfg(feature = "tokio")] | |
pub(crate) async fn await_identity<T: ?Sized>(operand: T) -> T { | |
T.await | |
} | |
impl Zip64CentralDirectoryEndLocator { | |
#[cfg(feature = "tokio")] | |
pub async fn parse<T>(reader: &mut T) -> ZipResult<Self> | |
where | |
T: AsyncRead + Unpin, | |
{ | |
use std::future::Future; | |
async fn await<T>(operand: Future<T>) -> T { | |
T.await | |
} | |
parse!(await_identity) | |
} | |
#[cfg(feature = "tokio")] | |
pub fn parse<T>(reader: &mut T) -> ZipResult<Self> | |
where | |
T: Read, | |
{ | |
parse!(std::convert::identity) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been brainstorming using macros for sharing code between async and sync code but i can't find a way to create a macro that can handle both .await calls and calls without .await without breaking the other, like in the code u just showed, is the goal of supporting async is to have a different code pathway for people to use or to just be a wrapper around current sync functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need an extension trait that's implemented for both Read and AsyncRead, and other for Write and AsyncWrite. For the compression and decompression themselves, I think the wrapper approach is probably adequate, since they're CPU-heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make a demo in another branch of using async as just a wrapper around sync, but my preferred aproach would actually be using async all io calls and trying to create shareable sync functions for non io operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why we can't use macros whose parameters have sync and async definitions, to share code between sync and async versions of each function (e.g. Zip64CentralDirectoryEndLocator
would have its sync and async parse
method bodies generated by separate calls to a parse!
macro that took as arguments $reader_read_u32_le_maybe_async:expr
and $reader_read_u64_le_maybe_async:expr
, both of which could be shared with other methods by having other macros define them at impl
-block or wider scope). Could you please explain why you don't think that's feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long break. There's two issues in trying to share code between async and sync functions using macros, the first is that as long as you're using async-await syntax calling async inside the macro would break sync code and not calling .await breaks async code and using .await outside async code block is not allowed.
macro_rules! maybe_await {
(maybe_async:expr) => {
$maybe_async().await.unwrap();
}
}
fn main() {
#[cfg!(feature = "async")]
let string = maybe_await!(async {read_to_string("/tmp/foo")}); // Code breaks due to main not being async
#[cfg!(feature = "sync")]
let string = maybe_await!(read_to_string("/tmp/foo")) // Code breaks to to .await being called on sync function
println!("{}", string)
I don't see a way to solve both problems while sharing any code whatsoever, if u can have any idea on how to due this then i'm all ears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tests, one fix might be to use a maybe_block_on!
macro as well. But maybe this is more trouble than it's worth; let's wait and see how much duplicated code is left after rebasing against #93 and factoring out as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that one solution might involve 0th-order macros called sync_defs!
and async_defs!
that would provide different definitions of 1st-order macros such as maybe_await!
and maybe_block_on!
and read_or_async_read!
, and then definitions of sync and async versions of a function would invoke their different 0th-order macros and then the same 2nd-order macro (which would contain the shared code). Could that possibly work?
FYI: Be aware that some other major PRs are currently open. |
Signed-off-by: Chris Hennick <[email protected]>
This is to enable `doc_auto_cfg` feature with Docs.rs.
deflate-zlib was an omission; deflate64 is a different, backward-incompatible algorithm. Signed-off-by: Chris Hennick <[email protected]>
I've retargeted this PR to the same branch as #134, but we're getting complicated merge conflicts. Please address them. |
docs: Enable `doc_auto_cfg` feature with Docs.rs
Please revert the |
@@ -316,16 +318,4 @@ mod test { | |||
let mut file = reader.by_index(0).unwrap(); | |||
assert_eq!(file.read(&mut decompressed).unwrap(), 12); | |||
} | |||
|
|||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you deleting this?
@@ -25,6 +25,7 @@ | |||
//! | ZipCrypto deprecated encryption | ✅ | ✅ | | |||
//! | |||
//! | |||
#![cfg_attr(docsrs, feature(doc_auto_cfg))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
.await?; | ||
|
||
let results: Vec<Result<CentralDirectoryInfo, ZipError>> = search_results.iter().map(|(footer64, archive_offset)| { | ||
let directory_start_result = footer64.central_directory_offset.checked_add(*archive_offset).ok_or(ZipError::InvalidArchive("Invalid central directory size or effect")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let directory_start_result = footer64.central_directory_offset.checked_add(*archive_offset).ok_or(ZipError::InvalidArchive("Invalid central directory size or effect")); | |
let directory_start_result = footer64.central_directory_offset.checked_add(*archive_offset).ok_or(ZipError::InvalidArchive("Invalid central directory size or offset")); |
) | ||
.await?; | ||
|
||
let results: Vec<Result<CentralDirectoryInfo, ZipError>> = search_results.iter().map(|(footer64, archive_offset)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factor out lines 109-130 and their copies in the sync version to a shared method. It can be called validate_zip64_footers
.
#[allow(deprecated)] | ||
let compression_method = | ||
CompressionMethod::from_u16(reader.read_u16_le().await?); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factor out lines 286-313 into a shared method; it can be called decode_aes_extra_data
.
#[cfg(feature = "tokio")] | ||
impl Zip64CentralDirectoryEndLocator { | ||
pub async fn parse<T>(reader: &mut T) -> ZipResult<Self> | ||
where | ||
T: AsyncRead + Unpin, | ||
{ | ||
let magic = reader.read_u32_le().await?; | ||
if magic != ZIP64_CENTRAL_DIRECTORY_END_LOCATOR_SIGNATURE { | ||
return Err(ZipError::InvalidArchive( | ||
"Invalid zip64 locator digital signature header", | ||
)); | ||
} | ||
let disk_with_central_directory = reader.read_u32_le().await?; | ||
let end_of_central_directory_offset = reader.read_u64_le().await?; | ||
let number_of_disks = reader.read_u32_le().await?; | ||
|
||
Ok(Self { | ||
disk_with_central_directory, | ||
end_of_central_directory_offset, | ||
number_of_disks, | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why we can't use macros whose parameters have sync and async definitions, to share code between sync and async versions of each function (e.g. Zip64CentralDirectoryEndLocator
would have its sync and async parse
method bodies generated by separate calls to a parse!
macro that took as arguments $reader_read_u32_le_maybe_async:expr
and $reader_read_u64_le_maybe_async:expr
, both of which could be shared with other methods by having other macros define them at impl
-block or wider scope). Could you please explain why you don't think that's feasible?
@@ -21,7 +21,7 @@ jobs: | |||
matrix: | |||
os: [ubuntu-latest, macOS-latest, windows-latest] | |||
rustalias: [stable, nightly, msrv] | |||
feature_flag: ["--all-features", "--no-default-features", ""] | |||
feature_flag: ["--no-default-features --features sync_all", "--no-default-features --features tokio_all", "--no-default-features --features sync", "--no-default-features --features tokio", ""] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please make it possible to build with both sync
and tokio
? To do this, you can invoke the traits with an explicit self parameter, e.g. Read::read_exact(self, buf)
and AsyncRead::read_exact(self, buf)
instead of self.read_exact(buf)
. Since methods with the same name don't conflict when they're specified in different traits, this should be the only change needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I found a crate that does something similar to this, but with proc macros: https://crates.io/crates/async-generic. Looks like it may still be worth layering a few regular macros on top, for when async-generic functions call other async-generic functions.
#207 refactors readers to pass through the parameterized reader type rather than flatten it to |
Start implementing tokio support for reading zip and separating the io operations of sync and async code using the features sync and tokio. Starting work to closes #108.