Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not always searching files in zip stored with the "Store" method #455

Open
swg0101 opened this issue Jan 14, 2025 · 1 comment
Open

not always searching files in zip stored with the "Store" method #455

swg0101 opened this issue Jan 14, 2025 · 1 comment
Labels
discuss Feedback requested for possible enhancements question A question that has or needs further clarification

Comments

@swg0101
Copy link

swg0101 commented Jan 14, 2025

On the latest version 7.13, it seems searching a zip file with files that are using method 0 (Store), the search will fail with the following message:
ugrep: cannot decompress crowdstrike-1.49.0.zip: unsupported zip compression method 0

The file in reference can be found here, but any zip file using the Store method should have the same error.

https://epr.elastic.co/epr/crowdstrike/crowdstrike-1.49.0.zip

Any thoughts for adding support for this method? Most ZIP files are either DEFLATE (supported) or Store (not supported).

@genivia-inc
Copy link
Member

genivia-inc commented Jan 14, 2025

The STORE method is supported in ugrep (src/zstream.cpp:530):

else if (method != Compression::STORE || (flag & 8) != 0)
{
        std::string message("unsupported zip compression method ");

However, if the "general purpose bit flag" bit 3 is set in the zip local file header data descriptor then the CRC-32 and file sizes are not known when the header is written with the STORE method. This prevents it from being searchable right now, because we don't know the size.

EDIT: perhaps when bit 3 is set and a very large file is stored in Zip64 format then the size is known, because the size is in the Zip64 extended header? I need more information on this!

EDIT-2: before someone comments about the "zip central directory": yes, I am aware of the central directory located at the end of the zip file. However, we don't necessarily have to use the central directory to get the (de)compression sizes, because the compression formats are all self-terminating. Only for the STORE method we may not have its size when it is not specified as per general purpose flag bit 3 and when instead it is specified in the central directory. For this, we can't just search from the end of the zip for the central directory when we have streaming zip file data being piped through. But we can use it when the zip file is seekable i.e. a physical file, not a stream. <CAUTION:RANT> It sucks to be a zip decompressor when a zip compressor tool puts all this extra burden on the decompressor. If the STORE size is eventually known by the zip compressor, it should back-patch it in the zip file and not leave it to the decompressor to find it at the end of the zip in the central directory. </CAUTION:RANT> In order to read the STORE data that has its size specified in the central directory, I will add some code to read the central directory when the zip file is seekable. If the zip file is not seekable, such as a pipe or a nested zip file (--zmax=2), then we can't do this.

@genivia-inc genivia-inc changed the title Ugrep does not support searching files in zip stored with the "Store" method not always searching files in zip stored with the "Store" method Jan 14, 2025
@genivia-inc genivia-inc added question A question that has or needs further clarification discuss Feedback requested for possible enhancements labels Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feedback requested for possible enhancements question A question that has or needs further clarification
Projects
None yet
Development

No branches or pull requests

2 participants