Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupted octal value in tar header #475

Open
daleiliu opened this issue Nov 8, 2024 · 4 comments
Open

corrupted octal value in tar header #475

daleiliu opened this issue Nov 8, 2024 · 4 comments

Comments

@daleiliu
Copy link

daleiliu commented Nov 8, 2024

I was trying to make tarballs to the subdirectories of a rust vendor directory with this command:
tar jcf ../tarballs/$CRATE.tar.bz2 $CRATE

The command runs OK for all the crates (>200). But later when I tried decompress them, two of them failed:
tar xf tarballs/windows_x86_64_gnu-0.48.5.tar.bz2
tar xf tarballs/windows_x86_64_gnu.tar.bz2
with this error:
tar: corrupted octal value in tar header

The original crate file can be downloaded from:
curl -L https://crates.io/api/v1/crates/windows_x86_64_gnu/0.52.6/download | tar -zxf -

And I found the issue is caused by just one file, lib/libwindows.0.52.0.a

@rmyorston
Copy link
Owner

I'm unable to reproduce the problem.

This is with the latest busybox-w32 release on 64-bit Windows 10:

~ $ busybox | head -2
BusyBox v1.37.0-FRP-5467-g9376eebd8 (2024-09-15 08:56:36 UTC)
(mingw64-gcc 14.1.1-3.fc40; mingw64-crt 11.0.1-3.fc40; glob; Unicode)
~ $ wget -O - https://crates.io/api/v1/crates/windows_x86_64_gnu/0.52.6/download | tar xfz -
Connecting to crates.io (52.84.90.74:443)
Connecting to static.crates.io (151.101.190.137:443)
writing to stdout
-                    100% |**********************************************************|  816k  0:00:00 ETA
written to stdout
~ $ tar cfj windows_x86_64_gnu-0.52.6.tar.bz2 windows_x86_64_gnu-0.52.6/
~ $ tar tvfj windows_x86_64_gnu-0.52.6.tar.bz2
drwxrwxr-x 4095/4095         0 2024-11-08 07:28:41 windows_x86_64_gnu-0.52.6/
-rw-rw-r-- 4095/4095       119 1970-01-01 00:00:01 windows_x86_64_gnu-0.52.6/.cargo_vcs_info.json
-rw-rw-r-- 4095/4095       198 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/build.rs
-rw-rw-r-- 4095/4095       959 1970-01-01 00:00:01 windows_x86_64_gnu-0.52.6/Cargo.toml
-rw-rw-r-- 4095/4095       327 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/Cargo.toml.orig
drwxrwxr-x 4095/4095         0 2024-11-08 07:28:40 windows_x86_64_gnu-0.52.6/lib/
-rw-rw-r-- 4095/4095  12695688 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/lib/libwindows.0.52.0.a
-rw-rw-r-- 4095/4095     11351 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/license-apache-2.0
-rw-rw-r-- 4095/4095      1141 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/license-mit
drwxrwxr-x 4095/4095         0 2024-11-08 07:28:41 windows_x86_64_gnu-0.52.6/src/
-rwxrwxr-x 4095/4095        11 2006-07-24 02:21:28 windows_x86_64_gnu-0.52.6/src/lib.rs
~ $

@daleiliu
Copy link
Author

daleiliu commented Nov 8, 2024

Thanks for the double check. I did some more tests and found the issue seems about the "tar" auto detection of compression type function. The first command is OK but the second one reports the error:
$ tar jxf windows_x86_64_gnu-0.52.6.tar.bz2
$ tar xf windows_x86_64_gnu-0.52.6.tar.bz2

Also if I compress it from a Ubuntu PC, busybox-w32's "tar xf" generate tar.bz2 file reports the same error. So it should be something in the decompression path.

@rmyorston
Copy link
Owner

I'm still unable to reproduce the problem on my Windows system. However, I have been able to create tar files on Linux which display the issue. Both GNU tar and BusyBox tar generate files which fail to extract with BusyBox tar on both Windows and Linux.

Now that I have a file that reliably reproduces the problem I'll be able to investigate.

rmyorston added a commit that referenced this issue Nov 8, 2024
The code to autodetect compressed tar files failed to detect a
bunzip2-compressed archive.  When tar was invoked with the 'j'
option it worked fine.

The autodetection code looks for the magic string 'ustar' or a
series of five NULs to determine that an archive is uncompressed.
The failing archives had more than five NULs in the header and
were taken to be uncompressed.

Look for a longer run of NULs: 16 is certainly sufficient for the
archives in question.

Adds 8-16 bytes.

(GitHub issue #475)
@rmyorston
Copy link
Owner

The code to autodetect compression was looking for the magic string 'ustar' or a run of five NULs to determine if an archive is uncompressed. In my sample file the bzip2 header had seven NULs, it was therefore considered to be uncompressed.

Looking for a longer run of NULs should avoid the problem. Really old tar files can have 127 NULs, but checking for 16 should be plenty.

New prerelease binaries have been issued with this fix (PRE-5531 or above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants