Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDF format ISO Large file support #65

Open
MarkBaggett opened this issue Apr 27, 2021 · 4 comments
Open

UDF format ISO Large file support #65

MarkBaggett opened this issue Apr 27, 2021 · 4 comments

Comments

@MarkBaggett
Copy link

Hello,
First, thank you Chris for this wonderful module. I am having trouble with the UDF ISO's breaking large files up into smaller ones. I imagine I am not enabling some feature to support large files or something like that but I am not sure what I am doing wrong.

I am running version 1.1

C:\Users\User\Desktop\test>pip show pycdlib
Name: pycdlib
Version: 1.11.0
Summary: Pure python ISO manipulation library
Home-page: http://github.com/clalancette/pycdlib
Author: Chris Lalancette
Author-email: [email protected]
License: LGPLv2
Location: c:\users\user\venv\mpmod\lib\site-packages
Requires:
Required-by: media-processor

Here I have a directory with a large file in it..

 C:\Users\User\Desktop\test>dir bigzip
 Volume in drive C has no label.
 Volume Serial Number is 6000-8F6B

 Directory of C:\Users\User\Desktop\test\bigzip

04/27/2021  07:33 AM    <DIR>          .
04/27/2021  07:33 AM    <DIR>          ..
07/17/2020  08:51 AM     8,367,733,776 Windows-10vm.zip
               1 File(s)  8,367,733,776 bytes
               2 Dir(s)  45,585,305,600 bytes free

Here is my code:

def dir2iso(source, destination, filter=None):
    "Create an ISO from a given source directory."
    if filter==None:
        filter = lambda x:True
    new_iso = pycdlib.PyCdlib()
    new_iso.new(udf="2.60")
    for eachitem in pathlib.Path(source).rglob("*"):
        if eachitem.is_dir() and filter(eachitem):
            new_iso.add_directory( udf_path = "/"+str(eachitem.relative_to(source).as_posix())) 
        elif eachitem.is_file() and filter(eachitem):
            new_iso.add_file(str(eachitem), udf_path = "/"+str(eachitem.relative_to(source).as_posix()))
    new_iso.write(destination)
    return "Created", []

def dir2iso_cli():
    parser = argparse.ArgumentParser()
    parser.add_argument("source", help = "The path to the directory to turn into an ISO")
    parser.add_argument("destination", help="The destination ISO file to create (including path).")
    args = parser.parse_args()
    dir2iso(args.source, args.destination)

if __name__ == "__main__":
    dir2iso_cli()

I execute that program and pass it the directory containing the 1 large zip file and here is the resulting iso:

E:\>dir 
 Volume in drive E is CDROM
 Volume Serial Number is 5957-8578

 Directory of E:\

04/04/2021  01:02 AM     4,294,965,248 Windows-10vm.zip
04/04/2021  01:02 AM     4,072,768,528 Windows-10vm.zip
               2 File(s)  8,367,733,776 bytes
               0 Dir(s)               0 bytes free

I would appreciate your help.

@clalancette
Copy link
Owner

Ah, interesting.

The issue here is a limitation of ISOs. Regular ISO9660 can only create files of up to 4GB. However, it allows "splitting" files into smaller files, so you can effectively get larger file sizes.

UDF does not have the 4GB file limitation. However, pycdlib treats all ISOs as ISO9660 compatible, with optional UDF support. So it still splits up all files into smaller chunks so that they are still viable from the ISO9660 perspective.

I'm not sure how to resolve this, to be honest. We could add a "UDF-only" mode, but it's actually quite a lot of work and I've been stuck trying to do that for years now (see #19, for instance). Otherwise, in order to maintain compatibility with older ISO9660, we kind of have to keep doing this splitting.

I'm open to other ideas, but I can't think of how to fix this right now.

@MarkBaggett
Copy link
Author

MarkBaggett commented Apr 29, 2021 via email

@MarkBaggett
Copy link
Author

Joliet format didn't solve my issue after all. In joliet format is no longer splitting the ISO into multiple ISO files. However the files inside the ISO appear to be limited to 8GB. I tried modifying the following lines of the code above:

    new_iso = pycdlib.PyCdlib()
    #new_iso.new(udf="2.60")
    new_iso.new(joliet=3)

Now files are truncated (see below).

How do I use this library to create an ISO that contains 20GB files? Is it possible?

File lengths in ISO are truncated. File hashes don't match (for obvious reasons).

PS C:\Users\User\Desktop\source> ls *.ova

    Directory: C:\Users\User\Desktop\source
Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----         5/24/2021   2:35 PM    12123749376 VirtualMachine.ova

PS C:\Users\User\Desktop\source> Get-FileHash -Algorithm md5 *.ova
Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
MD5             511F9BCF8863BF4FD319212A62D95836                    .\VirtualMachine.ova                   

PS E:\> dir *.ova
    Directory: E:\
Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
--r---          6/8/2021  10:27 AM     7828784128 VirtualMachine.ova

PS E:\> Get-FileHash -Algorithm md5 *.ova
Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
MD5             C3B0F7272D50C7115A7E31C206A5BC11                                       E:\VirtualMachine.ova

@MarkBaggett
Copy link
Author

If anyone else finds that they need to create ISOs on Windows files larger than 8GB, here is the nasty, dirty, traitorous solution I came up with. If there is a way to do this with this or any other native python module I'd appreciate the heads up.

https://github.com/MarkBaggett/pxpowershell

Specifically the dir2iso function in https://github.com/MarkBaggett/pxpowershell/blob/main/pxpowershell/example_dir2iso.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants