Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report carve dir #1017

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Report carve dir #1017

wants to merge 12 commits into from

Commits on Dec 3, 2024

  1. feat(devenv): allow disabling devenv locally with .envrc.user

    Current devenv config has problems on at least Ubuntu 22.04.
    
    .envrc.user with the below content enables work without devenv:
    
    layout_poetry() {
     # create venv if it doesn't exist
     poetry run true
    
     export VIRTUAL_ENV=$(poetry env info --path)
     export POETRY_ACTIVE=1
     PATH_add "$VIRTUAL_ENV/bin"
    }
    
    layout_poetry
    export SKIP=nixpkgs-fmt
    export UNBLOB_USE_DEVENV=false
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    a6d8cbe View commit details
    Browse the repository at this point in the history
  2. feat(processing): Support configurable carve suffix

    Andrew Fasano authored and e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    183c6ea View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6bbdeb3 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1ad22f5 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    81b452b View commit details
    Browse the repository at this point in the history
  6. refact: Introduce _carve_then_extract_chunks

    This removes the burden of carving from already complex function
    _extract_chunks and also allowed for some better variable names.
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    a7a6f96 View commit details
    Browse the repository at this point in the history
  7. feat(reports) report carved directories - CarveDirectoryReport

    Carve directories were hard to explain, as they look like extraction
    directories and there was no public information to tell them apart.
    
    Adding this report makes the purpose of the directory visible.
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    6b70e8f View commit details
    Browse the repository at this point in the history
  8. chore: eliminate _FileTask.carve_dir

    `_FileTask.carve_dir` was initially used for both extraction and carving.
    The naming of the directories can now differ, so it is not used anymore
    apart from an existence check, which would terminate this branch of the
    extraction. This output directory existence check is now present in both
    the carving and extraction paths, and the output report's name is also
    renamed, to accommodate both types of output directories.
    
    `ExtractDirectoryExistsReport` was generalized to
    `OutputDirectoryExistsReport` instead of introducing yet another
    `Report` type - `CarveDirectoryExistsReport`.
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    ff02ba3 View commit details
    Browse the repository at this point in the history
  9. fix: ensure summary and error report output when no chunks are processed

    Chunk statistics require a divide by total chunk size, which can be 0
    in certain rare cases. This makes chunk related output is conditional,
    and not part of the summary.
    
    An example command line sequence which leads to a silent failure:
    
        (echo a; gzip < README.md ; echo b) > fw
        unblob fw
        # the next command would silently fail:
        unblob fw
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    3642e64 View commit details
    Browse the repository at this point in the history
  10. feat: print output directory path as part of Summary output

    With the separation of carve and extract directories, the output
    directory become dependent on the *content* of the input file:
    if it has multiple chunks, because it is not covered by a single handler
    the output directory will be generated as a *carve* directory,
    otherwise as an *extract* directory.
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    4cf5ff4 View commit details
    Browse the repository at this point in the history
  11. fix: Do not create extra output directory, if nothing was extracted

    The output path is printed in the previous commit, so depending on the
    caller having to look at well known paths is no longer needed.
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    05f9c77 View commit details
    Browse the repository at this point in the history
  12. test: behavior with different suffix combinations

    The test files were created with this script:
    
        # cd tests/files/suffixes
    
        # clean
        rm -rf chunks_carve/ extractions/ collisions.zip __input__ __outputs__
    
        # reproduce output
        mkdir __input__ __outputs__
        seq 100 | gzip > 0-160.gzip
        seq 128 | gzip > 160-375.gzip
        dd if=/dev/zero of=375-512.padding bs=1 count=137
        cat 0-160.gzip 160-375.gzip 375-512.padding > __input__/chunks
    
        unblob --carve-suffix _carve chunks
        cp 0-160.gzip chunks_carve/
        echo something else > chunks_carve/0-160.gzip_extract/gzip.uncompressed
    
        zip __input__/collisions.zip chunks chunks_carve/0-160.gzip chunks_carve/0-160.gzip_extract/gzip.uncompressed
    
        rm 0-160.gzip 160-375.gzip 375-512.padding
        rm -rf chunks_carve
    
        for input in collisions.zip chunks
        do
          unblob \
                -e __outputs__/$input/defaults/ __input__/$input
          unblob --carve-suffix _carve \
                -e __outputs__/$input/_carve_extract/ __input__/$input
          unblob --carve-suffix _c --extract-suffix _e \
                -e __outputs__/$input/_c_e/ __input__/$input
        done
    e3krisztian committed Dec 3, 2024
    Configuration menu
    Copy the full SHA
    4f1bd85 View commit details
    Browse the repository at this point in the history