Files
gzip2
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
--------------------------------------------------------------------------------------------------------------- To test out the suite, run ./test_suite.sh. This will get you all the binaries and test it on a bunch of files. The output times and other info will go to tso.txt. The test suite only does file sizes of 20, 100 and 250 MB, though if you wish you can also do them for 500 MB, 750, 1000, 2000 (or any size you desire) by commenting out/adding the desired lines in test_suite.sh. --------------------------------------------------------------------------------------------------------------- This is a set of various implementations, both sequential and parallel of the gzip compression program. There are four folders: gzip, pigz, quickzip, and pgzip. The folder gzip is the standard implementation of gzip found at www.gzip.org. There is a script called init_script.sh which can be run and will download the file, build it, and copy the executable to the directory. The folder pigz is the standard implementation of parallel gzip found at www.zlib.net/pigz/. There is a script called init_script.sh which can be run and will download the file, build it, and copy the executable to the directory. These are the standard two implementations most people use. The next folder quickzip, implements pseudo gzip parallelism, by simply splitting a file into chunks (unix split command) and then compresses each in a background process, we could join them back by removing the last two bytes in each chunk and then cat the compressed files and add a crc byte and a length byte. Furthermore we would need to remove the end block codes of each seperate chunk except the last one to comply with gzip file format. However we do not join in this manner because we wanted a clean implemenation and did not want to modify end chunk codes, which would need to be done in the gzip src (see LINE 653 in deflate.c source from gzip). That is why we call it pseudo gzip, since it is technically not RFC Gzip, but does the same job. The next folder pgzip, implements true gzip parallelism. It follows a similar approach as pigz, but absolutely no source code is borrowed or taken as inspiration from that project. It is developed purely from the gzip source files, and proceeds to create a threadpool and then send chunks of data to free threads, and then takes finished work and flushes to a file appropriately. For the time being it only supports compression. ------------------------------------------------------------------------------------------------------------------ Build Details: gzip: simply go to directory gzip and run init_script.sh pigz: simply go to directory pigz and run init_script.sh pgzip: simply go to directory pgzip and run make quickzip: depends on having gzip and python, but otherwise you're set ------------------------------------------------------------------------------------------------------------------ Contributors: https://github.com/a-wild-tigger https://github.com/SeanHogan https://github.com/norseboar -------------------------------------------------------------------------------------------------------------------