Skip to content

Commit

Permalink
merge with development
Browse files Browse the repository at this point in the history
  • Loading branch information
sahib committed Apr 25, 2018
2 parents eafd032 + 403eb4c commit 1a6148c
Show file tree
Hide file tree
Showing 72 changed files with 4,538 additions and 3,224 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
*.mo
rmlint
rmlint.sh
rmlint.json
src/config.h
docs/rmlint.1.gz
docs/rmlint.1
Expand Down
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
language: c

install:
- sudo apt-get update
- sudo apt-get install python3-sphinx gettext python3-setuptools
- sudo apt-get install libblkid-dev libelf-dev libglib2.0-dev libjson-glib-dev
- sudo apt-get install clang
- sudo easy_install3 $(cat test-requirements.txt)

compiler:
Expand Down
2 changes: 1 addition & 1 deletion .version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.6.1 Penetrating Pineapple
2.6.2 Penetrating Pineapple
35 changes: 33 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,37 @@ All notable changes to this project will be documented in this file.

The format follows [keepachangelog.com]. Please stick to it.

## [2.7.0 Toothless Taipan] -- unreleased

### Added

* New checksum types metro and highway
* New option --keep-hardlinked
* --dedupe option can deduplicate twins on any reflick-capable filesystem
* --dedupe-readonly option can dedupe files on read-only btrfs snapshots

### Changed

* Checksum types for -P... options (see https://github.com/sahib/rmlint/issues/261)

### Deprecated

* Option --btrfs-clone (use --dedupe)
* Paranoia option -pp (use -p)

### Removed

* Checksum types bastard, spooky, city & farmhash
* Multihash output option

### Fixed

* Fix scons 3 compatibility issue (https://github.com/sahib/rmlint/issues/258)
* Fix compile error on systems with no FIEMAP (https://github.com/sahib/rmlint/issues/252)
* Fix handling of bad uids/gids in python output formatter (https://github.com/sahib/rmlint/issues/239)
* Fix escaping of dirnames in rmlint.sh test for new emptydirs (https://github.com/sahib/rmlint/issues/241)


## [2.6.1 Penetrating Pineapple] -- 2017-06-13

### Fixed
Expand Down Expand Up @@ -178,7 +209,7 @@ The format follows [keepachangelog.com]. Please stick to it.
### Added

- A fully working graphical user interface which is installed as a python module
by default (can be disabled via compile option ie ``scons --without-gui``).
by default (can be disabled via compile option ie ``scons --without-gui``).
It can be started via ``rmlint --gui``.
- Support for automatic deduplication on btrfs using ``BTRFS_IOC_FILE_EXTENT_SAME``.
The Shellscript now will contain calls to ``rmlint --btrfs $source $dest``
Expand Down Expand Up @@ -220,7 +251,7 @@ The format follows [keepachangelog.com]. Please stick to it.

### Added

- ``--replay``: Re-output a previously written json file. Allow filtering
- ``--replay``: Re-output a previously written json file. Allow filtering
by using all other standard options (like size or directory filtering).
- ``--sort-by``: Similar to ``-S``, but sorts groups of files. So showing
the group with the biggest size sucker is as easy as ``-y s``.
Expand Down
108 changes: 92 additions & 16 deletions SConstruct
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import SCons
import SCons.Conftest as tests
from SCons.Script.SConscript import SConsEnvironment

DEFAULT_OPTIMISATION='s' # compile with -Os

pkg_config = os.getenv('PKG_CONFIG') or 'pkg-config'

def read_version():
Expand Down Expand Up @@ -325,6 +327,15 @@ def check_btrfs_h(context):
context.Result(rc)
return rc

def check_linux_fs_h(context):
rc = 1
if tests.CheckHeader(context, 'linux/fs.h'):
rc = 0

conf.env['HAVE_LINUX_FS_H'] = rc
context.did_show_result = True
context.Result(rc)
return rc

def check_linux_limits(context):
rc = 1
Expand Down Expand Up @@ -352,6 +363,30 @@ def check_cygwin(context):
context.Result(rc)
return rc

def check_mm_crc32_u64(context):

rc = 0 if tests.CheckDeclaration(
context,
symbol='_mm_crc32_u64',
includes='#include <nmmintrin.h>\n'
) else 1

conf.env['HAVE_MM_CRC32_U64'] = rc
context.did_show_result = True
context.Result(rc)
return rc

def check_builtin_cpu_supports(context):
rc = 0 if tests.CheckDeclaration(
context,
symbol='__builtin_cpu_supports'
) else 1

conf.env['HAVE_BUILTIN_CPU_SUPPORTS'] = rc
context.did_show_result = True
context.Result(rc)
return rc


def create_uninstall_target(env, path):
env.Command("uninstall-" + path, path, [
Expand Down Expand Up @@ -469,14 +504,6 @@ for suffix in ['libelf', 'gettext', 'fiemap', 'blkid', 'json-glib', 'gui']:
dest='with_' + suffix
)

AddOption(
'--with-sse', action='store_const', default=False, const=False, dest='with_sse'
)

AddOption(
'--without-sse', action='store_const', default=False, const=False, dest='with_sse'
)

# General Environment
options = dict(
CXXCOMSTR=compile_source_message,
Expand Down Expand Up @@ -524,8 +551,11 @@ conf = Configure(env, custom_tests={
'check_gettext': check_gettext,
'check_linux_limits': check_linux_limits,
'check_btrfs_h': check_btrfs_h,
'check_linux_fs_h': check_linux_fs_h,
'check_uname': check_uname,
'check_cygwin': check_cygwin,
'check_mm_crc32_u64': check_mm_crc32_u64,
'check_builtin_cpu_supports': check_builtin_cpu_supports,
'check_sysmacro_h': check_sysmacro_h
})

Expand Down Expand Up @@ -599,13 +629,10 @@ if conf.env['IS_CYGWIN']:
else:
conf.env.Append(CCFLAGS=['-fPIC'])


if ARGUMENTS.get('DEBUG') == "1":
conf.env.Append(CCFLAGS=['-ggdb3'])
else:
# Generic compiler:
conf.env.Append(CCFLAGS=['-Os'])
conf.env.Append(LINKFLAGS=['-s'])
# check _mm_crc32_u64 (SSE4.2) support:
conf.check_mm_crc32_u64()
if conf.env['HAVE_MM_CRC32_U64']:
conf.env.Append(CCFLAGS=['-msse4.2'])

if 'clang' in os.path.basename(conf.env['CC']):
conf.env.Append(CCFLAGS=['-fcolor-diagnostics']) # Colored warnings
Expand All @@ -619,14 +646,15 @@ conf.env.Append(CFLAGS=[
'-Wmissing-include-dirs',
'-Wuninitialized',
'-Wstrict-prototypes',
'-Wno-implicit-fallthrough'
'-Wno-implicit-fallthrough',
])

env.ParseConfig(pkg_config + ' --cflags --libs ' + ' '.join(packages))


conf.env.Append(_LIBFLAGS=['-lm'])

conf.check_builtin_cpu_supports()
conf.check_blkid()
conf.check_sys_block()
conf.check_libelf()
Expand All @@ -639,12 +667,27 @@ conf.check_linux_limits()
conf.check_posix_fadvise()
conf.check_faccessat()
conf.check_btrfs_h()
conf.check_linux_fs_h()
conf.check_uname()
conf.check_sysmacro_h()

if conf.env['HAVE_LIBELF']:
conf.env.Append(_LIBFLAGS=['-lelf'])

# compiler optimisation and debug symbols:
cc_O_option = '-O'
if ARGUMENTS.get('DEBUG') == "1":
print("Compiling with gdb extra debug symbols")
conf.env.Append(CCFLAGS=['-ggdb3', '-fno-inline'])
cc_O_option += (ARGUMENTS.get('O') or '0')
else:
conf.env.Append(LINKFLAGS=['-s'])
cc_O_option += (ARGUMENTS.get('O') or DEFAULT_OPTIMISATION)

print("Using compiler optimisation {} (to change, run scons with O=[0|1|2|3|s|fast])".format(cc_O_option))
conf.env.Append(CCFLAGS=[cc_O_option])


SConsEnvironment.Chmod = SCons.Action.ActionFactory(
os.chmod,
lambda dest, mode: 'Chmod("%s", 0%o)' % (dest, mode)
Expand All @@ -663,6 +706,39 @@ SConsEnvironment.InstallPerm = InstallPerm
# Your extra checks here
env = conf.Finish()

def get_cpu_count():
# priority: environ('NUM_CPU'), else try to read actual cpu count, else fallback
fallback = 4

if 'NUM_CPU' in os.environ:
return int(os.environ.get('NUM_CPU'))

# try multiprocessing.cpu_count() (Python 2.6+)
try:
import multiprocessing
return multiprocessing.cpu_count()
except (ImportError, NotImplementedError):
pass

# try psutil.cpu_count()
try:
import psutil
return psutil.cpu_count()
except (ImportError, AttributeError):
pass

# default value
return fallback


# set number of parallel jobs during build
# note: while not particularly intuitive or obvious from the documentation,
# SetOption() will *not* over-ride commandline option passed by `scons -j<n>`
# or `scons --jobs=<n>`
SetOption('num_jobs', get_cpu_count())

print ("Running with --jobs=" + repr(GetOption('num_jobs')))

library = SConscript('lib/SConscript')
programs = SConscript('src/SConscript', exports='library')
env.Default(library)
Expand Down
6 changes: 5 additions & 1 deletion docs/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,11 @@ env.Alias('man', env.Depends(manpage, sphinx))


if 'install' in COMMAND_LINE_TARGETS:
man_install = env.InstallPerm('$PREFIX/share/man/man1', [manpage], 0644)
man_install = env.InstallPerm(
'$PREFIX/share/man/man1',
[manpage],
int("644", 8),
)
target = env.Alias('install', [manpage, man_install])


Expand Down
25 changes: 13 additions & 12 deletions docs/cautions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,22 +62,23 @@ follows to move the files to */tmp*:
fi
}
Another safe alternative, if your files are on a ``btrfs`` filesystem and you have linux
kernel 4.2 or higher, is to reflink the duplicate to the original. You can do this via
``cp --reflink`` or using ``rmlint --btrfs-clone``:
Another safe alternative, if your files are on a copy-on-write filesystem such
as ``btrfs``, and you have linux kernel 4.2 or higher, is to use a deduplication
utility such as ``duperemove`` or ``rmlint --dedupe``:

.. code-block:: bash
$ cp --reflink=always original duplicate # deletes duplicate and replaces it with reflink copy of original
$ rmlint --btrfs-clone original duplicate # does and in-place clone
$ duperemove -dh original duplicate
$ rmlint --dedupe original duplicate
Both of the above first verify (via the kernel) that ``original`` and
``duplicate`` are identical, then modifies ``duplicate`` to reference
``original``'s data extents. Note they do not change the mtime or other
metadata of the duplicate (unlike hardlinks).

If you pass ``-c sh:link`` to ``rmlint``, it will even check for you if your
filesystem is capable of reflinks and emit the correct command conveniently.

The second option is actually safer because it verifies (via the kernel) that the files
are identical before creating the reflink. Also it does not change the mtime or other
metadata of the duplicate.

You might think hardlinking as a safe alternative to deletion, but in fact hardlinking
first deletes the duplicate and then creates a hardlink to the original in its place.
If your duplicate finder has found a false positive, it is possible that you may lose
Expand Down Expand Up @@ -139,7 +140,7 @@ Dupe finders ``rdfind`` and ``dupd`` can also be tricked with the right combinat
Deleted 1 files.
$ ls -l dir/
total 0
$ dupd scan --path /home/foo/a --path /home/foo/a
Files scanned: 2
Total duplicates: 2
Expand Down Expand Up @@ -210,8 +211,8 @@ Symlinks can make a real mess out of filesystem traversal:
dir/link/link/file
[snip]
dir/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/link/file
Set 1 of 1, preserve files [1 - 41, all]:
Set 1 of 1, preserve files [1 - 41, all]:
*Solution:*

Expand Down
4 changes: 2 additions & 2 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,9 +152,9 @@ build the software from the potentially unstable ``develop`` branch:
$ git clone -b develop https://github.com/sahib/rmlint.git
$ cd rmlint/
$ scons config # Look what features scons would compile
$ scons DEBUG=1 -j4 # Optional, build locally.
$ scons DEBUG=1 # Optional, build locally.
# Install (and build if necessary). For releases you can omit DEBUG=1
$ sudo scons DEBUG=1 -j4 --prefix=/usr install
$ sudo scons DEBUG=1 --prefix=/usr install
Done!

Expand Down
Loading

0 comments on commit 1a6148c

Please sign in to comment.