I know of 4 command-line solutions for linux. My preferred one is the last one listed here,
rdfind, because of all the options available.
- This appears to be the most recommended/most well-known one.
- It’s the simplest to use, but its only action is to delete duplicates.
- To ensure duplicates are actually duplicates (while not taking forever to run), comparisons between files are done first by file size, then md5 hash, then bye-by-byte comparison.
Sample output (with options “show size”, “recursive”):
$ fdupes -Sr . 17 bytes each: ./Dir1/Some File ./Dir2/SomeFile
- Designed to, as the name indicates, replace found files with hardlinks.
- Has a
- Does not indicate how contents are compared, but unlike all other options, does take into account file mode, owner, and modified time.
Sample output (note how my two files have slightly different modified times, so in the second run I tell it to ignore that):
$ stat Dir*/* | grep Modify Modify: 2015-09-06 23:51:38.784637949 -0500 Modify: 2015-09-06 23:51:47.488638188 -0500 $ hardlink --dry-run -v . Mode: dry-run Files: 5 Linked: 0 files Compared: 0 files Saved: 0 bytes Duration: 0.00 seconds $ hardlink --dry-run -v -t . [DryRun] Linking ./Dir2/SomeFile to ./Dir1/Some File (-17 bytes) Mode: dry-run Files: 5 Linked: 1 files Compared: 1 files Saved: 17 bytes Duration: 0.00 seconds
- Made to find files that the user then acts upon; has no actions available.
- Comparisons are done by file size, then sha1 hash.
- Hash can be changed to sha256, sha384, or sha512.
- Hash can be disabled to do a byte-by-byte comparison
Sample output (with option “recursive”):
$ duff -r . 2 files in cluster 1 (17 bytes, digest 34e744e5268c613316756c679143890df3675cbb) ./Dir2/SomeFile ./Dir1/Some File
- Options have an unusual syntax (meant to mimic
- Several options for actions to take on duplicate files (delete, make symlinks, make hardlinks).
- Has a dry-run mode.
- Comparisons are done by file size, then first-bytes, then last-bytes, then either md5 (default) or sha1.
- Ranking of files found makes it predictable which file is considered the original.
$ rdfind -dryrun true -makehardlinks true . (DRYRUN MODE) Now scanning ".", found 5 files. (DRYRUN MODE) Now have 5 files in total. (DRYRUN MODE) Removed 0 files due to nonunique device and inode. (DRYRUN MODE) Now removing files with zero size from list...removed 0 files (DRYRUN MODE) Total size is 13341 bytes or 13 kib (DRYRUN MODE) Now sorting on size:removed 3 files due to unique sizes from list.2 files left. (DRYRUN MODE) Now eliminating candidates based on first bytes:removed 0 files from list.2 files left. (DRYRUN MODE) Now eliminating candidates based on last bytes:removed 0 files from list.2 files left. (DRYRUN MODE) Now eliminating candidates based on md5 checksum:removed 0 files from list.2 files left. (DRYRUN MODE) It seems like you have 2 files that are not unique (DRYRUN MODE) Totally, 17 b can be reduced. (DRYRUN MODE) Now making results file results.txt (DRYRUN MODE) Now making hard links. hardlink ./Dir1/Some File to ./Dir2/SomeFile Making 1 links. $ cat results.txt # Automatically generated # duptype id depth size device inode priority name DUPTYPE_FIRST_OCCURRENCE 1 1 17 2055 24916405 1 ./Dir2/SomeFile DUPTYPE_WITHIN_SAME_TREE -1 1 17 2055 24916406 1 ./Dir1/Some File # end of file
- a bunch of new command-line options — including
- native support for all major OS platforms
- speed said to be over 7 times faster than fdupes on average
For your question, you can just execute
$ jdupes -L /path/to/your/files.
You may want to clone and build the latest source from its GitHub repo since the project is still under active development.
Windows binaries are also provided here. Packaged binaries are available in some Linux / BSD distros — actually I first found it through
$ apt search.
Duplicate Commander is a possible solution on Windows:
Duplicate Commander is a freeware application that allows you to find
and manage duplicate files on your PC. Duplicate Commander comes with
many features and tools that allow you to recover your disk space from
Replacing files with hard links Replacing files with soft links ... (and many more) ...
For Linux you can find a Bash script here.
Duplicate & Same File Searcher is yet another solution on Windows:
Duplicate & Same Files Searcher (Duplicate Searcher) is an application
for searching duplicate files (clones) and NTFS hard links to the same
file. It searches duplicate file contents regardless of file name
(true byte-to-byte comparison is used). This application allows not
only to delete duplicate files or to move them to another location,
but to replace duplicates with NTFS hard links as well (unique!)
I had a nifty free tool on my computer called Link Shell Extension; not only was it great for creating Hard Links and Symbolic Links, but Junctions too! In addition, it added custom icons that allow you to easily identify different types of links, even ones that already existed prior to installation; Red Arrows represent Hard Links for instance, while Green represent Symbolic Links… and chains represent Junctions.
I unfortunately uninstalled the software a while back (in a mass-uninstallation of various programs), so I can’t create anymore links manually, but the icons still show up automatically whenever Windows detects a Hard, Symbolic or Junction link.