Let’s say you have a folder with 5000 MP3 files you want to check for duplicates. Or a directory containing thousands of EPUB files, all with different names but you have a hunch some of them might be duplicates.

  1. You can install and run fslint which is the most user friendly approach (if you like GUIs). In my experience, this can be lengthy… read: “not fast”.
  2. Alternatively, you can cd your way in the console (terminal) to a particular folder and then enter the following command (very fast):
    find -not -empty -type f -printf “%s\n” | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 –all-repeated=separate
    This will output a list of files that are duplicates, according to their HASH signature.
  3. Another way is to install fdupes or jdupes (faster) and open a folder, enter terminal mode then enter the following command:
    fdupes -r /folder > duplicates_list.txt
    The -r is for recursion. Check the duplicates_list.txt afterwards in a text editor for a list of duplicate files. But for this to work you will need to have fdupes or jdupes installed (on Debian: sudo apt install fdupes)