How to Find and Remove Duplicate Files in Ubuntu

  • Post category:Ubuntu

Duplicate files can clutter your system, wasting precious storage space. Fortunately, Ubuntu offers a variety of powerful tools to help you efficiently find and delete duplicate files. In this guide, we’ll explore three popular command-line utilities: fdupes, rdfind, and rmlint. Each of these tools provides a fast and effective way to manage duplicates and optimize your storage.

Why Remove Duplicate Files?

Over time, your system can accumulate redundant files, especially when downloading, backing up, or working with large data sets. Duplicate files not only take up unnecessary space but also make it harder to organize your system. By using specialized tools like fdupes, rdfind, and rmlint, you can keep your system clean and efficient.

How to Find Duplicate Files with fdupes

The fdupes utility is a simple, user-friendly tool for finding duplicate files based on content, not just filenames.

Installing fdupes

To install fdupes on Ubuntu, open your terminal and enter:

sudo apt install fdupes

Finding Duplicate Files with fdupes

To find duplicate files in a specific directory, run:

fdupes /path/to/directory

For a recursive search that includes subdirectories, use the -r option:

fdupes -r /path/to/directory

Deleting Duplicates with fdupes

To remove duplicates interactively, where you’ll be prompted to decide which files to keep, use:

fdupes -d /path/to/directory

If you prefer a non-interactive deletion process, use:

fdupes -dN /path/to/directory

How to Use rdfind for Duplicate File Detection

rdfind (Redundant Data Finder) is a fast tool that identifies duplicate files efficiently.

Installing rdfind

Install it via terminal:

sudo apt install rdfind

Finding Duplicate Files with rdfind

To locate duplicates, run:

rdfind /path/to/directory

The results will be saved in a file named results.txt in the specified directory.

Deleting Duplicates with rdfind

You can automatically delete duplicates by running:

rdfind -deleteduplicates true /path/to/directory

If you want to preview which files would be deleted without actually removing them, add the -dryrun option:

rdfind -dryrun true /path/to/directory

Speed Up Duplicate Detection with rmlint

rmlint is a comprehensive tool that not only detects duplicate files but also identifies other forms of “lint” like empty files and broken symlinks.

Installing rmlint

To install it, enter:

sudo apt install rmlint

Finding Duplicates with rmlint

Use the following command to scan for duplicate files:

rmlint /path/to/directory

After running the scan, rmlint generates a shell script (usually named rmlint.sh) in the current directory.

Executing the Removal Script

Once you’ve reviewed the findings, run the removal script to delete duplicates:

bash rmlint.sh

Conclusion

Removing duplicate files is an essential part of system maintenance, especially if you’re managing large data sets or older backups. fdupes, rdfind, and rmlint are excellent options for efficiently managing file duplication. Depending on your needs—whether it’s simplicity, speed, or comprehensive lint removal—these tools have you covered.

For more tips on managing your Ubuntu system, check out our other posts on codeallow.com.