Wednesday, March 5, 2008

Discovering duplicate files

A while ago I had a problem. I had to change my file servers hard drives, and I didn't have large enough hard drive to take copies of. This lead to the situation where I had copies of my data spread on various computers, and after I got my server up and running again, I realized there were quite many duplicate files on my server.

Luckily I found a nice small application called fdupes. This program goes thru directories and creates MD5 sum of every file. After the process it compares the MD5 sums and lets the user know which files are duplicates.

Fdupes can be found on regular Fedora Core yum repository and the usage of the program is quite easy:

fdupes -r /share << the directory you want to scan, recursively

The program gives you nice list of the duplicate files

No comments: