-
Notifications
You must be signed in to change notification settings - Fork 2
bolapara/duped
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Simple tool for identifying duplicate files and providing a list of files to delete based on your specifications. This tool does not perform any file removal of it's own unless you specifically direct it to by using the 'delete' command. Let's say that you may have at some point made multiple copies of a directory as a backup and now wish to just delete any files which are duplicates in those backups but not touch the original directory at all: ~/dups/original ~/dups/copy1 ~/dups/copy2 ~/dups/copy3 Identifying duplicates is a multiple stage process. First you build the hash database. j@c:~/duped$ python3 duped.py build --skip .git ~/dups calculating file hashes Working directory is /home/j/duped/duped_results_11738 Now you have a working directory that has a database of file hashes. Next, let's parse that database and generate lists of files to keep or delete based on the directories we specify. The arguments to the process command are the work directory location and the directories you are willing to delete files from. j@c:~/duped$ python3 duped.py process --work_dir /home/j/duped/duped_results_11738 ~/dups/copy1 ~/dups/copy2 ~/dups/copy3 processing files writing results into /home/j/duped/duped_results_11738 keep delete error j@c:~/duped$ In the working directory you have a list of files that are proposed for deletion and a list of files that are to keep. Only files in the copy[1-3] directories that you speicified in the 'process' command will be added to the delete list. You can now review these file lists and see if they make sense. Then you can either use your own tools to delete files from the delete list or run the 'delete' command with the exact same arguments as the 'process' command and the tool will delete the files for you. This tool was made for my specific use but is provided in case it's useful for anyone else.
About
Find duplicate files and produce deletion list based on user specifications
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published