Skip to main content

rclone Dedupe

rclone dedupe finds duplicate files (same name in the same directory) on a remote and provides strategies to resolve them. This is especially important for Google Drive, which allows multiple files with the same name in the same folder — a situation that breaks most sync tools.

Backend-Specific

Only some backends allow duplicate filenames (Google Drive is the most common). On backends that enforce unique names (S3, local filesystem), dedupe typically has nothing to do.

Basic Syntax

rclone dedupe [MODE] REMOTE:PATH [flags]

Dedup Modes

ModeBehavior
interactiveAsk what to do for each duplicate (default)
skipDon't delete anything — just report duplicates
firstKeep the first file, delete the rest
newestKeep the newest file, delete older copies
oldestKeep the oldest file, delete newer copies
largestKeep the largest file, delete smaller copies
smallestKeep the smallest file, delete larger copies
renameRename duplicates to unique names (no deletion)

Practical Examples

Find Duplicates Without Deleting

# Report only — don't change anything
rclone dedupe skip gdrive: -v

Keep the Newest Version

# Keep the most recent copy of each duplicate
rclone dedupe newest gdrive:Documents -v

Interactive Resolution

# Ask for each duplicate
rclone dedupe interactive gdrive:

Rename Instead of Delete

# Give each duplicate a unique name (safe, no data loss)
rclone dedupe rename gdrive:shared-folder -v

Dry Run

# Preview what would happen with "newest" strategy
rclone dedupe newest gdrive: --dry-run -v

Dedupe a Specific Folder

rclone dedupe newest gdrive:Work/Projects -v

Key Flags

FlagDescription
--dry-run / -nPreview without making changes
--verbose / -vShow what's happening
--by-hashConsider files duplicate only if they have the same hash (not just name)
--dedupe-mode MODEAlternative way to specify mode

When Duplicates Happen

ScenarioRisk
Google Drive web uploadDrive allows same-name files in one folder
Failed sync/copy retrySome backends create duplicates on retry
Moving Google Drive filesCan create phantom duplicates
Shared folders with multiple editorsSame filename uploaded by different users

Common Pitfalls

PitfallConsequencePrevention
Running newest/oldest without --dry-runWrong versions deleted permanentlyAlways preview first
Expecting dedupe on S3Nothing happens — S3 prevents duplicatesOnly relevant for Google Drive and similar
Deduplicating shared foldersMay delete other users' filesCoordinate with team, use skip first to assess
Not using --by-hashFiles with same name but different content treated as duplicatesAdd --by-hash for content-aware dedup

What's Next

Examples with Output

1. List duplicates without deleting

See how many phantoms exist on your Google Drive. Command:

rclone dedupe skip gdrive:Backup

Output:

2024/01/15 12:00:00 NOTICE: file1.txt: Found 2 duplicates
2024/01/15 12:00:00 NOTICE: file2.jpg: Found 3 duplicates

2. Auto-keep the newest version

Resolve duplicates by preserving the most recent timestamp. Command:

rclone dedupe newest remote:staging -v

Output:

2024/01/15 12:00:00 INFO  : file1.txt: Deleting 1/2 duplicates
2024/01/15 12:00:00 INFO : file1.txt: Kept newest

3. Rename duplicates to be unique

Safe resolution that preserves all data by adding suffixes. Command:

rclone dedupe rename remote:shared-folder

Output:

2024/01/15 12:00:00 INFO  : document-1.pdf: Renamed to document-1-v1.pdf
2024/01/15 12:00:00 INFO : document-1.pdf: Renamed to document-1-v2.pdf

4. Deduplicate based on file content

Only treat files as duplicates if their hashes match, regardless of name. Command:

rclone dedupe --by-hash newest remote:bucket

Output:

(Only deletes files with identical content)

5. Interactive cleanup

Manually choose which file to keep for each conflict. Command:

rclone dedupe interactive remote:docs

Output:

file1.txt: Found 2 duplicates
[1] f88e... (2024-01-01)
[2] a923... (2024-01-15)
Keep [1, 2, all, none, s]: