Clepho can find duplicate photos in your collection using multiple detection methods, helping you reclaim disk space while keeping the best versions.
Duplicate detection works by comparing:
- Exact duplicates: Identical files (same SHA-256 hash)
- Perceptual duplicates: Visually similar images (perceptual hash)
Press u in normal mode to find duplicates in scanned photos.
Finding duplicates...
Found 15 groups with 42 duplicate photos
The duplicates view shows groups of similar photos:
┌─────────────────────────────────────────────────────────────┐
│ Duplicates: Group 3/15 (Exact) - 3 photos │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ │ │ │ │ [X] │ │
│ │ Photo │ │ Photo │ │ Photo │ │
│ │ 1 │ │ 2 │ │ 3 │ │
│ │ │ │ │ │ │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ > Keep Keep Delete │
│ │
│ photo_001.jpg IMG_1234.jpg photo_001_copy.jpg │
│ 4032x3024 4032x3024 4032x3024 │
│ 3.2 MB 3.2 MB 3.2 MB │
│ Canon EOS R5 Canon EOS R5 Canon EOS R5 │
│ │
├─────────────────────────────────────────────────────────────┤
│ j/k:photo J/K:group Space:mark a:auto x:trash ?:help Esc │
└─────────────────────────────────────────────────────────────┘
| Key | Action |
|---|---|
j / k |
Navigate between photos in current group |
J / K |
Navigate between duplicate groups |
h / l |
Also navigate photos (vim-style) |
| Key | Action |
|---|---|
Space |
Toggle mark on current photo |
a |
Auto-select duplicates (keep best quality) |
u |
Unmark all in current group |
| Key | Action |
|---|---|
x |
Move marked photos to trash |
X |
Permanently delete marked (no trash) |
| Key | Action |
|---|---|
Enter |
Open photo in external viewer |
? |
Show duplicates help |
Esc / q |
Exit duplicates view |
Files with identical content (same SHA-256 hash):
- Same file copied multiple times
- Imported twice from camera
- Backup copies
These are 100% identical - safe to delete extras.
Visually similar images (similar perceptual hash):
- Resized versions (thumbnail vs original)
- Recompressed (different JPEG quality)
- Minor edits (cropped, color adjusted)
- Format converted (PNG to JPEG)
Review these carefully - they may have different quality.
When using auto-select (a), Clepho ranks photos by quality:
| Factor | Weight | Best Value |
|---|---|---|
| Resolution | High | Larger dimensions |
| File size | Medium | Larger (less compression) |
| Original name | Low | Camera naming patterns |
The highest-quality photo is kept, others are marked.
photo_001.jpg [Quality: 95] ← Keep
photo_001_web.jpg [Quality: 62] ← Delete
photo_001_thumb.jpg [Quality: 31] ← Delete
Configure how similar photos must be to be grouped:
[scanner]
# Hamming distance threshold for perceptual hash
# Lower = stricter (fewer matches)
# Higher = looser (more matches)
similarity_threshold = 50| Value | Matches |
|---|---|
| 10-20 | Nearly identical only |
| 30-40 | Same photo, minor differences |
| 50-60 | Same photo, moderate edits (default) |
| 70-80 | Similar compositions |
| 90+ | May include false positives |
-
Scan your collection (
s)Scanning complete: 10,000 photos -
Find duplicates (
d)Found 150 groups with 380 duplicate photos -
Review groups (
J/Kto navigate)- Check each group visually
- Verify duplicates are actually duplicates
-
Auto-select low quality (
a)- Marks lower-quality versions
- Keeps highest-quality in each group
-
Review selections
- Use
Spaceto adjust if needed - Some "duplicates" may be intentional
- Use
-
Move to trash (
x)Moved 230 files to trash -
Verify and permanently delete
- Press
Xto view trash - Review trashed files
- Press
cto cleanup or wait for auto-cleanup
- Press
By default, duplicates go to trash:
- Located at
~/.local/share/clepho/.trash/ - Can be restored via trash view (
t) - Auto-cleaned based on age/size settings
Use X (uppercase) for immediate permanent deletion:
- Cannot be undone
- Use only when certain
- Bypasses trash entirely
- Backup first - Always have backups
- Check quality - Larger isn't always better
- Check metadata - Some versions have better EXIF
- Check edits - "Duplicates" may be intentional edits
If unrelated photos are grouped:
- Lower the similarity threshold
- Don't mark them - skip to next group
- Consider if perceptual hashing suits your collection
For collections with many duplicates:
- Process in batches by directory
- Use auto-select for obvious duplicates
- Manually review perceptual matches
- Let trash accumulate, then bulk cleanup
- Ensure photos are scanned first (
s) - Check similarity threshold isn't too low
- Verify hash values exist in database
- Lower the similarity threshold
- Stick to exact duplicates only
- Review perceptual matches manually
- Increase similarity threshold
- Ensure all locations are scanned
- Check if files were already deleted
Clepho uses pHash algorithm:
- Resize image to 32x32
- Convert to grayscale
- Apply DCT (discrete cosine transform)
- Extract 64-bit hash from low frequencies
Similar images have similar hashes (low Hamming distance).
-- In photos table
sha256_hash TEXT, -- Exact matching
perceptual_hash TEXT -- Visual similarity (hex string)- Group by identical SHA-256 (exact)
- Compare perceptual hashes pairwise
- Group if Hamming distance ≤ threshold
- Merge overlapping groups