This guide walks through the primary use cases and how to manually recover data.
Scenario: You have shot some new photos on your 1TB drive and want to safely back them up to your 2TB drive.
- Run
./main.sh - Select "Run Backup / Sync"
- Choose
1TB -> 2TB HDD - Select
Safe Sync - The utility will parse the dry-run, tell you it found 45 new files, 0 deleted, and 0 modified.
- It prompts for approval. You hit
Y. - Once finished, it asks if you'd like to Audit the destination. This is optional but good practice to run periodically.
Scenario: You accidentally deleted the "Family Vacation 2025" folder off the 1TB drive three months ago. You only just realized this today after running main.sh numerous times.
Recovery Steps:
Because you used Safe Sync, rsync did not permanently delete the folder from the 2TB drive. Instead, it moved the folder to the Archive/ directory during the sync cycle when it originally noticed the folder was missing.
- Navigate to
/mnt/2tb/1tb_backup/Archive/ - Look through the timestamped folders for the date you likely ran the sync after deleting it.
- You will find
Family Vacation 2025completely intact. - Copy it back to the active
1tb_backuprepository or the original 1TB drive.
Scenario: You want peace of mind that your thousands of photos haven't succumbed to silent data corruption on the 4TB long-term storage drive.
- Run
./main.sh-> "Run Integrity Auditor" ->4TB HDD. - The auditor will scan the drive. Since it's the first time, it hashes everything, building a baseline in
auditor.dband outputting a log. - Six months later, you run it again. The auditor skips everything that hasn't changed its size or
mtime. - If a file's hash mysteriously un-aligns with the database while its size and
mtimestayed the same... [🚨 CRITICAL: 1 FILES FLAGGED FOR BIT-ROT!] - Open the log file mentioned in the terminal output to find the exact corrupted file.
- You can then restore a healthy version of that file from the
Archive/directories or a secondary backup.
Scenario: Over time, the Archive/ directories on your 2TB and 4TB drives will grow large as they capture every modified and deleted file.
- Protocol: Once you are completely satisfied that the main
1tb_backuprepository is healthy and you have no accidental deletions you wish to restore, you can safelyrm -rfthe older timestamped folders insideArchive/to reclaim space.
Scenario: Over the years, you have accumulated duplicate files across different folders, and you want to detect them without being overwhelmed by duplicates you have intentionally kept.
- Run
./main.sh-> "Find Duplicate Files" ->Select a Drive. - The auditor will use its existing database (no hashing required) to find all active files that share a hash and output an organized
duplicates_YYYY-MM-DD.jsonin thelogs/directory. - If you review this file and decide you want to intentionally keep these duplicates (and don't want the script to keep bugging you about them), simply copy or rename this
.jsonfile to the path specified in yourconfig.envunderKNOWN_DUPLICATES_JSON. - Next time you run "Find Duplicate Files", the auditor will load your known duplicates file. Any file sets that perfectly match the known files you already acknowledged will be intelligently hidden, only showing you new duplicates!
Scenario: You want precisely detailed metrics about your drive's state without writing complex bash find scripts. Since auditor.py builds a full SQLite database mapping your drives, you can query it directly!
How to Query: Enter the SQLite prompt by running:
sqlite3 auditor.dbCreative Query Examples:
-
Find the Top 10 Largest Forgotten Files (Not Modified in over 5 Years)
SELECT file_path, (file_size / 1024 / 1024) || ' MB' as SizeMB FROM file_hashes WHERE status = 'active' -- Find files older than 5 years (approx 157784630 seconds) AND mtime < strftime('%s', 'now') - 157784630 ORDER BY file_size DESC LIMIT 10;
-
Generate a "Corrupted Files" Report
-- Instantly list every file ever flagged by the auditor for Bit-rot. SELECT file_path, datetime(last_seen, 'unixepoch') as Discovered FROM file_hashes WHERE status = 'corrupted';
-
Find Files that Only Exist as "Archive" Backups (Missing from the main drive)
-- Assuming your sync script moves things to /Archive/ folders when modified/deleted: SELECT file_name, file_path FROM file_hashes WHERE file_path LIKE '%/Archive/%' AND status = 'active' AND sha256_hash NOT IN ( SELECT sha256_hash FROM file_hashes WHERE file_path NOT LIKE '%/Archive/%' AND status = 'active' );
-
Storage Consumption by File Extension
SELECT substr(file_name, instr(file_name, '.') + 1) as extension, count(*) as total_files, SUM(file_size) / 1024 / 1024 || ' MB' as TotalSize FROM file_hashes WHERE status = 'active' AND file_name LIKE '%.%' GROUP BY extension ORDER BY SUM(file_size) DESC LIMIT 5;