Skip to content

Commit d749c77

Browse files
phernandezclaude
andcommitted
feat: SPEC-20 enhancements - cleanup, path normalization, and docs
This commit adds several critical improvements discovered during manual testing of SPEC-20 project-scoped rclone sync: **Critical Bug Fixes:** 1. Path Normalization (fixes path doubling bug) - API: Strip /app/data/ prefix in project_router.py - CLI: Defensive normalization in project.py - Rclone: Fix get_project_remote() path construction - Prevents files syncing to /app/data/app/data/project/ 2. Rclone Flag Fix - Changed --filters-file to correct --filter-from flag - Fixes "unknown flag" error in sync and bisync **Enhancements:** 3. Automatic Database Sync - POST to /{project}/project/sync after file operations - Keeps database in sync with files automatically - Skipped on --dry-run operations 4. Enhanced Project Removal - Clean up local sync directory (with --delete-notes) - Always remove bisync state directory - Always remove cloud_projects config entry - Informative messages about what was/wasn't deleted 5. Bisync State Reset Command - New: bm project bisync-reset <project> - Clears corrupted bisync metadata - Safe recovery tool for bisync issues 6. Improved Project List UI - Show Local Path column in cloud mode - Conditionally show/hide columns based on config - Prevent path truncation with no_wrap/overflow - Apply path normalization to display **Documentation:** 7. Cloud CLI Documentation - Add troubleshooting: empty directory bisync issues - Add troubleshooting: bisync state corruption - Document bisync-reset command usage - Explain rclone bisync limitations 8. SPEC-20 Updates - Mark implementation complete - Document all enhancements in Implementation Notes - Update phase checklists with completed work - Add manual testing results **Tests:** 9. Unit Tests for --local-path - Test config persistence with --local-path - Test no config without --local-path - Test tilde expansion - Test nested directory creation All changes tested manually end-to-end. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
1 parent db85186 commit d749c77

7 files changed

Lines changed: 631 additions & 77 deletions

File tree

docs/cloud-cli.md

Lines changed: 62 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ bm project add research --local-path ~/Documents/research
5252
bm project add work --local-path ~/work-notes
5353
bm project add temp # No local sync
5454

55-
# Now you can sync individually:
55+
# Now you can sync individually (after initial --resync):
5656
bm project bisync --name research
5757
bm project bisync --name work
5858
# temp stays cloud-only
@@ -125,10 +125,13 @@ When you add a project with `--local-path`:
125125

126126
### 4. Sync Your Project
127127

128-
Establish the initial sync baseline:
128+
Establish the initial sync baseline. **Best practice:** Always preview with `--dry-run` first:
129129

130130
```bash
131-
# First sync requires --resync to establish baseline
131+
# Step 1: Preview the initial sync (recommended)
132+
bm project bisync --name research --resync --dry-run
133+
134+
# Step 2: If all looks good, run the actual sync
132135
bm project bisync --name research --resync
133136
```
134137

@@ -143,6 +146,14 @@ bm project bisync --name research --resync
143146

144147
**Result:** Local and cloud are in sync. Baseline established.
145148

149+
**Why `--resync`?** This is an rclone requirement for the first bisync run. It establishes the initial state that future syncs will compare against. After the first sync, never use `--resync` unless you need to force a new baseline.
150+
151+
See: https://rclone.org/bisync/#resync
152+
```
153+
--resync
154+
This will effectively make both Path1 and Path2 filesystems contain a matching superset of all files. By default, Path2 files that do not exist in Path1 will be copied to Path1, and the process will then copy the Path1 tree to Path2.
155+
```
156+
146157
### 5. Subsequent Syncs
147158

148159
After the first sync, just run bisync without `--resync`:
@@ -541,6 +552,49 @@ bm project bisync --name research --resync
541552

542553
**Result:** Future syncs work without `--resync`.
543554

555+
### Empty Directory Issues
556+
557+
**Problem:** "Empty prior Path1 listing. Cannot sync to an empty directory"
558+
559+
**Explanation:** Rclone bisync doesn't work well with completely empty directories. It needs at least one file to establish a baseline.
560+
561+
**Solution:** Add at least one file before running `--resync`:
562+
563+
```bash
564+
# Create a placeholder file
565+
echo "# Research Notes" > ~/Documents/research/README.md
566+
567+
# Now run bisync
568+
bm project bisync --name research --resync
569+
```
570+
571+
**Why this happens:** Bisync creates listing files that track the state of each side. When both directories are completely empty, these listing files are considered invalid by rclone.
572+
573+
**Best practice:** Always have at least one file (like a README.md) in your project directory before setting up sync.
574+
575+
### Bisync State Corruption
576+
577+
**Problem:** Bisync fails with errors about corrupted state or listing files
578+
579+
**Explanation:** Sometimes bisync state can become inconsistent (e.g., after mixing dry-run and actual runs, or after manual file operations).
580+
581+
**Solution:** Clear bisync state and re-establish baseline:
582+
583+
```bash
584+
# Clear bisync state
585+
bm project bisync-reset research
586+
587+
# Re-establish baseline
588+
bm project bisync --name research --resync
589+
```
590+
591+
**What this does:**
592+
- Removes all bisync metadata from `~/.basic-memory/bisync-state/research/`
593+
- Forces fresh baseline on next `--resync`
594+
- Safe operation (doesn't touch your files)
595+
596+
**Note:** This command also runs automatically when you remove a project to clean up state directories.
597+
544598
### Too Many Deletes
545599

546600
**Problem:** "Error: max delete limit (25) exceeded"
@@ -591,7 +645,7 @@ If instance is down, wait a few minutes and retry.
591645
## Security
592646

593647
- **Authentication**: OAuth 2.1 with PKCE flow
594-
- **Tokens**: Stored securely in `~/.basic-memory/auth/token`
648+
- **Tokens**: Stored securely in `~/.basic-memory/basic-memory-cloud.json`
595649
- **Transport**: All data encrypted in transit (HTTPS)
596650
- **Credentials**: Scoped S3 credentials (read-write to your tenant only)
597651
- **Isolation**: Your data isolated from other tenants
@@ -655,8 +709,9 @@ bm project ls --name <project> --path <subpath>
655709
1. **Enable cloud mode** - `bm cloud login`
656710
2. **Install rclone** - `bm cloud setup`
657711
3. **Add projects with sync** - `bm project add research --local-path ~/Documents/research`
658-
4. **Establish baseline** - `bm project bisync --name research --resync`
659-
5. **Daily workflow** - `bm project bisync --name research`
712+
4. **Preview first sync** - `bm project bisync --name research --resync --dry-run`
713+
5. **Establish baseline** - `bm project bisync --name research --resync`
714+
6. **Daily workflow** - `bm project bisync --name research`
660715

661716
**Key benefits:**
662717
- ✅ Each project independently syncs (or doesn't)
@@ -668,4 +723,4 @@ bm project ls --name <project> --path <subpath>
668723
**Future enhancements:**
669724
- `--all` flag to sync all configured projects
670725
- Project list showing sync status
671-
- Watch mode for automatic sync
726+
- Watch mode for automatic sync

specs/SPEC-20 Simplified Project-Scoped Rclone Sync.md

Lines changed: 157 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
---
22
title: 'SPEC-20: Simplified Project-Scoped Rclone Sync'
33
date: 2025-01-27
4-
status: Draft
4+
updated: 2025-01-28
5+
status: Implemented
56
priority: High
67
goal: Simplify cloud sync by making it project-scoped, safe by design, and closer to native rclone commands
78
parent: SPEC-8
@@ -1019,11 +1020,15 @@ rm -rf ~/basic-memory-cloud-sync/
10191020
- [x] Create `project.py`: Add `project bisync` command
10201021
- [x] Create `project.py`: Add `project check` command
10211022
- [x] Create `project.py`: Add `project ls` command
1023+
- [x] Create `project.py`: Add `project bisync-reset` command
10221024
- [x] Import rclone_commands module and get_mount_info helper
1023-
- [ ] Update `project list` to show sync status (optional)
1024-
- [ ] Update `cloud/core_commands.py`: Simplify `cloud setup` command (optional)
1025-
- [ ] Add helper functions: `get_all_sync_projects()`, `get_project_by_name()` (optional)
1026-
- [ ] Write integration tests for new commands (deferred)
1025+
- [x] Update `project list` to show local sync paths in cloud mode
1026+
- [x] Update `project list` to conditionally show columns based on config
1027+
- [x] Update `project remove` to clean up local directories and bisync state
1028+
- [x] Add automatic database sync trigger after file sync operations
1029+
- [x] Add path normalization to prevent S3 mount point leakage
1030+
- [x] Update `cloud/core_commands.py`: Simplified `cloud setup` command
1031+
- [x] Write unit tests for `project add --local-path` (4 tests passing)
10271032

10281033
### Phase 5: Cleanup ✅
10291034
- [x] Remove `mount_commands.py` (entire file)
@@ -1053,23 +1058,154 @@ rm -rf ~/basic-memory-cloud-sync/
10531058
- [x] Update tests to remove references to deprecated functionality
10541059
- [x] All typecheck errors resolved
10551060

1056-
### Phase 6: Documentation
1057-
- [ ] Update `docs/cloud-cli.md` with new workflow
1058-
- [ ] Add migration guide for existing users
1059-
- [ ] Update command reference
1060-
- [ ] Add troubleshooting section
1061-
- [ ] Update SPEC-8 with "Superseded by SPEC-20" note
1062-
- [ ] Add examples for common workflows
1063-
1064-
### Testing & Validation
1065-
- [ ] Test Scenario 1: New user setup
1066-
- [ ] Test Scenario 2: Multiple projects
1067-
- [ ] Test Scenario 3: Project without sync
1068-
- [ ] Test Scenario 4: Integrity check
1069-
- [ ] Test Scenario 5: Safety features (max delete)
1070-
- [ ] Verify performance targets (setup < 30s, sync < 5s)
1071-
- [ ] Test migration from SPEC-8 implementation
1061+
### Phase 6: Documentation ✅
1062+
- [x] Update `docs/cloud-cli.md` with new workflow
1063+
- [x] Add troubleshooting section for empty directory issues
1064+
- [x] Add troubleshooting section for bisync state corruption
1065+
- [x] Document `bisync-reset` command usage
1066+
- [x] Update command reference with all new commands
1067+
- [x] Add examples for common workflows
1068+
- [ ] Add migration guide for existing users (deferred - no users on old system yet)
1069+
- [ ] Update SPEC-8 with "Superseded by SPEC-20" note (deferred)
10721070

1071+
### Testing & Validation ✅
1072+
- [x] Test Scenario 1: New user setup (manual testing complete)
1073+
- [x] Test Scenario 2: Multiple projects (manual testing complete)
1074+
- [x] Test Scenario 3: Project without sync (manual testing complete)
1075+
- [x] Test Scenario 4: Integrity check (manual testing complete)
1076+
- [x] Test Scenario 5: bisync-reset command (manual testing complete)
1077+
- [x] Test cleanup on remove (manual testing complete)
1078+
- [x] Verify all commands work end-to-end
1079+
- [x] Document known issues (empty directory bisync limitation)
1080+
- [ ] Automated integration tests (deferred)
1081+
- [ ] Test migration from SPEC-8 implementation (N/A - no users yet)
1082+
1083+
## Implementation Notes
1084+
1085+
### Key Improvements Added During Implementation
1086+
1087+
**1. Path Normalization (Critical Bug Fix)**
1088+
1089+
**Problem:** Files were syncing to `/app/data/app/data/project/` instead of `/app/data/project/`
1090+
1091+
**Root cause:**
1092+
- S3 bucket contains projects directly (e.g., `basic-memory-llc/`)
1093+
- Fly machine mounts bucket at `/app/data/`
1094+
- API returns paths like `/app/data/basic-memory-llc` (mount point + project)
1095+
- Rclone was using this full path, causing path doubling
1096+
1097+
**Solution (three layers):**
1098+
- API side: Added `normalize_project_path()` in `project_router.py` to strip `/app/data/` prefix
1099+
- CLI side: Added defensive normalization in `project.py` commands
1100+
- Rclone side: Updated `get_project_remote()` to strip prefix before building remote path
1101+
1102+
**Files modified:**
1103+
- `src/basic_memory/api/routers/project_router.py` - API normalization
1104+
- `src/basic_memory/cli/commands/project.py` - CLI normalization
1105+
- `src/basic_memory/cli/commands/cloud/rclone_commands.py` - Rclone remote path construction
1106+
1107+
**2. Automatic Database Sync After File Operations**
1108+
1109+
**Enhancement:** After successful file sync or bisync, automatically trigger database sync via API
1110+
1111+
**Implementation:**
1112+
- After `project sync`: POST to `/{project}/project/sync`
1113+
- After `project bisync`: POST to `/{project}/project/sync` + update config timestamps
1114+
- Skip trigger on `--dry-run`
1115+
- Graceful error handling with warnings
1116+
1117+
**Benefit:** Files and database stay in sync automatically without manual intervention
1118+
1119+
**3. Enhanced Project Removal with Cleanup**
1120+
1121+
**Enhancement:** `bm project remove` now properly cleans up local artifacts
1122+
1123+
**Behavior with `--delete-notes`:**
1124+
- ✓ Removes project from cloud API
1125+
- ✓ Deletes cloud files
1126+
- ✓ Removes local sync directory
1127+
- ✓ Removes bisync state directory
1128+
- ✓ Removes `cloud_projects` config entry
1129+
1130+
**Behavior without `--delete-notes`:**
1131+
- ✓ Removes project from cloud API
1132+
- ✗ Keeps local files (shows path in message)
1133+
- ✓ Removes bisync state directory (cleanup)
1134+
- ✓ Removes `cloud_projects` config entry
1135+
1136+
**Files modified:**
1137+
- `src/basic_memory/cli/commands/project.py` - Enhanced `remove_project()` function
1138+
1139+
**4. Bisync State Reset Command**
1140+
1141+
**New command:** `bm project bisync-reset <project>`
1142+
1143+
**Purpose:** Clear bisync state when it becomes corrupted (e.g., after mixing dry-run and actual runs)
1144+
1145+
**What it does:**
1146+
- Removes all bisync metadata from `~/.basic-memory/bisync-state/{project}/`
1147+
- Forces fresh baseline on next `--resync`
1148+
- Safe operation (doesn't touch files)
1149+
- Also runs automatically on project removal
1150+
1151+
**Files created:**
1152+
- Added `bisync-reset` command to `src/basic_memory/cli/commands/project.py`
1153+
1154+
**5. Improved UI for Project List**
1155+
1156+
**Enhancements:**
1157+
- Shows "Local Path" column in cloud mode for projects with sync configured
1158+
- Conditionally shows/hides columns based on config:
1159+
- Local Path: only in cloud mode
1160+
- Default: only when `default_project_mode` is True
1161+
- Uses `no_wrap=True, overflow="fold"` to prevent path truncation
1162+
- Applies path normalization to prevent showing mount point details
1163+
1164+
**Files modified:**
1165+
- `src/basic_memory/cli/commands/project.py` - Enhanced `list_projects()` function
1166+
1167+
**6. Documentation of Known Issues**
1168+
1169+
**Issue documented:** Rclone bisync limitation with empty directories
1170+
1171+
**Problem:** "Empty prior Path1 listing. Cannot sync to an empty directory"
1172+
1173+
**Explanation:** Bisync creates listing files that track state. When both directories are completely empty, these listing files are considered invalid.
1174+
1175+
**Solution documented:** Add at least one file (like README.md) before running `--resync`
1176+
1177+
**Files updated:**
1178+
- `docs/cloud-cli.md` - Added troubleshooting sections for:
1179+
- Empty directory issues
1180+
- Bisync state corruption
1181+
- Usage of `bisync-reset` command
1182+
1183+
### Rclone Flag Fix
1184+
1185+
**Bug fix:** Incorrect rclone flag causing sync failures
1186+
1187+
**Error:** `unknown flag: --filters-file`
1188+
1189+
**Fix:** Changed `--filters-file` to correct flag `--filter-from` in both `project_sync()` and `project_bisync()` functions
1190+
1191+
**Files modified:**
1192+
- `src/basic_memory/cli/commands/cloud/rclone_commands.py`
1193+
1194+
### Test Coverage
1195+
1196+
**Unit tests added:**
1197+
- `tests/cli/test_project_add_with_local_path.py` - 4 tests for `--local-path` functionality
1198+
- Test with local path saves to config
1199+
- Test without local path doesn't save to config
1200+
- Test tilde expansion in paths
1201+
- Test nested directory creation
1202+
1203+
**Manual testing completed:**
1204+
- All 10 project commands tested end-to-end
1205+
- Path normalization verified
1206+
- Database sync trigger verified
1207+
- Cleanup on remove verified
1208+
- Bisync state reset verified
10731209

10741210
## Future Enhancements (Out of Scope)
10751211

src/basic_memory/api/routers/project_router.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,24 @@
2727
project_resource_router = APIRouter(prefix="/projects", tags=["project_management"])
2828

2929

30+
def normalize_project_path(path: str) -> str:
31+
"""Normalize project path by stripping mount point prefix.
32+
33+
In cloud deployments, the S3 bucket is mounted at /app/data. We strip this
34+
prefix from project paths to avoid leaking implementation details and to
35+
ensure paths match the actual S3 bucket structure.
36+
37+
Args:
38+
path: Project path (e.g., "/app/data/basic-memory-llc")
39+
40+
Returns:
41+
Normalized path (e.g., "/basic-memory-llc")
42+
"""
43+
if path.startswith("/app/data/"):
44+
return path.removeprefix("/app/data")
45+
return path
46+
47+
3048
@project_router.get("/info", response_model=ProjectInfoResponse)
3149
async def get_project_info(
3250
project_service: ProjectServiceDep,
@@ -50,7 +68,7 @@ async def get_project(
5068

5169
return ProjectItem(
5270
name=found_project.name,
53-
path=found_project.path,
71+
path=normalize_project_path(found_project.path),
5472
is_default=found_project.is_default or False,
5573
)
5674

@@ -167,7 +185,7 @@ async def list_projects(
167185
project_items = [
168186
ProjectItem(
169187
name=project.name,
170-
path=project.path,
188+
path=normalize_project_path(project.path),
171189
is_default=project.is_default or False,
172190
)
173191
for project in projects

src/basic_memory/cli/commands/cloud/core_commands.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -174,11 +174,18 @@ def setup() -> None:
174174
console.print("\n[bold green]✓ Cloud setup completed successfully![/bold green]")
175175
console.print("\n[bold]Next steps:[/bold]")
176176
console.print("1. Add a project with local sync path:")
177-
console.print(" bm project add research ~/projects/research --local-path ~/sync/research")
178-
console.print("\n2. Sync your project:")
179-
console.print(" bm project bisync --name research --resync # First time")
180-
console.print(" bm project bisync --name research # Subsequent syncs")
181-
console.print("\n[dim]Use 'bm project --help' for more commands[/dim]")
177+
console.print(" bm project add research --local-path ~/Documents/research")
178+
console.print("\n Or configure sync for an existing project:")
179+
console.print(" bm project sync-setup research ~/Documents/research")
180+
console.print("\n2. Preview the initial sync (recommended):")
181+
console.print(" bm project bisync --name research --resync --dry-run")
182+
console.print("\n3. If all looks good, run the actual sync:")
183+
console.print(" bm project bisync --name research --resync")
184+
console.print("\n4. Subsequent syncs (no --resync needed):")
185+
console.print(" bm project bisync --name research")
186+
console.print(
187+
"\n[dim]Tip: Always use --dry-run first to preview changes before syncing[/dim]"
188+
)
182189

183190
except (RcloneInstallError, BisyncError, CloudAPIError) as e:
184191
console.print(f"\n[red]Setup failed: {e}[/red]")

0 commit comments

Comments
 (0)