Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions apps/obsidian/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
e2e/test-vault*/
e2e/test-results/
e2e/html-report/
e2e/.env
240 changes: 240 additions & 0 deletions apps/obsidian/e2e/NOTES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
# E2E Testing for Obsidian Plugin — Notes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an artifact from an LLM conversation. Should it live in the repo? If so, why?


## Approaches Considered

### Option 1: Playwright `electron.launch()`

The standard Playwright approach for Electron apps — point `executablePath` at the binary and let Playwright manage the process lifecycle.

**Pros:**

- First-class Playwright API — `app.evaluate()` runs code in the main process, not just renderer
- Automatic process lifecycle management (launch, close, cleanup)
- Access to Electron-specific APIs (e.g., `app.evaluate(() => process.env)`)
- Well-documented, widely used for Electron testing

**Cons:**

- **Does not work with Obsidian.** Obsidian's executable is a launcher that loads an `.asar` package (`obsidian-1.11.7.asar`) and forks a new Electron process. Playwright connects to the initial process, which exits, causing `kill EPERM` and connection failures.
- No workaround without modifying Obsidian's startup or using a custom Electron shell

**Verdict:** Not viable for Obsidian.

---

### Option 2: CDP via `chromium.connectOverCDP()` (chosen)

Launch Obsidian as a subprocess with `--remote-debugging-port=9222`, then connect via Chrome DevTools Protocol.

**Pros:**

- Works with Obsidian's forked process architecture — the debug port is inherited by the child process
- Full access to renderer via `page.evaluate()` — Obsidian's global `app` object is available
- Keyboard/mouse interaction works normally
- Can take screenshots, traces, and use all Playwright assertions
- Process is managed explicitly — clear control over startup and teardown

**Cons:**

- No main process access (can't call Electron APIs directly, only renderer-side `window`/`app`)
- Must manually manage process lifecycle (spawn, pkill, port polling)
- Fixed debug port (9222) means tests can't run in parallel across multiple Obsidian instances without port management
- Port polling adds ~2-5s startup overhead
- `pkill -f Obsidian` in setup is aggressive — kills ALL Obsidian instances, not just test ones

**Verdict:** Works well for PoC. Sufficient for single-worker CI/local testing.

---

### Option 3: Obsidian's built-in plugin testing (not explored)

Obsidian has no official testing framework. Some community approaches exist (e.g., `obsidian-jest`, hot-reload-based testing), but none are mature or maintained.

**Verdict:** Not a real option today.

---

## What We Learned

### Obsidian internals accessible via `page.evaluate()`

- `app.plugins.plugins["@discourse-graph/obsidian"]` — check plugin loaded
- `app.vault.getMarkdownFiles()` — list files
- `app.vault.read(file)` — read file content
- `app.vault.create(name, content)` — create files
- `app.workspace.openLinkText(path, "", false)` — open a file in the editor
- `app.commands.executeCommandById(id)` — could execute commands directly (alternative to command palette UI)

### Plugin command IDs

Commands are registered with IDs like `@discourse-graph/obsidian:create-discourse-node`. The command palette shows them as "Discourse Graph: Create discourse node".

### Modal DOM structure

The `ModifyNodeModal` renders React inside Obsidian's `.modal-container`:

- Content: `<textarea>` (`.modal-container textarea`) — not an `<input>`
- Node type: `<select>` (`.modal-container select`)
- Confirm: `<button class="mod-cta">` inside `.modal-button-container`

### Vault configuration

Minimum config for plugin to load:

- `.obsidian/community-plugins.json` → `["@discourse-graph/obsidian"]`
- `.obsidian/app.json` → `{"livePreview": true}` (restricted mode must be off, but this is handled by Obsidian detecting the plugins dir)
- Plugin files (`main.js`, `manifest.json`, `styles.css`) in `.obsidian/plugins/@discourse-graph/obsidian/`

---

## Current architecture

### Vault launch flow

Obsidian has no `--vault` CLI flag. Tests select a vault by editing `~/Library/Application Support/obsidian/obsidian.json` before launch:

1. `resolveVaultPath()` returns an absolute path (`e2e/test-vault` by default, or `OBSIDIAN_TEST_VAULT`)
2. `ensureVaultWithPlugin()` creates `.obsidian/` config and copies `dist/` plugin files into the vault
3. `setActiveVault()` clears all `open` flags and sets `open: true` on the matching vault entry (paths compared via `path.resolve`)
4. Obsidian spawns with `--remote-debugging-port=9222`
5. Playwright connects via CDP and `verifyActiveVault()` asserts `app.vault.adapter.basePath` matches the expected path

### Obsidian CLI vs CDP launch

The official **Obsidian CLI** (`obsidian help`, `obsidian eval`, etc.) is a control channel to a **running** app. It cannot start an instance with `--remote-debugging-port` — that flag must be passed at launch.

E2E tests therefore:

1. Launch via macOS `open -na Obsidian.app --args --remote-debugging-port=9222`
2. Connect Playwright with `chromium.connectOverCDP`
3. Optionally use CLI afterward for vault ops (`obsidian vault=... eval "..."`)

Enable CLI in **Settings → General → Command line interface** anyway — newer Obsidian builds gate command-line launches on this setting. Reinstall the latest `.app` from [obsidian.md/download](https://obsidian.md/download) if you see "installer out of date".

### Environment variables

Copy `e2e/.env.example` to `apps/obsidian/.env` (loaded by `playwright.config.ts`):

| Variable | Purpose |
| --------------------- | ------------------------------------------------------------------------------------------- |
| `OBSIDIAN_APP_PATH` | Path to Obsidian executable (default: `/Applications/Obsidian.app/Contents/MacOS/Obsidian`) |
| `OBSIDIAN_TEST_VAULT` | Optional override vault path (skips vault cleanup on teardown) |

### Single Obsidian instance

A worker-scoped Playwright fixture in `e2e/fixtures/obsidian.ts` launches Obsidian once per test run (`workers: 1`). All specs share the same `obsidianPage` fixture. Teardown kills the process via debug port, restores `obsidian.json`, and cleans the test vault.

### Test organization

```
e2e/
├── fixtures/obsidian.ts # Worker fixture: launch + teardown
├── scenarios/ # Test logic (assertions, interactions)
│ ├── plugin-load.ts
│ └── node-creation.ts
└── tests/
└── smoke.spec.ts # Thin orchestrator — calls each scenario
```

`pnpm test:e2e` runs `smoke.spec.ts` only. Scenarios hold the real test logic; the spec file just wires them in order. Tests use **cumulative vault state** — later steps may see notes created by earlier ones.

### Run commands

```bash
cd apps/obsidian
pnpm build
pnpm test:e2e # full smoke suite
pnpm test:e2e:ui # Playwright UI (humans)
```

---

## Proposal: Full Agentic Testing Flow

### Goal

AI coding agents (Cursor, Claude Code) can run `pnpm test:e2e` after making changes to automatically verify features work end-to-end. The test suite should be comprehensive enough to catch regressions, fast enough to run frequently, and deterministic enough to trust the results.

### Phase 1: Stabilize the PoC

**Done:**

- Single shared vault (`e2e/test-vault`) with path normalization and post-launch verification
- Configurable `OBSIDIAN_APP_PATH` via `.env`
- Worker-scoped fixture for one Obsidian instance per run
- Port-based process teardown (`killObsidianOnDebugPort`)
- CDP connection retry logic
- Shared scenario helpers with thin spec files + smoke orchestrator

**Remaining:**

- Use a unique temp directory per test run (`os.tmpdir()`) instead of a fixed `test-vault/` path
- Use a random debug port to allow parallel runs
- Add a `test.beforeEach` that resets vault state (delete all non-config files) instead of cumulative state

**Done (post-architecture):**

- macOS `open -na` launch for reliable CDP port
- Modal selectors updated for `ModifyNodeModal` textarea UI
- Single `smoke.spec.ts` orchestrates all scenarios; `pnpm test:e2e` runs smoke only

### Phase 2: Expand test coverage

**Core plugin features to test:**

- Create each discourse node type (Question, Claim, Evidence, Source)
- Verify frontmatter (`nodeTypeId`) is set correctly
- Verify file naming conventions (e.g., `QUE - `, `CLM - `, `EVD - `, `SRC - `)
- Open node type menu via hotkey (`Cmd+\`)
- Discourse context view toggle
- Settings panel opens and renders

**Vault-level tests:**

- Create multiple nodes and verify they appear in file explorer
- Verify node format regex matching (files follow the format pattern)

**Use `app.commands.executeCommandById()` as the primary way to trigger commands** — faster, more reliable, and avoids flaky command palette typing. Reserve command palette tests for testing the palette itself.

### Phase 3: Agentic integration

**For agents to use the tests effectively:**

1. **Fast feedback loop** — Tests should complete in <30s total. Current PoC is ~14s for 2 tests, which is good. Keep Obsidian running between test files using Playwright's `globalSetup`/`globalTeardown`.

2. **Clear error messages** — When a test fails, the agent needs to understand WHY. Add descriptive assertion messages:

```ts
expect(
pluginLoaded,
"Plugin should be loaded — check dist/ is built and plugin ID matches manifest.json",
).toBe(true);
```

3. **Screenshot-on-failure for visual debugging** — Already configured. Consider adding `page.screenshot()` at key checkpoints even on success, so agents can visually verify state.

4. **Test file organization** — Add new scenarios under `e2e/scenarios/`, then call them from `smoke.spec.ts`.

5. **CI integration** — Run in GitHub Actions with a macOS runner. Obsidian would need to be pre-installed on the runner (or downloaded in a setup step). This is the biggest open question — Obsidian doesn't have a headless mode, so CI would need `xvfb` or a virtual display.

6. **Agent-executable test commands:**
```bash
pnpm test:e2e # run all tests
pnpm test:e2e -- --grep "node creation" # run specific tests
pnpm test:e2e:ui # interactive Playwright UI (for humans)
```

### Phase 4: Advanced (future)

- **Visual regression testing** — Compare screenshots against baselines to catch UI regressions
- **Obsidian version matrix** — Test against multiple Obsidian versions (download different `.asar` files)
- **Headless mode wrapper** — Investigate running Obsidian with `--disable-gpu --headless` flags (may not work due to Obsidian's renderer requirements)
- **Test data fixtures** — Pre-built vaults with specific node/relation configurations for testing complex scenarios
- **Performance benchmarks** — Measure plugin load time, command execution time

### Open Questions

1. **CI runner setup** — How to install Obsidian on GitHub Actions macOS runners? Is there a `.dmg` download URL that's stable? Or do we cache the `.app` bundle?
2. **Obsidian updates** — Obsidian auto-updates the `.asar`. Should tests pin a specific version? How to prevent auto-update during test runs?
3. **Multiple vaults** — Obsidian tracks known vaults globally. Test vaults may accumulate in Obsidian's vault list. Need cleanup strategy.
4. **Restricted mode** — The PoC doesn't explicitly disable restricted mode via config. The plugin loads because the `community-plugins.json` file is present, but a fresh Obsidian install might prompt the user to enable community plugins. Need to investigate if there's a config flag to skip this.
23 changes: 23 additions & 0 deletions apps/obsidian/e2e/constants.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import fs from "fs";
import path from "path";

type PluginManifest = {
id: string;
name: string;
};

const manifest = JSON.parse(
fs.readFileSync(path.join(__dirname, "..", "manifest.json"), "utf-8"),
) as PluginManifest;

export const PLUGIN_ID = manifest.id;
export const PLUGIN_NAME = manifest.name;
export const CREATE_NODE_COMMAND_ID = "create-discourse-node";
export const CREATE_NODE_PALETTE_LABEL = `${PLUGIN_NAME}: Create discourse node`;
export const QUESTION_NODE_PREFIX = "QUE - ";
export const PLUGIN_BUILD_FILES = [
"main.js",
"manifest.json",
"styles.css",
] as const;
export const E2E_TIMEOUT = 10_000;
58 changes: 58 additions & 0 deletions apps/obsidian/e2e/fixtures/obsidian.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
import { test as base } from "@playwright/test";
import type { Browser, Page } from "@playwright/test";
import {
DEFAULT_TEST_VAULT,
createTestVault,
ensureVaultWithPlugin,
launchObsidian,
restoreObsidianConfig,
resolveVaultPath,
isCustomVault,
killObsidianOnDebugPort,
cleanTestVault,
} from "../helpers/obsidian-setup";
import { cleanupE2eScratchFiles } from "../helpers/commands";

type ObsidianWorker = {
browser: Browser;
page: Page;
vaultPath: string;
originalObsidianConfig?: string;
};

export const test = base.extend<object, { obsidian: ObsidianWorker }>({
obsidian: [
async ({}, use) => {
const vaultPath = resolveVaultPath(DEFAULT_TEST_VAULT);
if (isCustomVault()) {
ensureVaultWithPlugin(vaultPath);
} else {
createTestVault(vaultPath);
}

const launched = await launchObsidian(vaultPath);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore Obsidian state when launch setup fails

When launchObsidian() throws before use is reached (for example the debug port never opens or verifyActiveVault fails after setActiveVault has rewritten ~/Library/Application Support/obsidian/obsidian.json), Playwright never executes the teardown code below, so the user's Obsidian config remains pointed at the test vault and any launched process may keep running. Wrap the launch/setup path in try/finally (or make launchObsidian roll back on failure) so cleanup happens even when fixture setup fails.

Useful? React with 👍 / 👎.


await use({
browser: launched.browser,
page: launched.page,
vaultPath,
originalObsidianConfig: launched.originalObsidianConfig,
});

await cleanupE2eScratchFiles(launched.page).catch(() => {
// Vault may already be unavailable if Obsidian shut down early
});
await launched.browser.close();
await killObsidianOnDebugPort();
if (launched.originalObsidianConfig) {
restoreObsidianConfig(launched.originalObsidianConfig);
}
if (!isCustomVault()) {
cleanTestVault(vaultPath);
}
},
{ scope: "worker" },
],
});

export { expect } from "@playwright/test";
70 changes: 70 additions & 0 deletions apps/obsidian/e2e/helpers/commands.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import type { Page } from "@playwright/test";
import {
CREATE_NODE_COMMAND_ID,
CREATE_NODE_PALETTE_LABEL,
E2E_TIMEOUT,
PLUGIN_ID,
} from "../constants";

export const E2E_SCRATCH_FILE = "e2e-scratch.md";
const LEGACY_SCRATCH_PREFIX = "scratch-e2e-";

export const executeCommand = async (
page: Page,
commandId: string = CREATE_NODE_COMMAND_ID,
): Promise<void> => {
/* eslint-disable @typescript-eslint/no-unsafe-call, @typescript-eslint/no-unsafe-member-access */
await page.evaluate(
({ pluginId, id }) => {
// @ts-expect-error - Obsidian's global `app` is available at runtime
app.commands.executeCommandById(`${pluginId}:${id}`);
},
{ pluginId: PLUGIN_ID, id: commandId },
);
/* eslint-enable @typescript-eslint/no-unsafe-call, @typescript-eslint/no-unsafe-member-access */
};

/**
* Execute a command via the command palette UI.
* Use this when testing the palette interaction itself.
*/
export const executeCommandViaPalette = async (
page: Page,
commandLabel: string = CREATE_NODE_PALETTE_LABEL,
): Promise<void> => {
await page.keyboard.press("Meta+p");
await page.waitForSelector(".prompt-input", { timeout: E2E_TIMEOUT });

await page.keyboard.type(commandLabel, { delay: 30 });
await page.waitForSelector(".suggestion-item", { timeout: E2E_TIMEOUT });

await page.keyboard.press("Enter");
await page.waitForSelector(".prompt-container", {
state: "hidden",
timeout: E2E_TIMEOUT,
});
};

/**
* Remove e2e scratch files from the vault (including legacy scratch-e2e-* files).
*/
export const cleanupE2eScratchFiles = async (page: Page): Promise<void> => {
/* eslint-disable @typescript-eslint/no-unsafe-call, @typescript-eslint/no-unsafe-member-access, @typescript-eslint/no-unsafe-assignment, @typescript-eslint/no-unsafe-return */
await page.evaluate(
async ({ scratchPath, legacyPrefix }) => {
// @ts-expect-error - Obsidian's global `app` is available at runtime
const vault = app.vault;
const files = vault
.getMarkdownFiles()
.filter(
(f: { path: string; basename: string }) =>
f.path === scratchPath || f.basename.startsWith(legacyPrefix),
Comment on lines +60 to +61

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Delete the nodes created in custom vault runs

When OBSIDIAN_TEST_VAULT is set, teardown intentionally skips cleanTestVault, but this cleanup only removes e2e-scratch.md and legacy scratch-e2e-* files. The smoke scenario creates QUE - What is discourse graph testing ... notes in that same user-provided vault, so every e2e run leaves test nodes behind; track/delete the created basename or give e2e-created nodes a cleanup prefix that this filter removes.

Useful? React with 👍 / 👎.

);
for (const file of files) {
await vault.delete(file);
}
},
{ scratchPath: E2E_SCRATCH_FILE, legacyPrefix: LEGACY_SCRATCH_PREFIX },
);
/* eslint-enable @typescript-eslint/no-unsafe-call, @typescript-eslint/no-unsafe-member-access, @typescript-eslint/no-unsafe-assignment, @typescript-eslint/no-unsafe-return */
};
Loading