Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
160 commits
Select commit Hold shift + click to select a range
ba55134
docs(openspec): draft init/config UX overhaul (3 changes)
Aaronontheweb May 23, 2026
8f57cf5
docs: address self-scrutiny review of openspec changes
Aaronontheweb May 23, 2026
43ec915
docs(openspec): align init and config workflows
Aaronontheweb May 24, 2026
5915559
feat(config): prototype schema-driven search editor
Aaronontheweb May 25, 2026
1999ebf
fix(cli): stabilize init and config TUI flows
Aaronontheweb May 25, 2026
797681d
docs(ui): add search config redesign POC
Aaronontheweb May 26, 2026
4de2f79
docs(ui): simplify search config mockups
Aaronontheweb May 26, 2026
d16c102
refine(config): align search editor with init TUI
Aaronontheweb May 26, 2026
1cfb7be
refine(config): introduce typed search editor model
Aaronontheweb May 26, 2026
30f42af
refine(config): streamline search editor editing
Aaronontheweb May 26, 2026
d421c7a
refine(config): make search setup a focused workflow
Aaronontheweb May 27, 2026
665dde3
refine(config): keep search save flow in context
Aaronontheweb May 27, 2026
6c6ddff
feat(config): add workflow editor pilot
Aaronontheweb May 28, 2026
a8125b0
fix(tui): show active config selections
Aaronontheweb May 28, 2026
b4634c4
fix(config): preserve inactive exposure settings
Aaronontheweb May 28, 2026
b135531
refine(config): centralize editor session merges
Aaronontheweb May 28, 2026
d48879d
fix(config): reset exposure editor on reopen
Aaronontheweb May 28, 2026
e08cbbe
fix(config): return from search saved screen
Aaronontheweb May 28, 2026
af459c3
refine(config): route search saves through editor session
Aaronontheweb May 28, 2026
1c41bbc
feat(config): inline enabled feature toggles
Aaronontheweb May 29, 2026
c181016
feat(config): add inline security editors
Aaronontheweb May 29, 2026
a388d07
refine(config): improve audience profile editor
Aaronontheweb May 30, 2026
24fd27e
refine(config): use Termina back navigation for MCP grants
Aaronontheweb May 30, 2026
7786e77
refine(config): harden security access editors
Aaronontheweb May 30, 2026
1db6f2f
feat(config): add channels summary page
Aaronontheweb May 30, 2026
623ae4d
feat(config): add channels management editor
Aaronontheweb May 31, 2026
e89b790
fix(config): persist channel connection resets
Aaronontheweb May 31, 2026
14454dd
refine(config): unify channels validation
Aaronontheweb May 31, 2026
c178e24
fix(config): validate channel target contracts
Aaronontheweb May 31, 2026
0271a78
chore(opencode): use gpt-5.5 xhigh defaults
Aaronontheweb May 31, 2026
c7322b8
fix(config): restore post-rebase imports
Aaronontheweb May 31, 2026
da3d301
docs(testing): inventory cross-boundary contracts
Aaronontheweb May 31, 2026
bda9ba7
test(config): automate TUI regression gates
Aaronontheweb May 31, 2026
1459a35
test(config): audit config editor coverage
Aaronontheweb May 31, 2026
40634d4
test(config): prove channel validation uses persisted secrets
Aaronontheweb May 31, 2026
3b3eb9b
test(config): generalize leaf validation coverage
Aaronontheweb May 31, 2026
6650824
test(config): prove security access reset semantics
Aaronontheweb May 31, 2026
3e605af
fix(config): refresh daemon auth after exposure changes
Aaronontheweb Jun 1, 2026
7ace600
docs(plan): mark security access review complete
Aaronontheweb Jun 1, 2026
54b5747
docs(plan): scope remaining config surfaces
Aaronontheweb Jun 1, 2026
c90dee7
docs(plan): require config surface review
Aaronontheweb Jun 1, 2026
753f69b
fix(config): block invalid exposure bootstrap state
Aaronontheweb Jun 1, 2026
60e652c
feat(config): implement task 1 config surfaces
Aaronontheweb Jun 1, 2026
7aaf5f5
feat(config): add ops config surfaces
Aaronontheweb Jun 1, 2026
4b94910
fix(config): clarify channel management rows
Aaronontheweb Jun 4, 2026
cd6d923
fix(config): resolve saved channel labels
Aaronontheweb Jun 4, 2026
1aeda07
fix(config): stabilize channel setup input
Aaronontheweb Jun 4, 2026
db0ed4d
docs(config): specify autosave interaction contract
Aaronontheweb Jun 4, 2026
64cd19f
fix(config): autosave inline settings changes
Aaronontheweb Jun 4, 2026
91e43fa
fix(config): add channel done affordance
Aaronontheweb Jun 6, 2026
aff34ea
fix(config): use shared spinner for search validation
Aaronontheweb Jun 6, 2026
ea09850
docs(config): redesign skill sources workflows
Aaronontheweb Jun 6, 2026
a488687
feat(config): add skill sources manager
Aaronontheweb Jun 6, 2026
1ed87f5
fix(config): wire skill source detail actions
Aaronontheweb Jun 6, 2026
adae9c4
fix(config): clarify skill source input screens
Aaronontheweb Jun 6, 2026
5963323
docs(openspec): propose validated UI components
Aaronontheweb Jun 6, 2026
244360d
docs(openspec): specify validated UI contracts
Aaronontheweb Jun 6, 2026
1eec86e
docs(openspec): design validated UI components
Aaronontheweb Jun 6, 2026
2cfbeb5
docs(openspec): plan validated UI migration
Aaronontheweb Jun 6, 2026
7ed10c6
feat(tui): add validated UI commit pipeline
Aaronontheweb Jun 6, 2026
3d35510
feat(tui): validate skill source local paths
Aaronontheweb Jun 7, 2026
270c04e
feat(tui): validate skill source remote urls
Aaronontheweb Jun 7, 2026
a7b5770
feat(tui): validate skill source auth selection
Aaronontheweb Jun 7, 2026
e3b37e0
feat(tui): validate skill source remote save
Aaronontheweb Jun 7, 2026
1e8bd8c
fix(tui): keep validated text drafts current
Aaronontheweb Jun 7, 2026
1115e05
feat(tui): validate skill source edit screens
Aaronontheweb Jun 7, 2026
2de9633
feat(tui): validate skill source actions
Aaronontheweb Jun 7, 2026
28a2ef5
test(tui): guard skill source validation wiring
Aaronontheweb Jun 7, 2026
8f1e787
feat(tui): standardize probe validation dialogs
Aaronontheweb Jun 7, 2026
810d989
fix(tui): reuse native validated input controls
Aaronontheweb Jun 7, 2026
9b9bf80
fix(tui): return failed skill probes to edit screen
Aaronontheweb Jun 7, 2026
e64955d
Import terminal-faithful TUI prototype as design reference
Aaronontheweb Jun 9, 2026
74bd7ea
Add /opsx reconciliation plan for init + config
Aaronontheweb Jun 9, 2026
1abd627
feat(init): simplify netclaw init to a 5-step bootstrap with existing…
Aaronontheweb Jun 9, 2026
87b8865
refactor(config): remove validated-UI commit framework; inline Skill …
Aaronontheweb Jun 9, 2026
ca4b114
feat(config): ship prototype-proven config UX deltas
Aaronontheweb Jun 9, 2026
cd724e0
chore(openspec): archive section-editor-abstraction (confirmed valid)
Aaronontheweb Jun 9, 2026
0230b47
fix(config,init): address code-review findings
Aaronontheweb Jun 9, 2026
b0cdf28
fix(config): address manual TUI review findings
Aaronontheweb Jun 10, 2026
288e856
fix(config): persist channels by the picker's enabled state, not the …
Aaronontheweb Jun 10, 2026
c554606
fix(config): drop a stale default UpdateChannel when the wizard prese…
Aaronontheweb Jun 10, 2026
4f24c7b
fix(config): persist all channels and flag unresolved ones instead of…
Aaronontheweb Jun 10, 2026
e12d5f6
fix(config): stop blocking channel save when probe Success is false f…
Aaronontheweb Jun 11, 2026
3c6b269
fix(channels): sync text inputs on TextChanged so auto-routed pastes …
Aaronontheweb Jun 11, 2026
7f79d1e
fix(channels): normalize resolved Slack channel names to IDs on re-open
Aaronontheweb Jun 11, 2026
be45666
feat(config): directory pickers for skill folders and the workspaces …
Aaronontheweb Jun 12, 2026
233b4e5
chore(design): extract the browser TUI prototype to a standalone repo
Aaronontheweb Jun 12, 2026
e647d0d
test(smoke): cover directory pickers with dedicated single-launch tapes
Aaronontheweb Jun 12, 2026
b519489
fix(config): document intentional cancellation swallow in channel lab…
Aaronontheweb Jun 12, 2026
d1b712c
test(smoke): guard config tapes against alt-screen restore race after…
Aaronontheweb Jun 12, 2026
c41bc69
refactor(init): drop workspaces + notification-webhook substeps from …
Aaronontheweb Jun 12, 2026
ffb2b14
fix(init): surface the container-supervisor reason when the daemon ne…
Aaronontheweb Jun 12, 2026
3fd5096
test(smoke): align init-wizard tape with the trimmed Identity step
Aaronontheweb Jun 12, 2026
e94443b
feat(init): auto-launch chat once the health check passes
Aaronontheweb Jun 15, 2026
844a1ee
chore(config): drop dead schema-projection engine and UI/UX prototype…
Aaronontheweb Jun 15, 2026
8044150
fix(config): auto-pair the configuring client when enabling non-local…
Aaronontheweb Jun 15, 2026
e82ff61
docs(openspec): reconcile config-surface + onboarding specs to as-built
Aaronontheweb Jun 15, 2026
bf72e9b
docs(openspec): mark reconcile-config-onboarding-specs deltas verified
Aaronontheweb Jun 15, 2026
77d9820
refactor(cli): apply branch code-review findings — safe reuse, 2 bug …
Aaronontheweb Jun 16, 2026
1bb156a
refactor(cli): remove legacy single-purpose wizard steps superseded b…
Aaronontheweb Jun 16, 2026
e6599c6
docs(openspec): plan config-TUI hardening from deep C# review
Aaronontheweb Jun 16, 2026
ee39d23
fix(cli): atomic config & secrets writes via shared AtomicFile seam
Aaronontheweb Jun 16, 2026
7832530
fix(cli): atomic device-registry writes; consolidate owner-only harde…
Aaronontheweb Jun 16, 2026
bae05f9
fix(cli): cancel-and-await background channel-label refresh before save
Aaronontheweb Jun 16, 2026
cbb460e
fix(cli): channels autosave persists without blocking the UI on a net…
Aaronontheweb Jun 16, 2026
a481c8a
fix(cli): atomic CTS ownership in ProviderStep probe lifecycle
Aaronontheweb Jun 16, 2026
d8b33d9
fix(cli): synchronize HealthCheck Results list across async writer an…
Aaronontheweb Jun 16, 2026
8f19b1a
fix(cli): track and cancel ProviderManager detail revalidation
Aaronontheweb Jun 16, 2026
4330c68
fix(tui): remove DiscordStepViewModel background channel-resolution d…
Aaronontheweb Jun 16, 2026
37148e1
fix(tui): run skill-feed reachability probe off the TUI loop
Aaronontheweb Jun 16, 2026
21c53e6
fix(config): fail closed on an unparseable deployment posture
Aaronontheweb Jun 16, 2026
0f86b11
test(config): merge the two local-path rejection tests into a theory
Aaronontheweb Jun 16, 2026
ea61dff
fix(config): default-deny browser MCP server without explicit Enabled
Aaronontheweb Jun 16, 2026
ed247fe
fix(config): flag a plaintext skill-feed token instead of silently us…
Aaronontheweb Jun 16, 2026
3b34d94
fix(config): guard exposure/audience parse on render and mutation paths
Aaronontheweb Jun 16, 2026
d06b250
fix(config): guard dashboard section summaries against malformed config
Aaronontheweb Jun 16, 2026
687dc1d
fix(config): surface skill-sources/workspaces config-write IO failures
Aaronontheweb Jun 16, 2026
b612d6f
fix(wizard): release health-check step on an unexpected exception
Aaronontheweb Jun 16, 2026
0b042c1
fix(config): autosave the channel-audience left/right toggle
Aaronontheweb Jun 16, 2026
cb61629
fix(provider): persist fixed credentials only after the probe succeeds
Aaronontheweb Jun 16, 2026
d77b29e
fix(wizard): omit unresolved Slack channels from the ACL audience map
Aaronontheweb Jun 16, 2026
e90a285
fix(config): stop add-channel crashing when a channel resolves to "dm"
Aaronontheweb Jun 16, 2026
29c2c97
fix(slopwatch): resolve SW003 empty-catch violations failing CI
Aaronontheweb Jun 16, 2026
efa8ced
fix(config): fix AtomicFile empty-catch in code instead of baselining
Aaronontheweb Jun 16, 2026
973064b
fix(config): allow clearing a webhook auth header
Aaronontheweb Jun 16, 2026
9a704b5
fix(mcp): stop server-access save mutating the live in-memory ACL pro…
Aaronontheweb Jun 16, 2026
4f01735
test(config): accept IOException from the atomic config-write path
Aaronontheweb Jun 16, 2026
061418f
test(wizard): de-flake the HealthCheck ResultsSnapshot concurrency test
Aaronontheweb Jun 16, 2026
d196096
refactor(secrets): derive the protector from paths instead of a globa…
Aaronontheweb Jun 16, 2026
5316096
fix(smoke): repair screenshot regression job
Aaronontheweb Jun 16, 2026
6f3229b
test(smoke): refresh wizard baselines + add config-search baselines
Aaronontheweb Jun 16, 2026
30bacc7
test(smoke): stop anchoring config-search tape on transient probe spi…
Aaronontheweb Jun 16, 2026
03db1dc
fix(config): guard ExposureMode GoNext save against IOException crash…
Aaronontheweb Jun 16, 2026
88018df
fix(config): guard SecurityAccess config writes against IOException c…
Aaronontheweb Jun 16, 2026
9786751
fix(providers): write OAuth token expiry to netclaw.json atomically
Aaronontheweb Jun 16, 2026
52d388a
fix(config): harden Telemetry save — guard OTLP write, preserve in-pr…
Aaronontheweb Jun 16, 2026
9af0dea
fix(config): guard unguarded config read/write paths in three config VMs
Aaronontheweb Jun 16, 2026
dcfe7a9
fix(config): read deployment posture fail-closed in Channels via a sh…
Aaronontheweb Jun 16, 2026
4b6004e
fix(config): cancel and await label refresh before a Channels reset p…
Aaronontheweb Jun 16, 2026
da06977
fix(config): guard ApplyResetConfirmation save+reload against IOExcep…
Aaronontheweb Jun 16, 2026
db78fc4
fix(config): guard SkillSources pre-save and reload reads against mal…
Aaronontheweb Jun 16, 2026
7a62db4
fix(config): guard SkillSources API-key encryption against key-ring f…
Aaronontheweb Jun 16, 2026
38d6337
fix(config): own the Search probe CTS and guard the persisted-draft r…
Aaronontheweb Jun 16, 2026
47d29ac
fix(config): guard constructor-time config reads against malformed co…
Aaronontheweb Jun 16, 2026
eaf77a6
feat(config): block the save when a channel cannot be resolved to an …
Aaronontheweb Jun 16, 2026
fa00c2c
feat(config): resolve Discord/Mattermost channel display-names to ids…
Aaronontheweb Jun 16, 2026
bbc570b
fix(cli): resolve Discord/Mattermost wizard channel refs to ids like …
Aaronontheweb Jun 17, 2026
7a7bf16
docs(skills): add termina-tui-patterns skill for async work in the TUI
Aaronontheweb Jun 17, 2026
ddd21ff
fix(config): canonicalize channel allow-list to ids via async backgro…
Aaronontheweb Jun 17, 2026
a8a3285
fix(config): accept a comma-separated channel list in the add-channel…
Aaronontheweb Jun 17, 2026
43a320e
refactor(config): one channel-resolution path for onboarding and add-…
Aaronontheweb Jun 17, 2026
888f765
fix(config): surface resolved Mattermost channel display name in the …
Aaronontheweb Jun 17, 2026
5f72db6
chore(openspec): archive harden-config-tui + reconcile-config-onboard…
Aaronontheweb Jun 17, 2026
67142a6
fix(config): migrate Channels TUI config writes to async, fixing macO…
Aaronontheweb Jun 17, 2026
8621887
fix(config): harden async config-write lifecycle and fail loud on res…
Aaronontheweb Jun 17, 2026
43c0e3b
fix(config): give the config-write defensive catches a debug trace (s…
Aaronontheweb Jun 17, 2026
dacd5d7
ci(test): capture a full hang dump + sequence file on a stalled test
Aaronontheweb Jun 17, 2026
481156f
fix(test): bound health snapshot concurrency test
Aaronontheweb Jun 17, 2026
faaad6c
fix(tui): publish async updates on loop
Aaronontheweb Jun 17, 2026
9867c87
Revert "fix(tui): publish async updates on loop"
Aaronontheweb Jun 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
302 changes: 302 additions & 0 deletions .claude/skills/termina-tui-patterns.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,302 @@
---
name: termina-tui-patterns
description: How to do async work correctly in the Termina TUI (R3 + single-threaded render loop). Activate when editing anything under src/Netclaw.Cli/Tui/ that touches a network/disk probe, a background refresh, streaming output, spinners, or when you are tempted to write `.GetAwaiter().GetResult()` in a view-model.
---

# Termina TUI Patterns (async, R3, the render loop)

## The myth that wastes hours

> "Termina has no `SynchronizationContext`, so I can't `await` — I have to
> `.GetAwaiter().GetResult()` to stay on the loop thread."

**This is wrong, and it is the single most common mistake agents make in this
codebase.** Blocking the loop thread on a network probe freezes input *and*
rendering for the entire round-trip (the spinner stops spinning, keys queue up).
"No SyncContext" does **not** mean "no async". It means async continuations
resume on arbitrary thread-pool threads, so the continuation must publish its
result through a thread-safe boundary before the Termina loop renders or handles
input from that state.

The whole TUI already runs async the right way: `netclaw chat` streams live LLM
tokens to the screen, provider/search probes spin without blocking, and this
config editor resolves channel labels *after the page loads*. Copy those. Do not
reach for `GetResult()`.

## How Termina actually works (the mental model)

Termina (package `Termina` 0.12.1, which pulls `R3` 1.3.1) runs **one** loop:
`TerminaApplication.RunAsync` does `await foreach` over an **unbounded
`Channel<object>`**, and after every dequeued event calls `RenderCurrentPage()`.
That loop is the single-threaded *serializer* — exactly one event is processed
and one render happens at a time. It runs on a thread-pool thread with **no
installed `SynchronizationContext`** (`TerminaHostedService` launches it via
`Task.Run`).

Three consequences that define every correct pattern:

1. **`RequestRedraw()` is a redraw signal, not a general UI-thread marshal.** It
is literally `_eventChannel.Writer.TryWrite(RedrawRequested.Instance)` and is
safe to call from any thread. The loop later dequeues it and renders. That
does not make unrelated mutable fields, dictionaries, lists, `ReactiveProperty`
fan-out, focus changes, navigation, or `DynamicLayoutNode.Invalidate()` safe to
perform from a background continuation.
2. **Input handlers run synchronously on the loop thread.** Input is delivered
inside the loop via R3 `Subject.OnNext` (a synchronous in-line fan-out, no
scheduler). So `Input.OfType<KeyPressed>().Subscribe(HandleKeyPress)` runs on
the loop thread — the *synchronous prefix* of your handler is on-loop.
3. **R3 `ReactiveProperty.Value = ...` is synchronous fan-out.** If a page
subscription invalidates a `DynamicLayoutNode`, changes focus, navigates, or
mutates Termina nodes, that work runs on the thread that set `.Value`. Setting
a reactive property from a background continuation is therefore an off-loop UI
mutation unless every subscriber is known to be thread-safe.
4. **Every background-to-UI handoff needs an explicit publication strategy.** Use
one of these, and document which one applies: locked snapshots, immutable
replacement values, `Volatile`/`Interlocked` for scalar flags and counters, or
a genuine loop-owned action processed by a Termina input/redraw path. Canceling
and awaiting a background task prevents stale writers, but it is not a memory
barrier for fields concurrently read by render/input.

On ARM64 this distinction matters. x64's stronger memory ordering can hide plain
field races; Apple Silicon will not. A field written by a background continuation
and read by render/input must be synchronized even if every local x64 test passes.

## The async shape to copy (with synchronized publish)

Use this control flow for probes and refreshes: synchronous loop-owned setup,
tracked background task, cancellation check after the await, synchronized publish,
then `RequestRedraw()`. Do not copy older examples that publish plain fields or
reactive properties off-loop without auditing their subscribers.

```csharp
private CancellationTokenSource? _probeCts; // owned CTS
private Task? _probeTask; // TRACKED task (never .GetResult() it)

// Called from a synchronous (loop-thread) key/selection handler.
private void StartBackgroundProbe(/* inputs */)
{
_probeCts?.Cancel();
_probeCts?.Dispose();
_probeCts = new CancellationTokenSource();

SetStatus("Validating…", ConfigStatusTone.Neutral); // 1. sync "working" state…
RequestRedraw(); // …painted on the loop thread

_probeTask = RunProbeAsync(_probeCts.Token); // 2. fire-and-forget, TRACKED
}

private async Task RunProbeAsync(CancellationToken ct)
{
Result result;
try { result = await _probe.ProbeAsync(ct); } // 3. await OFF-loop (thread pool)
catch (OperationCanceledException) { return; } // superseded/abandoned → drop

if (ct.IsCancellationRequested) return; // 4. re-check before publishing
// (a stale result must not clobber)
PublishProbeResult(result); // 5. synchronized publish; see below
RequestRedraw(); // 6. schedule render. NEVER navigate here.
}

// Tests await this instead of Task.Delay / Thread.Sleep:
internal Task? PendingProbe => _probeTask;
```

The rules baked into that shape:

- **Track the task in a field.** Fire-and-forget is fine, *untracked* is not — you
need it to cancel-and-await before a save (below) and to expose it to tests.
- **Own a `CancellationTokenSource`;** on restart, `Cancel()`+`Dispose()` the old
one. Re-check `ct.IsCancellationRequested` *after* the await, before you publish —
this is what stops a superseded probe from overwriting fresh state.
- **The continuation may only publish through a synchronized boundary and call
`RequestRedraw()`. It must NEVER navigate, change focus, invalidate layout nodes,
or set `ReactiveProperty` values with UI-mutating subscribers** off the loop.
- **If the published value is read by render/input, synchronize it.** Use a `lock`
around a mutable collection plus a snapshot method (copy `HealthCheckStepViewModel`
/ `HealthCheckRunner`), replace the whole value with an immutable object, or use
`Volatile`/`Interlocked` for simple scalar state.
- **Do not assume `RequestRedraw()` orders every later read.** Even if the channel
enqueue/dequeue gives the redraw event an ordering edge, input events, timer
invalidations, existing subscriptions, and current renders can read the same state
outside that edge.
- **Expose the `Task`** (`PendingProbe`) so tests await it deterministically. No
`Task.Delay`/`Thread.Sleep` in tests (see CLAUDE.md Testing Guidelines).

## The save-vs-background-write discipline

When a background task can **write the same state** a save reads (e.g. the label
refresh normalizes names->ids and persists), the save must cancel-and-await it
first so it can't land a stale snapshot over the fresh save:

```csharp
private async Task CancelAndAwaitLabelRefreshAsync()
{
_labelResolutionCts?.Cancel();
var inFlight = _labelRefreshTask;
if (inFlight is null) return;
await inFlight; // the refresh swallows its own exceptions
_labelRefreshTask = null;
}
// SaveAsync awaits this at its top, in an async method — NOT via .GetResult().
```

Keep the *consumer* async too: the save path is an `async Task`, dispatched
fire-and-forget from the handler (`_ = ViewModel.SaveFromInputAsync();`) or via
`ConfigAutosave.RunAsync`. Do **not** re-block it with `.GetAwaiter().GetResult()`.

This rule solves stale-writer ordering. It does **not** make the background task's
ordinary field writes safe while render/input can read them concurrently. Those
fields still need locks, immutable replacement, atomics, or loop-owned mutation.

## Streaming (the chat reference)

`netclaw chat` is the proof that async-to-front-end works. The daemon's
server-side `IAsyncEnumerable<token>` arrives over SignalR as a callback push that
is mapped onto an R3 `Subject`, and the page subscribes and appends:

- `DaemonClient.cs:78` — `_connection.On<…>("ReceiveOutput", dto => _outputSubject.OnNext(...))`
- `DaemonClient.cs:153` — `public Observable<SessionOutput> SessionOutput => _outputSubject.AsObservable();`
- `ChatPage.cs:78` — subscribe in `OnBound`; `ChatPage.cs:394-402` — append the delta to the
`StreamingTextNode`; `ChatPage.cs:493` — `RequestRedraw()`.

Do not generalize this into "any off-loop mutation is fine." Chat streaming is a
dedicated push path whose page owns the append/redraw behavior. Before copying it,
verify the target node or subscriber is thread-safe, or publish into synchronized
state that the loop snapshots during render.

## Publication patterns that are safe on ARM64

### Locked mutable collection + snapshot

Use this when a background task appends or replaces items and the render path
enumerates them.

```csharp
private readonly List<HealthCheckItem> _results = [];

private void AddResult(HealthCheckItem item)
{
lock (_results)
_results.Add(item);
RequestRedraw();
}

internal IReadOnlyList<HealthCheckItem> ResultsSnapshot()
{
lock (_results)
return _results.ToArray();
}
```

All readers and writers must use the same lock. Do not expose the mutable list as
the render surface unless callers are required to take the same lock.

### Immutable replacement

Use this when the background result is a complete value, not an incremental edit.
Build the value off-loop, then publish one immutable object/array. If the value is
read without a lock from another thread, publish/read via `Volatile` or another
explicit synchronization edge.

```csharp
private ImmutableArray<Row> _rows = [];

private void PublishRows(ImmutableArray<Row> rows)
{
Volatile.Write(ref _rows, rows);
RequestRedraw();
}

internal ImmutableArray<Row> RowsSnapshot() => Volatile.Read(ref _rows);
```

### Atomic scalar state

Use `Interlocked` for counters and task/CTS ownership; use `Volatile` for simple
single-writer flags. Never use `x++` on a cross-thread reactive version counter.

```csharp
private int _version;

private void PublishChanged()
{
Interlocked.Increment(ref _version);
RequestRedraw();
}

internal int Version => Volatile.Read(ref _version);
```

If a `ReactiveProperty<int>` is used only to wake page subscriptions, remember
that `.Value++` synchronously runs those subscriptions on the publishing thread.
Prefer a loop-owned invalidation path or a plain atomic version read by render.

## Current audit flags

These are not all necessarily bugs, but they are the fields/patterns that must be
checked before further TUI async work is considered safe:

- `HealthCheckStepViewModel`: `Results` is lock-synchronized; keep using
`ResultsSnapshot()`. `ResultVersion`, `IsRunning`, `IsComplete`, `Succeeded`,
`_context.StatusMessage`, and `LaunchChat()` are written from async health-check
continuations and should not synchronously drive Termina invalidation/navigation
off-loop.
- `ChannelsConfigViewModel`: `RefreshChannelLabelsAsync` / `ReconcileResolvedChannels`
mutate `Step`, `_channelAudiences`, `Status`, `IsSaved`, and persisted config off-loop;
page callbacks invalidate nodes inline. Either move reconciliation onto a loop-owned
action or protect the shared state with a documented lock/snapshot discipline.
- `SkillSourcesConfigViewModel`: `RunProbeAsync` publishes `_pendingRemoteProbeResult`,
`_pendingRemoteProbeMessage`, `Status`, and `IsSaved` from a background continuation;
page subscriptions invalidate inline. Dispose cancels but does not drain `_probeTask`.
- `ProviderManagerViewModel`: eager probes mutate `DisplayProviders` rows and reactive
state from background continuations; `StateVersion.Value++` drives inline invalidation;
`_probeCts` ownership should use the `Interlocked.CompareExchange` pattern from
`ProviderStepViewModel` to avoid one probe disposing a newer probe's CTS.
- `ExposureModeStepViewModel`: currently appears loop-owned; do not add background
readers/writers without one of the publication strategies above.

Tests for these paths must be bounded. Do not use an unbounded writer loop plus a
large snapshot loop; that creates a CPU/memory stress test instead of a race test.
Use finite handshakes, cancel in `finally`, and `WaitAsync` when awaiting background
writers.

## Spinners and timers: let the node animate itself

Do **not** hand-roll a frame ticker. `SpinnerNode` (via `SpinnerViews`) owns its
own animation timer and bubbles invalidation up the layout tree; `ReactivePage`
subscribes the root node's `Invalidated` and calls `RequestRedraw()` for you. A
hand-rolled spinner tick field is the bug from #1312. For a live elapsed counter,
copy `ElapsedTimeSegment` (an `IAnimatedTextSegment` whose timer fires
`Invalidated.OnNext`). See `src/Netclaw.Cli/Tui/SpinnerViews.cs:16-24`.

## Anti-pattern: `.GetAwaiter().GetResult()` on the loop thread

This **freezes input and rendering** for the whole operation. The "it can't
deadlock because there's no SyncContext" argument is a red herring — no-deadlock
is not the same as non-blocking. Every network-bound `GetResult()` on the loop is
a bug to fix, not a pattern to copy.

If you find an old sync bridge in a TUI network/disk path, migrate it to the
tracked-task shape above. A bounded synchronous wait during disposal is a teardown
backstop, not an event-loop interaction pattern.

## Checklist before you write TUI async code

- [ ] Am I about to type `.GetAwaiter().GetResult()`? Stop. Use the tracked-task pattern.
- [ ] Is the network/disk await off-loop, with only the sync "working" setup on-loop?
- [ ] Owned CTS, cancelled+disposed on restart, re-checked after the await?
- [ ] Continuation publishes through a lock/immutable/atomic/loop-owned boundary, not plain fields?
- [ ] No off-loop `ReactiveProperty.Value` update has subscribers that touch Termina nodes?
- [ ] `RequestRedraw()` is used only to schedule a render, not as the only synchronization mechanism?
- [ ] No off-loop navigation, focus change, or `DynamicLayoutNode.Invalidate()`?
- [ ] Background task tracked in a field, exposed as `PendingX` for deterministic tests?
- [ ] Does any save read state this task writes? If so, cancel-and-await it before the save.

## Key reference files

- `src/Netclaw.Cli/Tui/Config/SkillSourcesConfigViewModel.cs` — useful probe shape, but audit its off-loop publication before copying
- `src/Netclaw.Cli/Tui/Config/ChannelsConfigViewModel.cs` — label-refresh/save ordering; do not copy its off-loop mutable-state publication without fixing synchronization
- `src/Netclaw.Cli/Tui/Wizard/Steps/ProviderStepViewModel.cs` — probe + cosmetic timer (`StartProbe`, `:155-244`)
- `src/Netclaw.Cli/Tui/Wizard/Steps/HealthCheckStepViewModel.cs` — streaming results into a locked list + version-counter redraw
- `src/Netclaw.Cli/Tui/ChatPage.cs` / `ChatViewModel.cs` / `Daemon/DaemonClient.cs` — live streaming to the front end
- `src/Netclaw.Cli/Tui/SpinnerViews.cs` — self-animating spinner (don't hand-roll)
32 changes: 31 additions & 1 deletion .github/workflows/pr_validation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,37 @@ jobs:

- name: "dotnet test"
shell: bash
run: dotnet test -c Release
# blame-hang aborts + writes a Sequence file (naming the in-flight test) and a full process
# dump if a test stalls (no test activity) for 300s, so a hang fails fast (~minutes, not the
# 30-min job cap) and is diagnosable from the CI log + the uploaded dump instead of a silent
# timeout. The full dump carries every thread's stack — needed to confirm the suspected
# macOS/ARM64 weak-memory-ordering hang in the TUI view-models.
run: dotnet test -c Release --blame-hang-timeout 300s --blame-hang-dump-type full --results-directory ./TestResults

# Diagnostic: surface the blame Sequence file (names the stuck test) in the CI log so a
# platform-specific hang can be pinpointed even without downloading the dump artifact.
- name: "Show hang sequence (if a test hung)"
if: always()
shell: bash
run: |
seq=$(find ./TestResults -name '*Sequence*.xml' 2>/dev/null || true)
if [ -n "$seq" ]; then
for f in $seq; do echo "===== HANG SEQUENCE: $f ====="; cat "$f"; echo; done
else
echo "No blame Sequence file — no test hang detected."
fi

# Upload the hang dump + sequence file so a dedicated agent can open the dump (dotnet-dump
# analyze / lldb) and read the stalled thread stacks. Per-OS name; only the macOS run is
# expected to produce one today.
- name: "Upload hang dump"
if: always()
uses: actions/upload-artifact@v4
with:
name: test-hang-dump-${{ matrix.os }}
path: ./TestResults
if-no-files-found: ignore
retention-days: 14

- name: "Publish CLI (single-file, self-contained)"
if: runner.os != 'Windows'
Expand Down
35 changes: 31 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Read first:

- `PROJECT_CONTEXT.md`
- `TOOLING.md`
- `IMPLEMENTATION_PLAN.md`
- `docs/prd/README.md`
- `.opencode/skills/netclaw-*/SKILL.md`
- `.claude/skills/ralph-*.md`
Expand Down Expand Up @@ -88,14 +89,40 @@ task checkboxes in `openspec/changes/*/tasks.md` during RALPH iterations.

Before coding a capability, discover in this order:

1. matching PRD in `docs/prd/`
2. matching engineering spec in `docs/spec/`
3. matching OpenSpec capability in `openspec/specs/`
4. active change plan in `openspec/changes/<name>/`
1. active task in `IMPLEMENTATION_PLAN.md`
2. matching PRD in `docs/prd/`
3. matching engineering spec in `docs/spec/`
4. matching OpenSpec capability in `openspec/specs/`
5. active change plan in `openspec/changes/<name>/`

If planning and implementation artifacts conflict, fix planning artifacts first.
If discovery artifacts conflict with each other, update them before implementing.

## Cross-Boundary Contract Rule

When a change writes data consumed by another subsystem, identify the consumer
before implementation and verify the producer emits the consumer's canonical
representation. This applies to config editors, persistence records, actor
messages, protocol payloads, tool schemas, and security policy inputs.

For configuration changes, tests must prove both:

- invalid or unresolved values are rejected before persistence
- persisted values match what runtime ACL/routing/startup code expects

Do not treat UI-level save success or schema validity as sufficient when runtime
behavior depends on provider IDs, canonical names, permissions, or security
policy keys.

## Automation Floor

Recent regressions define mandatory automated proof classes. TUI text input must
have headless typed-key coverage and native smoke coverage for critical flows.
Dynamic validation must have fake-failure tests proving save is blocked before
persistence. Legacy/new config paradigm changes must have load/round-trip tests
from the old shape to the runtime-consumed shape. Human manual testing is a
last-mile confidence check, not a substitute for these gates.

## Configuration Schema Sync Rule

When adding or changing properties on any `*Config` type in `Netclaw.Configuration`,
Expand Down
Loading
Loading