As of 2026-03-17, the core slices in this document are implemented. Selection semantics, grouped gizmo discovery/execution, and structured messages/letters/alerts are now part of the live bridge surface. Treat the remainder of this file as design context for why those seams were chosen, not as an unimplemented backlog snapshot.
The next bridge layer should make RimWorld state easier for an AI to validate semantically while staying deterministic and close to real game behavior. The immediate focus is not broader generic remote control. It is:
- selection-scoped inspect data
- semantic gizmo discovery and execution
- structured messages, letters, and alerts
This should let an agent verify mod behavior without relying on screenshots for every state transition.
For the first semantic layer, the bridge should read canonical game objects instead of scraping window text or screen geometry.
Verified seams from the real RimWorld 1.6 assembly:
ISelectable.GetInspectString()ISelectable.GetGizmos()LetterStack.LettersListForReadingMessages.liveMessagesAlertsReadout.activeAlerts
These are more stable than screenshot parsing and preserve the exact gameplay semantics already used by the game.
Gizmos are fundamentally contextual. A gizmo id only makes sense relative to the current selection. The bridge should therefore expose selection-scoped gizmo ids instead of pretending gizmos are globally stable entities.
The id strategy should:
- bind to the current selection fingerprint
- be deterministic for the same selection and gizmo ordering
- fail clearly if the selection or gizmo set changed before execution
Recommended shape:
- compute a selection fingerprint from the selected objects in order
- recompute the grouped gizmo list on each call
- assign each representative gizmo a synthetic id derived from the selection fingerprint plus a stable ordinal and compact semantic fingerprint
This avoids stale execution against the wrong selection while still giving the agent an opaque handle it can pass back.
The bridge should not expose raw per-object gizmos and call that "what the player sees." RimWorld groups gizmos across the selection before drawing them, merges compatible commands, applies special representative selection rules for toggles, and fans interactions back out across the grouped commands.
The bridge should mirror that behavior closely enough that:
list_selected_gizmoscorresponds to the actionable commands in the UIexecute_gizmoproduces the same grouped side effects that the UI would
Directly exposing every raw GetGizmos() result would be simpler, but it would be wrong for multi-selection and would give an AI a surface that does not match the real game.
Messages, letters, and alerts should be exposed as structured state that can be polled and diffed:
- live messages for short-lived feedback
- letter stack contents for long-form notifications
- active alerts for colony-wide state validation
This is better than parsing logs because many mod-relevant outcomes are user-facing UI signals rather than bridge logs.
rimworld/get_selection_semanticsrimworld/list_selected_gizmosrimworld/execute_gizmo
rimworld/list_messagesrimworld/list_lettersrimworld/open_letterrimworld/dismiss_letterrimworld/list_alertsrimworld/activate_alert
- Lua-layer changes
- richer inspect-tab introspection
- full model-aware mod-settings interaction through the native dialog path
- richer generalized dialog semantics on top of the UI workbench
- broader remote-control surface expansion
Those can build on this layer later, but they are not required to make autonomous mod validation materially better right now.