Skip to content

Commit 678f515

Browse files
authored
feat(runtime): project active workspace identity into Data Machine engine_data (#425)
* feat(runtime): project active workspace identity into Data Machine engine_data Adds ActiveWorkspaceProjector — listens on DM's datamachine_engine_snapshot filter (added in data-machine v0.10.3) and projects active workspace identity into engine_data at job initialization. Lets AI directives, abilities, and tool calls answer "which repo is this job operating against" by reading $engine->get('active_workspace'). ## Schema The projected entry has a stable, generic shape: active_workspace: { handle: "<repo>@<branch>" or "<repo>" for primary repo: short name (last segment of handle) owner: GitHub owner when handle is in owner/repo form full_name: "owner/repo" when both known branch: worktree branch, omitted for primary path: absolute filesystem path primary: true for primary checkout origin_site: site that created the worktree, when known origin_agent: agent slug that created the worktree, when known task_url: linked task URL, when set pr_url: linked PR URL, when set } Missing fields are omitted (not nulled) so consumers can use isset() checks cleanly. ## Caller contract Callers (e.g. homeboy-extensions's CI workload) opt in by passing active_workspace.handle via the run-flow ability's initial_data input: wp_get_ability( 'datamachine/run-flow' )->execute( array( 'flow_id' => $flow_id, 'initial_data' => array( 'active_workspace' => array( 'handle' => 'extrachill-artist-platform@docs/agent-run-123', ), ), ) ); Any additional fields the caller supplies override fields derived from worktree metadata. No automatic "current workspace" tracking — identity is always explicit so concurrent jobs cannot clobber each other. ## Layer purity The projector talks about workspaces only — never docs, voice, audience, or any consumer-specific concept. Downstream plugins (e.g. extrachill-docs) consume active_workspace to make their own routing decisions through the datamachine_code_active_workspace filter we expose for further enrichment. ## Closes #423 Depends on Extra-Chill/data-machine PR for the datamachine_engine_snapshot filter (feat-engine-snapshot-filter branch). No-op without that filter — DMC stays installable against older DM versions. * chore(lint): suppress UnusedFunctionParameter on project_into_snapshot The filter callback signature must match the 4-argument contract from DM's datamachine_engine_snapshot filter, but the implementation only needs $snapshot. Add a phpcs:ignore matching DMC's established pattern (see WorkspaceAbilities::getCapabilities and friends) so the parameter list stays accurate to the filter contract while satisfying lint.
1 parent f1d953a commit 678f515

2 files changed

Lines changed: 221 additions & 0 deletions

File tree

data-machine-code.php

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,11 @@ function datamachine_code_bootstrap() {
8787
new \DataMachineCode\Abilities\WordPressRuntimeAbilities();
8888
( new \DataMachineCode\Bundle\WorkspacePreloadArtifact() )->register();
8989

90+
// Project active workspace identity into Data Machine's engine_data
91+
// snapshot at job init. Requires DM's datamachine_engine_snapshot
92+
// filter (added in data-machine v0.10.3); no-op on older DM versions.
93+
\DataMachineCode\Runtime\ActiveWorkspaceProjector::register();
94+
9095
// Load Handlers (they self-register).
9196
new \DataMachineCode\Handlers\GitHub\GitHub();
9297
new \DataMachineCode\Handlers\GitHub\GitHubIssuePublish();
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
<?php
2+
/**
3+
* Active Workspace Projector
4+
*
5+
* Projects active workspace identity (repo, handle, branch, path) into
6+
* Data Machine's engine_data snapshot at job initialization so AI
7+
* directives, abilities, and tool calls can read which repo the
8+
* current job is operating against.
9+
*
10+
* == How identity arrives ==
11+
*
12+
* Callers (e.g. homeboy-extensions's CI workload) pass workspace
13+
* identity via `initial_data.active_workspace` on the
14+
* datamachine/run-flow ability call. The filter callback below reads
15+
* that input, looks up the matching worktree metadata via
16+
* WorktreeContextInjector, and stamps the enriched entry into
17+
* engine_data so any directive or tool can read it via
18+
* $engine->get( 'active_workspace' ).
19+
*
20+
* No automatic "current workspace" tracking. Identity is always
21+
* explicit so concurrent jobs cannot clobber each other and the
22+
* extension surface is one clearly-documented input on run-flow.
23+
*
24+
* == Schema ==
25+
*
26+
* The projected entry has a stable shape suitable for any consumer:
27+
*
28+
* active_workspace: {
29+
* handle: "<repo>@<branch>" or "<repo>" for primary
30+
* repo: short name (last segment of handle, no @branch)
31+
* owner: GitHub owner (when handle includes owner/repo format)
32+
* full_name: "owner/repo" when both are known
33+
* branch: worktree branch, omitted for primary
34+
* path: absolute filesystem path to the worktree
35+
* primary: true when the handle is the primary checkout
36+
* origin_site: site that created the worktree, when known
37+
* origin_agent: agent slug that created the worktree, when known
38+
* task_url: linked task URL (issue/PR) when set
39+
* pr_url: linked PR URL, when set
40+
* }
41+
*
42+
* Missing fields are omitted (not nulled) so consumers can use
43+
* isset() checks cleanly.
44+
*
45+
* == Caller contract ==
46+
*
47+
* Minimum required to activate the projection: pass an
48+
* active_workspace.handle to run-flow:
49+
*
50+
* wp_get_ability( 'datamachine/run-flow' )->execute( array(
51+
* 'flow_id' => $flow_id,
52+
* 'initial_data' => array(
53+
* 'active_workspace' => array(
54+
* 'handle' => 'extrachill-artist-platform@docs/agent-run-123',
55+
* ),
56+
* ),
57+
* ) );
58+
*
59+
* Any additional fields the caller supplies (e.g. owner, full_name) are
60+
* preserved verbatim and override fields derived from worktree metadata.
61+
*
62+
* == Layer purity ==
63+
*
64+
* This class talks about workspaces, not docs, voice, or any consumer
65+
* concept. Downstream plugins (e.g. extrachill-docs) consume the
66+
* active_workspace entry to make their own routing decisions. DMC stays
67+
* generic.
68+
*
69+
* @package DataMachineCode\Runtime
70+
* @since 0.46.0
71+
*/
72+
73+
namespace DataMachineCode\Runtime;
74+
75+
use DataMachineCode\Workspace\WorktreeContextInjector;
76+
77+
defined( 'ABSPATH' ) || exit;
78+
79+
class ActiveWorkspaceProjector {
80+
81+
/**
82+
* Bootstrap: register the engine_snapshot filter.
83+
*
84+
* @since 0.46.0
85+
* @return void
86+
*/
87+
public static function register(): void {
88+
add_filter(
89+
'datamachine_engine_snapshot',
90+
array( self::class, 'project_into_snapshot' ),
91+
20,
92+
4
93+
);
94+
}
95+
96+
/**
97+
* Filter callback — enrich the engine snapshot with active_workspace.
98+
*
99+
* Reads explicit active_workspace input that the caller passed via
100+
* initial_data on run-flow. Looks up the worktree metadata to fill
101+
* in fields the caller did not supply. No-op when no handle was
102+
* passed.
103+
*
104+
* @since 0.46.0
105+
*
106+
* @param array $snapshot Engine snapshot about to be persisted.
107+
* @param int $job_id Job being initialized.
108+
* @param array $flow Flow row.
109+
* @param array $pipeline Pipeline row.
110+
* @return array Modified snapshot.
111+
*/
112+
public static function project_into_snapshot( array $snapshot, int $job_id, array $flow, array $pipeline ): array { // phpcs:ignore Generic.CodeAnalysis.UnusedFunctionParameter.FoundAfterLastUsed
113+
$explicit = is_array( $snapshot['active_workspace'] ?? null )
114+
? (array) $snapshot['active_workspace']
115+
: array();
116+
117+
$handle = (string) ( $explicit['handle'] ?? '' );
118+
if ( '' === $handle ) {
119+
// No handle — preserve any pre-existing entry (e.g. hand-set
120+
// by tests) and exit.
121+
return $snapshot;
122+
}
123+
124+
$entry = self::build_entry( $handle, $explicit );
125+
if ( empty( $entry ) ) {
126+
return $snapshot;
127+
}
128+
129+
$snapshot['active_workspace'] = $entry;
130+
131+
return $snapshot;
132+
}
133+
134+
/**
135+
* Build the active_workspace entry from a handle and optional caller overrides.
136+
*
137+
* @since 0.46.0
138+
*
139+
* @param string $handle Workspace handle.
140+
* @param array<string,mixed> $overrides Caller-provided fields to preserve.
141+
* @return array<string,mixed>
142+
*/
143+
private static function build_entry( string $handle, array $overrides ): array {
144+
$metadata = WorktreeContextInjector::get_metadata( $handle );
145+
$is_primary = ! str_contains( $handle, '@' );
146+
147+
$entry = array(
148+
'handle' => $handle,
149+
'primary' => $is_primary,
150+
);
151+
152+
// Derive repo + branch from handle.
153+
$handle_parts = explode( '@', $handle, 2 );
154+
$repo_slug = $handle_parts[0] ?? '';
155+
if ( '' !== $repo_slug ) {
156+
$entry['repo'] = $repo_slug;
157+
}
158+
if ( ! $is_primary && isset( $handle_parts[1] ) && '' !== $handle_parts[1] ) {
159+
$entry['branch'] = $handle_parts[1];
160+
}
161+
162+
// Enrich from persisted metadata.
163+
if ( is_array( $metadata ) ) {
164+
foreach ( array( 'repo', 'branch', 'path', 'origin_site', 'origin_agent', 'pr_url' ) as $field ) {
165+
if ( isset( $metadata[ $field ] ) && '' !== (string) $metadata[ $field ] ) {
166+
$entry[ $field ] = (string) $metadata[ $field ];
167+
}
168+
}
169+
170+
$task = is_array( $metadata['origin_task'] ?? null ) ? $metadata['origin_task'] : array();
171+
if ( isset( $task['task_url'] ) && '' !== (string) $task['task_url'] ) {
172+
$entry['task_url'] = (string) $task['task_url'];
173+
}
174+
}
175+
176+
// If repo looks like "owner/repo" (common when callers pass full_name
177+
// or when the handle is "owner/repo"), split it into owner + repo.
178+
$repo_value = (string) ( $entry['repo'] ?? '' );
179+
if ( str_contains( $repo_value, '/' ) ) {
180+
$parts = explode( '/', $repo_value, 2 );
181+
$entry['owner'] = $parts[0];
182+
$entry['repo'] = $parts[1];
183+
$entry['full_name'] = $repo_value;
184+
}
185+
186+
// Caller overrides win for scalar fields (preserve explicit input).
187+
foreach ( $overrides as $key => $value ) {
188+
if ( is_string( $key ) && '' !== $key && ! is_array( $value ) ) {
189+
$entry[ $key ] = $value;
190+
}
191+
}
192+
193+
// Synthesize full_name from owner + repo when both present.
194+
if ( empty( $entry['full_name'] ) && ! empty( $entry['owner'] ) && ! empty( $entry['repo'] ) ) {
195+
$entry['full_name'] = $entry['owner'] . '/' . $entry['repo'];
196+
}
197+
198+
/**
199+
* Filter the projected active_workspace entry before it lands in engine_data.
200+
*
201+
* Lets extensions enrich or override fields. Returning a non-array
202+
* silently preserves the projector's own value.
203+
*
204+
* @since 0.46.0
205+
*
206+
* @param array<string,mixed> $entry Projected entry.
207+
* @param string $handle Source handle.
208+
*/
209+
$filtered = apply_filters( 'datamachine_code_active_workspace', $entry, $handle );
210+
if ( is_array( $filtered ) ) {
211+
$entry = $filtered;
212+
}
213+
214+
return $entry;
215+
}
216+
}

0 commit comments

Comments
 (0)