Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions services/kiloclaw/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ These are non-negotiable. Do not reintroduce shared/fallback paths.
- **Two-phase destroy.** Fly resource IDs (`pendingDestroyMachineId`, `pendingDestroyVolumeId`) are persisted before deletion attempts. DO state is only cleared when both are confirmed deleted. The alarm retries on failure.
- **No machine recreation on transient errors.** `startExistingMachine()` only creates a new machine on 404 (confirmed gone). Transient Fly API errors (500, timeout) are re-thrown, not masked by duplicate creation.
- **Machine ID persisted before waiting.** `createNewMachine()` writes `flyMachineId` to durable storage immediately after `fly.createMachine()`, before `waitForState()`. This prevents orphaning machines if the wait times out.
- **Pipelock proxy env vars go only to the OpenClaw child.** When `KILOCLAW_PIPELOCK_ENABLED` is set, the controller injects `HTTPS_PROXY` and friends into the gateway supervisor's child env only. The Pipelock sidecar must receive a scrubbed allowlist env with no agent secrets and no proxy/CA env vars, so Pipelock does not recurse through itself or cross the capability boundary. See [`docs/pipelock.md`](docs/pipelock.md).

## Architecture

Expand Down
30 changes: 30 additions & 0 deletions services/kiloclaw/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,36 @@ RUN GOBIN=/usr/local/bin go install github.com/steipete/gogcli/cmd/gog@v0.12.0 \
RUN curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR=/usr/local/bin sh \
&& uv --version

# ── Pipelock (optional agent firewall sidecar) ─────────────────────────
# Installed unconditionally so the per-instance opt-in via
# KILOCLAW_PIPELOCK_ENABLED does not require a rebuild. The controller
# ignores the binary entirely when the flag is unset.
#
# Upstream release: https://github.com/luckyPipewrench/pipelock/releases
# Artifact naming: pipelock_<version>_linux_<arch>.tar.gz (no `v` in filename,
# even though the tag is `vX.Y.Z`; confirmed in the goreleaser archive
# name_template).
#
# SHA256 checksums are published in `checksums.txt` on the release page.
# Update all three ARGs together when bumping the pinned version.
ARG PIPELOCK_VERSION=v2.3.0
ARG PIPELOCK_SHA256_AMD64=2723df194492bf07ef5c7f329d97eb03d48d712eed3360e6086cc1783ca5668b
ARG PIPELOCK_SHA256_ARM64=7bcc4170159718d7ddfb6bb035b352bf90949c995e9208d8e6982f912bac5983
RUN set -e \
&& VERSION_NO_V="${PIPELOCK_VERSION#v}" \
&& ARCH="$(dpkg --print-architecture)" \
&& case "${ARCH}" in \
amd64) PIPELOCK_ARCH="linux_amd64"; PIPELOCK_SHA256="${PIPELOCK_SHA256_AMD64}" ;; \
arm64) PIPELOCK_ARCH="linux_arm64"; PIPELOCK_SHA256="${PIPELOCK_SHA256_ARM64}" ;; \
*) echo "pipelock: unsupported architecture ${ARCH}" >&2; exit 1 ;; \
esac \
&& curl -fsSL "https://github.com/luckyPipewrench/pipelock/releases/download/${PIPELOCK_VERSION}/pipelock_${VERSION_NO_V}_${PIPELOCK_ARCH}.tar.gz" \
-o /tmp/pipelock.tar.gz \
&& echo "${PIPELOCK_SHA256} /tmp/pipelock.tar.gz" | sha256sum -c - \
&& tar -xzf /tmp/pipelock.tar.gz -C /usr/local/bin pipelock \
&& rm /tmp/pipelock.tar.gz \
&& pipelock version

# ── Builder stage: package custom plugins ───────────────────────────
# Build the vendored plugin tarball in an isolated stage so dev-time
# dependencies do not inflate the runtime image layers.
Expand Down
126 changes: 121 additions & 5 deletions services/kiloclaw/controller/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,19 @@ import { getOpenclawVersion } from './openclaw-version';
import { startCheckin } from './checkin';
import { collectProductTelemetry } from './product-telemetry';
import { GoogleOAuthTokenProvider } from './google-oauth-token-provider';
import {
ensurePipelockCa,
ensurePipelockCaBundle,
ensurePipelockConfig,
getOpenClawProxyEnv,
getPipelockChildEnv,
getPipelockSupervisorOptions,
isPipelockEnabled,
PIPELOCK_LISTEN_HOST,
PIPELOCK_LISTEN_PORT,
waitForPipelockReady,
waitForSupervisorAlive,
} from './pipelock';

export type RuntimeConfig = {
port: number;
Expand Down Expand Up @@ -244,6 +257,10 @@ export async function startController(env: NodeJS.ProcessEnv = process.env): Pro
// eslint-disable-next-line prefer-const -- assigned after critical bootstrap completes
let supervisor: Supervisor | undefined;
let gmailWatchSupervisor: Supervisor | undefined;
// Pipelock sidecar supervisor. Created and started only when
// KILOCLAW_PIPELOCK_ENABLED is set; remains undefined otherwise so the
// shutdown path can skip it cleanly.
let pipelockSupervisor: Supervisor | undefined;
// eslint-disable-next-line prefer-const -- assigned after pairing cache is created
let pairingCache: ReturnType<typeof createPairingCache> | undefined;
let stopCheckin: (() => void) | undefined;
Expand All @@ -261,10 +278,13 @@ export async function startController(env: NodeJS.ProcessEnv = process.env): Pro
pairingCache?.cleanup();
stopCheckin?.();
stopWatchRenewal();
const shutdowns: Promise<void>[] = [];
if (supervisor) shutdowns.push(supervisor.shutdown(signal));
if (gmailWatchSupervisor) shutdowns.push(gmailWatchSupervisor.shutdown(signal));
await Promise.all(shutdowns);
await Promise.all([
supervisor?.shutdown(signal) ?? Promise.resolve(),
gmailWatchSupervisor?.shutdown(signal) ?? Promise.resolve(),
]);
// Pipelock last: OpenClaw's in-flight fetches should get a chance to
// drain through the proxy before the proxy itself exits.
if (pipelockSupervisor) await pipelockSupervisor.shutdown(signal);
await new Promise<void>(resolve => {
server.close(() => resolve());
});
Expand Down Expand Up @@ -331,8 +351,17 @@ export async function startController(env: NodeJS.ProcessEnv = process.env): Pro
}),
});

// When Pipelock is enabled, route the OpenClaw child's outbound fetches
// through the local forward proxy by injecting HTTPS_PROXY and friends.
// The controller's own env is untouched so pipelock does not recurse
// through itself.
const gatewayChildEnv: NodeJS.ProcessEnv | undefined = isPipelockEnabled(env)
? { ...env, ...getOpenClawProxyEnv(env) }
: undefined;

supervisor = createSupervisor({
args: ['gateway', ...config.gatewayArgs],
env: gatewayChildEnv,
onStdoutLine: line => pc.onPairingLogLine(line),
});

Expand Down Expand Up @@ -483,7 +512,94 @@ export async function startController(env: NodeJS.ProcessEnv = process.env): Pro
console.error('[gog] Failed to install shim:', err);
}

// ── Phase 7: Start gateway ──────────────────────────────────────────
// ── Phase 7: Start Pipelock sidecar (if enabled) ────────────────────
// Fail-closed: any failure here leaves the controller in degraded mode
// without starting OpenClaw. Running OpenClaw unproxied when the
// operator asked for scanning would silently undo the security model.
if (isPipelockEnabled(env)) {
try {
ensurePipelockCa(env);
ensurePipelockCaBundle(env);
ensurePipelockConfig(env);
} catch (err) {
const fullError = err instanceof Error ? err.message : String(err);
controllerState.current = {
state: 'degraded',
error: toPublicDegradedError('pipelock-init'),
};
console.error('[controller] Pipelock init failed, running in degraded mode:', fullError);
return;
}

const pipelockOptions = getPipelockSupervisorOptions(env);
if (pipelockOptions) {
// Capability separation: the sidecar runs with an explicit allowlist
// env, NOT the controller's process.env. After bootstrap.ts decryption,
// process.env carries the agent's API key, gateway token, and channel
// secrets; none of those belong in the proxy's trust zone. The same
// allowlist also strips HTTPS_PROXY/CA env vars so pipelock cannot
// recurse through itself if a future caller mutates env upstream.
pipelockSupervisor = createSupervisor({
command: pipelockOptions.command,
args: pipelockOptions.args,
env: getPipelockChildEnv(env),
});

try {
await pipelockSupervisor.start();
} catch (err) {
const fullError = err instanceof Error ? err.message : String(err);
controllerState.current = {
state: 'degraded',
error: toPublicDegradedError('pipelock-start'),
};
console.error(
'[controller] Pipelock supervisor start failed, running in degraded mode:',
fullError
);
return;
}

// supervisor.start() returns before the spawn outcome is known. Surface
// ENOENT / EPERM (binary missing or non-executable) as pipelock-start
// within ~2s rather than waiting out the full readiness ceiling.
const alive = await waitForSupervisorAlive(pipelockSupervisor, 2_000);
if (!alive) {
controllerState.current = {
state: 'degraded',
error: toPublicDegradedError('pipelock-start'),
};
console.error(
'[controller] Pipelock child failed to spawn (binary missing or non-executable), running in degraded mode'
);
await pipelockSupervisor.shutdown('SIGTERM');
return;
}

// 30s ceiling matches the gateway healthy-threshold in supervisor.ts.
// Pipelock's steady-state startup is sub-second; only a misconfigured
// or broken binary should exhaust this budget.
const ready = await waitForPipelockReady(PIPELOCK_LISTEN_HOST, PIPELOCK_LISTEN_PORT, 30_000);
if (!ready) {
controllerState.current = {
state: 'degraded',
error: toPublicDegradedError('pipelock-listen'),
};
console.error(
`[controller] Pipelock did not report healthy forward proxy + TLS interception on ${PIPELOCK_LISTEN_HOST}:${PIPELOCK_LISTEN_PORT} within 30s, running in degraded mode`
);
// Best-effort cleanup so we do not leak a half-started child.
await pipelockSupervisor.shutdown('SIGTERM');
return;
}

console.log(
`[controller] Pipelock ready, listening on ${PIPELOCK_LISTEN_HOST}:${PIPELOCK_LISTEN_PORT}`
);
}
}

// ── Phase 8: Start gateway ──────────────────────────────────────────
controllerState.current = { state: 'starting' };
console.log('[controller] Bootstrap complete, starting gateway...');

Expand Down
Loading