feat(engines): add Kilocode engine adapter (#417)#451
Conversation
Implements KilocodeAdapter as an external subprocess engine that delegates code execution to the `kilo run --auto` CLI. Follows the same SubprocessAdapter pattern as claude-code and opencode adapters. - Add codeframe/core/adapters/kilocode.py with KilocodeAdapter - Passes prompt as positional CLI arg (not stdin) - Handles exit code 124 as timeout failure - Configurable via KILOCODE_PATH, KILOCODE_MODEL, KILOCODE_FLAGS env vars - Register "kilocode" in engine_registry.py (VALID_ENGINES + EXTERNAL_ENGINES) - Update CLI --engine help text to include "kilocode" - Document env vars in .env.example - Add 14 unit tests covering all adapter behaviors Closes #417
WalkthroughAdded a Kilocode external engine: CLI help updated, engine registry extended, a Subprocess-based Changes
Sequence DiagramsequenceDiagram
participant User as User
participant CLI as CLI (app.py)
participant Registry as Engine Registry
participant Adapter as Kilocode Adapter
participant Subprocess as kilo (subprocess)
participant Workspace as Workspace Files
User->>CLI: cf work start --engine kilocode
CLI->>Registry: get_external_adapter("kilocode")
Registry->>Adapter: instantiate KilocodeAdapter()
Adapter->>Adapter: check_ready() -> shutil.which(KILOCODE_PATH)
CLI->>Adapter: execute(prompt, workspace_path)
Adapter->>Adapter: build_command(prompt, workspace_path)
Adapter->>Subprocess: Popen(command, stdout/stderr)
Subprocess->>Workspace: may read/modify files
Subprocess-->>Adapter: exit_code, stdout, stderr
Adapter->>Adapter: _map_result(exit_code, stdout, stderr)
alt exit_code == 124
Adapter-->>CLI: AgentResult(status="failed", error="timed out")
else exit_code == 0
Adapter-->>CLI: AgentResult(status="completed", output=...)
else
Adapter-->>CLI: AgentResult(status="failed", error=...)
end
CLI-->>User: report result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
Code Review — feat(engines): add Kilocode engine adapter (#417)Overall this is a clean, well-scoped addition that correctly follows the SubprocessAdapter pattern. Architecture is sound and the test coverage is solid. A few issues worth addressing before merge: Critical / BugShell-injection risk in
cmd.extend(extra_flags_str.split())
The same pattern is used in the other adapters (claude-code uses import shlex
cmd.extend(shlex.split(extra_flags_str))This correctly handles quoted strings and is the idiomatic approach for user-supplied CLI flag strings. ImportantPrompt injected as a positional argument is unsafe for untrusted content
cmd = [
self._binary_path,
"run",
prompt, # ← entire rich context prompt as one token
"--auto",
...
]The prompt assembled by The existing adapters (claude-code, opencode) avoid this by piping via stdin. If kilo's
Nice-to-haveMissing
Test:
def test_raises_if_kilo_not_installed(self) -> None:
with patch("shutil.which", return_value=None):
with pytest.raises(EnvironmentError, match="not found on PATH"):
KilocodeAdapter()This patches the global
return KilocodeAdapter()
Positive notes
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/core/test_engine_registry.py (1)
1-7:⚠️ Potential issue | 🟠 MajorAdd the required v2 pytest marker for this test module.
This test file is missing
pytestmark = pytest.mark.v2at the module level. All v2 functionality tests must be marked according to repository guidelines.Suggested fix
"""Tests for engine registry.""" import os import pytest from unittest.mock import MagicMock, patch +pytestmark = pytest.mark.v2 + from codeframe.core.adapters.agent_adapter import AgentAdapter🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/core/test_engine_registry.py` around lines 1 - 7, Add the module-level pytest marker by declaring pytestmark = pytest.mark.v2 at the top-level of the test module so pytest treats all tests here as v2; since pytest is already imported, insert a top-level assignment named pytestmark referencing pytest.mark.v2 (ensure it’s placed outside any function/class so it applies to the whole module).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@codeframe/core/engine_registry.py`:
- Around line 101-104: The kilocode branch in get_external_adapter currently
instantiates KilocodeAdapter without passing through adapter-specific kwargs, so
calls like get_adapter("kilocode", timeout_s=...) are ignored; update the branch
that imports KilocodeAdapter in engine_registry (the elif engine == "kilocode"
block) to forward the received **kwargs into the adapter constructor (i.e.,
construct KilocodeAdapter with the kwargs) so adapter-specific options are
honored when returning the adapter instance.
---
Outside diff comments:
In `@tests/core/test_engine_registry.py`:
- Around line 1-7: Add the module-level pytest marker by declaring pytestmark =
pytest.mark.v2 at the top-level of the test module so pytest treats all tests
here as v2; since pytest is already imported, insert a top-level assignment
named pytestmark referencing pytest.mark.v2 (ensure it’s placed outside any
function/class so it applies to the whole module).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7a1e0d9d-ad0d-4bc2-bfd2-03fa136b97af
📒 Files selected for processing (6)
.env.examplecodeframe/cli/app.pycodeframe/core/adapters/kilocode.pycodeframe/core/engine_registry.pytests/core/adapters/test_kilocode.pytests/core/test_engine_registry.py
- Use shlex.split() for KILOCODE_FLAGS instead of str.split() — handles quoted values correctly and avoids argument smuggling - Extract _resolve_binary() staticmethod to eliminate duplicated KILOCODE_PATH resolution between __init__ and check_ready - Add requirements() classmethod for consistency with CodexAdapter and compatibility with `cf engines check kilocode` - Forward **kwargs in get_external_adapter() so timeout_s and other params reach the adapter (was silently dropped) - Fix shutil.which patch path in tests to target subprocess_adapter module - Add pytestmark = pytest.mark.v2 to both test files for explicit marking - Add docstring note on prompt-as-positional-arg platform limits (macOS 256KB)
There was a problem hiding this comment.
🧹 Nitpick comments (1)
codeframe/core/adapters/kilocode.py (1)
84-91: Consider optional warning for oversized prompts on macOS.The docstring correctly documents that macOS caps arguments at 256KB, but there's no runtime check. For very large task contexts, the subprocess call would fail with an OS-level error that may be cryptic.
This is a low-priority improvement since the limitation is documented and failures would still be caught, but a proactive warning could improve debuggability.
💡 Optional: Add prompt length warning
def build_command(self, prompt: str, workspace_path: Path) -> list[str]: + # macOS caps individual CLI arguments at 256 KB + _MACOS_ARG_LIMIT = 256 * 1024 + if len(prompt.encode("utf-8")) > _MACOS_ARG_LIMIT: + import warnings + warnings.warn( + f"Prompt size ({len(prompt.encode('utf-8'))} bytes) exceeds macOS " + f"argument limit ({_MACOS_ARG_LIMIT} bytes). " + "Execution may fail on macOS systems.", + RuntimeWarning, + stacklevel=2, + ) cmd = [ self._binary_path,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/adapters/kilocode.py` around lines 84 - 91, Add a proactive macOS prompt-size check before building/running the subprocess command (the block that constructs cmd using self._binary_path, "run", prompt, "--auto", "--workspace", str(workspace_path)) to warn when the prompt will exceed the macOS argv cap (~256KB). Detect platform (e.g., sys.platform == "darwin"), compute the byte-length of prompt (prompt.encode("utf-8")), and if it exceeds 256*1024 emit a clear warning via the module/class logger or warnings.warn that the subprocess may fail due to macOS argument-size limits; then continue to build cmd as before. Ensure you reference the same variables (prompt, self._binary_path, workspace_path) and do not alter subprocess invocation behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@codeframe/core/adapters/kilocode.py`:
- Around line 84-91: Add a proactive macOS prompt-size check before
building/running the subprocess command (the block that constructs cmd using
self._binary_path, "run", prompt, "--auto", "--workspace", str(workspace_path))
to warn when the prompt will exceed the macOS argv cap (~256KB). Detect platform
(e.g., sys.platform == "darwin"), compute the byte-length of prompt
(prompt.encode("utf-8")), and if it exceeds 256*1024 emit a clear warning via
the module/class logger or warnings.warn that the subprocess may fail due to
macOS argument-size limits; then continue to build cmd as before. Ensure you
reference the same variables (prompt, self._binary_path, workspace_path) and do
not alter subprocess invocation behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b72a14d0-f736-43bc-959a-9f0724ef05f2
📒 Files selected for processing (4)
codeframe/core/adapters/kilocode.pycodeframe/core/engine_registry.pytests/core/adapters/test_kilocode.pytests/core/test_engine_registry.py
✅ Files skipped from review due to trivial changes (2)
- tests/core/test_engine_registry.py
- tests/core/adapters/test_kilocode.py
🚧 Files skipped from review as they are similar to previous changes (1)
- codeframe/core/engine_registry.py
Code Review - PR 451: Kilocode Engine AdapterOverall this is a clean, well-structured addition. The implementation follows the established SubprocessAdapter pattern correctly, tests are comprehensive, and the registry wiring is complete. A few issues worth addressing before merge. CriticalShell injection via KILOCODE_FLAGS kilocode.py line 96 calls shlex.split(extra_flags_str). shlex.split prevents naive word-splitting bugs, but it does not sanitize malicious flag injection. Since subprocess.Popen is called with a list (not shell=True), shell metacharacters are passed as literal arguments rather than interpreted by a shell. However, flags like --workspace and --model could be shadowed by duplicate flags injected via KILOCODE_FLAGS, potentially overriding values CodeFrame intends to control. Consider validating that no flags from KILOCODE_FLAGS start with --workspace or --model, or at minimum documenting this explicitly. ImportantPrompt as positional argument - argument-length risk on macOS not enforced The docstring at line 28 correctly notes that macOS caps individual arguments at 256 KB. However, there is no runtime guard. If TaskContextPackager produces a prompt larger than 256 KB, the subprocess will fail with E2BIG on macOS and the user will see a confusing OSError rather than a clear message. A size check with an early descriptive error before calling build_command would improve the experience significantly. check_ready uses a different shutil.which import path than the test mock The adapter check_ready calls shutil.which via codeframe.core.adapters.kilocode.shutil.which, while the constructor binary resolution goes through SubprocessAdapter.init using codeframe.core.adapters.subprocess_adapter.shutil.which. The test file handles this with two different patch constants (_WHICH and _WHICH_KILOCODE). This is correct but fragile. A comment in the test explaining why two different patch paths are needed would help future contributors. No timeout_s passthrough exercised anywhere KilocodeAdapter.init accepts timeout_s but none of the call sites in the registry tests or CLI code actually pass it. The other subprocess-based adapters (OpenCodeAdapter, ClaudeCodeAdapter) have no timeout_s parameter at all. Either add a test covering timeout_s forwarding or document it as a future hook for completeness. Minorrequirements() only lists KILOCODE_PATH but not KILOCODE_MODEL or KILOCODE_FLAGS. Both are documented in .env.example but absent from requirements(), so cf engines check kilocode will not surface them. They are optional so this is not wrong, but a comment noting they are intentionally excluded would be helpful. The test_kilocode_adapter test in test_engine_registry.py patches bare shutil.which which is broader than necessary. Prefer patching codeframe.core.adapters.subprocess_adapter.shutil.which (same as _WHICH in test_kilocode.py). Positive observations
SummaryThe two issues most worth addressing before merge are: (1) documenting or guarding against KILOCODE_FLAGS flag injection that could shadow controlled flags like --workspace and --model, and (2) adding a runtime guard or clear error for the macOS 256 KB per-argument limit on prompt length. The rest are polish items. |
|
Follow-up Review - Round 3 Good progress since my earlier reviews. Summarizing what landed and what is still open. Confirmed fixed
Still outstanding
The test uses patch("shutil.which", ...) which patches the global shutil.which rather than codeframe.core.adapters.subprocess_adapter.shutil.which (the correct import path, already used as the _WHICH constant throughout test_kilocode.py). The test passes incidentally due to import caching, but will silently fail if import order changes. Should use:
build_command appends extra flags after --workspace and --model. If kilo respects last-wins semantics for duplicate flags, KILOCODE_FLAGS="--workspace /tmp/evil" would override the workspace CodeFrame controls. Since the other adapters do not expose a free-form flags env var, at minimum a docstring note is appropriate. A simple validation guard is also an option: iterate shlex.split(extra_flags_str) and raise ValueError on any flag in {"--workspace", "--model", "--auto"}.
The docstring correctly warns about the 256 KB macOS per-argument limit, but there is no check before the subprocess call. A warnings.warn before building the command list is sufficient and non-breaking. Not a blocker, but easy to add now. The adapter is clean and consistent with the established pattern. The shutil.which patch path is a test-quality issue; the KILOCODE_FLAGS shadowing concern and the prompt-size guard are the two items most worth addressing before merge. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/core/test_engine_registry.py`:
- Around line 58-60: Add the v2 marker to the new Kilocode test: update the test
function test_kilocode_is_external to be decorated with `@pytest.mark.v2` (or add
module-level pytestmark = pytest.mark.v2) so it follows the project's v2 test
convention; place the decorator immediately above the def
test_kilocode_is_external(...) or add the pytestmark declaration at top of the
file and ensure pytest is imported.
- Around line 81-86: The test function test_kilocode_adapter is missing the
required v2 marker; add the pytest v2 marker by decorating test_kilocode_adapter
with `@pytest.mark.v2` (or set pytestmark = pytest.mark.v2 at the top of the file)
so the Kilocode adapter test follows the same convention as
test_claude_code_adapter and test_opencode_adapter and will be picked up as a v2
test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5191c907-eb33-4cb4-b4c4-b075563ef0f5
📒 Files selected for processing (2)
tests/core/adapters/test_kilocode.pytests/core/test_engine_registry.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/core/adapters/test_kilocode.py
| def test_kilocode_is_external(self): | ||
| assert is_external_engine("kilocode") is True | ||
|
|
There was a problem hiding this comment.
Missing @pytest.mark.v2 decorator.
The new test for Kilocode engine should be marked with @pytest.mark.v2 to comply with the project's test conventions for v2 functionality.
Proposed fix
+ `@pytest.mark.v2`
def test_kilocode_is_external(self):
assert is_external_engine("kilocode") is TrueAs per coding guidelines: "New v2 Python tests must be marked with @pytest.mark.v2 decorator or pytestmark = pytest.mark.v2"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_kilocode_is_external(self): | |
| assert is_external_engine("kilocode") is True | |
| `@pytest.mark.v2` | |
| def test_kilocode_is_external(self): | |
| assert is_external_engine("kilocode") is True |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/core/test_engine_registry.py` around lines 58 - 60, Add the v2 marker
to the new Kilocode test: update the test function test_kilocode_is_external to
be decorated with `@pytest.mark.v2` (or add module-level pytestmark =
pytest.mark.v2) so it follows the project's v2 test convention; place the
decorator immediately above the def test_kilocode_is_external(...) or add the
pytestmark declaration at top of the file and ensure pytest is imported.
| def test_kilocode_adapter(self): | ||
| with patch("shutil.which", return_value="/usr/bin/kilo"): | ||
| adapter = get_external_adapter("kilocode") | ||
| assert adapter.name == "kilocode" | ||
| assert isinstance(adapter, AgentAdapter) | ||
|
|
There was a problem hiding this comment.
Missing @pytest.mark.v2 decorator.
The new test for the Kilocode adapter should be marked with @pytest.mark.v2. The test logic itself looks good—it correctly mocks shutil.which and validates the adapter name and type, following the same pattern as the existing test_claude_code_adapter and test_opencode_adapter tests.
Proposed fix
+ `@pytest.mark.v2`
def test_kilocode_adapter(self):
with patch("shutil.which", return_value="/usr/bin/kilo"):
adapter = get_external_adapter("kilocode")
assert adapter.name == "kilocode"
assert isinstance(adapter, AgentAdapter)As per coding guidelines: "New v2 Python tests must be marked with @pytest.mark.v2 decorator or pytestmark = pytest.mark.v2"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_kilocode_adapter(self): | |
| with patch("shutil.which", return_value="/usr/bin/kilo"): | |
| adapter = get_external_adapter("kilocode") | |
| assert adapter.name == "kilocode" | |
| assert isinstance(adapter, AgentAdapter) | |
| `@pytest.mark.v2` | |
| def test_kilocode_adapter(self): | |
| with patch("shutil.which", return_value="/usr/bin/kilo"): | |
| adapter = get_external_adapter("kilocode") | |
| assert adapter.name == "kilocode" | |
| assert isinstance(adapter, AgentAdapter) | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/core/test_engine_registry.py` around lines 81 - 86, The test function
test_kilocode_adapter is missing the required v2 marker; add the pytest v2
marker by decorating test_kilocode_adapter with `@pytest.mark.v2` (or set
pytestmark = pytest.mark.v2 at the top of the file) so the Kilocode adapter test
follows the same convention as test_claude_code_adapter and
test_opencode_adapter and will be picked up as a v2 test.
Summary
KilocodeAdapteras a subprocess-based external engine, allowingcf work start <id> --execute --engine kilocodeSubprocessAdapterpattern used byclaude-codeandopencodeadapterskilo run <prompt> --auto --workspace <path>) rather than via stdinfailedresult with a descriptive messageKILOCODE_PATH,KILOCODE_MODEL, andKILOCODE_FLAGSenvironment variablesChanges
codeframe/core/adapters/kilocode.py— newKilocodeAdapterextendingSubprocessAdaptercodeframe/core/engine_registry.py— registers"kilocode"inVALID_ENGINESandEXTERNAL_ENGINES; wires factory functionscodeframe/cli/app.py— updates--enginehelp text (bothwork startandwork batch run).env.example— documentsKILOCODE_PATH,KILOCODE_MODEL,KILOCODE_FLAGStests/core/adapters/test_kilocode.py— 14 unit tests covering all adapter behaviorstests/core/test_engine_registry.py— adds kilocode coverage to registry testsTest plan
KilocodeAdapter— all passingcf work start <id> --execute --engine kilocoderoutes correctly throughexecute_agent()Closes #417
Summary by CodeRabbit
New Features
Tests