Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed .DS_Store
Binary file not shown.
Binary file removed .github/.DS_Store
Binary file not shown.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,9 @@ archive/
.claw/sessions/
.clawhip/
status-help.txt

# HackCode local artifacts (sessions, agents) — never commit to a public repo
.hackcode/
.hackcode-agents/
# macOS
.DS_Store
27 changes: 0 additions & 27 deletions .hackcode-todos.json

This file was deleted.

Binary file removed .hackcode/.DS_Store
Binary file not shown.
47 changes: 0 additions & 47 deletions .hackcode/sessions/bc692bcac69824a6/session-1776068029370-0.jsonl

This file was deleted.

Binary file removed cheatsheets/.DS_Store
Binary file not shown.
Binary file removed rust/.DS_Store
Binary file not shown.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

1 change: 1 addition & 0 deletions rust/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 5 additions & 1 deletion rust/crates/api/src/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,10 @@ impl ProviderClient {
Some(meta) if meta.auth_env == "DASHSCOPE_API_KEY" => {
OpenAiCompatConfig::dashscope()
}
Some(meta) if meta.auth_env == "OPENCODE_API_KEY" => {
OpenAiCompatConfig::opencode()
}
Some(meta) if meta.auth_env == "NIM_API_KEY" => OpenAiCompatConfig::nim(),
_ => OpenAiCompatConfig::openai(),
};
Ok(Self::OpenAi(OpenAiCompatClient::from_env(config)?))
Expand Down Expand Up @@ -168,7 +172,7 @@ mod tests {

#[test]
fn resolves_existing_and_grok_aliases() {
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-6");
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-8");
assert_eq!(resolve_model_alias("grok"), "grok-3");
assert_eq!(resolve_model_alias("grok-mini"), "grok-3-mini");
}
Expand Down
87 changes: 82 additions & 5 deletions rust/crates/api/src/providers/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,15 @@ const MODEL_REGISTRY: &[(&str, ProviderMetadata)] = &[
default_base_url: anthropic::DEFAULT_BASE_URL,
},
),
(
"fable",
ProviderMetadata {
provider: ProviderKind::Anthropic,
auth_env: "ANTHROPIC_API_KEY",
base_url_env: "ANTHROPIC_BASE_URL",
default_base_url: anthropic::DEFAULT_BASE_URL,
},
),
(
"grok",
ProviderMetadata {
Expand Down Expand Up @@ -143,9 +152,10 @@ pub fn resolve_model_alias(model: &str) -> String {
.find_map(|(alias, metadata)| {
(*alias == lower).then_some(match metadata.provider {
ProviderKind::Anthropic => match *alias {
"opus" => "claude-opus-4-6",
"opus" => "claude-opus-4-8",
"sonnet" => "claude-sonnet-4-6",
"haiku" => "claude-haiku-4-5-20251213",
"haiku" => "claude-haiku-4-5",
"fable" => "claude-fable-5",
Comment on lines +155 to +158

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find remaining opus 4.6 aliases/defaults after `opus` now resolves to 4.8.
# Expectation: Product defaults that should track `opus` are migrated to claude-opus-4-8;
# legacy fixtures may remain only if intentionally pinned.
rg -n -C 2 'claude-opus-4-6|resolve_model_alias\("opus"\)|"smart"\s*:' --glob '!target/**'

Repository: itwizardo/hackcode

Length of output: 33429


🏁 Script executed:

cat -n rust/crates/api/src/providers/mod.rs | sed -n '150,165p'

Repository: itwizardo/hackcode

Length of output: 873


Update documentation, hardcoded defaults, and config fixtures to reflect the opus alias migration from claude-opus-4-6 to claude-opus-4-8.

The opus alias now resolves to claude-opus-4-8 (line 155), but the migration is incomplete across the codebase:

  • Documentation stale: USAGE.md (lines 200, 317) and rust/README.md (lines 112–113, 213) still document opusclaude-opus-4-6.
  • Hardcoded defaults unchanged: DEFAULT_MODEL in rust/crates/rusty-claude-cli/src/main.rs:59 and DEFAULT_AGENT_MODEL in rust/crates/tools/src/lib.rs:3661 remain hardcoded to claude-opus-4-6.
  • Config fixture inconsistency: USAGE.md:334 hardcodes "smart": "claude-opus-4-6" (stale); rust/crates/rusty-claude-cli/src/main.rs:11044 uses "smart": "opus" (alias); rust/crates/runtime/src/config.rs:1855–1860 show mixed values across test fixtures.

Update all references, documentation, and fixture defaults to use claude-opus-4-8 or the alias opus consistently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/api/src/providers/mod.rs` around lines 155 - 158, The `opus`
alias migration from `claude-opus-4-6` to `claude-opus-4-8` is incomplete across
the codebase. Update all references to reflect this change: in USAGE.md (lines
200, 317, 334), change documentation and config fixture references from
`claude-opus-4-6` to `claude-opus-4-8`; in rust/README.md (lines 112–113, 213),
update documentation to reference `claude-opus-4-8` instead of
`claude-opus-4-6`; in rust/crates/rusty-claude-cli/src/main.rs:59, change the
DEFAULT_MODEL constant from `claude-opus-4-6` to `claude-opus-4-8`; in
rust/crates/tools/src/lib.rs:3661, update DEFAULT_AGENT_MODEL from
`claude-opus-4-6` to `claude-opus-4-8`; in
rust/crates/rusty-claude-cli/src/main.rs:11044, ensure config fixture for
"smart" uses the consistent value; and in
rust/crates/runtime/src/config.rs:1855–1860, align test fixtures to use either
`claude-opus-4-8` or the `opus` alias consistently throughout.

_ => trimmed,
},
ProviderKind::Xai => match *alias {
Expand All @@ -154,7 +164,13 @@ pub fn resolve_model_alias(model: &str) -> String {
"grok-2" => "grok-2",
_ => trimmed,
},
ProviderKind::OpenAi | ProviderKind::Ollama => trimmed,
ProviderKind::OpenAi => match *alias {
// `kimi` is the friendly alias for Moonshot's current flagship
// on DashScope; expand it to the canonical model id.
"kimi" => "kimi-k2.5",
_ => trimmed,
},
ProviderKind::Ollama => trimmed,
})
})
.map_or_else(|| trimmed.to_string(), ToOwned::to_owned)
Expand Down Expand Up @@ -204,6 +220,39 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
default_base_url: openai_compat::DEFAULT_DASHSCOPE_BASE_URL,
});
}
// Moonshot Kimi models (kimi-k2.5, kimi-k1.5, kimi/<id>) also speak the
// OpenAI-compat shape on DashScope. Matched before the Ollama colon-heuristic
// so a model id like `kimi-k2.5` is not misrouted to a local backend.
if canonical.starts_with("kimi/") || canonical.starts_with("kimi-") {
return Some(ProviderMetadata {
provider: ProviderKind::OpenAi,
auth_env: "DASHSCOPE_API_KEY",
base_url_env: "DASHSCOPE_BASE_URL",
default_base_url: openai_compat::DEFAULT_DASHSCOPE_BASE_URL,
});
}
// opencode "Zen" gateway — OpenAI-compatible, free models (minimax-m3,
// deepseek-v4-pro, fable-5-go, glm-5.1, …). Routed via the `opencode/` prefix;
// the bare model id is sent on the wire. Key in OPENCODE_API_KEY.
if canonical.starts_with("opencode/") {
return Some(ProviderMetadata {
provider: ProviderKind::OpenAi,
auth_env: "OPENCODE_API_KEY",
base_url_env: "OPENCODE_BASE_URL",
default_base_url: openai_compat::DEFAULT_OPENCODE_BASE_URL,
});
}
// NVIDIA NIM — OpenAI-compatible, free quota. Routed via the `nim/` prefix;
// the vendor/model id after the prefix is sent on the wire
// (e.g. nim/deepseek-ai/deepseek-r1 → deepseek-ai/deepseek-r1). Key in NIM_API_KEY.
if canonical.starts_with("nim/") {
return Some(ProviderMetadata {
provider: ProviderKind::OpenAi,
auth_env: "NIM_API_KEY",
base_url_env: "NIM_BASE_URL",
default_base_url: openai_compat::DEFAULT_NIM_BASE_URL,
});
}
// Ollama local models — HuggingFace models pulled via `ollama pull hf.co/...`
// and any model containing a colon (e.g. "llama3.2:1b", "gemma:7b")
if canonical.starts_with("hf.co/")
Expand All @@ -229,6 +278,19 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
#[must_use]
pub fn detect_provider_kind(model: &str) -> ProviderKind {
if let Some(metadata) = metadata_for_model(model) {
// An explicitly-configured OpenAI-compatible endpoint takes precedence
// over the local-Ollama *fallback heuristic* — model ids like
// "qwen2.5-coder:7b" only match Ollama by shape (a colon), so when the
// user has pointed OPENAI_BASE_URL at their own server, route there.
// Explicit provider metadata (anthropic/grok/dashscope/openai/...) is
// resolved before the Ollama branch in `metadata_for_model`, so this
// only ever reinterprets the heuristic fallback.
if metadata.provider == ProviderKind::Ollama
&& std::env::var_os("OPENAI_BASE_URL").is_some()
&& openai_compat::has_api_key("OPENAI_API_KEY")
Comment on lines +288 to +290

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Treat empty OPENAI_BASE_URL as unset before overriding Ollama routing.

var_os("OPENAI_BASE_URL").is_some() is true for OPENAI_BASE_URL=, so an Ollama-shaped model plus OPENAI_API_KEY gets routed to OpenAI with an empty base URL; read_base_url then produces a /chat/completions request URL instead of using the Ollama/default route.

🐛 Proposed fix
+fn openai_base_url_is_configured() -> bool {
+    matches!(std::env::var("OPENAI_BASE_URL"), Ok(value) if !value.trim().is_empty())
+}
+
 #[must_use]
 pub fn detect_provider_kind(model: &str) -> ProviderKind {
     if let Some(metadata) = metadata_for_model(model) {
@@
         if metadata.provider == ProviderKind::Ollama
-            && std::env::var_os("OPENAI_BASE_URL").is_some()
+            && openai_base_url_is_configured()
             && openai_compat::has_api_key("OPENAI_API_KEY")
         {
             return ProviderKind::OpenAi;
         }
@@
-    if std::env::var_os("OPENAI_BASE_URL").is_some() && openai_compat::has_api_key("OPENAI_API_KEY")
+    if openai_base_url_is_configured() && openai_compat::has_api_key("OPENAI_API_KEY")
     {
         return ProviderKind::OpenAi;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/api/src/providers/mod.rs` around lines 288 - 290, The condition
checking `std::env::var_os("OPENAI_BASE_URL").is_some()` returns true even when
the environment variable is set to an empty string, causing Ollama models to be
incorrectly routed to OpenAI with an empty base URL. Modify the check for
OPENAI_BASE_URL to validate that not only does the variable exist, but that it
also contains a non-empty value. This way, empty values will be treated as
unset, and Ollama routing will proceed correctly instead of creating requests
with an empty base URL.

{
return ProviderKind::OpenAi;
}
return metadata.provider;
}
// When OPENAI_BASE_URL is set, the user explicitly configured an
Expand Down Expand Up @@ -257,7 +319,9 @@ pub fn detect_provider_kind(model: &str) -> ProviderKind {
pub const fn model_family_identity_for_kind(kind: ProviderKind) -> runtime::ModelFamilyIdentity {
match kind {
ProviderKind::Anthropic => runtime::ModelFamilyIdentity::Claude,
ProviderKind::Xai | ProviderKind::OpenAi => runtime::ModelFamilyIdentity::Generic,
ProviderKind::Xai | ProviderKind::OpenAi | ProviderKind::Ollama => {
runtime::ModelFamilyIdentity::Generic
}
}
}

Expand Down Expand Up @@ -293,14 +357,27 @@ pub fn model_token_limit(model: &str) -> Option<ModelTokenLimit> {
let canonical = resolve_model_alias(model);
let base_model = canonical.rsplit('/').next().unwrap_or(canonical.as_str());
match base_model {
"claude-opus-4-8" | "claude-fable-5" => Some(ModelTokenLimit {
max_output_tokens: 128_000,
context_window_tokens: 1_000_000,
}),
"claude-opus-4-6" => Some(ModelTokenLimit {
max_output_tokens: 32_000,
context_window_tokens: 200_000,
}),
"claude-sonnet-4-6" | "claude-haiku-4-5-20251213" => Some(ModelTokenLimit {
"claude-sonnet-4-6" => Some(ModelTokenLimit {
max_output_tokens: 64_000,
// Sonnet 4.6 ships a 200K context window by default (the 1M window is
// a separate beta opt-in); advertising 1M here would make the
// preflight under-block requests the API actually rejects.
context_window_tokens: 200_000,
}),
"claude-haiku-4-5" | "claude-haiku-4-5-20251001" | "claude-haiku-4-5-20251213" => {
Some(ModelTokenLimit {
max_output_tokens: 64_000,
context_window_tokens: 200_000,
})
}
"grok-3" | "grok-3-mini" => Some(ModelTokenLimit {
max_output_tokens: 64_000,
context_window_tokens: 131_072,
Expand Down
55 changes: 54 additions & 1 deletion rust/crates/api/src/providers/openai_compat.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ pub const DEFAULT_XAI_BASE_URL: &str = "https://api.x.ai/v1";
pub const DEFAULT_OPENAI_BASE_URL: &str = "https://api.openai.com/v1";
pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";
pub const DEFAULT_OLLAMA_BASE_URL: &str = "http://localhost:11434/v1";
pub const DEFAULT_OPENCODE_BASE_URL: &str = "https://opencode.ai/zen/go/v1";
pub const DEFAULT_NIM_BASE_URL: &str = "https://integrate.api.nvidia.com/v1";
const REQUEST_ID_HEADER: &str = "request-id";
const ALT_REQUEST_ID_HEADER: &str = "x-request-id";
const DEFAULT_INITIAL_BACKOFF: Duration = Duration::from_secs(1);
Expand All @@ -43,6 +45,8 @@ const XAI_ENV_VARS: &[&str] = &["XAI_API_KEY"];
const OPENAI_ENV_VARS: &[&str] = &["OPENAI_API_KEY"];
const DASHSCOPE_ENV_VARS: &[&str] = &["DASHSCOPE_API_KEY"];
const OLLAMA_ENV_VARS: &[&str] = &["OLLAMA_API_KEY"];
const OPENCODE_ENV_VARS: &[&str] = &["OPENCODE_API_KEY"];
const NIM_ENV_VARS: &[&str] = &["NIM_API_KEY"];

// Provider-specific request body size limits in bytes
const XAI_MAX_REQUEST_BODY_BYTES: usize = 52_428_800; // 50MB
Expand Down Expand Up @@ -101,13 +105,42 @@ impl OpenAiCompatConfig {
}
}

/// opencode "Zen" gateway (`https://opencode.ai/zen/go/v1`) — OpenAI-compatible,
/// serves models like minimax-m3, deepseek-v4-pro, fable-5-go, glm-5.1.
/// Key comes from the opencode auth store, exported as OPENCODE_API_KEY.
#[must_use]
pub const fn opencode() -> Self {
Self {
provider_name: "opencode-go",
api_key_env: "OPENCODE_API_KEY",
base_url_env: "OPENCODE_BASE_URL",
default_base_url: DEFAULT_OPENCODE_BASE_URL,
max_request_body_bytes: OPENAI_MAX_REQUEST_BODY_BYTES,
}
}

/// NVIDIA NIM (`https://integrate.api.nvidia.com/v1`) — OpenAI-compatible,
/// free quota. Serves vendor/model ids (deepseek-ai/deepseek-r1, etc.).
#[must_use]
pub const fn nim() -> Self {
Self {
provider_name: "NVIDIA-NIM",
api_key_env: "NIM_API_KEY",
base_url_env: "NIM_BASE_URL",
default_base_url: DEFAULT_NIM_BASE_URL,
max_request_body_bytes: OPENAI_MAX_REQUEST_BODY_BYTES,
}
}

#[must_use]
pub fn credential_env_vars(self) -> &'static [&'static str] {
match self.provider_name {
"xAI" => XAI_ENV_VARS,
"OpenAI" => OPENAI_ENV_VARS,
"DashScope" => DASHSCOPE_ENV_VARS,
"Ollama" => OLLAMA_ENV_VARS,
"opencode-go" => OPENCODE_ENV_VARS,
"NVIDIA-NIM" => NIM_ENV_VARS,
_ => &[],
}
}
Expand Down Expand Up @@ -897,7 +930,7 @@ fn strip_routing_prefix(model: &str) -> &str {
let prefix = &model[..pos];
// Only strip if the prefix before "/" is a known routing prefix,
// not if "/" appears in the middle of the model name for other reasons.
if matches!(prefix, "openai" | "xai" | "grok" | "qwen" | "kimi") {
if matches!(prefix, "openai" | "xai" | "grok" | "qwen" | "kimi" | "opencode" | "nim") {
&model[pos + 1..]
} else {
model
Expand Down Expand Up @@ -1019,9 +1052,29 @@ pub fn build_chat_completion_request(
payload["reasoning_effort"] = json!(effort);
}

// Local aeon (Qwen3.5-MTP) intermittently opens a `<think>` block and never
// emits the closing `</think>` within the token budget — even on trivial
// turns. The server's `--reasoning-parser qwen3` then cannot split the
// output, so `content` comes back empty and the REPL prints "(no response)".
// Disable thinking for this model via the Qwen chat template switch so it
// answers directly. Targeted by wire model name; no effect on other backends.
if model_disables_thinking(wire_model) {
payload["chat_template_kwargs"] = json!({ "enable_thinking": false });
}
Comment on lines +1061 to +1063

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Narrow the thinking-disable predicate to the local aeon variants.

model_disables_thinking currently matches every qwen3.5* / qwen3_5* wire model, so DashScope/opencode/NIM models with those names will receive the non-standard chat_template_kwargs field even though the documented target is the local aeon/Qwen3.5-MTP rig.

🐛 Proposed fix
 pub fn model_disables_thinking(wire_model: &str) -> bool {
     let canonical = wire_model.rsplit('/').next().unwrap_or(wire_model);
     let lowered = canonical.to_ascii_lowercase();
-    lowered == "aeon" || lowered.starts_with("qwen3.5") || lowered.starts_with("qwen3_5")
+    matches!(lowered.as_str(), "aeon" | "qwen3.5-mtp" | "qwen3_5_mtp")
 }

Also applies to: 1072-1075

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/api/src/providers/openai_compat.rs` around lines 1061 - 1063, The
`model_disables_thinking` function is currently too broad and matches all
qwen3.5* and qwen3_5* wire models, causing unintended models from
DashScope/opencode/NIM to receive the non-standard chat_template_kwargs field.
Narrow the predicate to specifically match only the local aeon variants
(aeon/Qwen3.5-MTP rig) by adding an additional check to verify the model is from
the local aeon source, not just checking the model name pattern. This ensures
the enable_thinking: false setting is only applied to the intended local aeon
implementation.


payload
}

/// Returns true for local models that must run with thinking disabled to avoid
/// unclosed `<think>` blocks swallowing the entire response. Currently the
/// Qwen3.5-MTP "aeon" served on the local vLLM rig.
#[must_use]
pub fn model_disables_thinking(wire_model: &str) -> bool {
let canonical = wire_model.rsplit('/').next().unwrap_or(wire_model);
let lowered = canonical.to_ascii_lowercase();
lowered == "aeon" || lowered.starts_with("qwen3.5") || lowered.starts_with("qwen3_5")
}

/// Returns true for models that do NOT support the `is_error` field in tool results.
/// kimi models (via Moonshot AI/Dashscope) reject this field with 400 Bad Request.
/// Returns true for models that do NOT support the `is_error` field in tool results.
Expand Down
4 changes: 2 additions & 2 deletions rust/crates/commands/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5735,11 +5735,11 @@ mod tests {
let _guard = env_guard();
let workspace = temp_dir("mcp-degrades-144");
let config_home = temp_dir("mcp-degrades-144-cfg");
fs::create_dir_all(workspace.join(".claw")).expect("create workspace .claw dir");
fs::create_dir_all(workspace.join(".hackcode")).expect("create workspace .claw dir");
fs::create_dir_all(&config_home).expect("create config home");
// One valid server + one malformed entry missing `command`.
fs::write(
workspace.join(".claw.json"),
workspace.join(".hackcode.json"),
r#"{
"mcpServers": {
"everything": {"command": "npx", "args": ["-y", "@modelcontextprotocol/server-everything"]},
Expand Down
31 changes: 8 additions & 23 deletions rust/crates/runtime/src/bash.rs
Original file line number Diff line number Diff line change
Expand Up @@ -176,32 +176,17 @@ async fn execute_bash_async(

let mut command = prepare_tokio_command(&input.command, &cwd, &sandbox_status, true);

// The model often passes timeout values thinking they're seconds (e.g. 60)
// but the parameter is in milliseconds. Enforce a 30-second floor so
// network commands (nmap, curl, ping) actually have time to finish.
// `timeout` is honored exactly as supplied (milliseconds). We deliberately do
// not floor it: an explicit timeout should mean what it says so callers can
// bound a hung `cargo test`/network probe precisely. Sensible defaults for
// long-running network commands belong at the caller, not a silent override.
let output_result = if let Some(timeout_ms) = input.timeout {
let timeout_ms = timeout_ms.max(30_000);
match timeout(Duration::from_millis(timeout_ms), command.output()).await {
Ok(result) => (result?, false),
Err(_) => {
return Ok(BashCommandOutput {
stdout: String::new(),
stderr: format!("Command exceeded timeout of {timeout_ms} ms"),
raw_output_path: None,
interrupted: true,
is_image: None,
background_task_id: None,
backgrounded_by_user: None,
assistant_auto_backgrounded: None,
dangerously_disable_sandbox: input.dangerously_disable_sandbox,
return_code_interpretation: Some(String::from("timeout")),
no_output_expected: Some(true),
structured_content: None,
persisted_output_path: None,
persisted_output_size: None,
sandbox_status: Some(sandbox_status),
});
}
// On timeout, classify the failure (a hung `cargo test`/`pytest`
// reads differently from a slow network command) and emit structured
// provenance instead of a bare "timeout" string.
Err(_) => return Ok(timeout_output(&input, timeout_ms, sandbox_status)),
Comment on lines 177 to +189

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Tokio process Command::output future cancellation Child drop behavior kill_on_drop documentation

💡 Result:

In the Tokio asynchronous process module, the behavior of a child process when its handle or the future representing its execution is dropped depends entirely on the configuration of the Command object [1][2]. By default, the kill_on_drop property for a Command is set to false [1][3]. When this is the case, dropping the Child handle or the future returned by Command::output does not terminate the child process; the process will continue to execute independently of the parent [1][2]. This behavior aligns with the standard library's process handling and differs from the typical expectations of the futures paradigm, where dropping a future often implies cancellation [1][4][2]. To change this behavior, you must explicitly call.kill_on_drop(true) on the Command object before spawning the process [1][3]. When kill_on_drop is enabled, the destructor of the associated Child handle or the future returned by Command::output will issue a kill signal to the child process if it has not yet exited [1][2]. Important considerations regarding this mechanism include: - Resource Cleanup (Zombies): On Unix platforms, terminated processes must be "reaped" by their parent to release system resources [4][5]. If a process is killed by the drop mechanism, it may become a "zombie" process [4][6]. Tokio makes a best-effort attempt to reap these processes in the background, but provides no guarantees regarding the speed or frequency of this cleanup [4][5]. - Recommendation: Because the drop-based cleanup is best-effort, it is recommended to avoid dropping a Child handle or the output future before it has been fully awaited if strict cleanup guarantees are necessary [1][4]. For reliable management, explicitly call child.kill.await or child.wait.await [1][3]. - Output Future Specifics: When using Command::output, the future returned will also respect the kill_on_drop setting [1][2]. If set to true, dropping the future before it resolves will kill the process [1]. If set to false, the process will continue to run [2].

Citations:


🏁 Script executed:

# Find and read the Cargo.toml to check Tokio version
fd -name "Cargo.toml" | head -5

Repository: itwizardo/hackcode

Length of output: 293


🏁 Script executed:

# Read the full function context around the timeout code
cat -n rust/crates/runtime/src/bash.rs | sed -n '150,210p'

Repository: itwizardo/hackcode

Length of output: 2780


🏁 Script executed:

# Find Cargo.toml and grep for tokio version
find . -name "Cargo.toml" -type f | xargs grep -l "tokio"

Repository: itwizardo/hackcode

Length of output: 245


🏁 Script executed:

# Look for the prepare_tokio_command function definition
rg "fn prepare_tokio_command" -A 20

Repository: itwizardo/hackcode

Length of output: 1387


🏁 Script executed:

# Read the runtime Cargo.toml to find tokio version
cat rust/crates/runtime/Cargo.toml | grep -A 5 "tokio"

Repository: itwizardo/hackcode

Length of output: 216


🏁 Script executed:

# Get the full prepare_tokio_command function
rg "fn prepare_tokio_command" -A 40 rust/crates/runtime/src/bash.rs

Repository: itwizardo/hackcode

Length of output: 1308


Kill the foreground child when the timeout future is cancelled.

Line 184 times out command.output(), but line 189 returns immediately without terminating the child process. By default, Tokio's Command::kill_on_drop is false, so dropping the timed-out future leaves the shell command running. This becomes a practical issue now that tiny caller-provided timeouts are honored exactly (lines 179–182).

Set kill_on_drop(true) on the command before the timeout to ensure the process terminates:

Proposed fix
     let mut command = prepare_tokio_command(&input.command, &cwd, &sandbox_status, true);
 
     // `timeout` is honored exactly as supplied (milliseconds). We deliberately do
@@ -183,6 +183,7 @@ async fn execute_bash_async(
     // not floor it: an explicit timeout should mean what it says so callers can
     // bound a hung `cargo test`/network probe precisely. Sensible defaults for
     // long-running network commands belong at the caller, not a silent override.
     let output_result = if let Some(timeout_ms) = input.timeout {
+        command.kill_on_drop(true);
         match timeout(Duration::from_millis(timeout_ms), command.output()).await {

For stricter cleanup guarantees, use explicit spawn() + timeout + kill()/wait() instead of command.output().

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mut command = prepare_tokio_command(&input.command, &cwd, &sandbox_status, true);
// The model often passes timeout values thinking they're seconds (e.g. 60)
// but the parameter is in milliseconds. Enforce a 30-second floor so
// network commands (nmap, curl, ping) actually have time to finish.
// `timeout` is honored exactly as supplied (milliseconds). We deliberately do
// not floor it: an explicit timeout should mean what it says so callers can
// bound a hung `cargo test`/network probe precisely. Sensible defaults for
// long-running network commands belong at the caller, not a silent override.
let output_result = if let Some(timeout_ms) = input.timeout {
let timeout_ms = timeout_ms.max(30_000);
match timeout(Duration::from_millis(timeout_ms), command.output()).await {
Ok(result) => (result?, false),
Err(_) => {
return Ok(BashCommandOutput {
stdout: String::new(),
stderr: format!("Command exceeded timeout of {timeout_ms} ms"),
raw_output_path: None,
interrupted: true,
is_image: None,
background_task_id: None,
backgrounded_by_user: None,
assistant_auto_backgrounded: None,
dangerously_disable_sandbox: input.dangerously_disable_sandbox,
return_code_interpretation: Some(String::from("timeout")),
no_output_expected: Some(true),
structured_content: None,
persisted_output_path: None,
persisted_output_size: None,
sandbox_status: Some(sandbox_status),
});
}
// On timeout, classify the failure (a hung `cargo test`/`pytest`
// reads differently from a slow network command) and emit structured
// provenance instead of a bare "timeout" string.
Err(_) => return Ok(timeout_output(&input, timeout_ms, sandbox_status)),
let mut command = prepare_tokio_command(&input.command, &cwd, &sandbox_status, true);
// `timeout` is honored exactly as supplied (milliseconds). We deliberately do
// not floor it: an explicit timeout should mean what it says so callers can
// bound a hung `cargo test`/network probe precisely. Sensible defaults for
// long-running network commands belong at the caller, not a silent override.
let output_result = if let Some(timeout_ms) = input.timeout {
command.kill_on_drop(true);
match timeout(Duration::from_millis(timeout_ms), command.output()).await {
Ok(result) => (result?, false),
// On timeout, classify the failure (a hung `cargo test`/`pytest`
// reads differently from a slow network command) and emit structured
// provenance instead of a bare "timeout" string.
Err(_) => return Ok(timeout_output(&input, timeout_ms, sandbox_status)),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/crates/runtime/src/bash.rs` around lines 177 - 189, The timeout
operation on the command does not terminate the child process when the timeout
is exceeded because kill_on_drop is false by default in Tokio's Command. To fix
this, call kill_on_drop(true) on the command object returned by
prepare_tokio_command before passing it to the timeout call. This ensures that
when the timeout future is cancelled and dropped, the underlying shell process
is properly terminated rather than left running in the background.

}
} else {
(command.output().await?, false)
Expand Down
Loading