Issue: Foreground subagents exhaust single API key rate limit, causing 429 errors and hangs

### What feature would you like to see?

### Problem Description

When running multiple foreground subagents concurrently (e.g., 3–4 `coder` or `explore` subagents working on independent tasks), **all subagents share the same API key** as the root runtime. This leads to severe rate-limit contention:

1. **Rate limit exhaustion**: A single `KIMI_API_KEY` has finite TPM/RPM quotas. With 3–4 subagents each making multi-step LLM calls, the quota is consumed almost instantly.
2. **429 errors and retries**: Subsequent requests hit `429 Too Many Requests`. Subagents either retry (wasting tokens) or hang waiting for quota recovery.
3. **Poor user experience**: From the user's perspective, subagents that should complete in seconds instead take minutes or fail silently. The shell UI shows subagents as "running" with no visible progress.
4. **No backend attribution**: All requests appear in the Kimi console as `KimiCLI/1.44.0` with no way to distinguish root agent calls from subagent calls, making it impossible to diagnose which subagent is consuming quota.

### Reproduction Steps

1. Configure a single API key via `/login` or `KIMI_API_KEY`.
2. Launch 3+ foreground subagents concurrently:
   ```
   /coder "Analyze app.py"
   /coder "Review key pool design"
   /coder "Check test coverage"
   ```
3. Observe that:
   - Subagent response latency increases dramatically after the first few LLM calls
   - `429` errors appear in logs (if debug mode is enabled)
   - Subagents may exceed the default timeout and get killed

### Expected Behavior

- Each concurrent subagent should use a **distinct API key** when multiple keys are available
- The system should enforce a **concurrency limit** based on available key count to avoid exhausting all keys
- Subagent requests should carry a **discernible User-Agent** so backend monitoring can attribute quota consumption correctly

### Environment

- Kimi CLI version: 1.44.0
- OS: macOS / Linux
- Python: 3.14
- Provider: kimi (Kimi Code platform)

### Additional Context

- The root agent itself also consumes the same key for compaction, user replies, etc. With subagents added, the contention becomes even worse.
- There is currently **no concurrency limit** for foreground subagents beyond the hardcoded background task limit, meaning a user can accidentally spawn unlimited subagents and DDoS their own API key.
- The timeout description in the Agent tool schema claims "Foreground: no default timeout (runs until completion)", which means a hung subagent (due to rate-limit backoff) will never be killed.

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: Foreground subagents exhaust single API key rate limit, causing 429 errors and hangs #2368

What feature would you like to see?

Problem Description

Reproduction Steps

Expected Behavior

Environment

Additional Context

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue: Foreground subagents exhaust single API key rate limit, causing 429 errors and hangs #2368

Description

What feature would you like to see?

Problem Description

Reproduction Steps

Expected Behavior

Environment

Additional Context

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions