feat(sisyphus): add GLM-5.x dedicated prompt builder and speed overlay by islee23520 · Pull Request #3736 · code-yeongyu/oh-my-openagent

islee23520 · 2026-04-30T14:23:42Z

Summary

GLM-5.x models overthink during Sisyphus orchestration. Since budgetTokens is not supported, thinking restraint must come from prompt structure. This PR introduces a dedicated GLM prompt system that enforces a fast dispatch-first execution loop and prevents text-only models from attempting image analysis.

Changes

New GLM-specific prompt builder (glm-prompt.ts) with 8-block structure enforcing DISPATCH→DELEGATE→COLLECT→SYNTHESIZE→DONE
Vision constraints applied to all GLM-using agents (Sisyphus, SJ, Oracle, Metis, Momus)
SJ speed overlay for concise GLM delegation
AI slop removal across agent prompts
Quality benchmarks (32 tests)

Verification

bun run typecheck — clean
bun run build — clean
bun test — 6026 pass (45 pre-existing failures in tmux/background-agent, unrelated)

Checklist

Code follows project conventions
bun run typecheck passes
bun run build succeeds
Tested locally with OpenCode
Updated documentation if needed (README, AGENTS.md)
No version changes in package.json

- Add isGlmSisyphusHarnessModel for GLM-5/5.1/5-turbo detection - Route GLM harness models to specialized prompts in Sisyphus agents - Add Small Context Working Memory with state slices for GLM context optimization - Add GLM-specific context priorities and vision constraints for Sisyphus Junior - Add comprehensive tests for GLM prompt validation and routing

- Refactor sisyphus/glm.ts: 387→69 lines via overlay pattern (string replacement) - Add isGlmThinkingModel() for GLM-5+ text models (excludes VLM) - Add isGlmVisionModel() for GLM VLM variants (glm-4.6v, glm-5v-turbo) - Oracle/Metis/Momus: GLM-5+ text → thinking: { type: enabled }, Claude → budgetTokens - Sisyphus-Junior: GLM → thinking: { type: enabled } (was bare base) - Sisyphus: GLM overlay injection + thinking config, fact-checked comments - Update stale test: sisyphus-junior GLM now returns thinking - Add 100-test factory benchmark (5 agents × 7 GLM variants + cross-agent guards) - Add runtime benchmark script (scripts/benchmark-glm-thinking.ts) Benchmark: 100 factory tests pass, 452 agent tests pass, typecheck clean Verified: No GLM text model receives budgetTokens across any agent Refs: code-yeongyu#3210, code-yeongyu#3256, code-yeongyu#3568

…tale comments

…rlay - New src/agents/sisyphus/glm-prompt.ts: 8-block GLM-specific Sisyphus prompt (DISPATCH→DELEGATE→COLLECT→SYNTHESIZE→DONE execution loop replacing EXPLORE→PLAN→ROUTE→EXECUTE→VERIFY→RETRY→DONE) - New src/agents/glm-prompt-quality.test.ts: 32 quality benchmarks across Instruction Compliance (10), Speed (10), Accuracy (9), Cross-Agent (3) - Extended src/agents/sisyphus-junior/glm.ts: SJ speed overlay with execution-first mindset, brief thinking, re-entry rule, exploration budget (2-iteration cap), tiered verification V1/V2/V3, token economy - Modified src/agents/sisyphus.ts: GLM routing from overlay string.replace to dedicated buildGlmSisyphusPrompt() builder (matches Kimi K2.x pattern) GLM-5.x does not support budgetTokens. Excessive thinking was controlled via prompt engineering: concise thinking mandate, re-entry rule (suppress re-verbalization for resolved turns), exploration budget hard stops, and tiered verification (V1/V2/V3) to avoid over-verification on trivial changes. Hephaestus delegation strategy included: sequential edits >= 3 automatically routed to Hephaestus (deep-thinking worker) to keep Sisyphus unblocked. All 140 GLM-related tests pass. Typecheck clean. AI slop removed.

cubic-dev-ai

No issues found across 18 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Auto-approved: GLM-specific changes are strictly isolated using model-specific predicates. Non-GLM logic and configurations are preserved, verified by 140 tests including 32 new quality benchmarks.}

- Add buildGlmSubagentVisionBlock() for concise subagent vision warnings - Apply vision constraint to Oracle, Metis, Momus (GLM branches) - Simplify GLM SJ speed overlay prompt (remove 4 redundant lines) - Remove redundant JSDoc from metis.ts, marketing language from sisyphus.ts - Centralize Sisyphus description as SISYPHUS_DESCRIPTION constant

cubic-dev-ai

No issues found across 18 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Requires human review: Large PR (1800+ lines) modifies Sisyphus metadata descriptions globally, not just for GLM models. 45 failing tests reported, and complex model-routing logic changes require manual verification.}

TheAsda · 2026-05-01T20:43:59Z

Looking forward to try these changes

…unior prompt test The SJ GLM prompt intentionally omits the .sisyphus/state/{plan-or-session}/ path and individual slice filenames (goal.md, decisions.md, etc.) that the main Sisyphus GLM prompt includes. The test incorrectly expected the full ledger path; align it with the lightweight memory contract.

cubic-dev-ai

0 issues found across 1 file (changes from recent commits).

_{Requires human review: The PR removes extensive JSDoc and descriptive comments from multiple agent files and types.ts, which is a regression in code maintainability and documentation.}

# Conflicts: # src/agents/momus.ts

…r GLM Upstream removed GLM-specific thinking config from Momus, causing budgetTokens: 32000 to be applied to GLM models that do not support it. Restore the isGlmThinkingModel branch matching Metis and Oracle. Also add test coverage for: - Momus GLM thinking config without budgetTokens - Sisyphus call_omo_agent permission (allow) vs Hephaestus (deny)

cubic-dev-ai

0 issues found across 11 files (changes from recent commits).

_{Requires human review: Prompt modifications (e.g., shortening sisyphus role description) could cause subtle behavioral regressions; cannot guarantee 100% no regression.}

…ries

…mpt builder

cubic-dev-ai

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/plugin-handlers/tool-config-handler.ts">

<violation number="1">
P1: Sisyphus’s `call_omo_agent` permission was changed from allow to deny, which can break its ability to delegate to other agents.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.}

…="deep") delegation

cubic-dev-ai

0 issues found across 1 file (changes from recent commits).

_{Requires human review: Large PR (2561 lines) touches core agent logic with significant prompt changes. Cannot be 100% sure of no regressions despite clean tests.}

…tion routing

cubic-dev-ai

0 issues found across 3 files (changes from recent commits).

_{Requires human review: Prompt identity changes (Sisyphus role/description) affect all models, not just GLM. Cannot guarantee zero regression from altering agent personality framing.}

Restore architecture and routing intent comments in types.ts, oracle.ts, and sisyphus/glm.ts that document design decisions.

cubic-dev-ai

0 issues found across 3 files (changes from recent commits).

_{Requires human review: PR adds 2544 lines across 12+ files (new agents, prompt builders, config changes). Despite clean AI review and tests, the strict '100% sure no regressions' criteria cannot be met without manual review}

…J, sisyphus Restore architecture intent comments: - metis.ts: agent role/responsibilities JSDoc - sisyphus-junior/agent.ts: routing order, BLOCKED_TOOLS intent - sisyphus.ts: Gemini overlay placement rationale, GLM thinking note

cubic-dev-ai

0 issues found across 3 files (changes from recent commits).

_{Requires human review: Cannot be 100% sure of zero regressions: modifies core orchestration agents (sisyphus, metis, oracle, momus, sisyphus-junior) with new GLM routing logic, which could introduce edge-case misrouting or}

… prompt The GLM_SJ_Speed_Optimizations section duplicated content already present in the base SJ prompt and Sisyphus system prompt. Only the GLM-specific context priorities and vision constraint are kept.

delegation-scorecard and event-metric-collector are test-only utilities with zero runtime consumers. Only referenced by scripts/benchmark-* Move to scripts/ if needed later.

cubic-dev-ai

0 issues found across 12 files (changes from recent commits).

_{Requires human review: Unrelated change in ralph-loop-event-handler.ts (runtime error retry cap) introduces behavioral regression risk. Not 100% sure of no regressions per custom criteria.}

cubic-dev-ai · 2026-05-07T02:18:48Z

You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment @cubic-dev-ai review.

islee23520 added 7 commits April 30, 2026 18:31

style(agents): remove redundant JSDoc and inline comments

b30a1b5

style(tests): remove redundant test comments

f68ff5e

style(tests): remove section dividers and redundant casts in benchmark

7f8cb96

fix(benchmark): correct factoryTestResults property name and remove s…

ab49716

…tale comments

Merge branch 'code-yeongyu:dev' into tune/glm-performance

24df263

islee23520 changed the title ~~feat(agents): GLM-5.x thinking optimization, overlay refactor, and GPT-5.5 prompt hardening~~ feat(agents): GLM-5.x thinking optimization and overlay refactor Apr 30, 2026

islee23520 added 2 commits May 1, 2026 09:04

Merge remote-tracking branch 'upstream/dev' into tune/glm-performance

9b4b655

islee23520 changed the title ~~feat(agents): GLM-5.x thinking optimization and overlay refactor~~ feat(sisyphus): add GLM-5.x dedicated prompt builder and speed overlay May 1, 2026

cubic-dev-ai Bot approved these changes May 1, 2026

View reviewed changes

islee23520 added 2 commits May 1, 2026 15:30

Merge branch 'code-yeongyu:dev' into tune/glm-performance

c2538a9

cubic-dev-ai Bot reviewed May 1, 2026

View reviewed changes

islee23520 added 3 commits May 3, 2026 11:42

Merge branch 'code-yeongyu:dev' into tune/glm-performance

647cc6a

Merge branch 'code-yeongyu:dev' into tune/glm-performance

1a0d800

cubic-dev-ai Bot reviewed May 3, 2026

View reviewed changes

islee23520 added 5 commits May 5, 2026 21:59

Merge branch 'code-yeongyu:dev' into tune/glm-performance

1cf61a5

perf(sisyphus): add direct Hephaestus delegation for GLM routing

44426b1

Merge branch 'dev' into tune/glm-performance

1116027

# Conflicts: # src/agents/momus.ts

Merge branch 'code-yeongyu:dev' into tune/glm-performance

485fb08

cubic-dev-ai Bot reviewed May 6, 2026

View reviewed changes

islee23520 added 4 commits May 6, 2026 23:51

Merge branch 'code-yeongyu:dev' into tune/glm-performance

d08c131

fix(security): revert call_omo_agent permission to deny for Sisyphus

a87b865

fix(ralph-loop): add runtime error retry cap to prevent unbounded ret…

f34fa5a

…ries

refactor(types): replace GLM model regex with explicit allowlist set

b6ea936

islee23520 added 2 commits May 7, 2026 00:32

refactor(glm-prompt): extract section builder functions from main pro…

b93bf4b

…mpt builder

chore(cli): remove benchmark-only exports from public CLI surface

d6c5bdb

cubic-dev-ai Bot reviewed May 6, 2026

View reviewed changes

fix(glm-prompt): replace call_omo_agent references with task(category…

7d32830

…="deep") delegation