Skip to content

test(thinking-recovery): add unit tests for all exported functions#550

Merged
giuliovv merged 1 commit intomainfrom
test/thinking-recovery
Apr 28, 2026
Merged

test(thinking-recovery): add unit tests for all exported functions#550
giuliovv merged 1 commit intomainfrom
test/thinking-recovery

Conversation

@giuliovv
Copy link
Copy Markdown
Collaborator

Cover analyzeConversationState, needsThinkingRecovery, closeToolLoopForThinking, looksLikeCompactedThinkingTurn, and hasPossibleCompactedThinking.

Tests verify: empty/null guards, tool-loop detection, turn-start tracking, thinking-strip on recovery, synthetic message content (singular/plural/fallback), and compacted-turn heuristic boundary cases.

Cover analyzeConversationState, needsThinkingRecovery, closeToolLoopForThinking,
looksLikeCompactedThinkingTurn, and hasPossibleCompactedThinking.

Tests verify: empty/null guards, tool-loop detection, turn-start tracking,
thinking-strip on recovery, synthetic message content (singular/plural/fallback),
and compacted-turn heuristic boundary cases.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 28, 2026

Greptile Summary

This PR adds a new unit test file for thinking-recovery.ts, covering all five exported functions (analyzeConversationState, needsThinkingRecovery, closeToolLoopForThinking, looksLikeCompactedThinkingTurn, hasPossibleCompactedThinking) with 20+ cases. All test assertions are logically correct against the current implementation; the remaining feedback is limited to minor coverage gaps and fixture clarity.

Confidence Score: 4/5

Safe to merge — test-only change with correct assertions and no production code modified.

All P2s: tests are logically correct, no production code is touched, and the only findings are minor coverage gaps and a slightly misleading fixture.

src/plugin/thinking-recovery.test.ts — minor coverage gap for Anthropic-style thinking blocks in the strip test.

Important Files Changed

Filename Overview
src/plugin/thinking-recovery.test.ts New test file covering all five exported functions with 20+ cases. Logic and expectations are correct against the implementation; minor gaps in Anthropic-format thinking-strip coverage, a misleading empty-contents fixture, and a sparse non-array guard assertion.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[analyzeConversationState] --> B{inToolLoop?}
    B -- No --> C[Return state as-is]
    B -- Yes --> D[needsThinkingRecovery]
    D --> E{turnHasThinking?}
    E -- Yes --> C
    E -- No --> F[closeToolLoopForThinking]
    F --> G[stripAllThinkingBlocks]
    G --> H[countTrailingToolResults]
    H --> I{count}
    I -- 0 --> J[Processing previous context.]
    I -- 1 --> K[Tool execution completed.]
    I -- N --> L[N tool executions completed.]
    J & K & L --> M[Append synthetic model + user messages]
    M --> N[New conversation with fresh turn]

    subgraph Compaction Detection
        O[looksLikeCompactedThinkingTurn] --> P{has functionCall?}
        P -- No --> Q[false]
        P -- Yes --> R{has thinking?}
        R -- Yes --> Q
        R -- No --> S{text before call?}
        S -- Yes --> Q
        S -- No --> T[true — looks compacted]
        U[hasPossibleCompactedThinking] --> V[iterate from turnStartIdx]
        V --> O
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 196-210

Comment:
**Thinking-strip test only checks `thought: true` format**

The assertion `p?.thought === true` only verifies the Gemini-style thinking format. `stripAllThinkingBlocks` (and `isThinkingPart`) also handles Anthropic-style blocks (`type === "thinking"` and `type === "redacted_thinking"`). If someone broke those branches in the implementation, this test would not catch it.

Consider broadening the predicate to match `isThinkingPart`'s full logic:
```ts
const hasThinking = parts.some(
  (p: any) =>
    p?.thought === true ||
    p?.type === "thinking" ||
    p?.type === "redacted_thinking",
);
```
Or add a parallel fixture using Anthropic-format thinking blocks.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 270-272

Comment:
**Test fixture uses `turnStartIdx = 0` but the guard only blocks `< 0`**

The test name says "returns false for empty contents", implying the guard catches it. However `turnStartIdx = 0` is not caught by `if (turnStartIdx < 0)`; the function proceeds to the loop, which just happens to be empty. A `turnStartIdx` of `-1` would make the test self-documenting and exercise the actual guard path.

```suggestion
    expect(hasPossibleCompactedThinking([], -1)).toBe(false);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 62-68

Comment:
**`non-array input` test only asserts `inToolLoop`**

The implementation returns a full default `ConversationState` for non-array input, but only one field is asserted here. A follow-on edge-case regression (e.g. a wrong default for `turnStartIdx`) would go undetected. Consider also asserting the other default values:
```ts
expect(state.turnStartIdx).toBe(-1);
expect(state.lastModelIdx).toBe(-1);
expect(state.turnHasThinking).toBe(false);
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "test(thinking-recovery): add unit tests ..." | Re-trigger Greptile

Comment on lines +196 to +210
const hasThinking = parts.some((p: any) => p?.thought === true);
expect(hasThinking).toBe(false);
}
});

it("uses singular message for single tool result", () => {
const contents = [userMsg("go"), modelWithToolCall(), toolResultMsg()];
const result = closeToolLoopForThinking(contents);
const syntheticModel = result[result.length - 2];
expect(syntheticModel?.parts[0]?.text).toBe("[Tool execution completed.]");
});

it("uses plural message for multiple tool results", () => {
const contents = [
userMsg("go"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Thinking-strip test only checks thought: true format

The assertion p?.thought === true only verifies the Gemini-style thinking format. stripAllThinkingBlocks (and isThinkingPart) also handles Anthropic-style blocks (type === "thinking" and type === "redacted_thinking"). If someone broke those branches in the implementation, this test would not catch it.

Consider broadening the predicate to match isThinkingPart's full logic:

const hasThinking = parts.some(
  (p: any) =>
    p?.thought === true ||
    p?.type === "thinking" ||
    p?.type === "redacted_thinking",
);

Or add a parallel fixture using Anthropic-format thinking blocks.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 196-210

Comment:
**Thinking-strip test only checks `thought: true` format**

The assertion `p?.thought === true` only verifies the Gemini-style thinking format. `stripAllThinkingBlocks` (and `isThinkingPart`) also handles Anthropic-style blocks (`type === "thinking"` and `type === "redacted_thinking"`). If someone broke those branches in the implementation, this test would not catch it.

Consider broadening the predicate to match `isThinkingPart`'s full logic:
```ts
const hasThinking = parts.some(
  (p: any) =>
    p?.thought === true ||
    p?.type === "thinking" ||
    p?.type === "redacted_thinking",
);
```
Or add a parallel fixture using Anthropic-format thinking blocks.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +270 to +272
parts: [
{ text: "I will now call the tool." },
{ functionCall: { name: "t", args: {} } },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test fixture uses turnStartIdx = 0 but the guard only blocks < 0

The test name says "returns false for empty contents", implying the guard catches it. However turnStartIdx = 0 is not caught by if (turnStartIdx < 0); the function proceeds to the loop, which just happens to be empty. A turnStartIdx of -1 would make the test self-documenting and exercise the actual guard path.

Suggested change
parts: [
{ text: "I will now call the tool." },
{ functionCall: { name: "t", args: {} } },
expect(hasPossibleCompactedThinking([], -1)).toBe(false);
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 270-272

Comment:
**Test fixture uses `turnStartIdx = 0` but the guard only blocks `< 0`**

The test name says "returns false for empty contents", implying the guard catches it. However `turnStartIdx = 0` is not caught by `if (turnStartIdx < 0)`; the function proceeds to the loop, which just happens to be empty. A `turnStartIdx` of `-1` would make the test self-documenting and exercise the actual guard path.

```suggestion
    expect(hasPossibleCompactedThinking([], -1)).toBe(false);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +62 to +68
const state = analyzeConversationState(null as any);
expect(state.inToolLoop).toBe(false);
});

it("detects a simple user→model conversation (not in tool loop)", () => {
const contents = [userMsg("hello"), modelMsg("hi there")];
const state = analyzeConversationState(contents);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 non-array input test only asserts inToolLoop

The implementation returns a full default ConversationState for non-array input, but only one field is asserted here. A follow-on edge-case regression (e.g. a wrong default for turnStartIdx) would go undetected. Consider also asserting the other default values:

expect(state.turnStartIdx).toBe(-1);
expect(state.lastModelIdx).toBe(-1);
expect(state.turnHasThinking).toBe(false);
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/plugin/thinking-recovery.test.ts
Line: 62-68

Comment:
**`non-array input` test only asserts `inToolLoop`**

The implementation returns a full default `ConversationState` for non-array input, but only one field is asserted here. A follow-on edge-case regression (e.g. a wrong default for `turnStartIdx`) would go undetected. Consider also asserting the other default values:
```ts
expect(state.turnStartIdx).toBe(-1);
expect(state.lastModelIdx).toBe(-1);
expect(state.turnHasThinking).toBe(false);
```

How can I resolve this? If you propose a fix, please make it concise.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

Walkthrough

A new test file is added that implements a comprehensive Vitest test suite for the thinking-recovery plugin. The suite contains tests validating the behavior of functions including analyzeConversationState, needsThinkingRecovery, closeToolLoopForThinking, looksLikeCompactedThinkingTurn, and hasPossibleCompactedThinking across various conversation scenarios. Tests cover input validation, flow detection, thinking identification, tool-loop sequence recognition, turn-start tracking, and message mutation behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding unit tests for all exported functions in the thinking-recovery module.
Description check ✅ Passed The description is directly related to the changeset, providing specific details about which functions are tested and what aspects are covered.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/plugin/thinking-recovery.test.ts (2)

186-199: Broaden strip assertions to all thinking encodings.

This test currently checks only part.thought === true. The heuristic code also recognizes type: "thinking" and type: "redacted_thinking", so this assertion should include those forms too.

Suggested assertion hardening
   it("strips thinking blocks from prior messages", () => {
@@
     for (const msg of modelMessages) {
       const parts: any[] = msg.parts ?? [];
-      const hasThinking = parts.some((p: any) => p?.thought === true);
+      const hasThinking = parts.some(
+        (p: any) =>
+          p?.thought === true ||
+          p?.type === "thinking" ||
+          p?.type === "redacted_thinking",
+      );
       expect(hasThinking).toBe(false);
     }
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plugin/thinking-recovery.test.ts` around lines 186 - 199, The test in
closeToolLoopForThinking should assert that no model message parts match any
supported "thinking" encodings; update the loop that checks parts (in the it
block using modelWithThinking, toolResultMsg and closeToolLoopForThinking) so
hasThinking is true if part.thought === true OR part.type === "thinking" OR
part.type === "redacted_thinking", and assert that hasThinking is false for all
model message parts.

66-146: Add assistant-role parity cases for analyzeConversationState.

analyzeConversationState treats both "model" and "assistant" as model turns, but this section validates only "model". Adding one or two "assistant" variants would prevent regressions in mixed-role providers.

Suggested test additions
 describe("analyzeConversationState", () => {
+  it("treats assistant role as model role", () => {
+    const contents = [
+      userMsg("hello"),
+      { role: "assistant", parts: [{ text: "hi there" }] },
+    ];
+    const state = analyzeConversationState(contents);
+    expect(state.lastModelIdx).toBe(1);
+    expect(state.inToolLoop).toBe(false);
+  });
+
+  it("detects thinking in assistant role messages", () => {
+    const contents = [
+      userMsg("hello"),
+      { role: "assistant", parts: [{ thought: true, text: "thinking..." }] },
+    ];
+    const state = analyzeConversationState(contents);
+    expect(state.lastModelHasThinking).toBe(true);
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/plugin/thinking-recovery.test.ts` around lines 66 - 146, Add parity tests
that use role "assistant" where current cases use "model" to ensure
analyzeConversationState treats both roles identically: duplicate a couple of
existing specs (e.g., the simple user→model conversation, the "detects thinking
in last model message", and one tool-loop case) but replace role "model" with
"assistant"; also add one mixed-role case where earlier messages use "model" and
later ones "assistant" to catch regressions in mixed-role providers; reference
analyzeConversationState and the existing test helper patterns like
modelWithThinking/modelWithToolCall to create the new cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/plugin/thinking-recovery.test.ts`:
- Around line 186-199: The test in closeToolLoopForThinking should assert that
no model message parts match any supported "thinking" encodings; update the loop
that checks parts (in the it block using modelWithThinking, toolResultMsg and
closeToolLoopForThinking) so hasThinking is true if part.thought === true OR
part.type === "thinking" OR part.type === "redacted_thinking", and assert that
hasThinking is false for all model message parts.
- Around line 66-146: Add parity tests that use role "assistant" where current
cases use "model" to ensure analyzeConversationState treats both roles
identically: duplicate a couple of existing specs (e.g., the simple user→model
conversation, the "detects thinking in last model message", and one tool-loop
case) but replace role "model" with "assistant"; also add one mixed-role case
where earlier messages use "model" and later ones "assistant" to catch
regressions in mixed-role providers; reference analyzeConversationState and the
existing test helper patterns like modelWithThinking/modelWithToolCall to create
the new cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 76bc6d04-3e6c-40b3-9482-3e8e81482e83

📥 Commits

Reviewing files that changed from the base of the PR and between 09ccf4b and 5b76820.

📒 Files selected for processing (1)
  • src/plugin/thinking-recovery.test.ts
📜 Review details
🔇 Additional comments (1)
src/plugin/thinking-recovery.test.ts (1)

53-321: Strong, high-value test coverage across recovery paths.

This suite exercises the main state transitions and recovery heuristics well, including immutability and boundary conditions.

@giuliovv giuliovv merged commit ce49176 into main Apr 28, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant