Skip to content

Commit 2fe643c

Browse files
Copilotowndev
andcommitted
Add comprehensive test scenarios for code interpreter
Co-authored-by: owndev <69784886+owndev@users.noreply.github.com>
1 parent 4efcd79 commit 2fe643c

1 file changed

Lines changed: 182 additions & 0 deletions

File tree

TEST_SCENARIOS.md

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
# Test Scenarios for Code Interpreter Fix
2+
3+
## Prerequisites
4+
1. Open WebUI is running with the updated Gemini pipeline
5+
2. Code interpreter feature is enabled in Open WebUI settings
6+
3. Google API key or Vertex AI credentials are configured
7+
4. A Gemini model is selected (e.g., gemini-2.5-pro, gemini-2.0-flash)
8+
9+
## Test Scenario 1: Simple Code Execution
10+
11+
**Prompt:**
12+
```
13+
Write and execute Python code to calculate pi to 10 decimal places using the Leibniz formula.
14+
```
15+
16+
**Expected Behavior:**
17+
- Gemini should respond with code explanation
18+
- Code should execute automatically
19+
- Results should be displayed showing pi ≈ 3.1415926536
20+
21+
**What to Check:**
22+
- No repeated text (the original bug symptom)
23+
- Code is displayed in a code block
24+
- Execution results are shown
25+
- No errors in console logs
26+
27+
## Test Scenario 2: Data Visualization
28+
29+
**Prompt:**
30+
```
31+
Create a simple bar chart showing the first 5 Fibonacci numbers using matplotlib.
32+
```
33+
34+
**Expected Behavior:**
35+
- Code generates a bar chart
36+
- Chart is displayed in the response
37+
- No repeated text errors
38+
39+
**What to Check:**
40+
- Code executes successfully
41+
- Image/chart is visible
42+
- Proper error handling if matplotlib isn't available
43+
44+
## Test Scenario 3: Error Handling
45+
46+
**Prompt:**
47+
```
48+
Execute this Python code: print(1/0)
49+
```
50+
51+
**Expected Behavior:**
52+
- Code attempts to execute
53+
- Division by zero error is caught and displayed
54+
- Error message is clear and doesn't break the UI
55+
56+
**What to Check:**
57+
- Error is handled gracefully
58+
- No system crash or hung requests
59+
- Error message is visible to user
60+
61+
## Test Scenario 4: Multi-turn Conversation
62+
63+
**Prompt 1:**
64+
```
65+
Create a Python function to calculate factorial of a number.
66+
```
67+
68+
**Prompt 2:**
69+
```
70+
Now use that function to calculate factorial of 10.
71+
```
72+
73+
**Expected Behavior:**
74+
- First response creates and shows the function
75+
- Second response uses the function and shows result (3628800)
76+
- Context is maintained between turns
77+
78+
**What to Check:**
79+
- Multi-turn context works correctly
80+
- Variables/functions from previous turns are available
81+
- No repeated text issues
82+
83+
## Test Scenario 5: Complex Calculation
84+
85+
**Prompt:**
86+
```
87+
Calculate the first 20 prime numbers using Python.
88+
```
89+
90+
**Expected Behavior:**
91+
- Code is generated and executed
92+
- List of primes is displayed: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
93+
94+
**What to Check:**
95+
- Complex logic executes correctly
96+
- Results are accurate
97+
- No timeout errors
98+
99+
## Test Scenario 6: Streaming Mode
100+
101+
Enable streaming responses and test:
102+
103+
**Prompt:**
104+
```
105+
Generate a simple "Hello, World!" program in Python and execute it.
106+
```
107+
108+
**Expected Behavior:**
109+
- Response streams in real-time
110+
- Code execution still works
111+
- No "chunk too big" errors
112+
113+
**What to Check:**
114+
- Streaming works smoothly
115+
- Tool calls are detected in streaming mode
116+
- No response corruption
117+
118+
## Test Scenario 7: Non-Streaming Mode
119+
120+
Disable streaming responses and test:
121+
122+
**Prompt:**
123+
```
124+
Calculate the sum of numbers from 1 to 100 using Python.
125+
```
126+
127+
**Expected Behavior:**
128+
- Response arrives all at once
129+
- Code executes and shows result (5050)
130+
- No repeated text
131+
132+
**What to Check:**
133+
- Non-streaming mode works
134+
- Tool calls are detected and emitted
135+
- Format is correct
136+
137+
## Debugging Tips
138+
139+
If tests fail, check:
140+
141+
1. **Browser Console**: Look for JavaScript errors or failed API calls
142+
2. **Open WebUI Logs**: Check for Python exceptions or warnings
143+
3. **Network Tab**: Inspect the API request/response format
144+
4. **Event Emitter**: Verify events are being emitted correctly
145+
146+
Key indicators of success:
147+
- ✅ No repeated text in responses
148+
- ✅ Code blocks are properly formatted
149+
- ✅ Execution results are displayed
150+
- ✅ Tool call events appear in logs
151+
- ✅ Format matches OpenAI tool call structure
152+
153+
Key indicators of issues:
154+
- ❌ Text repeats multiple times
155+
- ❌ Code doesn't execute
156+
- ❌ "function_call" errors in logs
157+
- ❌ Missing tool_calls in API response
158+
- ❌ Malformed JSON in arguments
159+
160+
## Log Monitoring
161+
162+
Watch for these log messages (set log level to DEBUG):
163+
164+
**Success indicators:**
165+
```
166+
Detected tool call: <function_name> with args: <args>
167+
Emitted tool call: <function_name> with args: <args>
168+
```
169+
170+
**Error indicators:**
171+
```
172+
Error processing content part: ...
173+
Failed to access content parts: ...
174+
```
175+
176+
## Comparison with Azure Pipeline
177+
178+
To verify the fix matches Azure's behavior, test the same prompts with both:
179+
1. Azure AI pipeline (known working)
180+
2. Gemini pipeline (after fix)
181+
182+
Both should execute code successfully and show results.

0 commit comments

Comments
 (0)