Skip to content

Commit 3dfd44a

Browse files
Complete Unit 15.11: Remove OpenTelemetry dependencies and disable OTel in Lambda
- Remove explicit OpenTelemetry dependencies from requirements.txt - Let Strands SDK manage its own OpenTelemetry transitive dependencies - Disable OpenTelemetry in Lambda environment variables - Should resolve StopIteration errors and allow webhook to return HTTP 200 Root cause: CodeRipple doesn't use OpenTelemetry directly, only Strands does See dev_log/015_troubleshooting_011.md for detailed analysis
1 parent b4d257b commit 3dfd44a

3 files changed

Lines changed: 182 additions & 11 deletions

File tree

coderipple/requirements.txt

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,8 @@ markdown-it-py==3.0.0
2121
mcp==1.9.3
2222
mdurl==0.1.2
2323
mpmath==1.3.0
24-
opentelemetry-api==1.34.0
25-
opentelemetry-exporter-otlp-proto-common==1.34.0
26-
opentelemetry-exporter-otlp-proto-http==1.34.0
27-
opentelemetry-proto==1.34.0
28-
opentelemetry-sdk==1.34.0
29-
opentelemetry-semantic-conventions==0.55b0
24+
# OpenTelemetry dependencies removed - let Strands SDK manage its own OTel dependencies
25+
# This prevents version conflicts and StopIteration errors in Lambda
3026
pillow==11.2.1
3127
prompt_toolkit==3.0.51
3228
protobuf==5.29.5

dev_log/015_troubleshooting_011.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Unit 15.11: Remove OpenTelemetry Dependencies from CodeRipple
2+
3+
**Date:** 2025-06-28
4+
**Status:** COMPLETED
5+
**Type:** Dependency Management & Runtime Error Resolution
6+
7+
## Problem Statement
8+
9+
Lambda function continues to return HTTP 500 errors due to OpenTelemetry StopIteration exceptions, even after:
10+
- ✅ Downgrading to Python 3.12 (Unit 15.10)
11+
- ✅ Enabling OpenTelemetry configuration (Unit 15.11 attempt)
12+
- ✅ Setting GitHub repository owner variable (Unit 15.12)
13+
14+
CloudWatch logs show persistent OpenTelemetry context loading errors:
15+
```
16+
[ERROR] Failed to load context: contextvars_context, fallback to contextvars_context
17+
Traceback (most recent call last):
18+
File "/opt/python/opentelemetry/context/__init__.py", line 46, in _load_runtime_context
19+
return next( # type: ignore
20+
^^^^^^^^^^^^^^^^^^^^^
21+
StopIteration
22+
```
23+
24+
## Root Cause Analysis
25+
26+
### Investigation: Does CodeRipple Actually Need OpenTelemetry?
27+
28+
**Key Discovery:** CodeRipple does NOT use OpenTelemetry directly in its source code.
29+
30+
```bash
31+
# Search for OpenTelemetry usage in CodeRipple source
32+
find ./coderipple/src -name "*.py" -exec grep -l "from opentelemetry\|import opentelemetry" {} \;
33+
# Result: No files found
34+
35+
grep -r "opentelemetry" ./coderipple/src/
36+
# Result: No OpenTelemetry usage found in CodeRipple source code
37+
```
38+
39+
**Dependency Analysis:**
40+
- **CodeRipple source:** No direct OpenTelemetry imports or usage
41+
- **Strands SDK:** Requires OpenTelemetry as transitive dependency
42+
- **Current requirements.txt:** Explicit OpenTelemetry dependencies (unnecessary)
43+
44+
### The Real Issue
45+
46+
OpenTelemetry dependencies in `requirements.txt` were:
47+
1. **Not needed by CodeRipple directly**
48+
2. **Potentially conflicting with Strands SDK's own OpenTelemetry requirements**
49+
3. **Causing StopIteration errors in Lambda runtime**
50+
51+
## Solution Implementation
52+
53+
### Step 1: Remove Explicit OpenTelemetry Dependencies
54+
55+
**File:** `coderipple/requirements.txt`
56+
57+
**Removed dependencies:**
58+
```python
59+
# REMOVED - Let Strands SDK manage its own OpenTelemetry dependencies
60+
opentelemetry-api==1.34.0
61+
opentelemetry-exporter-otlp-proto-common==1.34.0
62+
opentelemetry-exporter-otlp-proto-http==1.34.0
63+
opentelemetry-proto==1.34.0
64+
opentelemetry-sdk==1.34.0
65+
opentelemetry-semantic-conventions==0.55b0
66+
```
67+
68+
**Replaced with:**
69+
```python
70+
# OpenTelemetry dependencies removed - let Strands SDK manage its own OTel dependencies
71+
# This prevents version conflicts and StopIteration errors in Lambda
72+
```
73+
74+
### Step 2: Disable OpenTelemetry in Lambda Environment
75+
76+
**File:** `infra/terraform/functions.tf`
77+
78+
**Updated configuration:**
79+
```hcl
80+
# OpenTelemetry configuration - disabled because CodeRipple doesn't use it directly
81+
# Only Strands SDK requires it, but we can disable it to prevent StopIteration errors
82+
OTEL_SDK_DISABLED = "true"
83+
OTEL_TRACES_EXPORTER = "none"
84+
OTEL_METRICS_EXPORTER = "none"
85+
OTEL_LOGS_EXPORTER = "none"
86+
```
87+
88+
## Technical Rationale
89+
90+
### Why This Approach Works
91+
92+
1. **Dependency Separation:** Let Strands SDK manage its own OpenTelemetry version requirements
93+
2. **Version Conflict Prevention:** Avoid explicit version pinning that might conflict with Strands
94+
3. **Runtime Disabling:** Disable OpenTelemetry at runtime to prevent context loading errors
95+
4. **Minimal Impact:** CodeRipple functionality unaffected since it doesn't use OpenTelemetry directly
96+
97+
### Dependency Management Strategy
98+
99+
**Before (Problematic):**
100+
```
101+
CodeRipple requirements.txt
102+
├── opentelemetry-api==1.34.0 (explicit)
103+
├── opentelemetry-sdk==1.34.0 (explicit)
104+
└── strands-agents==0.1.6
105+
└── opentelemetry-* (transitive, potentially different versions)
106+
```
107+
108+
**After (Clean):**
109+
```
110+
CodeRipple requirements.txt
111+
└── strands-agents==0.1.6
112+
└── opentelemetry-* (transitive, managed by Strands)
113+
```
114+
115+
## Expected Results
116+
117+
### Lambda Function Behavior
118+
-**No more StopIteration errors** from OpenTelemetry context loading
119+
-**Successful Strands SDK initialization** with its own OpenTelemetry dependencies
120+
-**HTTP 200 webhook responses** instead of HTTP 500 errors
121+
-**Proper CodeRipple agent execution** without OpenTelemetry interference
122+
123+
### Deployment Impact
124+
- **Smaller Lambda package** - fewer explicit dependencies
125+
- **Faster cold starts** - less dependency resolution overhead
126+
- **Better compatibility** - no version conflicts between explicit and transitive dependencies
127+
128+
## Validation Steps
129+
130+
### 1. Commit Changes
131+
```bash
132+
git add infra/terraform/functions.tf coderipple/requirements.txt
133+
git commit -m "Unit 15.11: Remove OpenTelemetry dependencies and disable OTel in Lambda"
134+
```
135+
136+
### 2. Deploy and Test
137+
- Deploy via GitHub Actions pipeline
138+
- Test webhook endpoint: `curl -X POST https://API_ID.execute-api.us-east-1.amazonaws.com/prod/webhook`
139+
- Expected: HTTP 200 response instead of HTTP 500
140+
141+
### 3. Monitor CloudWatch Logs
142+
- Should see no more OpenTelemetry StopIteration errors
143+
- Should see successful Strands agent initialization
144+
- Should see proper CodeRipple webhook processing
145+
146+
## Lessons Learned
147+
148+
### Dependency Management Best Practices
149+
1. **Only include direct dependencies** in requirements.txt
150+
2. **Let libraries manage their own transitive dependencies**
151+
3. **Avoid version pinning unless specifically required**
152+
4. **Investigate actual usage before adding dependencies**
153+
154+
### Troubleshooting Approach
155+
1. **Question assumptions** - "Does CodeRipple actually need OpenTelemetry?"
156+
2. **Analyze source code** - Verify actual usage vs. listed dependencies
157+
3. **Understand dependency relationships** - Direct vs. transitive dependencies
158+
4. **Test minimal configurations** - Remove unnecessary complexity
159+
160+
## Integration with Previous Units
161+
162+
**Unit 15.10 (Python 3.12):** ✅ Still valid - Python 3.12 provides better OpenTelemetry compatibility when needed
163+
**Unit 15.12 (GitHub Repo Owner):** ✅ Still needed - CodeRipple requires repository information
164+
**Unit 15.11 (This Unit):** ✅ Addresses the core OpenTelemetry StopIteration issue
165+
166+
## Success Criteria
167+
168+
- [ ] Lambda function deploys successfully without OpenTelemetry dependency conflicts
169+
- [ ] Webhook returns HTTP 200 instead of HTTP 500
170+
- [ ] CloudWatch logs show no OpenTelemetry StopIteration errors
171+
- [ ] Strands SDK initializes successfully with its own OpenTelemetry dependencies
172+
- [ ] CodeRipple agents execute properly for webhook processing
173+
174+
**Status:** Implementation complete, awaiting deployment validation.

infra/terraform/functions.tf

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -73,11 +73,12 @@ resource "aws_lambda_function" "coderipple_orchestrator" {
7373
CODERIPPLE_DEPENDENCIES_LAYER = aws_lambda_layer_version.coderipple_dependencies.arn
7474
CODERIPPLE_PACKAGE_LAYER = aws_lambda_layer_version.coderipple_package.arn
7575

76-
# OpenTelemetry configuration - enabled for Python 3.12 compatibility with Strands SDK
77-
OTEL_SDK_DISABLED = "false"
78-
OTEL_TRACES_EXPORTER = "otlp"
79-
OTEL_METRICS_EXPORTER = "otlp"
80-
OTEL_LOGS_EXPORTER = "otlp"
76+
# OpenTelemetry configuration - disabled because CodeRipple doesn't use it directly
77+
# Only Strands SDK requires it, but we can disable it to prevent StopIteration errors
78+
OTEL_SDK_DISABLED = "true"
79+
OTEL_TRACES_EXPORTER = "none"
80+
OTEL_METRICS_EXPORTER = "none"
81+
OTEL_LOGS_EXPORTER = "none"
8182
}
8283
}
8384

0 commit comments

Comments
 (0)