Skip to content

Commit 43bed27

Browse files
authored
Updated prompt for Bedrock
1 parent 404dedd commit 43bed27

1 file changed

Lines changed: 70 additions & 9 deletions

File tree

  • infrastructure/scripts/setup/thread-dump-lambda/src

infrastructure/scripts/setup/thread-dump-lambda/src/index.py

Lines changed: 70 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -136,14 +136,76 @@ def analyze_thread_dump(thread_dump: str) -> str:
136136
bedrock = boto3.client("bedrock-runtime", region_name=region_name)
137137

138138
logger.info(f"Using Bedrock in region: {bedrock.meta.region_name}")
139-
prompt = f"""Please analyze the following Java thread dump. Your task is to identify performance issues and provide actionable insights. Structure the output into the following four sections:
140-
141-
1. **Summary of Thread States**: Count and categorize all thread states (e.g., RUNNABLE, WAITING).
142-
2. **Key Issues Identified**: Describe any threads that appear stuck, blocked, or problematic (e.g., deadlocks, high CPU).
143-
3. **Optimization Recommendations**: Suggest practical improvements based on your findings (e.g., code, configuration, GC tuning).
144-
4. **Detailed Analysis**: Provide a technical breakdown of the most interesting or problematic threads.
145-
146-
Thread Dump:
139+
prompt = f"""You are an expert in Java performance analysis with extensive experience diagnosing production issues.
140+
141+
Analyze the following Java thread dump and return your findings as a comprehensive Markdown document with these sections:
142+
143+
1. **Executive Summary**
144+
- Provide a concise 2-3 sentence overview of the thread dump health
145+
- Highlight the most critical issue identified
146+
- Include an overall system health assessment (Healthy/Degraded/Critical)
147+
148+
2. **Summary of Thread States**
149+
- Count and list all thread states (RUNNABLE, WAITING, BLOCKED, etc.)
150+
- Include totals and percentages for each state
151+
- Present a simple ASCII chart or table showing the distribution
152+
- Note any unusual state distributions that might indicate problems
153+
154+
3. **Key Issues Identified**
155+
For each issue, include:
156+
- Issue description with severity rating (Critical/High/Medium/Low)
157+
- Confidence level in the diagnosis (High/Medium/Low)
158+
- Affected threads (count and examples)
159+
- Potential business impact
160+
161+
Look specifically for:
162+
- Deadlocks or potential deadlocks
163+
- Resource contention patterns
164+
- Threads blocked on synchronization
165+
- Excessive CPU usage patterns
166+
- Thread pool saturation
167+
- Database connection issues
168+
- I/O bottlenecks
169+
- Common framework-specific issues (Spring, Hibernate, etc.)
170+
171+
4. **Optimization Recommendations**
172+
Provide actionable recommendations organized by:
173+
- Immediate actions (can be implemented quickly with low risk)
174+
- Short-term improvements (days to implement)
175+
- Long-term architectural changes (if applicable)
176+
177+
Include for each recommendation:
178+
- Specific code, configuration, or architectural changes
179+
- Expected impact level (High/Medium/Low)
180+
- Implementation complexity (High/Medium/Low)
181+
182+
Consider these areas:
183+
- Thread pool sizing and configuration
184+
- Synchronization and locking strategies
185+
- Database query and connection handling
186+
- Garbage collection tuning
187+
- Resource allocation
188+
- Caching strategies
189+
- Asynchronous processing opportunities
190+
191+
5. **Detailed Analysis of Critical Threads**
192+
For the 3-5 most problematic threads:
193+
- Thread name, ID, and state
194+
- Relevant stack trace snippet (focus on most important frames)
195+
- Explanation of why this thread is significant
196+
- What normal behavior would look like
197+
- Other threads with similar patterns or related issues
198+
- Specific code areas that should be investigated
199+
200+
6. **System Context Analysis**
201+
- Identify patterns across multiple threads suggesting systemic issues
202+
- Note any evidence of recent garbage collection activity
203+
- Identify potential memory issues (if detectable from thread patterns)
204+
- Comment on thread creation patterns and lifecycle management
205+
206+
If the thread dump appears incomplete or insufficient for complete analysis, clearly state this limitation and what additional information would be helpful.
207+
208+
**Thread Dump Input**:
147209
{thread_dump}
148210
"""
149211

@@ -186,7 +248,6 @@ def analyze_thread_dump(thread_dump: str) -> str:
186248

187249
return "Failed to analyze thread dump after multiple retries due to throttling."
188250

189-
190251
class ECSClient:
191252
def __init__(self, cluster_name: str):
192253
self.cluster_name = cluster_name

0 commit comments

Comments
 (0)