|
| 1 | +# perf-analyzer |
| 2 | + |
| 3 | +The brain of the agentic performance platform. Receives triggers (developer |
| 4 | +on-demand and Grafana webhook), drives the `perf-collector` to dump JFR and |
| 5 | +thread snapshots, queries Pyroscope for the top hot functions, calls Amazon |
| 6 | +Bedrock via Spring AI with source-code grounding, and writes a Markdown |
| 7 | +report to S3 alongside the raw artifacts. |
| 8 | + |
| 9 | +The full deployment walkthrough lives in the workshop content at |
| 10 | +`content/analysis/perf-platform/analyzer/`. This README covers build and |
| 11 | +the contract only. |
| 12 | + |
| 13 | +## Build |
| 14 | + |
| 15 | +```bash |
| 16 | +cd apps/perf-analyzer |
| 17 | +mvn compile jib:build -Dimage=${ECR_URI}/perf-analyzer:latest |
| 18 | +``` |
| 19 | + |
| 20 | +Multi-arch (`linux/amd64` + `linux/arm64`), Amazon Corretto 25 JRE base. |
| 21 | + |
| 22 | +## Endpoints |
| 23 | + |
| 24 | +| Method | Path | Purpose | |
| 25 | +|--------|------|---------| |
| 26 | +| POST | `/api/v1/analyze` | Developer on-demand. Body: `{service, platform, pod or task, reason}`. | |
| 27 | +| POST | `/api/v1/grafana-webhook` | Grafana alert payload; one analysis per firing alert. | |
| 28 | +| GET | `/actuator/health` | Liveness + readiness probes. | |
| 29 | +| GET | `/actuator/prometheus` | Metrics. | |
| 30 | + |
| 31 | +Both `/api/v1/*` endpoints return `202 Accepted` with |
| 32 | +`{analysisId, s3Prefix}`. Analysis runs asynchronously on a virtual thread. |
| 33 | + |
| 34 | +## Environment variables |
| 35 | + |
| 36 | +| Name | Required | Description | |
| 37 | +|------|----------|-------------| |
| 38 | +| `AWS_REGION` | yes | AWS Region for Bedrock, S3, ECS SDK clients. | |
| 39 | +| `AWS_S3_BUCKET` | yes | Workshop bucket (SSM `workshop-bucket-name`). | |
| 40 | +| `PYROSCOPE_URL` | yes | `http://pyroscope.monitoring:4040` on EKS. | |
| 41 | +| `SPRING_AI_BEDROCK_CONVERSE_CHAT_OPTIONS_MODEL` | yes | Claude Sonnet 4.6 model id. | |
| 42 | +| `SPRING_AI_BEDROCK_CONVERSE_CHAT_OPTIONS_MAX_TOKENS` | no | Default 10000. | |
| 43 | +| `GITHUB_REPO_URL` | no | `https://api.github.com/repos/aws-samples/java-on-aws`. | |
| 44 | +| `GITHUB_REPO_PATH` | no | `apps/unicorn-store-spring`. | |
| 45 | +| `GITHUB_TOKEN` | no | GitHub PAT for private repos. | |
| 46 | + |
| 47 | +## Flow |
| 48 | + |
| 49 | +``` |
| 50 | +POST /api/v1/analyze or /api/v1/grafana-webhook |
| 51 | + │ |
| 52 | + ▼ |
| 53 | +AnalysisService.submit ──► 202 {analysisId, s3Prefix} |
| 54 | + │ (virtual thread) |
| 55 | + ▼ |
| 56 | +Resolve collector URL |
| 57 | + - EKS: pod → nodeName → DaemonSet pod IP on that node (K8s API) |
| 58 | + - ECS: task → DescribeTasks → sidecar ENI private IP |
| 59 | + │ |
| 60 | + ▼ |
| 61 | +Three parallel lanes (virtual threads): |
| 62 | + a. POST /dump{jfr} → poll S3 → JfrParser.formatForModel |
| 63 | + b. POST /dump{threaddump} → poll S3 → first 200 lines |
| 64 | + c. PyroscopeTool.topFunctions (pre-fetched canonical prompt section) |
| 65 | + │ |
| 66 | + ▼ |
| 67 | +AiService builds layered prompt, calls Bedrock via Spring AI ChatClient |
| 68 | + │ |
| 69 | + ▼ |
| 70 | +S3 writes: request.json, events.md, threaddump.json, analysis.md |
| 71 | +``` |
| 72 | + |
| 73 | +## Spring AI tools (@Tool) |
| 74 | + |
| 75 | +- **PyroscopeTool** — top-level `@Component`. Always registered. Used both |
| 76 | + by `AnalysisService` to pre-fetch the canonical Pyroscope section and by |
| 77 | + the model to request additional windows/labels during reasoning. |
| 78 | +- **GitHubSourceCodeTool** — nested package-private class in `AiService`. |
| 79 | + Instantiated only when `GITHUB_REPO_URL` is set. Lets the model look up |
| 80 | + methods from stacks and cite file paths and line numbers in findings. |
| 81 | + |
| 82 | +## S3 layout |
| 83 | + |
| 84 | +``` |
| 85 | +perf-platform/ |
| 86 | + analysis/{platform}/{service}/{target}/{YYYYMMDD-HHMMSS-hex}/ |
| 87 | + request.json # normalized AnalysisRequest |
| 88 | + events.md # JfrParser Markdown (model input, captured) |
| 89 | + threaddump.json # raw Thread.print wrapped in JSON |
| 90 | + analysis.md # Markdown report (model output) |
| 91 | + profiling/{platform}/{service}/{target}/ |
| 92 | + dump-{jobId}.jfr # collector drop consumed by the JFR lane |
| 93 | + dump-{jobId}.json # thread dump drop consumed by the thread-dump lane |
| 94 | +``` |
| 95 | + |
| 96 | +## JFR events extracted |
| 97 | + |
| 98 | +`jdk.ExecutionSample`, `jdk.CPULoad`, `jdk.GCHeapSummary`, `jdk.JVMInformation`, |
| 99 | +`jdk.GCPhasePause`, `jdk.Compilation`, `jdk.Deoptimization`, |
| 100 | +`jdk.JavaMonitorEnter`, `jdk.SafepointBegin`, `jdk.ContainerConfiguration`. |
| 101 | + |
| 102 | +Each produces top-5 aggregates for model input, not raw events. |
0 commit comments