Move docs to docs/ folder and expand contributing guide

ChengaDev · claude · ChengaDev · commit d668379ae0dd · 2026-04-02T00:51:19.000+03:00
- WHY_OPSAGENT.md, EXAMPLES.md, RUNNING_LOCALLY.md → docs/
- Update README nav links to docs/ paths
- CONTRIBUTING.md: add "most wanted" section with notification channel
  wishlist (PagerDuty, Teams, Discord, Opsgenie, Datadog, Email, Telegram),
  step-by-step guide for adding a new channel, and gaps for log patterns

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,6 +1,6 @@
 # Contributing to OpsAgent
 
-Contributions are welcome. Please open an issue first for anything beyond a small bug fix so we can align on direction before you invest time in a PR.
+Contributions are very welcome — OpsAgent is a young project and there is a lot of room to grow. Please open an issue first for anything beyond a small bug fix so we can align on direction before you invest time in a PR.
 
 ## Requirements for every PR
 
@@ -18,11 +18,51 @@ pip install -e ".[all-providers]"
 pytest tests/ -v
 ```
 
-## Good first areas
+---
 
-- **New MCP servers** — Jira, PagerDuty, Datadog log fetcher, `kubectl` live pod state
-- **New log patterns** — add to `_PATTERNS` in `mcp_tools/log_analyzer.py` with a matching fixture and test
-- **Streaming output** — stream Claude's reasoning in real time
+## Most wanted contributions
+
+### 🔔 New notification channels
+
+This is the highest-impact area right now. OpsAgent currently supports Slack, generic webhooks, and GitHub PR comments. We'd love to add:
+
+| Channel | Notes |
+|---|---|
+| **PagerDuty** | Create an incident via the Events API v2 |
+| **Microsoft Teams** | Adaptive Card payload via Incoming Webhook |
+| **Discord** | Embed payload via Discord webhook |
+| **Opsgenie** | Create alert via Opsgenie REST API |
+| **Datadog** | Post event to Datadog Events API |
+| **Email** | SMTP or SendGrid for direct email delivery |
+| **Telegram** | Bot API message to a chat or channel |
+
+Each channel lives in `mcp_tools/notification_server.py` as a new MCP tool. Follow the pattern of `send_slack_notification` — accept a webhook URL or token via env var, build the payload, send it, return a success/error string.
+
+### 🔍 New log patterns
+
+Add to `_PATTERNS` in `mcp_tools/log_analyzer.py` with a matching fixture log and test. Common gaps:
+
+- Ruby / Bundler errors
+- Gradle / Maven build failures
+- Go module errors
+- Rust / Cargo compilation errors
+
+### 🛠️ New MCP servers
+
+- **Jira** — create or update a ticket from the RCA
+- **Datadog** — fetch recent logs or metrics for a service
+- **`kubectl`** — live pod state, describe, events
+
+---
+
+## Adding a new notification channel
+
+1. Add a new `@mcp.tool()` function in `mcp_tools/notification_server.py`
+2. Accept the target URL / token as a parameter (callers pass it from env)
+3. Build the channel-specific payload and POST it with `httpx`
+4. Return a plain string: `"✓ Sent"` or `"✗ Error: <message>"`
+5. Add a test in `tests/test_notification_server.py` using `respx` to mock the HTTP call
+6. Document the new env var in the README CLI reference table
 
 ## Adding a new log pattern
 
diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@
 [![Release](https://github.com/ChengaDev/opsagent/actions/workflows/release.yml/badge.svg)](https://github.com/ChengaDev/opsagent/actions/workflows/release.yml)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
 
-[Why OpsAgent?](WHY_OPSAGENT.md) · [Examples](EXAMPLES.md) · [Running locally](RUNNING_LOCALLY.md) · [Contributing](CONTRIBUTING.md)
+[Why OpsAgent?](docs/why-opsagent.md) · [Examples](docs/examples.md) · [Running locally](docs/running-locally.md) · [Contributing](CONTRIBUTING.md)
 
 ---
 
@@ -118,7 +118,7 @@ pip install -e .
     ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
 ```
 
-See [EXAMPLES.md](EXAMPLES.md) for Python, Node.js, Helm, Terraform, GitLab CI, Jenkins, and more.
+See [docs/examples.md](docs/examples.md) for Python, Node.js, Helm, Terraform, GitLab CI, Jenkins, and more.
 
 ---
 
diff --git a/docs/examples.md b/docs/examples.md
@@ -0,0 +1,217 @@
+# GitHub Actions Examples
+
+> **Tip:** Always use `set -o pipefail` before piping through `tee` — without it, the pipeline returns `tee`'s exit code (0) even when your command fails, so `if: failure()` never triggers.
+
+## Python / pytest
+
+```yaml
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: Install dependencies
+        run: pip install -r requirements.txt
+
+      - name: Run tests
+        run: |
+          set -o pipefail
+          pytest tests/ -v 2>&1 | tee "${{ runner.temp }}/pytest.log"
+
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/pytest.log
+          workspace: ${{ github.workspace }}
+          slack-webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+## Node.js / npm
+
+```yaml
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Install and build
+        run: |
+          npm ci
+          set -o pipefail
+          npm run build 2>&1 | tee "${{ runner.temp }}/build.log"
+
+      - name: Run tests
+        run: |
+          set -o pipefail
+          npm test 2>&1 | tee "${{ runner.temp }}/test.log"
+
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          slack-webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+## Post RCA as a PR comment
+
+```yaml
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+OpsAgent posts the full RCA as a comment on the pull request that triggered the failure — no webhook configuration needed.
+
+## Save the RCA to a file
+
+```yaml
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          output: ${{ runner.temp }}/rca.md
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+
+      - name: Upload RCA report
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: rca-report
+          path: ${{ runner.temp }}/rca.md
+```
+
+## Use a custom model
+
+```yaml
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          model: claude-opus-4-6
+          investigate-model: claude-haiku-4-5-20251001
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+## Use a different provider
+
+```yaml
+      # Google Gemini
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          provider: google
+        env:
+          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
+```
+
+```yaml
+      # OpenAI
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/test.log
+          workspace: ${{ github.workspace }}
+          provider: openai
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+```
+
+## CD pipeline — Helm deploy
+
+```yaml
+      - name: Deploy
+        run: |
+          set -o pipefail
+          helm upgrade --install my-service ./charts/my-service \
+            --namespace production \
+            --set image.tag=${{ github.sha }} \
+            --wait --timeout 5m 2>&1 | tee "${{ runner.temp }}/deploy.log"
+
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/deploy.log
+          workspace: ${{ github.workspace }}
+          slack-webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
+          webhook-url: ${{ secrets.WEBHOOK_URL }}
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+## CD pipeline — Terraform
+
+```yaml
+      - name: Terraform apply
+        run: |
+          set -o pipefail
+          terraform apply -auto-approve 2>&1 | tee "${{ runner.temp }}/tf.log"
+
+      - name: Run OpsAgent RCA
+        if: failure()
+        uses: ChengaDev/opsagent@v1
+        with:
+          log-path: ${{ runner.temp }}/tf.log
+          workspace: ${{ github.workspace }}
+          slack-webhook: ${{ secrets.SLACK_WEBHOOK_URL }}
+          webhook-url: ${{ secrets.WEBHOOK_URL }}
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+```
+
+## GitLab CI
+
+```yaml
+rca:
+  stage: .post
+  when: on_failure
+  script:
+    - pip install "git+https://github.com/ChengaDev/opsagent.git[all-providers]"
+    - opsagent --log-path build.log --workspace .
+  variables:
+    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
+```
+
+## Jenkins
+
+```groovy
+post {
+  failure {
+    sh '''
+      pip install "git+https://github.com/ChengaDev/opsagent.git[all-providers]"
+      opsagent --log-path build.log --workspace .
+    '''
+  }
+}
+```
diff --git a/docs/running-locally.md b/docs/running-locally.md
@@ -0,0 +1,81 @@
+# Running OpsAgent Locally
+
+## Setup
+
+```bash
+git clone https://github.com/ChengaDev/opsagent.git
+cd opsagent
+python3 -m venv .venv && source .venv/bin/activate
+pip install -e ".[all-providers]"
+cp .env.example .env
+# Add your API key to .env
+```
+
+## Mock mode — no API key needed
+
+The `demo.py` script runs the **full LangGraph pipeline** with a mock LLM against realistic fixture logs. Real MCP servers start, real tools execute, only the LLM response is mocked.
+
+```bash
+python demo.py                                        # default: python import error
+python demo.py --fixture oom_killed.log               # OOM killed container
+python demo.py --fixture test_failure.log             # pytest failures
+python demo.py --fixture k8s_crash_loop.log           # Kubernetes CrashLoopBackOff
+python demo.py --fixture helm_upgrade_failed.log      # Helm upgrade timeout
+python demo.py --fixture terraform_error.log          # Terraform apply error
+python demo.py --fixture registry_auth_error.log      # Docker registry auth failure
+python demo.py --fixture health_check_failed.log      # readiness probe failure
+python demo.py --list                                 # show all available fixtures
+```
+
+## Production mode — real LLM
+
+```bash
+# Anthropic (default)
+python cli.py \
+  --log-path tests/fixtures/k8s_crash_loop.log \
+  --workspace .
+
+# OpenAI
+python cli.py --provider openai \
+  --log-path tests/fixtures/k8s_crash_loop.log \
+  --workspace .
+
+# Google Gemini
+python cli.py --provider google \
+  --log-path tests/fixtures/k8s_crash_loop.log \
+  --workspace .
+
+# Custom models
+python cli.py --provider anthropic \
+  --model claude-opus-4-6 \
+  --investigate-model claude-haiku-4-5-20251001 \
+  --log-path tests/fixtures/k8s_crash_loop.log \
+  --workspace .
+
+# With Slack notification and saved report
+python cli.py \
+  --log-path tests/fixtures/helm_upgrade_failed.log \
+  --workspace . \
+  --slack-url "$SLACK_WEBHOOK_URL" \
+  --output rca_report.md
+
+# With a generic webhook (Discord, Teams, PagerDuty)
+python cli.py \
+  --log-path tests/fixtures/helm_upgrade_failed.log \
+  --workspace . \
+  --webhook-url "$WEBHOOK_URL"
+```
+
+## Running tests
+
+```bash
+pytest tests/ -v
+```
+
+## Build executable locally
+
+```bash
+pip install -e ".[build]"
+pyinstaller opsagent.spec
+./dist/opsagent --log-path tests/fixtures/terraform_error.log --workspace .
+```
diff --git a/docs/why-opsagent.md b/docs/why-opsagent.md