jmbish04
diff --git a/‎.agent/rules/AGENT_GOVERNANCE.md‎
Lines changed: 24 additions & 0 deletions b/‎.agent/rules/AGENT_GOVERNANCE.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎.agent/rules/HEALTH_GOVERNANCE.md‎
Lines changed: 32 additions & 0 deletions b/‎.agent/rules/HEALTH_GOVERNANCE.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎.agent/rules/actions-llm.md‎
Lines changed: 11 additions & 0 deletions b/‎.agent/rules/actions-llm.md‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎.agent/rules/logging-standards.md‎
Lines changed: 30 additions & 0 deletions b/‎.agent/rules/logging-standards.md‎
Lines changed: 30 additions & 0 deletions
diff --git a/‎.agent/workflows/IMPLEMENT_RESEARCH_TEAM.md‎
Lines changed: 57 additions & 0 deletions b/‎.agent/workflows/IMPLEMENT_RESEARCH_TEAM.md‎
Lines changed: 57 additions & 0 deletions
diff --git a/‎.agent/workflows/implement-github-judge.md‎
Lines changed: 55 additions & 0 deletions b/‎.agent/workflows/implement-github-judge.md‎
Lines changed: 55 additions & 0 deletions
diff --git a/‎.agent/workflows/run-health-suite.md‎
Lines changed: 38 additions & 0 deletions b/‎.agent/workflows/run-health-suite.md‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎.colbyignore‎
Lines changed: 25 additions & 0 deletions b/‎.colbyignore‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎.dev.vars.example‎
Lines changed: 11 additions & 0 deletions b/‎.dev.vars.example‎
Lines changed: 11 additions & 0 deletions
@@ -0,0 +1,24 @@
+---
+trigger: always_on
+---
+
+# AGENT & WORKFLOW GOVERNANCE
+
+## 1. The Manager-Worker Pattern
+*   **Rule**: Agents (`extends Agent`) **MUST NOT** perform long-running blocking tasks (>10s).
+*   **Rule**: Long-running tasks (Cloning, Vectorizing, Scraping) **MUST** be offloaded to Cloudflare Workflows (`extends WorkflowEntrypoint`).
+*   **Rule**: Agents act as "Managers" (State/Decision); Workflows act as "Workers" (Execution).
+
+## 2. Tool Integration (MCP)
+*   **Constraint**: Do NOT import Node.js-exclusive packages (e.g., `fs`, `child_process`) directly into the Worker.
+*   **Strategy**: Adapt the *logic* of official tools into the Agents SDK `sql`-backed state or stateless `Octokit` calls.
+*   **Schema**: All tools must strictly define Zod schemas for the Agents SDK to generate valid MCP interfaces.
+
+## 3. Sandbox Usage
+*   **Lifecycle**: Sandboxes are ephemeral. Data must be extracted (to R2, D1, or Vectorize) before the `step` completes.
+*   **Security**: Never pass raw user input directly to `sandbox.exec()`. Sanitize command arguments.
+
+## 4. Vectorization & RAG
+*   **Chunking**: Code must be chunked (e.g., by function/class) before embedding.
+*   **Model**: Use `@cf/baai/bge-large-en-v1.5` for embeddings (1024 dimensions).
+*   **Metadata**: Upserts MUST include `{ repo, filepath, commit_sha }`.
@@ -0,0 +1,32 @@
+# Health Check Governance
+
+## Rule: Every New Module Must Register a Health Check
+
+When adding a new domain module under `backend/src/`, you **MUST**:
+
+1. Create a `health.ts` file co-located with the module
+2. Export `checkHealth(env: Env): Promise<HealthStepResult>`
+3. Register the check in `backend/src/health/coordinator.ts` → `CODE_CHECKS` array
+4. Assign a `HealthCategory` from the union in `backend/src/health/types.ts`
+
+## Rule: Dynamic Tests via D1
+
+Runtime endpoint monitoring uses the `health_test_definitions` table. CRUD is available at:
+
+- `GET /api/health/tests` — list all
+- `POST /api/health/tests` — create (Zod-validated)
+- `DELETE /api/health/tests/:id` — remove
+
+## Rule: AI Remediation
+
+Failed health checks automatically receive AI-powered remediation hints via `analyzeFailure()`. These are stored in the `ai_suggestion` column of `health_results`.
+
+## Rule: Cron Schedule
+
+Health suite runs weekly via cron `0 3 * * 0` (Sundays 3AM UTC).  
+On-demand runs are available via `POST /api/health/run`.
+
+## Rule: Frontend Sync
+
+The frontend at `/health` **must** display all categories defined in `CATEGORY_META` in `Health.tsx`.
+When adding a new `HealthCategory` to `types.ts`, also add an entry to the frontend registry.
@@ -0,0 +1,11 @@
+# GitHub Actions LLM Rules
+
+## 1. Resilience
+
+- Always use `response_format={"type": "json_object"}` when interacting with `gpt-oss-120b` for data extraction.
+- Implement `try/except` blocks around the LLM call to prevent a single bad generation from crashing the CI pipeline.
+
+## 2. Dependency Management
+
+- Keep scripts self-contained within the YAML (using `cat <<EOF`) for simple tasks, or use a dedicated `scripts/` folder for complex ones.
+- Prioritize standard libraries (`openai`, `pydantic`) over experimental ones to ensure stability in the CI runner.
@@ -0,0 +1,30 @@
+# Research Logging Standards
+
+## 1. The "Glass Box" Principle
+
+The user must see HOW the agent arrived at a conclusion.
+
+- **BAD:** Agent returns "I found React."
+- **GOOD:**
+  1. Agent logs: "User asked for frontend frameworks."
+  2. Agent logs: "Tool 'GoogleSearch' called with query 'best frontend frameworks 2026'."
+  3. Agent logs: "Tool returned 15 results."
+  4. Agent logs: "Evaluating 'React' - it matches criteria."
+
+## 2. Structured Metadata
+
+Do not dump JSON into the `content` text field.
+
+- Use the `metadata` JSON column for large payloads (e.g., full HTML body, raw search JSON).
+- Keep `content` human-readable (e.g., "Parsing search results...").
+
+## 3. Error Visibility
+
+If a tool fails (e.g., Browser Rendering timeout):
+
+- Log it as `step_type: 'error'`.
+- Do not hide it. The user needs to see that the "Search Agent" failed to connect.
+
+## 4. Async Writes
+
+- Use `ctx.waitUntil()` for logging database inserts to prevent blocking the main agent execution thread.
@@ -0,0 +1,57 @@
+---
+description: Implement Agentic Research Team
+---
+
+# Workflow:
+
+## Phase 1: Infrastructure & MCP Layer
+1.  **Wrangler Config**:
+    *   Edit `wrangler.jsonc` to add `[workflows]`, `[vectorize_indexes]`, and `[send_email]`.
+    *   Run `npx wrangler types` to update `worker-configuration.d.ts`.
+2.  **Vectorize Setup**:
+    *   Run `npx wrangler vectorize create research-index --dimensions 1024 --metric cosine`.
+3.  **MCP Adapter**:
+    *   Create `src/mcp/github-official-adapter.ts`.
+    *   Implement standard GitHub tools (`list_files`, `read_file`, `search_repositories`) using `src/octokit` logic but matching official tool names/schemas.
+    *   Import and register this in `src/tools/index.ts` to combine with custom tools.
+
+## Phase 2: The Research Workflow (The Muscle)
+1.  **Scaffold Workflow**:
+    *   Create `src/workflows/DeepResearchWorkflow.ts` extending `WorkflowEntrypoint`.
+2.  **Sandbox Integration**:
+    *   Implement `step.do('clone')`:
+        ```typescript
+        import { Sandbox } from '@cloudflare/sandbox-sdk';
+        // ...
+        const sandbox = await Sandbox.create({ assets: env.BROWSER });
+        await sandbox.run(`git clone ${repoUrl}`);
+        ```
+3.  **Analysis & RAG**:
+    *   Implement `step.do('process')`:
+        *   Read file tree.
+        *   Split code files.
+        *   `env.AI.run('@cf/baai/bge-large-en-v1.5')`.
+        *   `env.RESEARCH_INDEX.upsert()`.
+
+## Phase 3: The Orchestrator (The Brain)
+1.  **Create Agent**:
+    *   Create `src/agents/ResearchAgent.ts` extending `Agent`.
+2.  **Logic Implementation**:
+    *   **Plan**: `onMessage` -> LLM generates research plan.
+    *   **Execute**: Call `env.DEEP_RESEARCH_WORKFLOW.create()`.
+    *   **Monitor**: Expose a `reportProgress` RPC method that the Workflow calls to update the Agent.
+    *   **HITL**: If the plan involves "Create Issue" or "PR", pause and send `type: 'approval_request'` to WebSocket.
+
+## Phase 4: Daily Discovery & Email
+1.  **Cron Handler**:
+    *   Update `src/index.ts` to export a `scheduled` handler.
+    *   Logic: Fetch "trending" -> Call `DeepResearchWorkflow` with `mode: 'discovery'` -> Aggregate Findings.
+2.  **Email**:
+    *   Install `mimetext`.
+    *   Generate HTML report.
+    *   Send via `env.EMAIL_SENDER`.
+
+## Phase 5: Verification
+1.  Deploy: `npx wrangler deploy`.
+2.  Test MCP: Connect generic MCP client to the Agent.
+3.  Test Full Loop: Trigger `ResearchAgent` via Chat UI.
@@ -0,0 +1,55 @@
+---
+description: This plan deploys the self-contained "LLM-as-a-Judge" workflow to GitHub Actions.
+---
+
+# Implementation Plan: GitHub Actions Research Judge
+
+This plan deploys the self-contained "LLM-as-a-Judge" workflow to GitHub Actions.
+
+## User Intent
+
+Create a robust GitHub Action that uses Cloudflare Workers AI (`@cf/openai/gpt-oss-120b`) to orchestrate, execute, and evaluate GitHub repository searches before syncing them to a Cloudflare Worker.
+
+## Technical Context
+
+- **Infrastructure**: GitHub Actions (`ubuntu-latest`)
+- **Language**: Python 3.11
+- **AI Provider**: Cloudflare AI Gateway (OpenAI Compatible Endpoint)
+- **Model**: `gpt-oss-120b` (128k context)
+
+## Execution Steps
+
+### 1. Create Workflow File
+
+- **Path**: `.github/workflows/research-judge.yml`
+- **Content**: Copy the provided YAML exactly.
+- **Key Features**:
+  - Embeds `research_judge.py` directly (no extra file management).
+  - Uses `pydantic` for strict JSON schema validation from the LLM.
+  - Implements a `TinyAgent` class to wrap the OpenAI SDK interactions.
+
+### 2. Configure Secrets (Manual)
+
+You must add the following secrets to your GitHub Repository:
+
+- `CLOUDFLARE_ACCOUNT_ID`: Your CF Account ID.
+- `CLOUDFLARE_GATEWAY_ID`: The ID of your AI Gateway.
+- `CLOUDFLARE_API_TOKEN`: Token with Workers AI permissions.
+- `WORKER_API_KEY`: Token to authenticate with your Hono Worker.
+
+### 3. Usage
+
+- **Manual**: Go to "Actions" -> "Deep Research Judge" -> "Run workflow" -> Enter a prompt.
+- **Automated**: Send a POST request from your Worker:
+  ```typescript
+  await fetch("https://api.github.com/repos/OWNER/REPO/dispatches", {
+    method: "POST",
+    body: JSON.stringify({
+      event_type: "deep-research",
+      client_payload: {
+        query: "Find react agents",
+        callback_url: "https://your.worker/callback",
+      },
+    }),
+  });
+  ```
@@ -0,0 +1,38 @@
+---
+description: Run the full health suite and review results
+---
+
+# Run Health Suite
+
+// turbo-all
+
+## Steps
+
+1. Trigger the health check via API:
+
+```bash
+curl -s -X POST https://core-github-api.126colby.workers.dev/api/health/run \
+  -H "Content-Type: application/json" \
+  -H "x-api-key: $API_KEY" | jq .
+```
+
+2. Check the latest results:
+
+```bash
+curl -s https://core-github-api.126colby.workers.dev/api/health/latest \
+  -H "x-api-key: $API_KEY" | jq '.results[] | {category, name, status, ai_suggestion}'
+```
+
+3. View run history:
+
+```bash
+curl -s "https://core-github-api.126colby.workers.dev/api/health/history?limit=5" \
+  -H "x-api-key: $API_KEY" | jq '.runs[] | {id: .run.id, status: .run.status, created: .run.created_at}'
+```
+
+4. List dynamic test definitions:
+
+```bash
+curl -s https://core-github-api.126colby.workers.dev/api/health/tests \
+  -H "x-api-key: $API_KEY" | jq .
+```
@@ -0,0 +1,25 @@
+# Dependencies
+node_modules/
+.pnpm-store/
+
+# Cloudflare Wrangler
+.wrangler/
+.dev.vars
+dist/
+
+# Logs
+*.log
+npm-debug.log*
+pnpm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+
+# IDEs and editors
+.idea/
+.vscode/
+*.swp
+*.swo
+
+# OS-specific
+.DS_Store
+Thumbs.db
@@ -0,0 +1,11 @@
+GOOGLE_API_KEY=
+GEMINI_API_KEY=
+WORKER_API_KEY=
+GITHUB_TOKEN=
+AI_GATEWAY_URL=
+AI_GATEWAY_TOKEN=
+CLOUDFLARE_API_TOKEN=
+GITHUB_ACTION_CLOUDFLARE_ACCOUNT_ID=
+GITHUB_APP_ID=
+GITHUB_APP_PRIVATE_KEY=
+