AGENTS.md

Cloudflare Bindings & Naming

script_name vs worker_name: The Cloudflare API often refers to the worker as script_name, but this is equivalent to the name field (the worker_name) defined in wrangler.jsonc or wrangler.toml.
Bindings Management Philosophy: The purpose of automated bindings management is create-only. The system should provision new resources (like a D1 database) in the Cloudflare account and then update the repository's wrangler.jsonc by submitting a GitHub PR or patching an existing one. The system is NOT responsible for attaching bindings to the worker via the Cloudflare API. Attaching bindings happens organically through the normal CI/CD deployment pipeline.

Localized Agent Documentation (MANDATORY CRAWL)

This repository contains localized AGENTS.md and AGENTS-REVIEW.md files that dictate behavior for specific directories and domains. You MUST read the relevant localized file when working in its respective directory:

🛑 AGENT REQUIRED READING 🛑

Protocol: You are operating in a pnpm monorepo. Verification: Every time you run a command, ask yourself: "Am I using --filter?" Instruction: If you are unsure of the project structure, run ls -R or check the pnpm-workspace.yaml.

Golden Rule: ALWAYS use the @google/genai SDK. NEVER use @google/generative-ai.

PNPM Workspace Commands

This project is a pnpm monorepo with packages: frontend and container.

Installing Dependencies: Never install dependencies at the root unless they are project-wide dev tools (e.g., turbo, prettier).
Targeted Install: Use the --filter flag to target specific packages from the root:
- Example: pnpm add zod --filter frontend
Root Install: If a package must go to the root, use the -w flag:
- Example: pnpm add -Dw typescript
Internal Dependencies: When adding one workspace package to another, use the workspace:* protocol.
- Example: pnpm add @workspace/common --filter frontend

State Management & Sync

When updating schemas in frontend/src/db, ensure the backend remains the source of truth if shared.
Always run pnpm install from the root after manual package.json edits to update the lockfile.

📦 PNPM Workspace Protocol

This repository is a pnpm monorepo.

Root Directory: Contains the backend and global workspace commands.
Frontend Directory: Contains the Astro/React/Shadcn application.

Installation Rules:

Never run pnpm install <pkg> at the root unless it is a workspace-wide dev tool (e.g., turbo, prettier).
Targeted Install: Always use the --filter flag from the root to add dependencies to specific packages.
- ✅ Correct: pnpm add zod --filter frontend
- ✅ Correct: pnpm add drizzle-orm --filter frontend
CD Method: Alternatively, cd into the package directory before running pnpm add.

Schema Sync:

When modifying schema.ts or validations.ts, ensure they are placed in the directory where the Drizzle client is instantiated (currently frontend/src/db).
After adding a dependency via the agent, always run pnpm install at the root to refresh the lockfile.

Core Directives

SDK: import { GoogleGenAI } from "@google/genai";
Instantiation: const ai = new GoogleGenAI({ apiKey: ... });
Models:
- General: gemini-2.5-flash (or gemini-2.0-flash-exp if requested)
- Reasoning: gemini-2.0-flash-thinking-exp-1219 (if available) or gemini-2.5-pro
- Images: gemini-2.5-flash-image
Configuration: Pass responseMimeType: "application/json" and responseSchema for structured output.
Environment Types: The Env interface is globally available via worker-configuration.d.ts (automatically loaded via tsconfig.json). NEVER manually define an Env or Bindings interface.
Forbidden Imports: import { Bindings } from '@utils/hono'; is strictly prohibited. Use the global Env type directly in your Hono app definitions (e.g., new Hono<{ Bindings: Env }>()).

Package Management (PNPM Workspace)

Since this is a monorepo using pnpm workspaces, you MUST use specific flags when installing packages to avoid the ERR_PNPM_ADDING_TO_ROOT error by defaults.

Root Dependencies (e.g., dev tools, shared types):
```
pnpm add <package-name> -w
```
Workspace Requirements: To install a package for a specific workspace (e.g., frontend or backend), use the --filter flag:
```
pnpm add <package-name> --filter <workspace-name>
```

Code Patterns

✅ Correct (New SDK)

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: env.GEMINI_API_KEY });

const result = await ai.models.generateContent({
  model: "gemini-2.5-flash",
  contents: [{ role: "user", parts: [{ text: "Hello" }] }],
  config: {
    responseMimeType: "application/json",
    // responseSchema: ... (Zod schema converted to JSON)
  },
});

console.log(result.text); // Getter, returns string

❌ Incorrect (Legacy/Deprecated)

require('@google/generative-ai')
genai.getGenerativeModel(...)
model.generateContent(...) (Called on model instance instead of ai.models)
generationConfig (Use config property instead)
result.response.text() (Method call)

Durable Object Abstraction (MANDATE)

To prevent type ambiguity and routing errors, raw Durable Object mounting (idFromName, .get(), raw .fetch()) is strictly forbidden.

Stateful AI Agents: MUST be accessed via HoniClient.getStub() or HoniClient.fetch() (from @utils/honi-client).
WebSocket Broadcasters: MUST be accessed via BroadcastClient (from @utils/do-broadcast). For details, see .agent/rules/02-do-abstraction.md.

Structured Outputs (MANDATE)

CRYSTAL CLEAR RULE: You MUST use AiProvider.generateStructuredResponse (or generateStructuredWithTools exported from @/ai/providers) anytime the AI model is being instructed to respond with a structured JSON response.

FORBIDDEN: Do NOT rely on Agent SDK schema enforcements (e.g., passing outputType: MySchema as any to @openai/agents), as they are prone to brittle string extraction failures or 400 errors via the Cloudflare AI Gateway.

Correct Pattern (Agent with Tools):

Let the Agent execute its internal tool loop freely (returning markdown text).
Take the Agent's result.finalOutput and pass it into generateStructuredResponse along with your schema.

import { generateStructuredResponse } from "@/ai/providers";
import { zodToJsonSchema } from "zod-to-json-schema";
import { z } from "zod";

const MySchema = z.object({ ... });

// 1. Let agent run
const result = await runner.run(agent, prompt);

// 2. Extract strictly
const finalData = await generateStructuredResponse<z.infer<typeof MySchema>>(
  env,
  `Extract the exact data from the Agent's response:\n\n${result.finalOutput}`,
  zodToJsonSchema(MySchema as any, "structured_output")
);

AI Provider Routing & Resolution

MANDATORY IMPORT PATH: Agents must always and exclusively import AI functions from @/ai/providers.
FORBIDDEN IMPORTS: It is never acceptable to import directly from specific provider files (e.g., ai/providers/openai, ai/providers/gemini) or the index file explicitly (e.g., ai/providers/index).
FUNCTION USAGE: When using functions like generateText, generateStructuredResponse, etc., the agent should specify the provider and model arguments when known.
FALLBACK BEHAVIOR:
- If no provider or model is provided by the caller, the system relies on the index.ts routing to default to workers-ai, which then utilizes its internal business logic to select the correct fallback model.
- Similarly, if a provider is specified but no model is provided, the specific provider module's logic determines the default model.
- Agents should not hardcode default models unless explicitly required by the business logic.

Full-Code Output Rule

Agents must never return elided or partial code using shortcuts such as:

// ... rest of the function remains the same ...
// leaving as is
// ... rest of code ...

If a file is in scope, return the complete file content for that file. If a function is rewritten, return the full rewritten function. Do not replace omitted code with commentary.

Tools (MCP)

When integrating tools:

Use src/lib/mcp.ts to connect to Cloudflare Docs or other MCP servers.

Container / Sandbox Protocol

When modifying the Cloudflare Sandbox SDK (@cloudflare/sandbox or containers), follow these strict architectural and troubleshooting rules:

Version Requirement: Ensure that package.json SDK dependencies exactly match the tags in container/Dockerfile (e.g. 0.8.0). Do not use latest.
Verification: Validated by scripts/package/verify-sandbox-version.mjs on pnpm run deploy. Mismatched versions invariably cause 500 Internal Server Error.
Container Base Image: We use the native Cloudflare Sandbox images (merging -opencode into -python). NEVER overwrite the base image with FROM oven/bun or standard Node/Alpine images. Doing so destroys the Sandbox supervisor network and causes immediate crashes upon sandbox.fetch().
Lockfile Sync: If the Docker build fails with "lockfile is frozen", it means container/bun.lockb is out of sync. Standard fix: run cd container && bun install locally before deploying to synchronize the definitions.
Port Exposure: Any process running inside the container (e.g. agent-sdk.ts on port 3001) MUST have a corresponding EXPOSE 3001 directive in the Dockerfile for the host network proxy to recognize it.

Exit Criteria & Verification

Before reporting a task or turn as complete, you MUST:

Clear Linting Errors: Ensure bun run check (or checking the IDE output) reveals no linting or compilation errors.
Verify Deployment: Run bun run dry-run to validate the worker configuration and build process.
- This executes wrangler deploy --dry-run to catch binding issues, bundle size limits, or config errors.
- Fix any errors reported by this command before finishing.

Antigravity Strategy: Agentic Research Team

Context

We are deploying a dedicated Agentic Research Team consisting of a stateful Orchestrator (ResearchAgent) and durable execution pipelines (DeepResearchWorkflow). This system performs deep code analysis using Sandbox containers and Vectorize RAG, delivering findings via real-time WebSocket updates and daily email reports.

Architectural Pillars

The Brain (Agents SDK): ResearchAgent maintains state, chat history, and HITL (Human-in-the-Loop) approvals.
The Muscle (Workflows): DeepResearchWorkflow handles long-running tasks (Cloning, Vectorizing) without timeout risks.
The Tools (MCP + Sandbox):
- Native MCP Adapter: Adapts official GitHub MCP tool schemas to run on octokit within V8.
- Sandbox: Ephemeral environments for git clone and code execution.
The Signal (Daily Discovery): Cron Trigger -> Workflow -> HTML Report -> Email.

Task List

Infrastructure & Configuration

Config: Update wrangler.jsonc with bindings:
- kv_namespaces: AGENT_CACHE
- vectorize_indexes: RESEARCH_INDEX (Dimensions: 1024 for @cf/baai/bge-large-en-v1.5)
- ai: AI
- workflows: DEEP_RESEARCH_WORKFLOW
- send_email: EMAIL_SENDER
- browser: BROWSER (Sandbox assets)

Component 1: MCP Integration (Native Adapter)

File: src/mcp/github-official-adapter.ts
- Strategy: Replicate the schemas of the official @modelcontextprotocol/server-github but implement the logic using your existing src/octokit client to ensure V8 compatibility.
- Registry: Export these tools to the shared MCP toolkit (src/mcp/index.ts).

Component 2: The Research Team

File: src/agents/ResearchAgent.ts (The Manager)
- State Machine: PLANNING -> RESEARCHING -> REVIEW_REQUIRED -> COMPLETED.
- Capabilities: runWorkflow, waitForEvent (HITL), getAgentByName.
File: src/workflows/DeepResearchWorkflow.ts (The Workers)
- Step 1: setup-sandbox: Init Sandbox, git clone.
- Step 2: analysis-macro: Run ls -R, tree, read README.
- Step 3: vectorize: Chunk code, embed (Workers AI), upsert to RESEARCH_INDEX.
- Step 4: cleanup: Destroy Sandbox.

Component 3: Daily Discovery

File: src/schedulers/daily-scan.ts
- Trigger: Cron (e.g., 9 AM UTC).
- Logic: Scans GitHub trending/new -> Triggers DeepResearchWorkflow.
- Report: Generates HTML via LLM -> Sends via env.EMAIL_SENDER.

Verification

MCP: Verify tools gh_official_search and gh_official_read are available in the Agent's tool list.
Research: Send "Analyze facebook/react" to ResearchAgent. Verify Workflow logs showing Sandbox clone.
Email: Trigger cron manually via pnpm dlx wrangler@latest triggers fire --name "daily-scan".

Cross-Repository Architecture & Actions

Rule: the core-github-standardization repository is the source of truth for CI/CD templates, heavy-lifting Python scripts, and global GitHub Actions.
Rule: Any modification to an async task requires two PRs: One to core-github-standardization to update the python/yaml logic, and one to core-github-api to update the Zod schemas and D1 ingestion logic.

Global Error Handling (Mandatory)

When handling exceptions across the stack, the following strict protocol MUST be followed:

Backend Errors (D1 Mirror): All backend errors (API failures, tool exceptions) must be logged persistently using src/lib/logger.ts. You must invoke logger.error() passing the original error message and call await logger.flush() before returning the JSON error response to ensure the D1 system_logs transaction commits.
Frontend UI (Shadcn): The frontend must catch API errors and pass them to the centralized handleGlobalError service (in @/lib/error-handler), which renders a Sonner toast containing the literal backend message and a "Copy to Clipboard" button for the user to paste back to an AI agent. Do not use generic <Alert> blocks or raw toast.error() directly for structural logic failures. handleGlobalError(error) handles deduplication and dispatching metrics automatically. This is strictly enforced and mandatory.
```
import { handleGlobalError } from "@/lib/error-handler";
handleGlobalError(`Failed to apply decision. ${res}`);
```
Transparent Passthrough: Do not genericize trace messages on the backend. If an external service returns a 404, the JSON payload must contain "error": "GitHub API responded with 404 Not Found", not "Extraction failed".

Traceability & Logging Governance (MANDATORY)

See .agent/rules/traceability-logging.md for the full rule set.

Logger Class (Strictly Enforced)

ALL backend code MUST use Logger from src/lib/logger.ts. This class outputs structured JSON to console AND mirrors every entry to D1 (system_logs).

// ✅ CORRECT - Example inside a class
import { Logger } from '@/lib/logger';
constructor(protected readonly env: Env, loggerNamespace = 'orchestration/base') {
  this.logger = new Logger(env, loggerNamespace);
}
this.logger.info('Operation', { key: 'value' });
await this.logger.flush();

// ❌ FORBIDDEN — raw console calls bypass D1
console.log("something");
console.error("error:", err);

No Error Truncation (Strictly Enforced)

NEVER truncate error messages or inputs with .slice(), .substring(), or any other method. Full bodies MUST be logged. Truncating hides root causes and is useless for debugging.

// ❌ FORBIDDEN
this.logger.debug(`Running orchestration for: ${input.slice(0, 100)}...`); 
logger.error('failed', { body: errBody.substring(0, 200) });

// ✅ CORRECT
this.logger.debug(`Running orchestration for: ${input}`);
logger.error('failed', { status: res.status, body: errBody });

Agent Evaluation Duty

Every time an agent evaluates, reviews, modifies, or creates code, it MUST also evaluate:

Traceability Coverage: Does every significant code path have adequate logging?
Logger Usage: Is the code using Logger? If raw console.* is found, migrate it.
Error Completeness: Are errors logged in full, without truncation?
Flush Discipline: Is await logger.flush() called before every early return or throw?

D1 & Drizzle ORM Governance (Mandatory)

Table Instance Ownership

D1 Binding	Purpose	Examples
`DB` (core)	All application tables	`system_logs`, `audit_logs`, `automation_logs`, `repos`, `prs`, `health_*`, `cloudflare_changelog`, everything not a raw webhook event
`DB_WEBHOOKS`	Raw GitHub webhook event data ONLY	`webhook_deliveries`, `pull_request`, `push`, `checkRun`, `workflow_run`, `webhook_configs`, `searches`, `repoAnalysis`

Pre-Table-Creation Scan (MANDATORY)

Before creating ANY new Drizzle table, you MUST:

Run: grep -r "sqliteTable" src/backend/src/db/schemas/ --include="*.ts" -l to list all schema files
Read the relevant domain's index.ts barrel and the table definitions
Ask: "Can I add columns to an existing table instead of creating a new one?"
Only create a new table if no existing table can reasonably serve the purpose
Assign the table to the correct D1 instance based on the ownership table above

ORM Client Rules

DB (core): Always use getDb(env.DB) — imported from @db
DB_WEBHOOKS: Always use getWebhooksDb(env.DB_WEBHOOKS) — imported from @db
NEVER call drizzle(env.DB) or drizzle(env.DB_WEBHOOKS) directly — the schema argument is required

Migration Discipline

NEVER edit files in migrations/core/ or migrations/webhooks/ directly
ALWAYS generate migrations via: pnpm run db:generate:core or pnpm run db:generate:webhooks
ALWAYS apply via: pnpm run migrate:remote:core or pnpm run migrate:remote:webhooks
Exception: if a migration fails and manual repair is explicitly authorized by the user
To reset D1 from scratch: pnpm run db:reset

Full Reset + Seed Protocol

pnpm run db:reset is fully autonomous — UUIDs are read from wrangler.jsonc automatically. No hardcoded constants to update.

After db:reset + deploy completes, restore prior data:

pnpm run db:seed:prep   # normalize exported data for D1 limits (truncates & chunks)
pnpm run db:seed:run    # apply seeds to fresh instances (bulk + per-statement fallback)

⚠️ NEVER put seed files in migrations/ — place them only in scripts/db/seeds/.

D1 Execution Limits (Reference)

Limit	Value
Max bound parameters per query	100
Max SQL statement	100 KB (scripts use 90 KB)
Max query duration	30 seconds
Safe INSERT batch	100 rows
Max D1 database size	10 GB

D1 Health Monitors

Three health checks run automatically as part of POST /api/health/run:

Check ID	Fails When
`webhook_staleness`	`webhook_deliveries` empty OR >24h lag behind GitHub API OR >30 days since last delivery
`log_staleness`	`system_logs` empty OR latest entry >1 day old
`d1_table_scan`	Any table has 0 rows or last row >30 days old across both DB instances

To manually verify D1 staleness:

# Quick row count
wrangler d1 execute DB --remote --command "SELECT count(*) FROM system_logs;"
wrangler d1 execute DB_WEBHOOKS --remote --command "SELECT count(*) FROM webhook_deliveries;"

# Full D1 health check (live API)
curl -X POST https://core-github-api.hacolby.workers.dev/api/health/run | \
  python3 -c "import sys, json; [print(r['name'], r['status'], '|', r['message'][:80]) for r in json.load(sys.stdin).get('results', []) if r['name'] in ['Webhook Staleness','Log Staleness','D1 Table Scan']]"

For the full D1 audit workflow, run: /d1-audit See also: .agent/rules/d1-drizzle-governance.md | .agent/workflows/d1-audit.md

GitHub Webhook Architecture (CRITICAL — READ BEFORE TOUCHING ROUTES)

See .agent/rules/github-webhooks.md for the full rule set.

Canonical Webhook URL (IMMUTABLE)

POST https://core-github-api.hacolby.workers.dev/api/webhooks

This URL is hardcoded in the GitHub App settings (GitHub Settings → Developer → Apps → core-github-api → Webhook URL).

Property	Value
Route File	`src/backend/src/routes/api/webhooks/index.ts`
Route Mount	`src/backend/src/routes/index.ts` → `.route('/api/webhooks', webhooksApi)`
wrangler.jsonc var	`WEBHOOK_URL = "https://core-github-api.hacolby.workers.dev/api/webhooks"`
Health Check	`GET /api/health/github-app-webhooks`

DO NOT rename or move /api/webhooks. Every GitHub event (push, PR, issue, check_run, etc.) from the jmbish04 organization is delivered to this exact path. A path change without a simultaneous GitHub App settings update will cause silent data loss in DB_WEBHOOKS.

Root Cause History (March 2026)

❌ Old (wrong): GitHub App was configured to POST to /webhooks
✅ Fixed: Corrected to /api/webhooks (the actual worker route prefix)
✅ Result: DB_WEBHOOKS.webhook_deliveries started receiving rows immediately after correction
✅ Safeguard: WEBHOOK_URL env var added to wrangler.jsonc as single source of truth

Health Check for GitHub App Webhooks

The endpoint GET /api/health/github-app-webhooks authenticates as the GitHub App (JWT, not installation token) and:

Fetches the current webhook URL from GitHub App settings
Compares it against env.WEBHOOK_URL
Scans the 50 most recent deliveries for status_code >= 400 failures
Returns { status: 'healthy' | 'degraded' | 'unhealthy', urlMatchesExpected, failedDeliveries }

Mobile-First Responsive Standard

All UI development within this ecosystem MUST prioritize fluid, mobile-responsive layouts. Our application shell (Sidebar) manages its own responsive off-canvas state via useIsMobile, but all internal page content (global views, repo-specific views, dashboards, etc.) must degrade gracefully on smaller viewports.

Utility-First: Utilize Tailwind CSS mobile-first breakpoints (e.g., default classes for mobile, shifting to sm:, md:, lg:, xl: for larger screens).
Fluid Widths: Never hardcode pixel widths for layout containers; use percentages or viewport units (e.g., w-full md:w-1/2).
Stacked Layouts: Grid and flex layouts must stack correctly on mobile (e.g., flex-col md:flex-row, grid-cols-1 md:grid-cols-2).
Data Tables: Wide data tables or complex elements must be wrapped in an overflow-x-auto container to prevent viewport breakage.

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS.md

Cloudflare Bindings & Naming

Localized Agent Documentation (MANDATORY CRAWL)

🛑 AGENT REQUIRED READING 🛑

PNPM Workspace Commands

State Management & Sync

📦 PNPM Workspace Protocol

Installation Rules:

Schema Sync:

Core Directives

Package Management (PNPM Workspace)

Code Patterns

✅ Correct (New SDK)

❌ Incorrect (Legacy/Deprecated)

Durable Object Abstraction (MANDATE)

Structured Outputs (MANDATE)

AI Provider Routing & Resolution

Full-Code Output Rule

Tools (MCP)

Container / Sandbox Protocol

Exit Criteria & Verification

Antigravity Strategy: Agentic Research Team

Context

Architectural Pillars

Task List

Infrastructure & Configuration

Component 1: MCP Integration (Native Adapter)

Component 2: The Research Team

Component 3: Daily Discovery

Verification

Cross-Repository Architecture & Actions

Global Error Handling (Mandatory)

Traceability & Logging Governance (MANDATORY)

Logger Class (Strictly Enforced)

No Error Truncation (Strictly Enforced)

Agent Evaluation Duty

D1 & Drizzle ORM Governance (Mandatory)

Table Instance Ownership

Pre-Table-Creation Scan (MANDATORY)

ORM Client Rules

Migration Discipline

Full Reset + Seed Protocol

D1 Execution Limits (Reference)

D1 Health Monitors

GitHub Webhook Architecture (CRITICAL — READ BEFORE TOUCHING ROUTES)

Canonical Webhook URL (IMMUTABLE)

Root Cause History (March 2026)

Health Check for GitHub App Webhooks

Mobile-First Responsive Standard