docs(ci): add tool integration checklist guidance (Tracer-Cloud#2697)

muddlebee · claude · greptile-apps[bot] · web-flow · commit 3c25fb96d628 · 2026-06-02T14:10:38.000+05:30
* docs(ci): add docs-only CI shortcut and tool checklist

- add a docs/process-only shortcut to CI.md for non-runtime changes
- add TOOL_INTEGRATION_CHECKLIST.md as the dedicated tool and integration review checklist
- add short AGENTS.md references pointing tool and integration work to the standalone checklist

* docs(agents): consolidate New Integration Checklist into TOOL_INTEGRATION_CHECKLIST.md

Replace the duplicate 7-item checklist in AGENTS.md section 6 with a single
pointer to TOOL_INTEGRATION_CHECKLIST.md, which is now the single source of
truth for tool/integration definition of done. Also updates the stale rule
in section 3 to point to the checklist file directly.

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

* docs(checklist): improve TOOL_INTEGRATION_CHECKLIST clarity and coverage

- Add retrieval_controls to the metadata completeness check
- Add masking check for tools that may return secrets/tokens/PII
- Add 429/5xx upstream error handling to live payload section
- Add .env.example check for new integration credentials
- Rewrite section 2 Core completeness items as verifiable states
  (not ordering instructions)
- Replace duplicate "Demo / proof" artifact list with a slim "Final gate"
  block pointing back to the detailed checks above

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

* Update CI.md

Co-authored-by: greptile-apps[bot] &lt;165735046+greptile-apps[bot]@users.noreply.github.com&gt;

* docs(checklist): add runtime registry and planner invocation test checks

Two test coverage gaps were missing:
- Runtime registry/discovery test proving the tool is visible on its
  declared surface(s) — distinct from schema/metadata shape checks
- If investigation-relevant, at least one test proving the planner/agent
  can actually discover or invoke the tool through the normal runtime path

Co-Authored-By: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;

---------

Co-authored-by: Claude Sonnet 4.6 &lt;noreply@anthropic.com&gt;
Co-authored-by: greptile-apps[bot] &lt;165735046+greptile-apps[bot]@users.noreply.github.com&gt;
diff --git a/AGENTS.md b/AGENTS.md
@@ -79,6 +79,7 @@ Steps:
 3. Keep the tool self-contained. Put reusable transport or parsing code in `app/services/` or `app/tools/utils/` rather than copying it into the tool body.
 4. If the tool should appear in both investigation and chat surfaces, set `surfaces=("investigation", "chat")`.
 5. Add tests that cover schema shape, availability, extraction, and the runtime behavior that the planner depends on.
+6. Before opening or approving the PR, follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) for tool/integration-specific wiring, payload, docs, and regression checks.
 
 ### Changing the investigation pipeline
 
@@ -128,6 +129,7 @@ Basic steps:
 3. Wire the tool layer after the config path is stable.
 4. Add docs and tests together so the integration is understandable and verifiable.
 5. Run `make verify-integrations` before treating the integration as complete.
+6. Before opening or approving the PR, follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) for integration completeness, investigation wiring, docs, and demo/test requirements.
 
 ## 3. Rules (if X -> do Y)
 
@@ -137,8 +139,9 @@ Basic steps:
 - If an existing feature changes behavior, flags, or config shape -> update the relevant `docs/` page in the same PR; docs and code must stay in sync.
 - When writing or editing a `docs/` page -> write for **users, not contributors**. Open with a command quick-reference table (command | what it does) if the page covers CLI commands. Follow with brief practical examples. Keep internal file formats, JSONL schemas, and implementation details out of user-facing pages — move those to `docs/DEVELOPMENT.md` or a contributor-only reference file if truly needed.
 - If a tool's API or schema changes -> update docs in `docs/` and update the related unit tests, usually under `tests/tools/`. For investigation LLM tool-calling (any provider), follow [docs/investigation-tool-calling.md](docs/investigation-tool-calling.md).
+- If adding or materially changing a tool/integration -> follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) in the same PR.
 - If an integration changes -> update `tests/integrations/` and verify with `make verify-integrations`.
-- If adding a new integration -> follow the New Integration Checklist below before opening the PR for review.
+- If adding a new integration -> follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) before opening the PR for review.
 - If adding new tests -> always place them in `tests/`, never in `app/` (no inline tests).
 - If CI-only tests are added -> mark them with the right pytest marker or place them in the appropriate e2e/synthetic/chaos folder so they do not run in the default local suite.
 - If investigation branching or loop behavior changes -> update `app/pipeline/pipeline.py` and the tests for that path.
@@ -161,12 +164,4 @@ Test commands, routing rules, CI-only paths: **[CI.md](CI.md)**. Live REPL testi
 
 ## 6. New Integration Checklist
 
-When adding a new integration, a PR is only ready when:
-
-- Integration code added under `app/integrations/<name>/`
-- Tool(s) added under `app/tools/` with proper typing
-- Unit/mock tests added under `tests/integrations/`
-- Docs added under `docs/` and registered in `docs/docs.json` `pages`
-- Screenshot or demo GIF showing the integration working
-- E2E or synthetic test added
-- CI checks pass (see [CI.md](CI.md))
+Follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) — it is the single definition of done for all tool and integration work.
diff --git a/CI.md b/CI.md
@@ -2,7 +2,38 @@
 
 This file is the **single source of truth** for local CI readiness before any push or PR.
 
-## 1) Mandatory baseline checks (every code change)
+## 0) Docs / process-only shortcut
+
+If your diff is **only** documentation or contributor-process files, you may
+skip the code-quality and test commands below.
+
+Examples of files that qualify:
+
+- `AGENTS.md`
+- `CI.md`
+- `CONTRIBUTING.md`
+- `README.md`
+- `TESTING.md`
+- `TOOL_INTEGRATION_CHECKLIST.md`
+- `docs/**/*.md`
+- `docs/**/*.mdx`
+- `docs/docs.json`
+
+You may use the shortcut only when **all** changed files are non-runtime and
+non-executable. If the diff touches application code, tests, build tooling,
+dependency manifests, CI workflows, scripts, or anything with runtime impact,
+run the normal harness.
+
+For docs/process-only changes, the minimum required local check is:
+
+```bash
+git status --short
+```
+
+If you are unsure whether the shortcut applies, do **not** use it — run the
+standard checks below.
+
+## 1) Mandatory baseline checks (every code change that is not docs/process-only)
 
 Run all of these first:
 
diff --git a/TOOL_INTEGRATION_CHECKLIST.md b/TOOL_INTEGRATION_CHECKLIST.md
@@ -0,0 +1,144 @@
+# Tool & Integration Definition of Done
+
+Use this checklist whenever you add or materially change:
+
+- a tool under `app/tools/`
+- an integration under `app/integrations/`
+- a service client under `app/services/` that changes investigation behavior
+- investigation source wiring for an existing tool/integration
+
+This file is the detailed definition of done for tool and integration work. Use it together with [AGENTS.md](AGENTS.md) and [CI.md](CI.md).
+
+## 1. Tool checklist
+
+### Files usually involved
+
+- `app/tools/<ToolName>/__init__.py` or `app/tools/<tool_file>.py`
+- `app/tools/utils/` for shared helpers
+- `app/services/<vendor>/client.py` if transport/parsing should live in a reusable client
+- `docs/<tool_name>.mdx`
+- `docs/docs.json`
+- `tests/tools/test_<tool_name>.py`
+
+### Contract and implementation
+
+- [ ] Pick the simplest shape that fits the tool (`@tool(...)` for lightweight tools, richer class only when needed)
+- [ ] Metadata is complete and accurate: `name`, `description`, `source`, `surfaces`, `requires`, and any `use_cases` / `outputs` / `retrieval_controls`
+- [ ] `input_schema` matches the actual runtime arguments and required fields
+- [ ] `is_available` only returns `True` when the tool can genuinely run
+- [ ] `extract_params` maps resolved integration state into tool args correctly
+- [ ] Failure responses have a stable, investigation-friendly shape
+- [ ] Tool output is normalized enough for the planner/LLM to consume reliably
+- [ ] Reusable transport/parsing logic lives in `app/services/` or `app/tools/utils/` rather than being copied into the tool body
+- [ ] If the tool should appear in both investigation and chat, set `surfaces=("investigation", "chat")`
+- [ ] Output that may contain secrets, tokens, or PII is run through `app/masking/` before being returned
+
+### Live payload parsing
+
+If the tool parses API, MCP, log, or webhook payloads:
+
+- [ ] Validate against the real or documented upstream response shape, not only idealized mocks
+- [ ] Handle alternate field names used in live payloads
+- [ ] Handle missing or partial fields without returning unusable output
+- [ ] Preserve important context when truncating, tailing, paginating, or flattening data
+- [ ] Upstream 429 / 5xx responses are handled and return a clear, investigation-friendly error rather than raising
+- [ ] Add at least one regression test using a realistic fixture payload
+
+Common failure modes to consider:
+
+- grouped + ungrouped log content
+- nested/foldered resources
+- paginated responses
+- `hasMore` / cursor mismatches
+- content-vs-pointer response shapes (`logs_content` vs `logs_url`-style payloads)
+
+## 2. Integration checklist
+
+### Files usually involved
+
+- `app/integrations/<name>.py`
+- `app/integrations/catalog.py`
+- `app/integrations/verify.py`
+- `app/services/<name>/client.py`
+- `app/tools/<Name>Tool/` or `app/tools/<tool_file>.py`
+- `docs/<name>.mdx`
+- `docs/docs.json`
+- `tests/integrations/test_<name>.py`
+- related `tests/tools/`, `tests/e2e/`, or `tests/synthetic/` coverage
+
+### Core completeness
+
+- [ ] Integration config, normalization, and validators are in place under `app/integrations/<name>.py`
+- [ ] Catalog resolution / env loading is wired correctly
+- [ ] Verification path is wired in `app/integrations/verify.py` and adapters/registry as needed
+- [ ] Service client is added under `app/services/<name>/client.py` (only if the integration needs direct remote calls)
+- [ ] Tool layer is wired and stable
+- [ ] CLI setup flow is updated if the integration is user-configurable locally
+- [ ] `opensre onboard` parity is added or intentionally documented as out of scope
+- [ ] Any new required env vars or credentials are added to `.env.example` (never `.env`)
+- [ ] Docs and tests are added together so the integration is understandable and verifiable
+- [ ] If a new `docs/` page is added, it is registered in `docs/docs.json`
+- [ ] `make verify-integrations` passes
+
+## 3. Investigation wiring checklist
+
+If the tool/integration is relevant to investigations:
+
+- [ ] Review alert-source seeding in `app/agent/investigation.py`
+- [ ] Review source-priority/prompt mapping in `app/agent/prompt.py`
+- [ ] Review evidence/source registration in `app/types/` or related state models when relevant
+- [ ] Add scenario coverage proving the tool surfaces useful RCA evidence
+
+If the integration is first-class for an `alert_source`, the source-to-tool maps must be reviewed explicitly.
+
+## 4. Discovery and edge cases
+
+For tools that list, search, or inspect resources:
+
+- [ ] Folder/nested resource layouts are considered where the upstream system supports them
+- [ ] Large result sets are capped or paginated intentionally
+- [ ] Partial fetches are surfaced clearly (`truncated`, `fetch_error`, etc.)
+- [ ] Time/order-sensitive results preserve causal ordering where it matters
+
+## 5. Docs, tests, and demos
+
+### Docs
+
+- [ ] If a new feature is shipped (tool, CLI command, pipeline behavior, integration), add or update a `docs/` page/section in the same PR
+- [ ] If a tool's API or schema changes, update docs in the same PR
+- [ ] If an integration changes, keep docs and config/setup guidance in sync
+- [ ] For investigation LLM tool-calling changes, follow [docs/investigation-tool-calling.md](docs/investigation-tool-calling.md)
+
+### Tests
+
+- [ ] Unit tests for config/normalization
+- [ ] Tool contract tests or equivalent schema/metadata coverage
+- [ ] Runtime registry/discovery test proves the tool is visible on the expected surface(s)
+- [ ] Runtime behavior tests for success and failure paths
+- [ ] At least one realistic fixture for live payload parsing if external payloads are involved
+- [ ] If investigation-relevant, at least one test proves the planner/agent can discover or invoke the tool through the normal runtime path
+- [ ] Synthetic or scenario coverage when the planner/investigation loop depends on the tool
+- [ ] Update `tests/integrations/` when integration wiring changes
+
+Green tests are not enough if they only cover idealized mocks.
+
+### Final gate (new integrations only)
+
+Before the PR is ready for review, verify all of the above are complete **and**:
+
+- [ ] Screenshot or demo GIF showing the integration working end-to-end
+- [ ] E2E or synthetic test added
+- [ ] `make verify-integrations` passes
+- [ ] CI checks pass (see [CI.md](CI.md))
+
+## 6. PR review checklist
+
+Before opening or approving a PR that adds/changes a tool or integration, confirm:
+
+- [ ] alert-source maps were reviewed explicitly
+- [ ] live payload parsing was reviewed explicitly
+- [ ] onboarding/setup/docs parity was reviewed explicitly
+- [ ] pagination/truncation/partial-response behavior was reviewed explicitly
+- [ ] tests cover realistic payloads and investigation usefulness, not only happy paths
+
+Follow [CI.md](CI.md) for the mandatory pre-push commands.