Skip to content

Commit 3c25fb9

Browse files
muddlebeeclaudegreptile-apps[bot]
authored
docs(ci): add tool integration checklist guidance (Tracer-Cloud#2697)
* docs(ci): add docs-only CI shortcut and tool checklist - add a docs/process-only shortcut to CI.md for non-runtime changes - add TOOL_INTEGRATION_CHECKLIST.md as the dedicated tool and integration review checklist - add short AGENTS.md references pointing tool and integration work to the standalone checklist * docs(agents): consolidate New Integration Checklist into TOOL_INTEGRATION_CHECKLIST.md Replace the duplicate 7-item checklist in AGENTS.md section 6 with a single pointer to TOOL_INTEGRATION_CHECKLIST.md, which is now the single source of truth for tool/integration definition of done. Also updates the stale rule in section 3 to point to the checklist file directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(checklist): improve TOOL_INTEGRATION_CHECKLIST clarity and coverage - Add retrieval_controls to the metadata completeness check - Add masking check for tools that may return secrets/tokens/PII - Add 429/5xx upstream error handling to live payload section - Add .env.example check for new integration credentials - Rewrite section 2 Core completeness items as verifiable states (not ordering instructions) - Replace duplicate "Demo / proof" artifact list with a slim "Final gate" block pointing back to the detailed checks above Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Update CI.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * docs(checklist): add runtime registry and planner invocation test checks Two test coverage gaps were missing: - Runtime registry/discovery test proving the tool is visible on its declared surface(s) — distinct from schema/metadata shape checks - If investigation-relevant, at least one test proving the planner/agent can actually discover or invoke the tool through the normal runtime path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
1 parent b0fb4ec commit 3c25fb9

3 files changed

Lines changed: 181 additions & 11 deletions

File tree

AGENTS.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ Steps:
7979
3. Keep the tool self-contained. Put reusable transport or parsing code in `app/services/` or `app/tools/utils/` rather than copying it into the tool body.
8080
4. If the tool should appear in both investigation and chat surfaces, set `surfaces=("investigation", "chat")`.
8181
5. Add tests that cover schema shape, availability, extraction, and the runtime behavior that the planner depends on.
82+
6. Before opening or approving the PR, follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) for tool/integration-specific wiring, payload, docs, and regression checks.
8283

8384
### Changing the investigation pipeline
8485

@@ -128,6 +129,7 @@ Basic steps:
128129
3. Wire the tool layer after the config path is stable.
129130
4. Add docs and tests together so the integration is understandable and verifiable.
130131
5. Run `make verify-integrations` before treating the integration as complete.
132+
6. Before opening or approving the PR, follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) for integration completeness, investigation wiring, docs, and demo/test requirements.
131133

132134
## 3. Rules (if X -> do Y)
133135

@@ -137,8 +139,9 @@ Basic steps:
137139
- If an existing feature changes behavior, flags, or config shape -> update the relevant `docs/` page in the same PR; docs and code must stay in sync.
138140
- When writing or editing a `docs/` page -> write for **users, not contributors**. Open with a command quick-reference table (command | what it does) if the page covers CLI commands. Follow with brief practical examples. Keep internal file formats, JSONL schemas, and implementation details out of user-facing pages — move those to `docs/DEVELOPMENT.md` or a contributor-only reference file if truly needed.
139141
- If a tool's API or schema changes -> update docs in `docs/` and update the related unit tests, usually under `tests/tools/`. For investigation LLM tool-calling (any provider), follow [docs/investigation-tool-calling.md](docs/investigation-tool-calling.md).
142+
- If adding or materially changing a tool/integration -> follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) in the same PR.
140143
- If an integration changes -> update `tests/integrations/` and verify with `make verify-integrations`.
141-
- If adding a new integration -> follow the New Integration Checklist below before opening the PR for review.
144+
- If adding a new integration -> follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) before opening the PR for review.
142145
- If adding new tests -> always place them in `tests/`, never in `app/` (no inline tests).
143146
- If CI-only tests are added -> mark them with the right pytest marker or place them in the appropriate e2e/synthetic/chaos folder so they do not run in the default local suite.
144147
- If investigation branching or loop behavior changes -> update `app/pipeline/pipeline.py` and the tests for that path.
@@ -161,12 +164,4 @@ Test commands, routing rules, CI-only paths: **[CI.md](CI.md)**. Live REPL testi
161164

162165
## 6. New Integration Checklist
163166

164-
When adding a new integration, a PR is only ready when:
165-
166-
- Integration code added under `app/integrations/<name>/`
167-
- Tool(s) added under `app/tools/` with proper typing
168-
- Unit/mock tests added under `tests/integrations/`
169-
- Docs added under `docs/` and registered in `docs/docs.json` `pages`
170-
- Screenshot or demo GIF showing the integration working
171-
- E2E or synthetic test added
172-
- CI checks pass (see [CI.md](CI.md))
167+
Follow [TOOL_INTEGRATION_CHECKLIST.md](TOOL_INTEGRATION_CHECKLIST.md) — it is the single definition of done for all tool and integration work.

CI.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,38 @@
22

33
This file is the **single source of truth** for local CI readiness before any push or PR.
44

5-
## 1) Mandatory baseline checks (every code change)
5+
## 0) Docs / process-only shortcut
6+
7+
If your diff is **only** documentation or contributor-process files, you may
8+
skip the code-quality and test commands below.
9+
10+
Examples of files that qualify:
11+
12+
- `AGENTS.md`
13+
- `CI.md`
14+
- `CONTRIBUTING.md`
15+
- `README.md`
16+
- `TESTING.md`
17+
- `TOOL_INTEGRATION_CHECKLIST.md`
18+
- `docs/**/*.md`
19+
- `docs/**/*.mdx`
20+
- `docs/docs.json`
21+
22+
You may use the shortcut only when **all** changed files are non-runtime and
23+
non-executable. If the diff touches application code, tests, build tooling,
24+
dependency manifests, CI workflows, scripts, or anything with runtime impact,
25+
run the normal harness.
26+
27+
For docs/process-only changes, the minimum required local check is:
28+
29+
```bash
30+
git status --short
31+
```
32+
33+
If you are unsure whether the shortcut applies, do **not** use it — run the
34+
standard checks below.
35+
36+
## 1) Mandatory baseline checks (every code change that is not docs/process-only)
637

738
Run all of these first:
839

TOOL_INTEGRATION_CHECKLIST.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Tool & Integration Definition of Done
2+
3+
Use this checklist whenever you add or materially change:
4+
5+
- a tool under `app/tools/`
6+
- an integration under `app/integrations/`
7+
- a service client under `app/services/` that changes investigation behavior
8+
- investigation source wiring for an existing tool/integration
9+
10+
This file is the detailed definition of done for tool and integration work. Use it together with [AGENTS.md](AGENTS.md) and [CI.md](CI.md).
11+
12+
## 1. Tool checklist
13+
14+
### Files usually involved
15+
16+
- `app/tools/<ToolName>/__init__.py` or `app/tools/<tool_file>.py`
17+
- `app/tools/utils/` for shared helpers
18+
- `app/services/<vendor>/client.py` if transport/parsing should live in a reusable client
19+
- `docs/<tool_name>.mdx`
20+
- `docs/docs.json`
21+
- `tests/tools/test_<tool_name>.py`
22+
23+
### Contract and implementation
24+
25+
- [ ] Pick the simplest shape that fits the tool (`@tool(...)` for lightweight tools, richer class only when needed)
26+
- [ ] Metadata is complete and accurate: `name`, `description`, `source`, `surfaces`, `requires`, and any `use_cases` / `outputs` / `retrieval_controls`
27+
- [ ] `input_schema` matches the actual runtime arguments and required fields
28+
- [ ] `is_available` only returns `True` when the tool can genuinely run
29+
- [ ] `extract_params` maps resolved integration state into tool args correctly
30+
- [ ] Failure responses have a stable, investigation-friendly shape
31+
- [ ] Tool output is normalized enough for the planner/LLM to consume reliably
32+
- [ ] Reusable transport/parsing logic lives in `app/services/` or `app/tools/utils/` rather than being copied into the tool body
33+
- [ ] If the tool should appear in both investigation and chat, set `surfaces=("investigation", "chat")`
34+
- [ ] Output that may contain secrets, tokens, or PII is run through `app/masking/` before being returned
35+
36+
### Live payload parsing
37+
38+
If the tool parses API, MCP, log, or webhook payloads:
39+
40+
- [ ] Validate against the real or documented upstream response shape, not only idealized mocks
41+
- [ ] Handle alternate field names used in live payloads
42+
- [ ] Handle missing or partial fields without returning unusable output
43+
- [ ] Preserve important context when truncating, tailing, paginating, or flattening data
44+
- [ ] Upstream 429 / 5xx responses are handled and return a clear, investigation-friendly error rather than raising
45+
- [ ] Add at least one regression test using a realistic fixture payload
46+
47+
Common failure modes to consider:
48+
49+
- grouped + ungrouped log content
50+
- nested/foldered resources
51+
- paginated responses
52+
- `hasMore` / cursor mismatches
53+
- content-vs-pointer response shapes (`logs_content` vs `logs_url`-style payloads)
54+
55+
## 2. Integration checklist
56+
57+
### Files usually involved
58+
59+
- `app/integrations/<name>.py`
60+
- `app/integrations/catalog.py`
61+
- `app/integrations/verify.py`
62+
- `app/services/<name>/client.py`
63+
- `app/tools/<Name>Tool/` or `app/tools/<tool_file>.py`
64+
- `docs/<name>.mdx`
65+
- `docs/docs.json`
66+
- `tests/integrations/test_<name>.py`
67+
- related `tests/tools/`, `tests/e2e/`, or `tests/synthetic/` coverage
68+
69+
### Core completeness
70+
71+
- [ ] Integration config, normalization, and validators are in place under `app/integrations/<name>.py`
72+
- [ ] Catalog resolution / env loading is wired correctly
73+
- [ ] Verification path is wired in `app/integrations/verify.py` and adapters/registry as needed
74+
- [ ] Service client is added under `app/services/<name>/client.py` (only if the integration needs direct remote calls)
75+
- [ ] Tool layer is wired and stable
76+
- [ ] CLI setup flow is updated if the integration is user-configurable locally
77+
- [ ] `opensre onboard` parity is added or intentionally documented as out of scope
78+
- [ ] Any new required env vars or credentials are added to `.env.example` (never `.env`)
79+
- [ ] Docs and tests are added together so the integration is understandable and verifiable
80+
- [ ] If a new `docs/` page is added, it is registered in `docs/docs.json`
81+
- [ ] `make verify-integrations` passes
82+
83+
## 3. Investigation wiring checklist
84+
85+
If the tool/integration is relevant to investigations:
86+
87+
- [ ] Review alert-source seeding in `app/agent/investigation.py`
88+
- [ ] Review source-priority/prompt mapping in `app/agent/prompt.py`
89+
- [ ] Review evidence/source registration in `app/types/` or related state models when relevant
90+
- [ ] Add scenario coverage proving the tool surfaces useful RCA evidence
91+
92+
If the integration is first-class for an `alert_source`, the source-to-tool maps must be reviewed explicitly.
93+
94+
## 4. Discovery and edge cases
95+
96+
For tools that list, search, or inspect resources:
97+
98+
- [ ] Folder/nested resource layouts are considered where the upstream system supports them
99+
- [ ] Large result sets are capped or paginated intentionally
100+
- [ ] Partial fetches are surfaced clearly (`truncated`, `fetch_error`, etc.)
101+
- [ ] Time/order-sensitive results preserve causal ordering where it matters
102+
103+
## 5. Docs, tests, and demos
104+
105+
### Docs
106+
107+
- [ ] If a new feature is shipped (tool, CLI command, pipeline behavior, integration), add or update a `docs/` page/section in the same PR
108+
- [ ] If a tool's API or schema changes, update docs in the same PR
109+
- [ ] If an integration changes, keep docs and config/setup guidance in sync
110+
- [ ] For investigation LLM tool-calling changes, follow [docs/investigation-tool-calling.md](docs/investigation-tool-calling.md)
111+
112+
### Tests
113+
114+
- [ ] Unit tests for config/normalization
115+
- [ ] Tool contract tests or equivalent schema/metadata coverage
116+
- [ ] Runtime registry/discovery test proves the tool is visible on the expected surface(s)
117+
- [ ] Runtime behavior tests for success and failure paths
118+
- [ ] At least one realistic fixture for live payload parsing if external payloads are involved
119+
- [ ] If investigation-relevant, at least one test proves the planner/agent can discover or invoke the tool through the normal runtime path
120+
- [ ] Synthetic or scenario coverage when the planner/investigation loop depends on the tool
121+
- [ ] Update `tests/integrations/` when integration wiring changes
122+
123+
Green tests are not enough if they only cover idealized mocks.
124+
125+
### Final gate (new integrations only)
126+
127+
Before the PR is ready for review, verify all of the above are complete **and**:
128+
129+
- [ ] Screenshot or demo GIF showing the integration working end-to-end
130+
- [ ] E2E or synthetic test added
131+
- [ ] `make verify-integrations` passes
132+
- [ ] CI checks pass (see [CI.md](CI.md))
133+
134+
## 6. PR review checklist
135+
136+
Before opening or approving a PR that adds/changes a tool or integration, confirm:
137+
138+
- [ ] alert-source maps were reviewed explicitly
139+
- [ ] live payload parsing was reviewed explicitly
140+
- [ ] onboarding/setup/docs parity was reviewed explicitly
141+
- [ ] pagination/truncation/partial-response behavior was reviewed explicitly
142+
- [ ] tests cover realistic payloads and investigation usefulness, not only happy paths
143+
144+
Follow [CI.md](CI.md) for the mandatory pre-push commands.

0 commit comments

Comments
 (0)