Skip to content

docs(skills): adopt create_or_update_*_monitor tools and preview-then-confirm flow#85

Merged
santiagoaguiar merged 2 commits into
mainfrom
update-monitor-skills-pr9
May 13, 2026
Merged

docs(skills): adopt create_or_update_*_monitor tools and preview-then-confirm flow#85
santiagoaguiar merged 2 commits into
mainfrom
update-monitor-skills-pr9

Conversation

@santiagoaguiar
Copy link
Copy Markdown
Contributor

Summary

  • Renames every create_*_monitor_mac reference in skill docs to the new create_or_update_{table,metric,validation,sql,comparison}_monitor tools introduced in ai-agent#998.
  • Documents the two-call preview-then-confirm flow that those tools enforce: dry_run=True (the default) returns rendered MaC YAML in result.yaml; dry_run=False actually creates/updates the monitor and returns result.monitor_uuid + a deep link in result.instructions (with result.yaml=None by design).
  • Documents monitor_uuid for update-in-place, is_draft for draft mode, and the parity params (schedule_type, interval_minutes, audiences, failure_audiences, notes, priority, tags, plus per-type extras like aggregate_time_sql/segment_sql/sensitivity/collection_lag_hours on metric, query_result_type/custom_sampling_sql/variable_definitions on SQL).
  • Migrates the tune-monitor skill off the now-extended-only direct-GraphQL create_metric_monitor / create_custom_sql_monitor / create_validation_monitor tools (hidden from the public default toolset) onto create_or_update_*_monitor with monitor_uuid for in-place updates.
  • Updates plugins/claude-code/evals/{monitoring-advisor,prevent}/live-evals-*.yaml and the eval README example to assert the new tool names in must_call / must_not_call.
  • Bumps monitoring-advisor to 2.1.0 and tune-monitor to 1.1.0.

Plugin-level CHANGELOGs are left for the next release cut (no Unreleased section convention in this repo).

Test plan

  • Spot-check the rendered skill docs to confirm tool names + preview/confirm flow read coherently end-to-end.
  • Run the monitoring-advisor live evals (plugins/claude-code/evals/monitoring-advisor/live-evals-{dev,prod}.yaml) against the updated dev MCP server — confirms the new tool names match what the server actually exposes in the default toolset.
  • Run the prevent live eval (live-04-monitor-delegation) and confirm the turn-2 trace shows one of the new create_or_update_*_monitor calls.
  • Manually exercise the tune-monitor apply step against a sandbox monitor: dry_run=True returns YAML, dry_run=False updates in place and returns the deep link.

🤖 Generated with Claude Code

…-confirm flow

ai-agent PR #998 renames the MaC creation tools to
create_or_update_{table,metric,validation,sql,comparison}_monitor and
makes dry_run=True the default. Update the monitoring-advisor and
tune-monitor skills (plus claude-code eval YAMLs) to reflect the new
names and the two-call flow, document monitor_uuid for in-place
updates, and the dry_run=False live-deploy response shape.

The direct-GraphQL create_*_monitor tools used by tune-monitor are now
EXTENDED_ONLY (hidden from the public default toolset). Migrate
tune-monitor's apply step to create_or_update_*_monitor with
monitor_uuid for update-in-place. Bumps monitoring-advisor to 2.1.0 and
tune-monitor to 1.1.0.
Copy link
Copy Markdown
Contributor

@mdediana mdediana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…ith monitor_uuid

Follow-up to ai-agent PR #998 commit c2f451c, which clarifies that passing
monitor_uuid to create_or_update_*_monitor has PUT semantics — the call
fully replaces the monitor's configuration, and fields you omit revert to
tool defaults rather than being left untouched.

Adds the safe-edit recipe (read current config with get_monitors(...,
include_fields=["config"]), re-pass every kept field, diff the dry_run
YAML against the original) to:

- monitoring-advisor's data-monitor-creation.md (intro + Step 7)
- monitoring-advisor per-type references — augments the monitor_uuid row
  in the parameters table for all five monitor types
- tune-monitor references (metric, custom-sql, validation) — strengthens
  the "Common mistakes" section and cross-references the Phase 1 config
  fetch as the source of truth for the re-pass

Bumps monitoring-advisor to 2.1.1 and tune-monitor to 1.1.1.
@santiagoaguiar santiagoaguiar merged commit f99e234 into main May 13, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants