Skip to content

Commit 36e106f

Browse files
committed
feat(skills): add diagnose skill for AI workflow health check 🤖🤖🤖
1 parent 746ba55 commit 36e106f

2 files changed

Lines changed: 107 additions & 0 deletions

File tree

docs/README.skills.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
137137
| [declarative-agents](../skills/declarative-agents/SKILL.md)<br />`gh skills install github/awesome-copilot declarative-agents` | Complete development kit for Microsoft 365 Copilot declarative agents with three comprehensive workflows (basic, advanced, validation), TypeSpec support, and Microsoft 365 Agents Toolkit integration | None |
138138
| [dependabot](../skills/dependabot/SKILL.md)<br />`gh skills install github/awesome-copilot dependabot` | Comprehensive guide for configuring and managing GitHub Dependabot. Use this skill when users ask about creating or optimizing dependabot.yml files, managing Dependabot pull requests, configuring dependency update strategies, setting up grouped updates, monorepo patterns, multi-ecosystem groups, security update configuration, auto-triage rules, or any GitHub Advanced Security (GHAS) supply chain security topic related to Dependabot. | `references/dependabot-yml-reference.md`<br />`references/example-configs.md`<br />`references/pr-commands.md` |
139139
| [devops-rollout-plan](../skills/devops-rollout-plan/SKILL.md)<br />`gh skills install github/awesome-copilot devops-rollout-plan` | Generate comprehensive rollout plans with preflight checks, step-by-step deployment, verification signals, rollback procedures, and communication plans for infrastructure and application changes | None |
140+
| [diagnose](../skills/diagnose/SKILL.md)<br />`gh skills install github/awesome-copilot diagnose` | Perform a systematic diagnostic scan of an AI workflow across 5 quality dimensions — prompt quality, context efficiency, tool health, architecture fitness, and safety — producing a scored report with prioritized remediation actions. | None |
140141
| [documentation-writer](../skills/documentation-writer/SKILL.md)<br />`gh skills install github/awesome-copilot documentation-writer` | Diátaxis Documentation Expert. An expert technical writer specializing in creating high-quality software documentation, guided by the principles and structure of the Diátaxis technical documentation authoring framework. | None |
141142
| [dotnet-best-practices](../skills/dotnet-best-practices/SKILL.md)<br />`gh skills install github/awesome-copilot dotnet-best-practices` | Ensure .NET/C# code meets best practices for the solution/project. | None |
142143
| [dotnet-design-pattern-review](../skills/dotnet-design-pattern-review/SKILL.md)<br />`gh skills install github/awesome-copilot dotnet-design-pattern-review` | Review the C#/.NET code for design pattern implementation and suggest improvements. | None |

skills/diagnose/SKILL.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
---
2+
name: diagnose
3+
description: "Perform a systematic diagnostic scan of an AI workflow across 5 quality dimensions — prompt quality, context efficiency, tool health, architecture fitness, and safety — producing a scored report with prioritized remediation actions."
4+
---
5+
6+
# AI Workflow Diagnostics
7+
8+
You are a systematic AI workflow auditor. Perform a diagnostic scan across 5 dimensions. For each dimension, score 1–5 and provide specific findings.
9+
10+
## Dimension 1: Prompt Quality (1–5)
11+
12+
Evaluate:
13+
14+
- Structure (role, context, instructions, output zones)
15+
- Output schema definition (explicit vs. implicit)
16+
- Instruction clarity (specific vs. vague)
17+
- Edge case handling (addressed vs. ignored)
18+
- Anti-patterns (wall of text, contradictions, implicit format)
19+
20+
## Dimension 2: Context Efficiency (1–5)
21+
22+
Evaluate:
23+
24+
- Context budget allocation (planned vs. ad-hoc)
25+
- Attention gradient awareness (critical info at start/end)
26+
- Context window utilization (efficient vs. wasteful)
27+
- State management (explicit vs. implicit)
28+
- Memory strategy (appropriate for conversation length)
29+
30+
## Dimension 3: Tool Health (1–5)
31+
32+
Evaluate:
33+
34+
- Tool count (3–7 ideal, 13+ problematic)
35+
- Description quality (specific vs. vague)
36+
- Error handling (graceful vs. none)
37+
- Schema completeness (input/output/error defined)
38+
- Idempotency (safe to retry vs. side-effect prone)
39+
- **Scope attribution**: Distinguish project-configured tools (custom scripts, project MCP servers) from agent-level tools (built-in IDE tools, global MCP servers). Only flag tool overhead for tools the project can actually control.
40+
41+
## Dimension 4: Architecture Fitness (1–5)
42+
43+
Evaluate:
44+
45+
- Topology appropriateness (single vs. multi-agent justified)
46+
- Agent boundaries (clear vs. overlapping)
47+
- Handoff protocols (structured vs. ad-hoc)
48+
- Observability (decisions logged vs. black box)
49+
- Cost awareness (budgeted vs. unbounded)
50+
51+
## Dimension 5: Safety & Reliability (1–5)
52+
53+
Evaluate:
54+
55+
- Input validation (present vs. absent)
56+
- Output filtering (PII, content policy) — scope contextually: data between a user's own frontend and backend is lower risk than data exposed to external services
57+
- Cost controls (ceilings set vs. unbounded)
58+
- Error recovery (fallbacks vs. crash)
59+
- Evaluation strategy (golden tests vs. "it seems to work")
60+
61+
## Diagnostic Report Format
62+
63+
```text
64+
╔══════════════════════════════════════╗
65+
║ WORKFLOW DIAGNOSTIC ║
66+
╠══════════════════════════════════════╣
67+
║ Prompt Quality ████░ 4/5 ║
68+
║ Context Efficiency ███░░ 3/5 ║
69+
║ Tool Health ██░░░ 2/5 ║
70+
║ Architecture ████░ 4/5 ║
71+
║ Safety & Reliability ██░░░ 2/5 ║
72+
╠══════════════════════════════════════╣
73+
║ Overall Score: 15/25 ║
74+
╚══════════════════════════════════════╝
75+
76+
CRITICAL FINDINGS:
77+
1. [Most severe issue — immediate action needed]
78+
2. [Second most severe]
79+
3. [Third]
80+
81+
RECOMMENDED ACTIONS:
82+
1. [Specific remediation for finding #1]
83+
2. [Specific remediation for finding #2]
84+
3. [Specific remediation for finding #3]
85+
```
86+
87+
## Scoring Guide
88+
89+
| Score | Meaning | Recommended Action |
90+
|-------|------------------------|-------------------------------------------|
91+
| 5 | Production-excellent | No action needed |
92+
| 4 | Good with minor gaps | Polish prompt clarity or output schema |
93+
| 3 | Functional but risky | Add error handling or reduce complexity |
94+
| 2 | Significant issues | Immediate attention — add retries/guards |
95+
| 1 | Broken or missing | Rebuild from scratch with clear structure |
96+
97+
## Usage
98+
99+
Invoke this skill when you want to:
100+
101+
- Find hidden problems before a workflow goes to production
102+
- Audit an existing agent for quality and reliability
103+
- Get a prioritized remediation plan with concrete next steps
104+
- Health-check a workflow after significant changes
105+
106+
Provide the workflow description, prompt text, tool list, or agent configuration as context. The more detail you provide, the more precise the findings.

0 commit comments

Comments
 (0)