You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Arize and Phoenix LLM observability skills (#1204)
* Add 9 Arize LLM observability skills
Add skills for Arize AI platform covering trace export, instrumentation,
datasets, experiments, evaluators, AI provider integrations, annotations,
prompt optimization, and deep linking to the Arize UI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add 3 Phoenix AI observability skills
Add skills for Phoenix (Arize open-source) covering CLI debugging,
LLM evaluation workflows, and OpenInference tracing/instrumentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Ignoring intentional bad spelling
* Fix CI: remove .DS_Store from generated skills README and add codespell ignore
Remove .DS_Store artifact from winmd-api-search asset listing in generated
README.skills.md so it matches the CI Linux build output. Add queston to
codespell ignore list (intentional misspelling example in arize-dataset skill).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Add arize-ax and phoenix plugins
Bundle the 9 Arize skills into an arize-ax plugin and the 3 Phoenix
skills into a phoenix plugin for easier installation as single packages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix skill folder structures to match source repos
Move arize supporting files from references/ to root level and rename
phoenix references/ to rules/ to exactly match the original source
repository folder structures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fixing file locations
* Fixing readme
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .github/plugin/marketplace.json
+12Lines changed: 12 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,12 @@
10
10
"email": "copilot@github.com"
11
11
},
12
12
"plugins": [
13
+
{
14
+
"name": "arize-ax",
15
+
"source": "arize-ax",
16
+
"description": "Arize AX platform skills for LLM observability, evaluation, and optimization. Includes trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI.",
17
+
"version": "1.0.0"
18
+
},
13
19
{
14
20
"name": "automate-this",
15
21
"source": "automate-this",
@@ -399,6 +405,12 @@
399
405
"description": "Complete toolkit for developing custom code components using Power Apps Component Framework for model-driven and canvas apps",
400
406
"version": "1.0.0"
401
407
},
408
+
{
409
+
"name": "phoenix",
410
+
"source": "phoenix",
411
+
"description": "Phoenix AI observability skills for LLM application debugging, evaluation, and tracing. Includes CLI debugging tools, LLM evaluation workflows, and OpenInference tracing instrumentation.",
Copy file name to clipboardExpand all lines: docs/README.plugins.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-plugins) for guidelines on how t
25
25
26
26
| Name | Description | Items | Tags |
27
27
| ---- | ----------- | ----- | ---- |
28
+
|[arize-ax](../plugins/arize-ax/README.md)| Arize AX platform skills for LLM observability, evaluation, and optimization. Includes trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI. | 9 items | arize, llm, observability, tracing, evaluation, instrumentation, datasets, experiments, prompt-optimization |
28
29
|[automate-this](../plugins/automate-this/README.md)| Record your screen doing a manual process, drop the video on your Desktop, and let Copilot CLI analyze it frame-by-frame to build working automation scripts. Supports narrated recordings with audio transcription. | 1 items | automation, screen-recording, workflow, video-analysis, process-automation, scripting, productivity, copilot-cli |
29
30
|[awesome-copilot](../plugins/awesome-copilot/README.md)| Meta prompts that help you discover and generate curated GitHub Copilot agents, instructions, prompts, and skills. | 4 items | github-copilot, discovery, meta, prompt-engineering, agents |
30
31
|[azure-cloud-development](../plugins/azure-cloud-development/README.md)| Comprehensive Azure cloud development tools including Infrastructure as Code, serverless functions, architecture patterns, and cost optimization for building scalable cloud applications. | 11 items | azure, cloud, infrastructure, bicep, terraform, serverless, architecture, devops |
@@ -60,6 +61,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-plugins) for guidelines on how t
60
61
|[ospo-sponsorship](../plugins/ospo-sponsorship/README.md)| Tools and resources for Open Source Program Offices (OSPOs) to identify, evaluate, and manage sponsorship of open source dependencies through GitHub Sponsors, Open Collective, and other funding platforms. | 1 items ||
61
62
|[partners](../plugins/partners/README.md)| Custom agents that have been created by GitHub partners | 20 items | devops, security, database, cloud, infrastructure, observability, feature-flags, cicd, migration, performance |
62
63
|[pcf-development](../plugins/pcf-development/README.md)| Complete toolkit for developing custom code components using Power Apps Component Framework for model-driven and canvas apps | 0 items | power-apps, pcf, component-framework, typescript, power-platform |
64
+
|[phoenix](../plugins/phoenix/README.md)| Phoenix AI observability skills for LLM application debugging, evaluation, and tracing. Includes CLI debugging tools, LLM evaluation workflows, and OpenInference tracing instrumentation. | 3 items | phoenix, arize, llm, observability, tracing, evaluation, openinference, instrumentation |
63
65
|[php-mcp-development](../plugins/php-mcp-development/README.md)| Comprehensive resources for building Model Context Protocol servers using the official PHP SDK with attribute-based discovery, including best practices, project generation, and expert assistance | 2 items | php, mcp, model-context-protocol, server-development, sdk, attributes, composer |
64
66
|[polyglot-test-agent](../plugins/polyglot-test-agent/README.md)| Multi-agent pipeline for generating comprehensive unit tests across any programming language. Orchestrates research, planning, and implementation phases using specialized agents to produce tests that compile, pass, and follow project conventions. | 9 items | testing, unit-tests, polyglot, test-generation, multi-agent, tdd, csharp, typescript, python, go |
65
67
|[power-apps-code-apps](../plugins/power-apps-code-apps/README.md)| Complete toolkit for Power Apps Code Apps development including project scaffolding, development standards, and expert guidance for building code-first applications with Power Platform integration. | 2 items | power-apps, power-platform, typescript, react, code-apps, dataverse, connectors |
Copy file name to clipboardExpand all lines: docs/README.skills.md
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,15 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
34
34
|[apple-appstore-reviewer](../skills/apple-appstore-reviewer/SKILL.md)| Serves as a reviewer of the codebase with instructions on looking for Apple App Store optimizations or rejection reasons. | None |
35
35
|[arch-linux-triage](../skills/arch-linux-triage/SKILL.md)| Triage and resolve Arch Linux issues with pacman, systemd, and rolling-release best practices. | None |
36
36
|[architecture-blueprint-generator](../skills/architecture-blueprint-generator/SKILL.md)| Comprehensive project architecture blueprint generator that analyzes codebases to create detailed architectural documentation. Automatically detects technology stacks and architectural patterns, generates visual diagrams, documents implementation patterns, and provides extensible blueprints for maintaining architectural consistency and guiding new development. | None |
37
+
|[arize-ai-provider-integration](../skills/arize-ai-provider-integration/SKILL.md)| INVOKE THIS SKILL when creating, reading, updating, or deleting Arize AI integrations. Covers listing integrations, creating integrations for any supported LLM provider (OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Vertex AI, Gemini, NVIDIA NIM, custom), updating credentials or metadata, and deleting integrations using the ax CLI. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
38
+
|[arize-annotation](../skills/arize-annotation/SKILL.md)| INVOKE THIS SKILL when creating, managing, or using annotation configs on Arize (categorical, continuous, freeform), or applying human annotations to project spans via the Python SDK. Configs are the label schema for human feedback on spans and other surfaces in the Arize UI. Triggers: annotation config, label schema, human feedback schema, bulk annotate spans, update_annotations. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
39
+
|[arize-dataset](../skills/arize-dataset/SKILL.md)| INVOKE THIS SKILL when creating, managing, or querying Arize datasets and examples. Covers dataset CRUD, appending examples, exporting data, and file-based dataset creation using the ax CLI. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
40
+
|[arize-evaluator](../skills/arize-evaluator/SKILL.md)| INVOKE THIS SKILL for LLM-as-judge evaluation workflows on Arize: creating/updating evaluators, running evaluations on spans or experiments, tasks, trigger-run, column mapping, and continuous monitoring. Use when the user says: create an evaluator, LLM judge, hallucination/faithfulness/correctness/relevance, run eval, score my spans or experiment, ax tasks, trigger-run, trigger eval, column mapping, continuous monitoring, query filter for evals, evaluator version, or improve an evaluator prompt. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
41
+
|[arize-experiment](../skills/arize-experiment/SKILL.md)| INVOKE THIS SKILL when creating, running, or analyzing Arize experiments. Covers experiment CRUD, exporting runs, comparing results, and evaluation workflows using the ax CLI. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
42
+
|[arize-instrumentation](../skills/arize-instrumentation/SKILL.md)| INVOKE THIS SKILL when adding Arize AX tracing to an application. Follow the Agent-Assisted Tracing two-phase flow: analyze the codebase (read-only), then implement instrumentation after user confirmation. When the app uses LLM tool/function calling, add manual CHAIN + TOOL spans so traces show each tool's input and output. Leverages https://arize.com/docs/ax/alyx/tracing-assistant and https://arize.com/docs/PROMPT.md.|`references/ax-profiles.md`|
43
+
|[arize-link](../skills/arize-link/SKILL.md)| Generate deep links to the Arize UI. Use when the user wants a clickable URL to open a specific trace, span, session, dataset, labeling queue, evaluator, or annotation config. |`references/EXAMPLES.md`|
44
+
|[arize-prompt-optimization](../skills/arize-prompt-optimization/SKILL.md)| INVOKE THIS SKILL when optimizing, improving, or debugging LLM prompts using production trace data, evaluations, and annotations. Covers extracting prompts from spans, gathering performance signal, and running a data-driven optimization loop using the ax CLI. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
45
+
|[arize-trace](../skills/arize-trace/SKILL.md)| INVOKE THIS SKILL when downloading or exporting Arize traces and spans. Covers exporting traces by ID, sessions by ID, and debugging LLM application issues using the ax CLI. |`references/ax-profiles.md`<br />`references/ax-setup.md`|
37
46
|[aspire](../skills/aspire/SKILL.md)| Aspire skill covering the Aspire CLI, AppHost orchestration, service discovery, integrations, MCP server, VS Code extension, Dev Containers, GitHub Codespaces, templates, dashboard, and deployment. Use when the user asks to create, run, debug, configure, deploy, or troubleshoot an Aspire distributed application. |`references/architecture.md`<br />`references/cli-reference.md`<br />`references/dashboard.md`<br />`references/deployment.md`<br />`references/integrations-catalog.md`<br />`references/mcp-server.md`<br />`references/polyglot-apis.md`<br />`references/testing.md`<br />`references/troubleshooting.md`|
38
47
|[aspnet-minimal-api-openapi](../skills/aspnet-minimal-api-openapi/SKILL.md)| Create ASP.NET Minimal API endpoints with proper OpenAPI documentation | None |
39
48
|[automate-this](../skills/automate-this/SKILL.md)| Analyze a screen recording of a manual process and produce targeted, working automation scripts. Extracts frames and audio narration from video files, reconstructs the step-by-step workflow, and proposes automation at multiple complexity levels using tools already installed on the user machine. | None |
@@ -201,6 +210,9 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
201
210
|[openapi-to-application-code](../skills/openapi-to-application-code/SKILL.md)| Generate a complete, production-ready application from an OpenAPI specification | None |
202
211
|[pdftk-server](../skills/pdftk-server/SKILL.md)| Skill for using the command-line tool pdftk (PDFtk Server) for working with PDF files. Use when asked to merge PDFs, split PDFs, rotate pages, encrypt or decrypt PDFs, fill PDF forms, apply watermarks, stamp overlays, extract metadata, burst documents into pages, repair corrupted PDFs, attach or extract files, or perform any PDF manipulation from the command line. |`references/download.md`<br />`references/pdftk-cli-examples.md`<br />`references/pdftk-man-page.md`<br />`references/pdftk-server-license.md`<br />`references/third-party-materials.md`|
203
212
|[penpot-uiux-design](../skills/penpot-uiux-design/SKILL.md)| Comprehensive guide for creating professional UI/UX designs in Penpot using MCP tools. Use this skill when: (1) Creating new UI/UX designs for web, mobile, or desktop applications, (2) Building design systems with components and tokens, (3) Designing dashboards, forms, navigation, or landing pages, (4) Applying accessibility standards and best practices, (5) Following platform guidelines (iOS, Android, Material Design), (6) Reviewing or improving existing Penpot designs for usability. Triggers: "design a UI", "create interface", "build layout", "design dashboard", "create form", "design landing page", "make it accessible", "design system", "component library". |`references/accessibility.md`<br />`references/component-patterns.md`<br />`references/platform-guidelines.md`<br />`references/setup-troubleshooting.md`|
213
+
|[phoenix-cli](../skills/phoenix-cli/SKILL.md)| Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, inspect datasets, and query the GraphQL API. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues. | None |
| [phoenix-tracing](../skills/phoenix-tracing/SKILL.md) | OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production. | `README.md`<br />`references/annotations-overview.md`<br />`references/annotations-python.md`<br />`references/annotations-typescript.md`<br />`references/fundamentals-flattening.md`<br />`references/fundamentals-overview.md`<br />`references/fundamentals-required-attributes.md`<br />`references/fundamentals-universal-attributes.md`<br />`references/instrumentation-auto-python.md`<br />`references/instrumentation-auto-typescript.md`<br />`references/instrumentation-manual-python.md`<br />`references/instrumentation-manual-typescript.md`<br />`references/metadata-python.md`<br />`references/metadata-typescript.md`<br />`references/production-python.md`<br />`references/production-typescript.md`<br />`references/projects-python.md`<br />`references/projects-typescript.md`<br />`references/sessions-python.md`<br />`references/sessions-typescript.md`<br />`references/setup-python.md`<br />`references/setup-typescript.md`<br />`references/span-agent.md`<br />`references/span-chain.md`<br />`references/span-embedding.md`<br />`references/span-evaluator.md`<br />`references/span-guardrail.md`<br />`references/span-llm.md`<br />`references/span-reranker.md`<br />`references/span-retriever.md`<br />`references/span-tool.md` |
204
216
|[php-mcp-server-generator](../skills/php-mcp-server-generator/SKILL.md)| Generate a complete PHP Model Context Protocol server project with tools, resources, prompts, and tests using the official PHP SDK | None |
205
217
|[planning-oracle-to-postgres-migration-integration-testing](../skills/planning-oracle-to-postgres-migration-integration-testing/SKILL.md)| Creates an integration testing plan for .NET data access artifacts during Oracle-to-PostgreSQL database migrations. Analyzes a single project to identify repositories, DAOs, and service layers that interact with the database, then produces a structured testing plan. Use when planning integration test coverage for a migrated project, identifying which data access methods need tests, or preparing for Oracle-to-PostgreSQL migration validation. | None |
206
218
|[plantuml-ascii](../skills/plantuml-ascii/SKILL.md)| Generate ASCII art diagrams using PlantUML text mode. Use when user asks to create ASCII diagrams, text-based diagrams, terminal-friendly diagrams, or mentions plantuml ascii, text diagram, ascii art diagram. Supports: Converting PlantUML diagrams to ASCII art, Creating sequence diagrams, class diagrams, flowcharts in ASCII format, Generating Unicode-enhanced ASCII art with -utxt flag | None |
"description": "Arize AX platform skills for LLM observability, evaluation, and optimization. Includes trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI.",
Arize AX platform skills for LLM observability, evaluation, and optimization. Includes trace export, instrumentation, datasets, experiments, evaluators, AI provider integrations, annotations, prompt optimization, and deep linking to the Arize UI.
4
+
5
+
## Installation
6
+
7
+
```bash
8
+
# Using Copilot CLI
9
+
copilot plugin install arize-ax@awesome-copilot
10
+
```
11
+
12
+
## What's Included
13
+
14
+
### Skills
15
+
16
+
| Skill | Description |
17
+
|-------|-------------|
18
+
|`arize-trace`| Export and analyze Arize traces and spans for debugging LLM applications using the ax CLI. |
19
+
|`arize-instrumentation`| Add Arize AX tracing to applications using a two-phase agent-assisted workflow. |
20
+
|`arize-dataset`| Create, manage, and query versioned evaluation datasets using the ax CLI. |
21
+
|`arize-experiment`| Run experiments against datasets and compare results using the ax CLI. |
22
+
|`arize-evaluator`| Create and run LLM-as-judge evaluators for automated scoring of spans and experiments. |
23
+
|`arize-ai-provider-integration`| Store and manage LLM provider credentials for use with evaluators. |
24
+
|`arize-annotation`| Create annotation configs and bulk-apply human feedback labels to spans. |
25
+
|`arize-prompt-optimization`| Optimize LLM prompts using production trace data, evaluations, and annotations. |
26
+
|`arize-link`| Generate deep links to the Arize UI for traces, spans, sessions, datasets, and more. |
0 commit comments