|
| 1 | +--- |
| 2 | +name: phoenix-cli |
| 3 | +description: Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, inspect datasets, and query the GraphQL API. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues. |
| 4 | +license: Apache-2.0 |
| 5 | +metadata: |
| 6 | + author: arize-ai |
| 7 | + version: "2.0.0" |
| 8 | +--- |
| 9 | + |
| 10 | +# Phoenix CLI |
| 11 | + |
| 12 | +## Invocation |
| 13 | + |
| 14 | +```bash |
| 15 | +px <resource> <action> # if installed globally |
| 16 | +npx @arizeai/phoenix-cli <resource> <action> # no install required |
| 17 | +``` |
| 18 | + |
| 19 | +The CLI uses singular resource commands with subcommands like `list` and `get`: |
| 20 | + |
| 21 | +```bash |
| 22 | +px trace list |
| 23 | +px trace get <trace-id> |
| 24 | +px span list |
| 25 | +px dataset list |
| 26 | +px dataset get <name> |
| 27 | +``` |
| 28 | + |
| 29 | +## Setup |
| 30 | + |
| 31 | +```bash |
| 32 | +export PHOENIX_HOST=http://localhost:6006 |
| 33 | +export PHOENIX_PROJECT=my-project |
| 34 | +export PHOENIX_API_KEY=your-api-key # if auth is enabled |
| 35 | +``` |
| 36 | + |
| 37 | +Always use `--format raw --no-progress` when piping to `jq`. |
| 38 | + |
| 39 | +## Traces |
| 40 | + |
| 41 | +```bash |
| 42 | +px trace list --limit 20 --format raw --no-progress | jq . |
| 43 | +px trace list --last-n-minutes 60 --limit 20 --format raw --no-progress | jq '.[] | select(.status == "ERROR")' |
| 44 | +px trace list --format raw --no-progress | jq 'sort_by(-.duration) | .[0:5]' |
| 45 | +px trace get <trace-id> --format raw | jq . |
| 46 | +px trace get <trace-id> --format raw | jq '.spans[] | select(.status_code != "OK")' |
| 47 | +``` |
| 48 | + |
| 49 | +## Spans |
| 50 | + |
| 51 | +```bash |
| 52 | +px span list --limit 20 # recent spans (table view) |
| 53 | +px span list --last-n-minutes 60 --limit 50 # spans from last hour |
| 54 | +px span list --span-kind LLM --limit 10 # only LLM spans |
| 55 | +px span list --status-code ERROR --limit 20 # only errored spans |
| 56 | +px span list --name chat_completion --limit 10 # filter by span name |
| 57 | +px span list --trace-id <id> --format raw --no-progress | jq . # all spans for a trace |
| 58 | +px span list --include-annotations --limit 10 # include annotation scores |
| 59 | +px span list output.json --limit 100 # save to JSON file |
| 60 | +px span list --format raw --no-progress | jq '.[] | select(.status_code == "ERROR")' |
| 61 | +``` |
| 62 | + |
| 63 | +### Span JSON shape |
| 64 | + |
| 65 | +``` |
| 66 | +Span |
| 67 | + name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT"|"RERANKER"|"GUARDRAIL"|"EVALUATOR"|"UNKNOWN") |
| 68 | + status_code ("OK"|"ERROR"|"UNSET"), status_message |
| 69 | + context.span_id, context.trace_id, parent_id |
| 70 | + start_time, end_time |
| 71 | + attributes (same as trace span attributes above) |
| 72 | + annotations[] (with --include-annotations) |
| 73 | + name, result { score, label, explanation } |
| 74 | +``` |
| 75 | + |
| 76 | +### Trace JSON shape |
| 77 | + |
| 78 | +``` |
| 79 | +Trace |
| 80 | + traceId, status ("OK"|"ERROR"), duration (ms), startTime, endTime |
| 81 | + rootSpan — top-level span (parent_id: null) |
| 82 | + spans[] |
| 83 | + name, span_kind ("LLM"|"CHAIN"|"TOOL"|"RETRIEVER"|"EMBEDDING"|"AGENT") |
| 84 | + status_code ("OK"|"ERROR"), parent_id, context.span_id |
| 85 | + attributes |
| 86 | + input.value, output.value — raw input/output |
| 87 | + llm.model_name, llm.provider |
| 88 | + llm.token_count.prompt/completion/total |
| 89 | + llm.token_count.prompt_details.cache_read |
| 90 | + llm.token_count.completion_details.reasoning |
| 91 | + llm.input_messages.{N}.message.role/content |
| 92 | + llm.output_messages.{N}.message.role/content |
| 93 | + llm.invocation_parameters — JSON string (temperature, etc.) |
| 94 | + exception.message — set if span errored |
| 95 | +``` |
| 96 | + |
| 97 | +## Sessions |
| 98 | + |
| 99 | +```bash |
| 100 | +px session list --limit 10 --format raw --no-progress | jq . |
| 101 | +px session list --order asc --format raw --no-progress | jq '.[].session_id' |
| 102 | +px session get <session-id> --format raw | jq . |
| 103 | +px session get <session-id> --include-annotations --format raw | jq '.annotations' |
| 104 | +``` |
| 105 | + |
| 106 | +### Session JSON shape |
| 107 | + |
| 108 | +``` |
| 109 | +SessionData |
| 110 | + id, session_id, project_id |
| 111 | + start_time, end_time |
| 112 | + traces[] |
| 113 | + id, trace_id, start_time, end_time |
| 114 | +
|
| 115 | +SessionAnnotation (with --include-annotations) |
| 116 | + id, name, annotator_kind ("LLM"|"CODE"|"HUMAN"), session_id |
| 117 | + result { label, score, explanation } |
| 118 | + metadata, identifier, source, created_at, updated_at |
| 119 | +``` |
| 120 | + |
| 121 | +## Datasets / Experiments / Prompts |
| 122 | + |
| 123 | +```bash |
| 124 | +px dataset list --format raw --no-progress | jq '.[].name' |
| 125 | +px dataset get <name> --format raw | jq '.examples[] | {input, output: .expected_output}' |
| 126 | +px experiment list --dataset <name> --format raw --no-progress | jq '.[] | {id, name, failed_run_count}' |
| 127 | +px experiment get <id> --format raw --no-progress | jq '.[] | select(.error != null) | {input, error}' |
| 128 | +px prompt list --format raw --no-progress | jq '.[].name' |
| 129 | +px prompt get <name> --format text --no-progress # plain text, ideal for piping to AI |
| 130 | +``` |
| 131 | + |
| 132 | +## GraphQL |
| 133 | + |
| 134 | +For ad-hoc queries not covered by the commands above. Output is `{"data": {...}}`. |
| 135 | + |
| 136 | +```bash |
| 137 | +px api graphql '{ projectCount datasetCount promptCount evaluatorCount }' |
| 138 | +px api graphql '{ projects { edges { node { name traceCount tokenCountTotal } } } }' | jq '.data.projects.edges[].node' |
| 139 | +px api graphql '{ datasets { edges { node { name exampleCount experimentCount } } } }' | jq '.data.datasets.edges[].node' |
| 140 | +px api graphql '{ evaluators { edges { node { name kind } } } }' | jq '.data.evaluators.edges[].node' |
| 141 | + |
| 142 | +# Introspect any type |
| 143 | +px api graphql '{ __type(name: "Project") { fields { name type { name } } } }' | jq '.data.__type.fields[]' |
| 144 | +``` |
| 145 | + |
| 146 | +Key root fields: `projects`, `datasets`, `prompts`, `evaluators`, `projectCount`, `datasetCount`, `promptCount`, `evaluatorCount`, `viewer`. |
| 147 | + |
| 148 | +## Docs |
| 149 | + |
| 150 | +Download Phoenix documentation markdown for local use by coding agents. |
| 151 | + |
| 152 | +```bash |
| 153 | +px docs fetch # fetch default workflow docs to .px/docs |
| 154 | +px docs fetch --workflow tracing # fetch only tracing docs |
| 155 | +px docs fetch --workflow tracing --workflow evaluation |
| 156 | +px docs fetch --dry-run # preview what would be downloaded |
| 157 | +px docs fetch --refresh # clear .px/docs and re-download |
| 158 | +px docs fetch --output-dir ./my-docs # custom output directory |
| 159 | +``` |
| 160 | + |
| 161 | +Key options: `--workflow` (repeatable, values: `tracing`, `evaluation`, `datasets`, `prompts`, `integrations`, `sdk`, `self-hosting`, `all`), `--dry-run`, `--refresh`, `--output-dir` (default `.px/docs`), `--workers` (default 10). |
0 commit comments