Skip to content

Commit 32f89a4

Browse files
Honor AI_AGENT and pass raw values through. (#1683)
## Why The Go SDK detects AI coding agents and surfaces them as `agent/<name>` in the User-Agent. Today the generic fallback (when no proprietary env var fires) only honors the agents.md `AGENT=<name>` standard. Vercel's `@vercel/detect-agent` library uses a parallel `AI_AGENT=<name>` convention that tools in the Vercel ecosystem set instead; we currently miss those. Separately, the existing fallback coerces any unrecognized value to the literal string `"unknown"`. That buries useful signal: a tool setting `AI_AGENT=claude-code_2-1-141_agent` ends up as `agent/unknown`, discarding the very signal (tool name plus version variant) we want to see. Bucketing arbitrary names is an ETL concern, not the SDK's. ## Changes Two behavior changes in `useragent/agent.go`: 1. **`AI_AGENT` fallback.** Add `AI_AGENT=<name>` as a secondary fallback after `AGENT=<name>`. `AGENT` wins when both are set to non-empty values; empty is treated as unset for both. Explicit product matchers (e.g. `CLAUDECODE=1`) still always win over both. 2. **Raw passthrough instead of `"unknown"`.** Drop the known-product lookup in the fallback. The value is piped through the existing `Sanitize()` helper (so disallowed chars become `-` and the User-Agent allowlist `[0-9A-Za-z_.+-]+` is satisfied) and capped at 64 chars to keep the header bounded. Known products like `cursor` or `claude-code` pass through unchanged because they already satisfy the allowlist. Same change should land in `databricks-sdk-py` and `databricks-sdk-java`; sibling PRs to follow. ## Test plan - [x] `go test ./useragent/...` passes - [x] `gofmt -l useragent/` clean - [x] `AI_AGENT=<known product>` returns the product name - [x] `AI_AGENT=<unrecognized>` returns the raw sanitized value (no longer `"unknown"`) - [x] `AGENT` wins over `AI_AGENT` when both are non-empty - [x] Empty `AGENT` falls through to `AI_AGENT` - [x] Disallowed chars in `AGENT` / `AI_AGENT` are sanitized to `-` - [x] Values longer than 64 chars are truncated - [x] Explicit matcher (e.g. `CLAUDECODE=1`) still wins over both fallbacks
1 parent fb01984 commit 32f89a4

3 files changed

Lines changed: 113 additions & 37 deletions

File tree

NEXT_CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,13 @@
66

77
### New Features and Improvements
88

9+
* Honor the Vercel `AI_AGENT=<name>` env var as a secondary fallback for
10+
AI agent detection in the User-Agent header (after the agents.md
11+
`AGENT=<name>` standard). Unrecognized fallback values now pass through
12+
the User-Agent sanitized and length-capped at 64 chars instead of being
13+
coerced to `agent/unknown`, so versioned variants such as
14+
`claude-code_2-1-141_agent` surface as-is.
15+
916
### Bug Fixes
1017

1118
### Documentation

useragent/agent.go

Lines changed: 37 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,24 @@ type knownAgent struct {
1313
product string
1414
}
1515

16-
// agentEnvVar is the agents.md standard env var. When set to a value we
17-
// don't specifically recognize, detection falls back to "unknown".
18-
const agentEnvVar = "AGENT"
16+
const (
17+
// agentEnvVar is the agents.md standard env var.
18+
agentEnvVar = "AGENT"
19+
20+
// aiAgentEnvVar is the Vercel @vercel/detect-agent convention. It
21+
// serves the same purpose as agentEnvVar; agentEnvFallback consults it
22+
// only when agentEnvVar is unset or empty.
23+
aiAgentEnvVar = "AI_AGENT"
24+
25+
// maxAgentFallbackLen caps fallback values to keep the User-Agent
26+
// bounded. Explicit-matcher products are short by construction; only
27+
// the fallback path can carry arbitrary lengths.
28+
maxAgentFallbackLen = 64
29+
)
1930

2031
// listKnownAgents returns the canonical list of AI coding agents.
21-
// Keep this list in sync with databricks-sdk-py and databricks-sdk-java.
32+
// Keep this list, and the AGENT / AI_AGENT fallback handling in
33+
// agentEnvFallback, in sync with databricks-sdk-py and databricks-sdk-java.
2234
// Agents are listed alphabetically by product name.
2335
func listKnownAgents() []knownAgent {
2436
return []knownAgent{
@@ -43,19 +55,16 @@ func listKnownAgents() []knownAgent {
4355
// lookupAgentProvider checks environment variables for known AI agents.
4456
//
4557
// Explicit product-specific env vars always take precedence over the generic
46-
// agents.md AGENT env var. AGENT is consulted only as a fallback when no
47-
// explicit matcher fires, so that an explicit signal (e.g. CLAUDECODE=1)
48-
// always wins over a conflicting AGENT=<name> value.
58+
// AGENT and AI_AGENT env vars, so that an explicit signal (e.g.
59+
// CLAUDECODE=1) always wins over a conflicting AGENT=<name> value.
4960
//
5061
// The function counts how many distinct agents matched via explicit env vars:
5162
// - Exactly one agent matched: return its product name.
5263
// - More than one agent matched: return "multiple". Agent env vars can be
5364
// stacked when one agent invokes another as a subagent (e.g. Claude Code
5465
// spawning a Cursor CLI subprocess), so the child process inherits env
5566
// vars from multiple layers.
56-
// - Zero agents matched: if the agents.md standard AGENT env var is set to
57-
// a non-empty value, return that value if it matches a known product name,
58-
// or "unknown" otherwise. If AGENT is not set, return "".
67+
// - Zero agents matched: see agentEnvFallback.
5968
func lookupAgentProvider() string {
6069
agents := listKnownAgents()
6170

@@ -75,7 +84,7 @@ func lookupAgentProvider() string {
7584
case 1:
7685
return matches[0]
7786
case 0:
78-
return agentEnvFallback(agents)
87+
return agentEnvFallback()
7988
default:
8089
return "multiple"
8190
}
@@ -104,20 +113,23 @@ func collapseCopilotBYOK(matches []string) []string {
104113
return filtered
105114
}
106115

107-
// agentEnvFallback honors the agents.md AGENT=<name> standard.
108-
// Returns the value if it matches a known product name, "unknown" if AGENT
109-
// is set to any other non-empty value, and "" if AGENT is unset or empty.
110-
func agentEnvFallback(agents []knownAgent) string {
111-
v, ok := os.LookupEnv(agentEnvVar)
112-
if !ok || v == "" {
116+
// agentEnvFallback returns a sanitized, length-capped name from AGENT
117+
// or AI_AGENT, preferring AGENT when both are non-empty. The value is
118+
// passed through rather than categorized so that new names are propagated
119+
// without the need to update the list of known agents.
120+
func agentEnvFallback() string {
121+
v := os.Getenv(agentEnvVar)
122+
if v == "" {
123+
v = os.Getenv(aiAgentEnvVar)
124+
}
125+
if v == "" {
113126
return ""
114127
}
115-
for _, a := range agents {
116-
if a.product == v {
117-
return v
118-
}
128+
v = Sanitize(v)
129+
if len(v) > maxAgentFallbackLen {
130+
v = v[:maxAgentFallbackLen]
119131
}
120-
return "unknown"
132+
return v
121133
}
122134

123135
var (
@@ -128,13 +140,12 @@ var (
128140
// AgentProvider returns the detected AI agent name, cached for the process lifetime.
129141
// Returns one of:
130142
// - the known product name when exactly one agent is detected via explicit
131-
// env matchers, or when AGENT is set to a known product name and no
132-
// explicit matcher fired;
143+
// env matchers;
133144
// - "multiple" when multiple explicit matchers fire for different agents
134145
// (typically nested agents, e.g. Cursor CLI running as a Claude Code
135146
// subagent);
136-
// - "unknown" when no explicit matcher fired and AGENT is set to a value
137-
// that is not a known product name;
147+
// - a sanitized, length-capped value from AGENT or AI_AGENT when no
148+
// explicit matcher fired (see agentEnvFallback);
138149
// - "" when no agent is detected.
139150
func AgentProvider() string {
140151
agentOnce.Do(func() {

useragent/agent_test.go

Lines changed: 69 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
package useragent
22

33
import (
4+
"strings"
45
"testing"
56

67
"github.com/databricks/databricks-sdk-go/internal/env"
@@ -130,24 +131,34 @@ func TestLookupAgentProvider(t *testing.T) {
130131
},
131132
// AGENT fallback behavior.
132133
{
133-
name: "AGENT with unknown value falls back to unknown",
134+
name: "AGENT=cursor falls back to cursor",
135+
envs: map[string]string{"AGENT": "cursor"},
136+
expect: "cursor",
137+
},
138+
{
139+
name: "AGENT with unrecognized value passes through (sanitized)",
134140
envs: map[string]string{"AGENT": "someweirdthing"},
135-
expect: "unknown",
141+
expect: "someweirdthing",
136142
},
137143
{
138-
name: "AGENT empty string does not trigger fallback",
139-
envs: map[string]string{"AGENT": ""},
140-
expect: "",
144+
name: "AGENT versioned variant passes through unchanged",
145+
envs: map[string]string{"AGENT": "claude-code_2-1-141_agent"},
146+
expect: "claude-code_2-1-141_agent",
141147
},
142148
{
143-
name: "AGENT=cursor falls back to cursor via known product name",
144-
envs: map[string]string{"AGENT": "cursor"},
145-
expect: "cursor",
149+
name: "AGENT with disallowed chars is sanitized to hyphens",
150+
envs: map[string]string{"AGENT": "claude code/agent"},
151+
expect: "claude-code-agent",
146152
},
147153
{
148-
name: "AGENT=claude-code falls back to claude-code via known product name",
149-
envs: map[string]string{"AGENT": "claude-code"},
150-
expect: "claude-code",
154+
name: "AGENT longer than the cap is truncated",
155+
envs: map[string]string{"AGENT": strings.Repeat("a", 100)},
156+
expect: strings.Repeat("a", 64),
157+
},
158+
{
159+
name: "AGENT empty string does not trigger fallback",
160+
envs: map[string]string{"AGENT": ""},
161+
expect: "",
151162
},
152163
{
153164
name: "known matcher wins over AGENT fallback",
@@ -178,6 +189,53 @@ func TestLookupAgentProvider(t *testing.T) {
178189
envs: map[string]string{"COPILOT_CLI": "1", "COPILOT_MODEL": "gpt-4", "CLAUDECODE": "1"},
179190
expect: "multiple",
180191
},
192+
// AI_AGENT fallback (Vercel @vercel/detect-agent convention).
193+
{
194+
name: "AI_AGENT=cursor falls back to cursor",
195+
envs: map[string]string{"AI_AGENT": "cursor"},
196+
expect: "cursor",
197+
},
198+
{
199+
name: "AI_AGENT empty string does not trigger fallback",
200+
envs: map[string]string{"AI_AGENT": ""},
201+
expect: "",
202+
},
203+
{
204+
name: "known matcher wins over AI_AGENT fallback",
205+
envs: map[string]string{"AI_AGENT": "somethingunknown", "CLAUDECODE": "1"},
206+
expect: "claude-code",
207+
},
208+
// AGENT vs AI_AGENT precedence: AGENT wins when both are non-empty.
209+
{
210+
name: "AGENT wins over AI_AGENT when both are set to known products",
211+
envs: map[string]string{"AGENT": "claude-code", "AI_AGENT": "cursor"},
212+
expect: "claude-code",
213+
},
214+
{
215+
name: "AGENT set to unrecognized non-empty value still wins over AI_AGENT",
216+
envs: map[string]string{"AGENT": "somethingunknown", "AI_AGENT": "cursor"},
217+
expect: "somethingunknown",
218+
},
219+
{
220+
name: "AGENT set, AI_AGENT empty: AGENT value is used",
221+
envs: map[string]string{"AGENT": "cursor", "AI_AGENT": ""},
222+
expect: "cursor",
223+
},
224+
{
225+
name: "empty AGENT falls through to AI_AGENT",
226+
envs: map[string]string{"AGENT": "", "AI_AGENT": "cursor"},
227+
expect: "cursor",
228+
},
229+
{
230+
name: "both AGENT and AI_AGENT empty returns no agent",
231+
envs: map[string]string{"AGENT": "", "AI_AGENT": ""},
232+
expect: "",
233+
},
234+
{
235+
name: "explicit CLAUDECODE wins over AI_AGENT=cursor",
236+
envs: map[string]string{"AI_AGENT": "cursor", "CLAUDECODE": "1"},
237+
expect: "claude-code",
238+
},
181239
}
182240

183241
for _, tt := range tests {

0 commit comments

Comments
 (0)