You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add Snowflake Cortex as an AI provider (#349)
* feat: add Snowflake Cortex as an AI provider
Adds `snowflake-cortex` as a built-in provider using Programmatic Access
Token (PAT) auth. Users authenticate by entering `<account>::<token>` once;
billing flows through Snowflake credits. Includes Claude, Llama, Mistral,
and DeepSeek models with Cortex-specific request transforms (max_completion_tokens,
tool stripping for unsupported models, synthetic SSE stop to break the AI
SDK's continuation-check loop when Snowflake rejects trailing assistant messages).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: harden Snowflake Cortex provider with `altimate_change` markers and edge case fixes
- Add `altimate_change` markers to all upstream-shared files (`provider.ts`,
`schema.ts`, `plugin/index.ts`) to prevent overwrites on upstream merges
- Validate account ID against `^[a-zA-Z0-9._-]+$` to prevent URL injection
- Remove `(auth as any).accountId` casts — use proper type narrowing
- Fix `env` array: `SNOWFLAKE_PAT` → `SNOWFLAKE_ACCOUNT` (matches actual usage)
- Fix `claude-3-5-sonnet` output limit: `8096` → `8192`
- Strip orphaned `tool_calls` and `tool` role messages for no-toolcall models
- Use explicit `Array.isArray(tool_calls)` check for synthetic stop condition
- Remove zero-usage block from synthetic SSE to avoid broken token accounting
- Handle `ArrayBuffer` body type in fetch wrapper
- Reduce PAT expiry from 365 → 90 days (matches Snowflake default TTL)
- Add 14 new test cases covering URL injection, orphaned tool_calls, empty
tool_calls array, SSE format validation, missing messages, and env/output limits
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: require oauth auth for snowflake-cortex, don't expose account as credential
Addresses CodeRabbit review comments:
- Require `auth.type === "oauth"` before autoloading — env-only `SNOWFLAKE_ACCOUNT`
no longer makes the provider look configured without a PAT
- Set `env: []` so `state()` env-key scan doesn't treat account name as an API key
- Validate account from env fallback against `^[a-zA-Z0-9._-]+$`
- Add test: env-only without oauth does NOT load the provider
- All provider tests now set up/teardown oauth auth properly via save/restore
- Update env array assertion: `toContain("SNOWFLAKE_ACCOUNT")` → `toEqual([])`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address consensus code review findings
Fixes from 6-model consensus review (Claude, GPT 5.2, Gemini 3.1,
Kimi K2.5, MiniMax M2.5, GLM-5):
- Gate synthetic SSE stop on `stream !== false` to avoid returning SSE
format for non-streaming requests (Major, flagged by GPT 5.2 Codex)
- Delete `content-length` header after body mutation to prevent
length mismatch (Minor, flagged by GPT 5.2/Kimi/Gemini consensus)
- Export `VALID_ACCOUNT_RE` from `snowflake.ts` and import in
`provider.ts` to eliminate duplicated regex (Minor, flagged by GLM-5)
- Add `claude-3-5-sonnet` to toolcall capability test (Kimi K2.5)
- Add 3 new tests: `stream: false` skips synthetic stop, `stream: true`
triggers it, absent `stream` field defaults to streaming behavior
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add Cortex E2E tests and sanitize hardcoded credentials
- Add `cortex-snowflake-e2e.test.ts` with 16 E2E tests for the Snowflake
Cortex AI provider: PAT auth, streaming/non-streaming completions, model
availability, request transforms, assistant-last rejection, PAT parsing
- Tests skip via `describe.skipIf` when `SNOWFLAKE_CORTEX_PAT` is not set
- Remove hardcoded credentials from `drivers-snowflake-e2e.test.ts` docstring
— replaced with placeholder values
Run with:
export SNOWFLAKE_CORTEX_ACCOUNT="<account>"
export SNOWFLAKE_CORTEX_PAT="<pat>"
bun test test/altimate/cortex-snowflake-e2e.test.ts --timeout 120000
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update Cortex models from E2E testing against real Snowflake API
Verified against Snowflake account ejjkbko-fub20041 with cross-region
inference enabled. Key findings and fixes:
Model list changes (from real API probing):
- Replace `llama3.3-70b` (unavailable) with `snowflake-llama-3.3-70b`
- Add `llama3.1-70b`, `llama3.1-405b`, `llama3.1-8b`, `mistral-7b`
- All 10 models verified against live Cortex endpoint
Tool calling fix:
- Switch from blocklist (`NO_TOOLCALL_MODELS`) to allowlist
(`TOOLCALL_MODELS`) — only Claude models support tool calls on Cortex,
all others reject with "tool calling is not supported"
E2E test improvements (24 tests, all pass against live API):
- Test all 10 registered models for availability and response shape
- Tool call support test: Claude accepts, non-Claude rejects
- DeepSeek R1 reasoning format test (`<think>` tags in content)
- Support key-pair JWT auth (no PAT required)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add OpenAI and additional Claude models from Snowflake Cortex docs
From official Snowflake documentation (docs.snowflake.com):
New models added (19 → 28 total):
- OpenAI: `openai-gpt-4.1`, `openai-gpt-5`, `openai-gpt-5-mini`,
`openai-gpt-5-nano`, `openai-gpt-5-chat`, `openai-gpt-oss-120b`
- Claude: `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-opus-4-5`,
`claude-4-sonnet`, `claude-4-opus`, `claude-3-7-sonnet`
- Meta: `llama4-maverick`, `mistral-large`
Tool calling update:
- Per docs: "Tool calling is supported for OpenAI and Claude models only"
- Updated `TOOLCALL_MODELS` allowlist to include all OpenAI + Claude IDs
Note: OpenAI models were not available on this test account (returned
"unknown model") but are documented in the Cortex REST API reference.
Availability depends on region and account configuration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: verify model availability via live API, remove broken models
Probed all 28 documented models against ejjkbko-fub20041 with
cross-region enabled:
Verified working (13):
- Claude: claude-sonnet-4-6, claude-opus-4-6, claude-sonnet-4-5,
claude-opus-4-5, claude-haiku-4-5, claude-4-sonnet, claude-3-7-sonnet,
claude-3-5-sonnet
- OpenAI: openai-gpt-4.1, openai-gpt-5, openai-gpt-5-mini,
openai-gpt-5-nano, openai-gpt-5-chat
- OpenAI tool calling confirmed working (get_weather test)
Removed from registration (kept as comments):
- claude-4-opus: 403 "account not allowed" (gated)
- openai-gpt-oss-120b: 500 internal error (not stable)
Also verified:
- llama4-maverick, mistral-large: working
- GPT-5 preview variants return 200 but empty content (preview)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs+test: pre-release — docs, test gaps, and full model validation
Documentation:
- Add Snowflake Cortex section to docs/configure/providers.md with
auth instructions, model table, and cross-region note
- Add Snowflake Cortex to model format reference in models.md
- Add v0.5.6 changelog entry
Test gap fixes (46 → 52 unit tests):
- Content-length deletion after body transform
- Synthetic stop returns valid SSE Response object
- Both max_tokens + max_completion_tokens present (max_tokens wins)
- Unknown model tools stripped (allowlist default)
- tool_choice without tools stripped for non-toolcall models
- max_completion_tokens preserved when max_tokens absent
E2E model validation (37 pass against live API):
- All 26 registered models probed: 21 available, 4 gated/broken, 1 preview
- Accept 200/400/403/500 for model availability (accounts vary)
- Handle preview models returning empty content (openai-gpt-5-*)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: anandgupta42 <anand@altimate.ai>
|`llama4-maverick`, `snowflake-llama-3.3-70b`, `llama3.1-70b`, `llama3.1-405b`, `llama3.1-8b`| No |
203
+
|`mistral-large`, `mistral-large2`, `mistral-7b`| No |
204
+
|`deepseek-r1`| No |
205
+
206
+
!!! note
207
+
Model availability depends on your Snowflake region. Enable cross-region inference with `ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'ANY_REGION'` for full model access.
208
+
179
209
## Custom / OpenAI-Compatible
180
210
181
211
Any OpenAI-compatible endpoint can be used as a provider:
// JSON parse error — pass original body through untransformed
144
+
}
145
+
}
146
+
147
+
returnfetch(requestInput,{ ...init, headers, body })
148
+
},
149
+
}
150
+
},
151
+
methods: [
152
+
{
153
+
label: "Snowflake PAT",
154
+
type: "oauth",
155
+
authorize: async()=>({
156
+
url: "https://app.snowflake.com",
157
+
instructions:
158
+
"Enter your credentials as: <account-identifier>::<PAT-token>\n e.g. myorg-myaccount::pat-token-here\n Create a PAT in Snowsight: Admin → Security → Programmatic Access Tokens",
159
+
method: "code"asconst,
160
+
callback: async(code: string)=>{
161
+
constparsed=parseSnowflakePAT(code)
162
+
if(!parsed)return{type: "failed"asconst}
163
+
return{
164
+
type: "success"asconst,
165
+
access: parsed.token,
166
+
refresh: "",
167
+
// PATs have variable TTLs (default 90 days); use conservative expiry
0 commit comments