Note
This reference is part of the prototype extension for the Azure CLI. See COMMANDS.md for the full command reference.
The az prototype extension uses AI models to power its agent-driven workflow (design → build → deploy). This document lists the available providers, their supported models, and guidance on choosing the right combination.
The extension supports three AI providers. Set the active provider with:
az prototype config set --key ai.provider --value <provider>
az prototype config set --key ai.model --value <model-id>| Provider | Config Value | Authentication | Best For |
|---|---|---|---|
| GitHub Copilot | copilot |
Copilot OAuth token (OS keychain, env vars, gh CLI) |
Recommended. Broadest model selection, including Anthropic Claude. |
| GitHub Models | github-models |
GitHub PAT via gh auth login (models:read scope) |
Experimentation with open-weight and frontier models. No Anthropic models. |
| Azure OpenAI | azure-openai |
Azure AD via DefaultAzureCredential (az login / managed identity) |
Enterprise deployments with data residency, private networking, or compliance requirements. |
Important
Anthropic Claude models are only available through the copilot provider. They are not available on GitHub Models or Azure OpenAI.
Recommended provider. Routes requests via direct HTTP calls to the GitHub Copilot enterprise API (https://api.enterprise.githubcopilot.com/chat/completions), which exposes models from OpenAI, Anthropic, and Google under a single authentication flow.
The raw OAuth token (gho_, ghu_, ghp_) is sent directly as a Bearer header with editor-identification headers — no JWT exchange or SDK subprocess is required. This makes requests fast and lightweight.
Prerequisites: An active GitHub Copilot Business or Enterprise licence assigned to your GitHub account.
The enterprise endpoint dynamically exposes the full Copilot model catalogue. The provider queries /models at runtime, but the curated table below reflects verified models:
Anthropic Claude
| Model ID | Name | Context Window | Notes |
|---|---|---|---|
claude-sonnet-4 |
Claude Sonnet 4 | 200K tokens | Default. Best balance of quality, speed, and cost for code generation. |
claude-sonnet-4.5 |
Claude Sonnet 4.5 | 200K tokens | Excellent coding model. |
claude-sonnet-4.6 |
Claude Sonnet 4.6 | 200K tokens | Latest Sonnet. |
claude-opus-4.5 |
Claude Opus 4.5 | 200K tokens | High-quality reasoning model. |
claude-opus-4.6 |
Claude Opus 4.6 | 200K tokens | Latest Opus. |
claude-opus-4.6-fast |
Claude Opus 4.6 (Fast) | 200K tokens | Faster variant of Opus 4.6. |
claude-opus-4.6-1m |
Claude Opus 4.6 (1M) | 1M tokens | Extended context Opus. |
claude-haiku-4.5 |
Claude Haiku 4.5 | 200K tokens | Fastest Claude. Good for simpler tasks. |
OpenAI GPT
| Model ID | Name | Context Window | Notes |
|---|---|---|---|
gpt-5.3-codex |
GPT-5.3 Codex | — | Latest GPT codex model. |
gpt-5.2-codex |
GPT-5.2 Codex | — | High-quality codex. |
gpt-5.2 |
GPT-5.2 | — | GPT-5 series. |
gpt-5.1-codex-max |
GPT-5.1 Codex Max | — | Maximum capability codex. |
gpt-5.1-codex |
GPT-5.1 Codex | — | Standard codex. |
gpt-5.1 |
GPT-5.1 | — | General-purpose GPT-5.1. |
gpt-5.1-codex-mini |
GPT-5.1 Codex Mini | — | Lightweight codex. |
gpt-5-mini |
GPT-5 Mini | — | Fast, cost-effective. |
gpt-4.1 |
GPT-4.1 | 1M tokens | Massive context window. |
gpt-4o-mini |
GPT-4o Mini | 128K tokens | Lower cost, good for simpler tasks. |
Google Gemini
| Model ID | Name | Context Window | Notes |
|---|---|---|---|
gemini-3-pro-preview |
Gemini 3 Pro (Preview) | — | Latest Gemini preview. |
gemini-2.5-pro |
Gemini 2.5 Pro | 1M tokens | Google's flagship model. Large context window. |
Model availability is dynamic — run
az prototype config showto see the full list queried from the API at runtime.
The Copilot provider resolves a raw OAuth token from the following sources, in priority order (first match wins):
| Priority | Source | Details |
|---|---|---|
| 1 | COPILOT_GITHUB_TOKEN env var |
Highest priority — set this to override all other sources |
| 2 | GH_TOKEN env var |
GitHub CLI-compatible environment variable |
| 3 | Copilot CLI keychain | Windows Credential Manager / macOS Keychain, written by copilot login |
| 4 | Copilot SDK config files | ~/.config/github-copilot/hosts.json or apps.json |
| 5 | gh auth token |
Reads the active token from the GitHub CLI subprocess |
| 6 | GITHUB_TOKEN env var |
Lowest priority fallback |
Important
The token must originate from an approved Copilot OAuth application (e.g. copilot login or the VS Code Copilot extension). Tokens created via gh auth login alone may return 403 Forbidden if the OAuth app is not approved for Copilot access in your organisation. For EMU (Enterprise Managed User) accounts, copilot login is the recommended setup method.
# Option A — Copilot CLI (recommended, especially for EMU accounts)
copilot login
# Option B — Environment variable
set COPILOT_GITHUB_TOKEN=gho_your_token_here
# Option C — GitHub CLI (may not work for all org policies)
gh auth loginRoutes requests through the GitHub Models inference API (models.inference.ai.azure.com). Uses the OpenAI SDK with model IDs in publisher/model-name format.
Authentication: Requires a GitHub Personal Access Token (PAT). Set via environment variable GITHUB_TOKEN or gh auth token.
Warning
Anthropic (Claude) models are not available on GitHub Models. If you need Claude, switch to the copilot provider.
| Model ID | Name | Provider | Context Window | Notes |
|---|---|---|---|---|
openai/gpt-4o |
GPT-4o | OpenAI | 128K tokens | Default for this provider. Reliable general-purpose model. |
openai/gpt-4.1 |
GPT-4.1 | OpenAI | 1M tokens | Massive context. Good for large repo analysis. |
openai/gpt-4o-mini |
GPT-4o Mini | OpenAI | 128K tokens | Lower cost, good for simpler tasks. |
openai/o3 |
o3 | OpenAI | 200K tokens | Reasoning model. Strong for complex multi-step problems. |
openai/o3-mini |
o3 Mini | OpenAI | 200K tokens | Smaller reasoning model. Faster, lower cost. |
meta/meta-llama-3.1-405b-instruct |
Llama 3.1 405B | Meta | 128K tokens | Largest open-weight model available. |
deepseek/deepseek-r1 |
DeepSeek R1 | DeepSeek | 128K tokens | Open-weight reasoning model. |
Routes requests to your own Azure OpenAI Service deployment. You control the model version, region, and networking.
Authentication: Uses Azure AD (Entra ID) via DefaultAzureCredential. API keys are intentionally not supported — all credentials stay within your Azure tenant.
Supported credential flows (in DefaultAzureCredential priority order):
- Managed Identity — Azure VMs, App Service, AKS, etc.
- Azure CLI —
az loginon a developer workstation - Visual Studio / VS Code — signed-in Azure account
- Environment variables —
AZURE_CLIENT_ID/AZURE_CLIENT_SECRET/AZURE_TENANT_ID
Prerequisites:
- An Azure subscription with an Azure OpenAI resource provisioned.
- At least one model deployed in the resource.
- The
Cognitive Services OpenAI Userrole assigned to your identity. - The
azure-identityPython package installed (pip install azure-identity).
az prototype config set --key ai.provider --value azure-openai
az prototype config set --key ai.azure_openai.endpoint --value https://<resource>.openai.azure.com/
az prototype config set --key ai.model --value <deployment-name>Models depend on what you deploy in your Azure OpenAI resource. Common deployments:
| Deployment Name | Typical Model | Context Window | Notes |
|---|---|---|---|
gpt-4o |
GPT-4o (2024-08-06+) | 128K tokens | Recommended general-purpose deployment. |
gpt-4.1 |
GPT-4.1 | 1M tokens | Available in select regions. |
gpt-4o-mini |
GPT-4o Mini | 128K tokens | Lower cost option. |
Tip
See Azure OpenAI model availability for regional deployment options.
For security, only endpoints matching https://<resource>.openai.azure.com/ are accepted. The following are blocked:
api.openai.com(public OpenAI)chat.openai.com(ChatGPT)platform.openai.com- Any non-Azure-hosted endpoint
For most users, the quickest path to a working setup:
az prototype config set --key ai.provider --value copilot
az prototype config set --key ai.model --value claude-sonnet-4This is the default configuration. Claude Sonnet 4 provides the best balance of code generation quality, architectural reasoning, and speed for the prototype workflow.
| Use Case | Provider | Model | Why |
|---|---|---|---|
| General prototyping | copilot |
claude-sonnet-4 |
Best code generation quality and architectural reasoning. |
| Complex architecture design | copilot |
claude-opus-4.6 |
Most capable model for nuanced design trade-offs. Slower but higher quality. |
| Fast iteration / cost-sensitive | copilot |
claude-haiku-4.5 |
Fastest response times. Good enough for straightforward generation tasks. |
| Very large codebases | copilot |
gpt-4.1 or gemini-2.5-pro |
1M token context windows let you feed entire repos. |
| Enterprise / compliance | azure-openai |
gpt-4o |
Data stays in your Azure tenant. Private endpoints, RBAC, audit logs. |
| Open-weight models | github-models |
meta/meta-llama-3.1-405b-instruct |
No vendor lock-in on the model itself. |
| Reasoning-heavy tasks | github-models |
openai/o3 |
Chain-of-thought reasoning for multi-step deployment planning. |
Different stages benefit from different model characteristics:
| Stage | Recommended Model | Rationale |
|---|---|---|
init |
Any (minimal AI usage) | Initialization is mostly scaffold work. |
design |
claude-sonnet-4 or claude-opus-4.6 |
Architecture design benefits from strong reasoning. Opus for complex multi-service designs. |
build |
claude-sonnet-4 |
Code generation is Sonnet's sweet spot — fast, high-quality Bicep/Terraform/app code. |
deploy |
claude-sonnet-4 |
Deployment troubleshooting needs good code understanding and Azure knowledge. |
analyze error |
claude-sonnet-4 |
Error diagnosis requires correlating logs, code, and Azure docs. |
analyze costs |
claude-haiku-4.5 or gpt-4o-mini |
Cost estimation is structured output — faster models work fine. |
generate docs |
claude-sonnet-4 |
Documentation generation benefits from natural language fluency. |
Note
The extension uses a single model across all stages. Per-stage model selection is planned for a future release.
Change the model at any time:
# Switch to a different model
az prototype config set --key ai.model --value claude-opus-4.6
# Switch provider entirely
az prototype config set --key ai.provider --value github-models
az prototype config set --key ai.model --value openai/gpt-4o
# Check current configuration
az prototype config show| Problem | Solution |
|---|---|
No Copilot credentials found |
Run copilot login, or set COPILOT_GITHUB_TOKEN env var, or try a different provider (--ai-provider github-models). |
403 Forbidden with copilot |
Your token likely came from an unapproved OAuth app. Run copilot login to get a token from the approved Copilot CLI app. Common with EMU accounts using gh auth login. |
401 Unauthorized with copilot |
Ensure you have an active GitHub Copilot Business or Enterprise licence. The provider will retry once automatically. |
401 Unauthorized with github-models |
Check your GitHub token has models:read scope. Run gh auth refresh --scopes models:read. |
Claude model on github-models |
Claude is not available on GitHub Models. Switch to copilot provider. |
Invalid Azure OpenAI endpoint |
Endpoint must match https://<resource>.openai.azure.com/. Public OpenAI endpoints are blocked. |
| Slow responses | Try a smaller/faster model like gpt-4o-mini. The copilot provider uses direct HTTP (no SDK overhead). |
| Token limit exceeded | Switch to a model with a larger context window (gpt-4.1, gemini-2.5-pro). |
| Timeout on large prompts | Increase the timeout: set COPILOT_TIMEOUT=600 (default is 480 seconds). |
| Feature | Copilot | GitHub Models | Azure OpenAI |
|---|---|---|---|
| Anthropic Claude | Yes | No | No |
| OpenAI GPT | Yes | Yes | Yes |
| Google Gemini | Yes | No | No |
| Open-weight models | No | Yes (Meta, DeepSeek) | No |
| Authentication | Copilot OAuth token (copilot login) |
GitHub PAT (gh auth login) |
Azure AD (az login / managed identity) |
| Data residency | GitHub-managed | GitHub-managed | Your Azure tenant |
| Private networking | No | No | Yes (Private Endpoints) |
| SLA | Copilot SLA | Preview (no SLA) | Azure OpenAI SLA |
| Cost | Included in Copilot plan | Free tier + usage | Azure OpenAI pricing |
The following provider names are explicitly blocked for security and policy compliance:
openai,chatgpt,public-openai,anthropic,cohere,aws-bedrock,huggingface
Only Azure-hosted / Microsoft-approved AI services (copilot, github-models, azure-openai) are permitted. Attempting to configure a blocked provider will result in a CLIError.