|
| 1 | +--- |
| 2 | +title: "How to Configure OrcaRouter with Continue" |
| 3 | +sidebarTitle: "OrcaRouter" |
| 4 | +--- |
| 5 | + |
| 6 | +<Info> |
| 7 | + [OrcaRouter](https://www.orcarouter.ai) is an OpenAI-compatible API gateway that aggregates ~120 chat models from OpenAI, Anthropic, Google, DeepSeek, xAI, Qwen, Kimi, MiniMax, Z-AI, and other vendors behind a single `sk-orca-` key. |
| 8 | +</Info> |
| 9 | + |
| 10 | +<Tip> |
| 11 | + Sign up at [orcarouter.ai](https://www.orcarouter.ai) and obtain an API key from your [console](https://www.orcarouter.ai/console). |
| 12 | +</Tip> |
| 13 | + |
| 14 | +## Quickstart |
| 15 | + |
| 16 | +<Tabs> |
| 17 | + <Tab title="YAML"> |
| 18 | + ```yaml title="config.yaml" |
| 19 | + name: My Config |
| 20 | + version: 0.0.1 |
| 21 | + schema: v1 |
| 22 | + |
| 23 | + models: |
| 24 | + - name: OrcaRouter Auto |
| 25 | + provider: orcarouter |
| 26 | + model: orcarouter/auto |
| 27 | + apiBase: https://api.orcarouter.ai/v1/ |
| 28 | + apiKey: sk-orca-xxxxxxxxxxxxxxxxxx |
| 29 | + ``` |
| 30 | + </Tab> |
| 31 | + <Tab title="JSON (Deprecated)"> |
| 32 | + ```json title="config.json" |
| 33 | + { |
| 34 | + "models": [ |
| 35 | + { |
| 36 | + "title": "OrcaRouter Auto", |
| 37 | + "provider": "orcarouter", |
| 38 | + "model": "orcarouter/auto", |
| 39 | + "apiBase": "https://api.orcarouter.ai/v1/", |
| 40 | + "apiKey": "sk-orca-xxxxxxxxxxxxxxxxxx" |
| 41 | + } |
| 42 | + ] |
| 43 | + } |
| 44 | + ``` |
| 45 | + </Tab> |
| 46 | +</Tabs> |
| 47 | + |
| 48 | +`orcarouter/auto` is a virtual model that adaptively routes each request to a candidate upstream based on a configurable strategy (cheapest / balanced / quality / contextual bandit / difficulty-gated). Routing pools and reward weights are tunable from the [console](https://www.orcarouter.ai/console/routing) without changing client code. |
| 49 | + |
| 50 | +## Pinning a specific model |
| 51 | + |
| 52 | +You can pin any model from the [OrcaRouter catalog](https://www.orcarouter.ai/models) by passing its full ID: |
| 53 | + |
| 54 | +<Tabs> |
| 55 | + <Tab title="YAML"> |
| 56 | + ```yaml title="config.yaml" |
| 57 | + models: |
| 58 | + - name: Claude Opus 4.7 |
| 59 | + provider: orcarouter |
| 60 | + model: anthropic/claude-opus-4.7 |
| 61 | + apiBase: https://api.orcarouter.ai/v1/ |
| 62 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 63 | + |
| 64 | + - name: GPT-5.5 |
| 65 | + provider: orcarouter |
| 66 | + model: openai/gpt-5.5 |
| 67 | + apiBase: https://api.orcarouter.ai/v1/ |
| 68 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 69 | + |
| 70 | + - name: DeepSeek V4 Pro |
| 71 | + provider: orcarouter |
| 72 | + model: deepseek/deepseek-v4-pro |
| 73 | + apiBase: https://api.orcarouter.ai/v1/ |
| 74 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 75 | + ``` |
| 76 | + </Tab> |
| 77 | +</Tabs> |
| 78 | +
|
| 79 | +## Reasoning controls |
| 80 | +
|
| 81 | +OrcaRouter passes vendor-native reasoning controls through to the upstream: |
| 82 | +
|
| 83 | +### OpenAI / Grok / Gemini / Qwen reasoning families |
| 84 | +
|
| 85 | +Use the flat `reasoning_effort` field via `requestOptions.extraBodyProperties`: |
| 86 | + |
| 87 | +```yaml title="config.yaml" |
| 88 | +models: |
| 89 | + - name: GPT-5.5 (High Reasoning) |
| 90 | + provider: orcarouter |
| 91 | + model: openai/gpt-5.5 |
| 92 | + apiBase: https://api.orcarouter.ai/v1/ |
| 93 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 94 | + requestOptions: |
| 95 | + extraBodyProperties: |
| 96 | + reasoning_effort: high # "minimal" | "low" | "medium" | "high" |
| 97 | +``` |
| 98 | + |
| 99 | +### Gemini reasoning caveat |
| 100 | + |
| 101 | +Some Gemini models (including `gemini-3-flash-preview`) are reasoning models that spend most of their `completion_tokens` budget on internal reasoning before producing the final reply. The streaming response stays silent for several seconds during that period, which can look like the chat is blank or stuck. |
| 102 | + |
| 103 | +For fast responses, pass `reasoning_effort: "minimal"`: |
| 104 | +
|
| 105 | +```yaml title="config.yaml" |
| 106 | +models: |
| 107 | + - name: Gemini 3 Flash (Fast) |
| 108 | + provider: orcarouter |
| 109 | + model: google/gemini-3-flash-preview |
| 110 | + apiBase: https://api.orcarouter.ai/v1/ |
| 111 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 112 | + requestOptions: |
| 113 | + extraBodyProperties: |
| 114 | + reasoning_effort: minimal |
| 115 | +``` |
| 116 | +
|
| 117 | +### Anthropic Claude (thinking block) |
| 118 | +
|
| 119 | +Anthropic reasoning models use a native `thinking` block: |
| 120 | + |
| 121 | +```yaml title="config.yaml" |
| 122 | +models: |
| 123 | + - name: Claude Opus 4.7 (Thinking) |
| 124 | + provider: orcarouter |
| 125 | + model: anthropic/claude-opus-4.7 |
| 126 | + apiBase: https://api.orcarouter.ai/v1/ |
| 127 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 128 | + requestOptions: |
| 129 | + extraBodyProperties: |
| 130 | + thinking: |
| 131 | + type: enabled |
| 132 | + budget_tokens: 2000 # >= 1024, must be < max_tokens |
| 133 | +``` |
| 134 | + |
| 135 | +<Note> |
| 136 | + Anthropic reasoning models (e.g. `claude-opus-4.7`) reject `temperature` and `top_k`. Omit them from `completionOptions` when targeting these models. |
| 137 | +</Note> |
| 138 | + |
| 139 | +## Fallback chain |
| 140 | + |
| 141 | +When the primary upstream fails, OrcaRouter can fall back to a configured list using the `extra_body` field: |
| 142 | + |
| 143 | +```yaml title="config.yaml" |
| 144 | +models: |
| 145 | + - name: OrcaRouter Auto with Fallback |
| 146 | + provider: orcarouter |
| 147 | + model: orcarouter/auto |
| 148 | + apiBase: https://api.orcarouter.ai/v1/ |
| 149 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 150 | + requestOptions: |
| 151 | + extraBodyProperties: |
| 152 | + extra_body: |
| 153 | + models: [deepseek/deepseek-v4-pro] |
| 154 | + route: fallback |
| 155 | +``` |
| 156 | + |
| 157 | +## Agent / tool-calling caveat |
| 158 | + |
| 159 | +The default `orcarouter/auto` pool may include models that do not support function calling. If you use Continue's Agent mode (which sends a `tools` field), pin a specific tool-capable model like `anthropic/claude-opus-4.7` or `openai/gpt-5.5`, or adjust the AUTO routing pool from the [console](https://www.orcarouter.ai/console/routing) to only include tool-capable upstreams. |
| 160 | + |
| 161 | +```yaml title="config.yaml" |
| 162 | +models: |
| 163 | + - name: Claude Opus 4.7 (for Agent) |
| 164 | + provider: orcarouter |
| 165 | + model: anthropic/claude-opus-4.7 |
| 166 | + apiBase: https://api.orcarouter.ai/v1/ |
| 167 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 168 | + capabilities: |
| 169 | + - tool_use |
| 170 | + roles: |
| 171 | + - chat |
| 172 | + - edit |
| 173 | +``` |
| 174 | + |
| 175 | +## Prompt caching |
| 176 | + |
| 177 | +For Anthropic Claude models, Continue automatically injects `cache_control: { type: "ephemeral" }` on the system message and the last two user turns when `cacheBehavior` or `promptCaching` is enabled: |
| 178 | + |
| 179 | +```yaml title="config.yaml" |
| 180 | +models: |
| 181 | + - name: Claude Opus 4.7 (Cached) |
| 182 | + provider: orcarouter |
| 183 | + model: anthropic/claude-opus-4.7 |
| 184 | + apiBase: https://api.orcarouter.ai/v1/ |
| 185 | + apiKey: ${{ secrets.ORCAROUTER_API_KEY }} |
| 186 | + cacheBehavior: |
| 187 | + cacheSystemMessage: true |
| 188 | + cacheConversation: true |
| 189 | +``` |
| 190 | + |
| 191 | +## See also |
| 192 | + |
| 193 | +- [Full model catalog](https://www.orcarouter.ai/models) |
| 194 | +- [Routing configuration console](https://www.orcarouter.ai/console/routing) |
| 195 | +- [API documentation](https://docs.orcarouter.ai) |
0 commit comments