diff --git a/support/inference/articles/api-error-code-401-authentication-failed.mdx b/support/inference/articles/api-error-code-401-authentication-failed.mdx index ed19646af3..67c8216a7b 100644 --- a/support/inference/articles/api-error-code-401-authentication-failed.mdx +++ b/support/inference/articles/api-error-code-401-authentication-failed.mdx @@ -1,30 +1,32 @@ --- -title: "API error code 401 - Authentication failed" +title: "API error code 401: Authentication failed" keywords: ["Authentication & Access"] --- -A 401 error with the message "Authentication failed" means your authentication credentials are incorrect or your W&B project entity and/or name are incorrect. +A `401` error with the message "Authentication failed" means your authentication credentials are incorrect, or your W&B project entity or name is incorrect. This article explains why the Serverless Inference API returns this error and how to resolve it. ## Why this happens The Serverless Inference API requires valid authentication credentials to process requests. This error occurs when: -- Your API key is invalid, expired, or revoked -- Your W&B project entity name is incorrect -- Your W&B project name is incorrect +- Your API key is invalid, expired, or revoked. +- Your W&B project entity name is incorrect. +- Your W&B project name is incorrect. ## What you can do +Work through the following checks in order. Each step rules out one of the common causes listed above so you can isolate the source of the failure. + 1. **Verify your API key** - - Check that you're using the correct API key for your account - - Regenerate your API key if needed from your [W&B settings](https://wandb.ai/settings) + - Check that you're using the correct API key for your account. + - Regenerate your API key if needed from your [W&B settings](https://wandb.ai/settings). 2. **Check your project details** - - Ensure your W&B project entity (organization or username) is correct - - Verify that the project name matches an existing project + - Ensure your W&B project entity (organization or username) is correct. + - Verify that the project name matches an existing project. 3. **Contact support** - - If the issue persists after verifying your credentials, reach out to [W&B support](mailto:support@wandb.com) + - If the issue persists after verifying your credentials, contact [W&B support](mailto:support@wandb.com). --- diff --git a/support/inference/articles/api-error-code-402-you-exceeded-your-cur.mdx b/support/inference/articles/api-error-code-402-you-exceeded-your-cur.mdx index 6bcecf6f4c..2f8a7096e0 100644 --- a/support/inference/articles/api-error-code-402-you-exceeded-your-cur.mdx +++ b/support/inference/articles/api-error-code-402-you-exceeded-your-cur.mdx @@ -3,25 +3,22 @@ title: "API error code 402 - You exceeded your current quota" keywords: ["Quotas & Rate Limits", "Billing"] --- -A 402 error with the message "You exceeded your current quota, please check your plan and billing details" means you've run out of credits or reached your monthly spending cap. +A `402` error with the message `You exceeded your current quota, please check your plan and billing details` means you've run out of credits or reached your monthly spending cap. This article explains why the error occurs and how to restore access to Serverless Inference. ## Why this happens Serverless Inference uses a credit-based system. This error occurs when: -- Your account has no remaining credits -- You've reached the monthly spending cap set for your organization +- Your account has no remaining credits. +- You've reached the monthly spending cap set for your organization. -## What you can do +## Restore access -1. **Check your plan and billing details** - - Review your current usage and remaining credits in your W&B account settings +To restore access to Serverless Inference, do any of the following: -2. **Get more credits** - - Purchase additional credits or upgrade your plan - -3. **Increase your limits** - - Adjust your monthly spending cap if one is configured +- **Check your plan and billing details.** Review your current usage and remaining credits in your W&B account settings. +- **Get more credits.** Purchase additional credits or upgrade your plan. +- **Increase your limits.** Adjust your monthly spending cap if one is configured. For more information, see [Usage information and limits](/inference/usage-limits/). diff --git a/support/inference/articles/api-error-code-403-country-region-or-ter.mdx b/support/inference/articles/api-error-code-403-country-region-or-ter.mdx index 67f5f189c1..58fcb3de80 100644 --- a/support/inference/articles/api-error-code-403-country-region-or-ter.mdx +++ b/support/inference/articles/api-error-code-403-country-region-or-ter.mdx @@ -3,29 +3,29 @@ title: "API error code 403 - Country, region, or territory not supported" keywords: ["Authentication & Access"] --- -A 403 error with the message "Country, region, or territory not supported" means you're accessing Serverless Inference from an unsupported location. +A `403` error with the message "Country, region, or territory not supported" means you're accessing Serverless Inference from an unsupported location. This article explains why the error occurs and how to regain access to the service. ## Why this happens -Serverless Inference has geographic restrictions due to compliance and regulatory requirements. The service is only accessible from supported geographic locations. +Serverless Inference has geographic restrictions due to compliance and regulatory requirements. You can only access the service from supported geographic locations. ## What you can do -1. **Check the geographic restrictions** - - Review the [geographic restrictions](/inference/usage-limits/#geographic-restrictions) for the current list of supported locations +To resolve the error, work through the following options to confirm whether your location is supported and identify a path forward: -2. **Use from a supported location** - - Access the service when in a supported country or region - - Consider using your organization's resources in supported locations - -3. **Contact your account team** - - Enterprise customers can discuss options with their account executive - - Some organizations may have special arrangements +- **Check the geographic restrictions**: Review the [geographic restrictions](/inference/usage-limits/#geographic-restrictions) for the list of supported locations to confirm whether your country or region is eligible. +- **Use from a supported location**: + - Access the service when in a supported country or region. + - Consider using your organization's resources in supported locations. +- **Contact your account team**: + - Enterprise customers can discuss options with their account executive. + - Some organizations might have special arrangements. ## Error details When you see this error: -``` + +```json { "error": { "code": 403, @@ -34,7 +34,7 @@ When you see this error: } ``` -This is determined by your IP address location at the time of the API request. +Serverless Inference determines your location from your IP address at the time of the API request. --- diff --git a/support/inference/articles/api-error-code-403-the-inference-gateway.mdx b/support/inference/articles/api-error-code-403-the-inference-gateway.mdx index 30e69b504e..1bfab14414 100644 --- a/support/inference/articles/api-error-code-403-the-inference-gateway.mdx +++ b/support/inference/articles/api-error-code-403-the-inference-gateway.mdx @@ -1,21 +1,18 @@ --- -title: "API error code 403 - The inference gateway is not enabled for your organization" +title: "API error code 403: The inference gateway is not enabled for your organization" keywords: ["Authentication & Access", "Administrator"] --- -A 403 error with the message "The inference gateway is not enabled for your organization" means your organization doesn't have the inference gateway enabled, which is required to use Serverless Inference. +A `403` error with the message `The inference gateway is not enabled for your organization` means your organization doesn't have the inference gateway enabled, which is required to use Serverless Inference. This article explains why the error occurs and how to enable the gateway. ## Why this happens -Serverless Inference requires the inference gateway to be enabled at the organization level. If your organization hasn't enabled this feature, all API requests will be rejected with a 403 error. +Serverless Inference requires the inference gateway to be enabled at the organization level. If your organization hasn't enabled this feature, the gateway rejects all API requests with a `403` error. ## What you can do -1. **Contact your W&B administrator** - - Ask your organization's W&B administrator to enable the inference gateway - -2. **Reach out to W&B support** - - If you're unsure who your administrator is, or need help enabling the gateway, contact [W&B support](mailto:support@wandb.com) for assistance +- **Contact your W&B administrator**: Ask your organization's W&B administrator to enable the inference gateway. +- **Contact W&B support**: If you're unsure who your administrator is, or need help enabling the gateway, contact [W&B support](mailto:support@wandb.com). --- diff --git a/support/inference/articles/api-error-code-429-concurrency-limit-rea.mdx b/support/inference/articles/api-error-code-429-concurrency-limit-rea.mdx index 125b76aa87..b1d8fbd546 100644 --- a/support/inference/articles/api-error-code-429-concurrency-limit-rea.mdx +++ b/support/inference/articles/api-error-code-429-concurrency-limit-rea.mdx @@ -3,20 +3,20 @@ title: "API error code 429 - Concurrency limit reached for requests" keywords: ["Quotas & Rate Limits"] --- -A 429 error with the message "Concurrency limit reached for requests" means you're sending too many concurrent requests to the Serverless Inference API. +A `429` error with the message "Concurrency limit reached for requests" means you're sending too many concurrent requests to the Serverless Inference API. This page explains why the error occurs and how to resolve it so your requests succeed. ## Why this happens -Serverless Inference enforces concurrency limits to ensure fair usage and service stability. When the number of simultaneous requests from your account exceeds the allowed limit, additional requests are rejected with a 429 status code. +Serverless Inference enforces concurrency limits to maintain fair usage and service stability. When the number of simultaneous requests from your account exceeds the allowed limit, additional requests are rejected with a `429` status code. ## What you can do -1. **Reduce concurrent requests** - - Implement request queuing or throttling in your application - - Use exponential backoff when retrying failed requests +To resolve the error, choose one or both of the following approaches based on your workload and plan. -2. **Increase your limits** - - Review your plan's concurrency limits and upgrade if needed +- **Reduce concurrent requests** to stay within your current limit: + - Implement request queuing or throttling in your application. + - Use exponential backoff when retrying failed requests. +- **Increase your limits** if your workload requires more capacity. Review your plan's concurrency limits and upgrade if needed. For more information, see [Usage information and limits](/inference/usage-limits/). diff --git a/support/inference/articles/api-error-code-500-the-server-had-an-err.mdx b/support/inference/articles/api-error-code-500-the-server-had-an-err.mdx index aa0b5ad05c..3500e620ba 100644 --- a/support/inference/articles/api-error-code-500-the-server-had-an-err.mdx +++ b/support/inference/articles/api-error-code-500-the-server-had-an-err.mdx @@ -3,20 +3,18 @@ title: "API error code 500 - The server had an error while processing your reque keywords: ["Server Errors"] --- -A 500 error with the message "The server had an error while processing your request" indicates an internal server error on the Serverless Inference side. +A `500` error with the message `The server had an error while processing your request` indicates an internal server error in Serverless Inference. ## Why this happens -Internal server errors are typically transient issues caused by temporary problems on the server side. These are not caused by your request or configuration. +Internal server errors are typically transient and originate on the server side. They aren't caused by your request or configuration. ## What you can do -1. **Retry after a brief wait** - - Wait a few seconds and retry your request - - Use exponential backoff for automated retries +To resolve the error, follow these steps: -2. **Contact support if it persists** - - If you continue to see 500 errors after multiple retries, contact [W&B support](mailto:support@wandb.com) with details about your request +1. **Retry after a brief wait.** Wait a few seconds, then retry your request. For automated retries, use exponential backoff. +2. **Contact support if the error persists.** If you continue to see `500` errors after multiple retries, contact [W&B support](mailto:support@wandb.com) with details about your request. --- diff --git a/support/inference/articles/api-error-code-503-the-engine-is-current.mdx b/support/inference/articles/api-error-code-503-the-engine-is-current.mdx index 34185b6ecc..6e2752d6e4 100644 --- a/support/inference/articles/api-error-code-503-the-engine-is-current.mdx +++ b/support/inference/articles/api-error-code-503-the-engine-is-current.mdx @@ -1,23 +1,24 @@ --- -title: "API error code 503 - The engine is currently overloaded" +title: "API error code 503: The engine is currently overloaded" keywords: ["Server Errors"] --- -A 503 error with the message "The engine is currently overloaded, please try again later" means the Serverless Inference server is experiencing high traffic and cannot process your request right now. +A `503` error with the message "The engine is currently overloaded, please try again later" means the Serverless Inference server is experiencing high traffic and can't process your request. This page explains why the error occurs and how to mitigate it. ## Why this happens -During periods of high demand, the inference engine may become temporarily overloaded. This is a transient condition that typically resolves on its own as traffic subsides. +During periods of high demand, the inference engine can become temporarily overloaded. This condition typically resolves on its own as traffic subsides. ## What you can do -1. **Retry after a short delay** - - Wait a few seconds before retrying your request - - Use exponential backoff to avoid adding to the congestion +Use the following strategies to recover from a `503` response and reduce the chance of encountering it again: -2. **Spread out requests** - - If you're sending many requests, consider spacing them out over time - - Implement request queuing to smooth traffic spikes +- **Retry after a short delay**: + - Wait a few seconds before retrying your request. + - Use exponential backoff to avoid adding to the congestion. +- **Spread out requests**: + - If you're sending many requests, space them out over time. + - Implement request queuing to smooth traffic spikes. --- diff --git a/support/inference/tags/administrator.mdx b/support/inference/tags/administrator.mdx index ecdc4ad5ca..83653f5d0f 100644 --- a/support/inference/tags/administrator.mdx +++ b/support/inference/tags/administrator.mdx @@ -5,6 +5,6 @@ generator: "knowledgebase-nav" template: "scripts/knowledgebase-nav/templates/support_tag.mdx.j2" --- - - A 403 error with the message "The inference gateway is not enabled for your organization" means your organization doesn' ... + + A 403 error with the message The inference gateway is not enabled for your organization means your organization doesn't ... diff --git a/support/inference/tags/authentication-access.mdx b/support/inference/tags/authentication-access.mdx index c3eb0cf251..2566fc8909 100644 --- a/support/inference/tags/authentication-access.mdx +++ b/support/inference/tags/authentication-access.mdx @@ -5,12 +5,12 @@ generator: "knowledgebase-nav" template: "scripts/knowledgebase-nav/templates/support_tag.mdx.j2" --- - - A 401 error with the message "Authentication failed" means your authentication credentials are incorrect or your W&B pro ... + + A 401 error with the message "Authentication failed" means your authentication credentials are incorrect, or your W&B pr ... A 403 error with the message "Country, region, or territory not supported" means you're accessing Serverless Inference f ... - - A 403 error with the message "The inference gateway is not enabled for your organization" means your organization doesn' ... + + A 403 error with the message The inference gateway is not enabled for your organization means your organization doesn't ... diff --git a/support/inference/tags/billing.mdx b/support/inference/tags/billing.mdx index dbd2d29057..d8dec0a136 100644 --- a/support/inference/tags/billing.mdx +++ b/support/inference/tags/billing.mdx @@ -6,5 +6,5 @@ template: "scripts/knowledgebase-nav/templates/support_tag.mdx.j2" --- - A 402 error with the message "You exceeded your current quota, please check your plan and billing details" means you've ... + A 402 error with the message You exceeded your current quota, please check your plan and billing details means you've ru ... diff --git a/support/inference/tags/quotas-rate-limits.mdx b/support/inference/tags/quotas-rate-limits.mdx index 00c17aad9f..49c2412542 100644 --- a/support/inference/tags/quotas-rate-limits.mdx +++ b/support/inference/tags/quotas-rate-limits.mdx @@ -6,7 +6,7 @@ template: "scripts/knowledgebase-nav/templates/support_tag.mdx.j2" --- - A 402 error with the message "You exceeded your current quota, please check your plan and billing details" means you've ... + A 402 error with the message You exceeded your current quota, please check your plan and billing details means you've ru ... A 429 error with the message "Concurrency limit reached for requests" means you're sending too many concurrent requests ... diff --git a/support/inference/tags/server-errors.mdx b/support/inference/tags/server-errors.mdx index ea4eacfda6..2c758a2dd3 100644 --- a/support/inference/tags/server-errors.mdx +++ b/support/inference/tags/server-errors.mdx @@ -6,8 +6,8 @@ template: "scripts/knowledgebase-nav/templates/support_tag.mdx.j2" --- - A 500 error with the message "The server had an error while processing your request" indicates an internal server error ... + A 500 error with the message The server had an error while processing your request indicates an internal server error in ... - + A 503 error with the message "The engine is currently overloaded, please try again later" means the Serverless Inference ...