Describe the feature or problem you'd like to solve
Copilot CLI exhibits aggressive retry behavior when encountering HTTP 429 (Rate Limited) responses. When the CLI receives a 429, it immediately closes the connection and retries without waiting, creating a loop of rapid retries (20+ per minute) even though GitHub requests wait times of 5-10 minutes via the Retry-After header. This behavior causes rapid rate limit quota depletion, unnecessary server load from retry spam, and poor user experience with no feedback about rate limiting.
Proposed solution
Implement standard HTTP retry patterns:
-
Exponential Backoff with Jitter
wait_time = min(2^attempt * 60s + random_jitter, 300s)
- First 429: Wait 60 seconds (or use Retry-After value)
- Second 429: Wait 120 seconds
- Third 429: Wait 240 seconds
- Subsequent: Wait 5 minutes (capped)
-
Respect Retry-After Headers
- Parse
Retry-After header from 429 responses
- Use header value as minimum wait time
- Apply exponential backoff if subsequent 429s occur
-
Circuit Breaker Pattern
- After 3 consecutive 429s, enter cooldown mode for 10 minutes
- Display: "Rate limit protection active. Cooling down for 10 minutes..."
- Reset counter on successful request
-
User-Facing Progress Indicators
- Show clear messages during rate limit waits
- Display countdown timer for retry
- Indicate number of attempts made
Benefits:
- Prevents rapid rate limit quota depletion
- Reduces server load from retry spam
- Provides clear user feedback
- Follows HTTP best practices
Example prompts or workflows
Example 1: Batch file processing
User processes 50 files with Copilot. Currently, if rate limited mid-batch, Copilot spams retries and burns rate limit quota. With proper backoff, it would gracefully pause and resume.
Example 2: Long-running sessions
Developer uses Copilot for 4-hour coding session. Currently hits rate limits in hour 2 and loses remaining time. With backoff, session continues at reduced speed instead of breaking.
Example 3: Multi-agent workflows
Copilot delegates to sub-agents for complex tasks. Currently, if sub-agent hits rate limit, parent agent retries immediately causing cascade failure. With backoff, system stabilizes.
Additional context
Current Behavior
- Copilot CLI ignores
Retry-After headers
- No exponential backoff implemented
- Immediate retry on 429 responses
- No user feedback during rate limiting
Expected Behavior
- Parse and respect
Retry-After headers
- Implement exponential backoff between retries
- Show user progress during waits
- Circuit breaker after consecutive failures
Environment
- Copilot CLI Version: 1.0.28 (observed, likely affects all versions)
- Operating System: Linux, macOS, Windows (all platforms affected)
Impact
For Users:
- Weekly rate limit quota depleted rapidly (sometimes within hours)
- No feedback indicating rate limiting is occurring
- Extended downtime once limits are hit
- Loss of context when forced to restart sessions
For GitHub Infrastructure:
- Unnecessary server load from retry spam (20+ requests/minute during rate limiting)
- Wasted compute on requests guaranteed to fail
- Escalating rate limit penalties for affected users (delays increase from 5min → 10min → 15min+)
Technical Details
Root Cause:
The issue stems from two missing features in Copilot CLI's HTTP client:
- No exponential backoff - retry logic lacks progressive delay increases
- Retry-After header ignored - client does not parse or respect standard HTTP header
Client Timeout Behavior:
Testing reveals Copilot CLI has an internal HTTP client timeout of approximately 30-60 seconds:
- Hold durations under 30 seconds: Client waits successfully
- Hold durations over 60 seconds: Client times out and retries immediately
This suggests the client interprets long response delays as connection failures rather than intentional rate limit handling.
Describe the feature or problem you'd like to solve
Copilot CLI exhibits aggressive retry behavior when encountering HTTP 429 (Rate Limited) responses. When the CLI receives a 429, it immediately closes the connection and retries without waiting, creating a loop of rapid retries (20+ per minute) even though GitHub requests wait times of 5-10 minutes via the
Retry-Afterheader. This behavior causes rapid rate limit quota depletion, unnecessary server load from retry spam, and poor user experience with no feedback about rate limiting.Proposed solution
Implement standard HTTP retry patterns:
Exponential Backoff with Jitter
Respect Retry-After Headers
Retry-Afterheader from 429 responsesCircuit Breaker Pattern
User-Facing Progress Indicators
Benefits:
Example prompts or workflows
Example 1: Batch file processing
User processes 50 files with Copilot. Currently, if rate limited mid-batch, Copilot spams retries and burns rate limit quota. With proper backoff, it would gracefully pause and resume.
Example 2: Long-running sessions
Developer uses Copilot for 4-hour coding session. Currently hits rate limits in hour 2 and loses remaining time. With backoff, session continues at reduced speed instead of breaking.
Example 3: Multi-agent workflows
Copilot delegates to sub-agents for complex tasks. Currently, if sub-agent hits rate limit, parent agent retries immediately causing cascade failure. With backoff, system stabilizes.
Additional context
Current Behavior
Retry-AfterheadersExpected Behavior
Retry-AfterheadersEnvironment
Impact
For Users:
For GitHub Infrastructure:
Technical Details
Root Cause:
The issue stems from two missing features in Copilot CLI's HTTP client:
Client Timeout Behavior:
Testing reveals Copilot CLI has an internal HTTP client timeout of approximately 30-60 seconds:
This suggests the client interprets long response delays as connection failures rather than intentional rate limit handling.