Skip to content

Commit 52e1edf

Browse files
committed
Retry CreateOnGithubJob on GitHub auth 401s
Deployment/status creation surfaces transient Octokit::Unauthorized when a GitHub installation token is rejected or still propagating. CommitDeployment#create_on_github! only rescues NotFound/Forbidden, so the 401 escaped the job unhandled and reopened the Observe issue. Add retry_on Octokit::Unauthorized to CreateOnGithubJob with polynomially_longer backoff and attempts: 6 (~15m window) so transient auth failures settle before giving up. On exhaustion, log and do not re-raise, matching the existing NotFound/Forbidden give-up behavior. No token cache or client changes. Fixes shop/issues#8801
1 parent 3b20110 commit 52e1edf

5 files changed

Lines changed: 47 additions & 4 deletions

File tree

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Unreleased
22

3+
# 0.45.3
4+
* Retry CreateOnGithubJob on transient GitHub authentication failures.
5+
* Stabilize PerformTaskJob tests by stubbing the task execution strategy instead of Command#stream!.
6+
37
# 0.45.2
48
* (bugfix) Fix 404 error when removing all permissions from an API client
59

app/jobs/shipit/create_on_github_job.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,16 @@ class CreateOnGithubJob < BackgroundJob
77
queue_as :default
88
on_duplicate :drop
99

10+
# GitHub installation-token rejection / propagation lag surfaces as a transient 401.
11+
# Retry on a backoff so it settles before we give up, instead of crashing the job.
12+
retry_on Octokit::Unauthorized, wait: :polynomially_longer, attempts: 6 do |job, exception|
13+
record = job.arguments.first
14+
Rails.logger.warn(
15+
"[CreateOnGithubJob] Giving up on #{record.class.name} #{record.id} " \
16+
"after GitHub authentication failures: #{exception.class} #{exception.message}"
17+
)
18+
end
19+
1020
# We observe that some objects regularly take longer than the default 10 seconds to create, e.g. deployments
1121
self.timeout = 40
1222
self.lock_timeout = 20

lib/shipit/version.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# frozen_string_literal: true
22

33
module Shipit
4-
VERSION = '0.45.2'
4+
VERSION = '0.45.3'
55
end

test/jobs/perform_task_job_test.rb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ def success?
107107
end
108108

109109
test "mark deploy as error an unexpected exception is raised" do
110-
Command.any_instance.expects(:stream!).at_least_once.raises(Command::Denied)
110+
Shipit::TaskExecutionStrategy::Default.any_instance.expects(:capture!).at_least_once.raises(Command::Denied)
111111

112112
@job.perform(@deploy)
113113

@@ -116,7 +116,7 @@ def success?
116116
end
117117

118118
test "mark deploy as timedout if a command timeout" do
119-
Command.any_instance.expects(:stream!).at_least_once.raises(Command::TimedOut)
119+
Shipit::TaskExecutionStrategy::Default.any_instance.expects(:capture!).at_least_once.raises(Command::TimedOut)
120120

121121
@job.perform(@deploy)
122122

@@ -129,7 +129,7 @@ def success?
129129
begin
130130
Shipit.timeout_exit_codes = [70].freeze
131131

132-
Command.any_instance.expects(:stream!).at_least_once.raises(Command::Failed.new('Blah', 70))
132+
Shipit::TaskExecutionStrategy::Default.any_instance.expects(:capture!).at_least_once.raises(Command::Failed.new('Blah', 70))
133133

134134
@job.perform(@deploy)
135135

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# frozen_string_literal: true
2+
3+
require 'test_helper'
4+
5+
module Shipit
6+
class CreateOnGithubJobTest < ActiveSupport::TestCase
7+
setup do
8+
@deployment = shipit_commit_deployments(:shipit_pending_fourth)
9+
end
10+
11+
test "#perform retries on GitHub authentication errors" do
12+
CommitDeployment.any_instance.stubs(:create_on_github!).raises(Octokit::Unauthorized)
13+
14+
assert_enqueued_with(job: CreateOnGithubJob) do
15+
CreateOnGithubJob.perform_now(@deployment)
16+
end
17+
end
18+
19+
test "#perform gives up without re-raising after exhausting authentication retries" do
20+
CommitDeployment.any_instance.stubs(:create_on_github!).raises(Octokit::Unauthorized)
21+
Rails.logger.stubs(:warn)
22+
23+
job = CreateOnGithubJob.new(@deployment)
24+
job.exception_executions = { "[Octokit::Unauthorized]" => 5 }
25+
26+
assert_nothing_raised { job.perform_now }
27+
end
28+
end
29+
end

0 commit comments

Comments
 (0)