Skip to content

Commit 44475cf

Browse files
committed
Fix race condition in ProcessesSync causing SSH key mismatch
The sync could update an LRP with stale SSH route data from a previous version when the process version changed between the initial scan and the batch execution. Timeline of the race condition: Sync User Request ---- ------------ | | | fetch diego LRPs | | (gets {guid}-{vA} with SSH key A) | | | | scan CC processes | | process has version A, matches | | to_update[id] = diego_lrp_A | | | | | update process | | version changes to B | | | | desire_app called | | creates {guid}-{vB} | | with new SSH key B | | | re-fetch process by id | | (now has version B) | | | | update_app(process_vB, diego_lrp_A) | | process_guid = {guid}-{vB} | | but uses SSH route from diego_lrp_A! | | | v v Result: LRP {guid}-{vB} has SSH key A in routes but SSH key B in run_info (sshd arguments), breaking cf ssh. Fix: Before executing desire/update, verify the process version still matches what was recorded during the initial scan. Skip if mismatched - the next sync cycle will handle it correctly.
1 parent e07777b commit 44475cf

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

lib/cloud_controller/diego/processes_sync.rb

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,15 +21,15 @@ def sync
2121
@bump_freshness = true
2222
diego_lrps = bbs_apps_client.fetch_scheduling_infos.index_by { |d| d.desired_lrp_key.process_guid }
2323
logger.info('fetched-scheduling-infos')
24-
to_desire = []
24+
to_desire = {}
2525
to_update = {}
2626
batched_processes_for_sync do |processes|
2727
processes.each do |process|
2828
process_guid = ProcessGuid.from_process(process)
2929
diego_lrp = diego_lrps.delete(process_guid)
3030

3131
if diego_lrp.nil?
32-
to_desire.append(process.id)
32+
to_desire[process.id] = process_guid
3333
elsif process.updated_at.to_f.to_s != diego_lrp.annotation
3434
to_update[process.id] = diego_lrp
3535
end
@@ -87,7 +87,10 @@ def process_workpool_exceptions(exceptions)
8787
def update_lrps(to_update)
8888
batched_processes(to_update.keys) do |processes|
8989
processes.each do |process|
90-
workpool.submit(process, to_update[process.id]) do |p, l|
90+
existing_lrp = to_update[process.id]
91+
next if ProcessGuid.from_process(process) != existing_lrp.desired_lrp_key.process_guid
92+
93+
workpool.submit(process, existing_lrp) do |p, l|
9194
logger.info('updating-lrp', process_guid: p.guid, app_guid: p.app_guid)
9295
bbs_apps_client.update_app(p, l)
9396
logger.info('update-lrp', process_guid: p.guid)
@@ -97,8 +100,11 @@ def update_lrps(to_update)
97100
end
98101

99102
def desire_lrps(to_desire)
100-
batched_processes(to_desire) do |processes|
103+
batched_processes(to_desire.keys) do |processes|
101104
processes.each do |process|
105+
desired_process_guid = to_desire[process.id]
106+
next if ProcessGuid.from_process(process) != desired_process_guid
107+
102108
workpool.submit(process) do |p|
103109
logger.info('desiring-lrp', process_guid: p.guid, app_guid: p.app_guid)
104110
bbs_apps_client.desire_app(p)

0 commit comments

Comments
 (0)