Skip to content

Commit 782a9c6

Browse files
Kyle-Nealeclaude
andcommitted
[ddev] Atomic config swap to avoid autodiscovery deregister race
The `--config-file` path of `ddev env agent` previously renamed the original config away before writing the override, leaving a window in which the mounted conf.d directory had no config for the integration. Agent autodiscovery rescans on file events; if it scanned during that window it deregistered the check, and the immediately-following `agent check <name>` returned "no valid check found". This is the actual SNMP master.yml flake fingerprint: agent runs cleanly for 10+ minutes (20-30 successful check cycles), then a single test using `dd_agent_check` (which goes through this code path) hits the race and fails. Two of the last three master.yml SNMP failures match it exactly. Switch to read-modify-restore in place. `EnvData.write_config` now writes via tmp + os.replace so the file is never transiently absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 996b3d5 commit 782a9c6

2 files changed

Lines changed: 17 additions & 4 deletions

File tree

ddev/src/ddev/cli/env/agent.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -71,10 +71,15 @@ def agent(app: Application, *, intg_name: str, environment: str, args: tuple[str
7171
finally:
7272
env_data.config_file.unlink()
7373
else:
74-
temp_config_file = env_data.config_file.parent / f'{env_data.config_file.name}.bak.example'
75-
env_data.config_file.replace(temp_config_file)
74+
# Read-modify-restore in place. The previous implementation renamed
75+
# the original config away before writing the override, which left a
76+
# window where the conf.d directory contained no config for this
77+
# integration; if Agent autodiscovery rescanned during that window it
78+
# deregistered the check and the immediately-following `agent check`
79+
# returned "no valid check found".
80+
original_config = env_data.read_config()
7681
try:
7782
env_data.write_config(config)
7883
agent.invoke(full_args)
7984
finally:
80-
temp_config_file.replace(env_data.config_file)
85+
env_data.write_config(original_config)

ddev/src/ddev/e2e/config.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,13 +42,21 @@ def write_config(self, config: dict[str, Any] | None) -> None:
4242
if config is None:
4343
return
4444

45+
import os
46+
4547
import yaml
4648

4749
if 'instances' not in config:
4850
config = {'instances': [config]}
4951

5052
self.config_file.parent.ensure_dir_exists()
51-
self.config_file.write_text(yaml.safe_dump(config, default_flow_style=False))
53+
# Write via tmp + os.replace so the file is never transiently absent.
54+
# Agent autodiscovery watches this directory; if it observes the path
55+
# missing it deregisters the integration's check, causing later
56+
# `agent check <name>` invocations to fail with "no valid check found".
57+
tmp = self.config_file.parent / f'.{self.config_file.name}.swap'
58+
tmp.write_text(yaml.safe_dump(config, default_flow_style=False))
59+
os.replace(tmp, self.config_file)
5260

5361
def read_metadata(self) -> dict[str, Any]:
5462
import json

0 commit comments

Comments
 (0)