Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions cisco_aci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@ By integrating ACI metrics with Datadog's infrastructure and application monitor

## Setup

**The Cisco ACI integration is Generally Available. To learn more about billing implications, visit our [pricing page][19].**

### Installation

The Cisco ACI check is packaged with the Agent, so simply [install the Agent][1] on a server within your network.
Expand Down Expand Up @@ -201,5 +199,4 @@ Contact [Datadog support][9].
[15]: /dash/integration/242/cisco-aci---overview
[16]: /logs
[17]: https://docs.datadoghq.com/logs/log_collection/?tab=host#setup
[18]: https://github.com/DataDog/integrations-core/blob/master/cisco_aci/tests/cisco_aci_query.py
[19]: https://www.datadoghq.com/pricing/?product=network-monitoring&tab=network-device-monitoring#products
[18]: https://github.com/DataDog/integrations-core/blob/master/cisco_aci/tests/cisco_aci_query.py
3 changes: 0 additions & 3 deletions cisco_sdwan/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ By surfacing traffic and throughput metrics at the device and interface level, t

## Setup

**The Cisco SD-WAN NDM integration is Generally Available. To learn more about billing implications, visit our [pricing page][8].**

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.

### Installation
Expand Down Expand Up @@ -63,4 +61,3 @@ Need help? Contact [Datadog support][7].
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
[6]: https://github.com/DataDog/integrations-core/blob/master/cisco_sdwan/metadata.csv
[7]: https://docs.datadoghq.com/help/
[8]: https://www.datadoghq.com/pricing/?product=network-monitoring&tab=network-device-monitoring#products
1 change: 1 addition & 0 deletions ddev/changelog.d/23646.fixed
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Retry agent check invocations on transient failures to address SNMP E2E flake from autodiscovery reload races.
29 changes: 26 additions & 3 deletions ddev/src/ddev/cli/env/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,26 @@

if TYPE_CHECKING:
from ddev.cli.application import Application
from ddev.e2e.agent.interface import AgentInterface


def _invoke_check_with_retry(agent: AgentInterface, args: list[str], *, retries: int = 3, backoff: float = 0.5) -> None:
"""Invoke ``agent check`` with bounded retry to absorb transient autodiscovery-reload races."""
import subprocess
import time

for attempt in range(retries + 1):
try:
agent.invoke(args)
return
except subprocess.CalledProcessError:
if attempt >= retries:
raise
click.echo(
f'agent check failed (attempt {attempt + 1}/{retries + 1}), retrying in {backoff:.1f}s...',
err=True,
)
time.sleep(backoff)


@click.command(
Expand Down Expand Up @@ -54,7 +74,10 @@ def agent(app: Application, *, intg_name: str, environment: str, args: tuple[str

if config_file is None or not trigger_run:
try:
agent.invoke(full_args)
if trigger_run:
_invoke_check_with_retry(agent, full_args)
else:
agent.invoke(full_args)
except subprocess.CalledProcessError as e:
app.abort(code=e.returncode)

Expand All @@ -67,14 +90,14 @@ def agent(app: Application, *, intg_name: str, environment: str, args: tuple[str
if not env_data.config_file.is_file():
try:
env_data.write_config(config)
agent.invoke(full_args)
_invoke_check_with_retry(agent, full_args)
finally:
env_data.config_file.unlink()
else:
temp_config_file = env_data.config_file.parent / f'{env_data.config_file.name}.bak.example'
env_data.config_file.replace(temp_config_file)
try:
env_data.write_config(config)
agent.invoke(full_args)
_invoke_check_with_retry(agent, full_args)
finally:
temp_config_file.replace(env_data.config_file)
5 changes: 1 addition & 4 deletions versa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@ This integration provides insight into how applications and users consume bandwi

## Setup

**The Versa NDM integration is Generally Available. To learn more about billing implications, visit our [pricing page][10].**

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.

### Installation
Expand Down Expand Up @@ -70,5 +68,4 @@ Need help? Contact [Datadog support][9].
[6]: https://docs.datadoghq.com/agent/configuration/agent-commands/#agent-status-and-information
[7]: https://github.com/DataDog/integrations-core/blob/master/versa/metadata.csv
[8]: https://github.com/DataDog/integrations-core/blob/master/versa/assets/service_checks.json
[9]: https://docs.datadoghq.com/help/
[10]: https://www.datadoghq.com/pricing/?product=network-monitoring&tab=network-device-monitoring#products
[9]: https://docs.datadoghq.com/help/
Loading