Skip to content

Expose connector_config.timeout to reduce SSH dial timeout on dead instances #1376

@lorenzstorm1

Description

@lorenzstorm1

Problem

When a spot instance is reclaimed while a job is assigned to it, the runner retries SSH connections to the dead instance with a 10-minute dial timeout × 3 retries = ~30 minutes of hanging before the job fails with runner_system_failure.

The [runners.autoscaler.connector_config] section supports a timeout field that controls how long the runner waits for an SSH connection, but the module doesn't expose it.

Current template

[runners.autoscaler.connector_config]
  username          = "${connector_config_user}"
  use_external_addr = false

Requested change

Expose timeout (and ideally use_static_credentials) in the runner_worker_docker_autoscaler variable:

variable "runner_worker_docker_autoscaler" {
  type = object({
    ...
    connector_config_user    = optional(string, "ec2-user")
    connector_config_timeout = optional(string, "")  # e.g. "2m"
    ...
  })
}

Template would become:

[runners.autoscaler.connector_config]
  username          = "${connector_config_user}"
  use_external_addr = false
  %{~ if connector_config_timeout != "" ~}
  timeout           = "${connector_config_timeout}"
  %{~ endif ~}

Impact

With timeout = "2m", a spot reclaim would fail the job in ~6 minutes (2 min × 3 retries) instead of ~30 minutes. Combined with retry: { max: 2, when: [runner_system_failure] } in CI config, jobs would recover in ~7 minutes total instead of failing the entire pipeline after 30 minutes.

Environment

  • Module version: 9.5.0
  • Executor: docker-autoscaler
  • Instance types: spot, mixed pool with price-capacity-optimized allocation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions