Skip to content

Shrinking autoscale group kills in-progress builds #1399

@DanielHeath

Description

@DanielHeath

Describe the bug

Stack was running nicely, had scaled up to four instances.

There's only enough work for three instances, so the ASG gets told to reduce capacity.

As a result, jobs which are in progress get interrupted partway through (in this case, the job was midway through pushing out a production hotfix, which was a great addition to a morning of incident response :))

Expected behavior

An agent which isn't currently performing work gets selected for termination

Actual behaviour

An instance performing useful work is often killed

Stack parameters:

  • AWS Region: us-east-2
  • Version 6.27.0

** Context **

Changing the size of an ASG is a very blunt instrument.

Consider instead the detach-instance call, which removes an instance from the ASG and decrements the DesiredCapacity.

Once the detach-instance completes, you could then terminate the instance from the lambda; this lets you pick which instance gets killed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions