Skip to content

Commit 8ec14a0

Browse files
Add support for container policies. (#5)
- Introduce default failure tolerance of 5 failures in 60 seconds.
1 parent 5cda50b commit 8ec14a0

15 files changed

Lines changed: 504 additions & 48 deletions

File tree

async-service.gemspec

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,6 @@ Gem::Specification.new do |spec|
2727
spec.required_ruby_version = ">= 3.2"
2828

2929
spec.add_dependency "async"
30-
spec.add_dependency "async-container", "~> 0.29"
30+
spec.add_dependency "async-container", "~> 0.33"
3131
spec.add_dependency "string-format", "~> 0.2"
3232
end

bake/async/service/controller.rb

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ def initialize(context)
1010
end
1111

1212
def run
13-
# Warm up the Ruby process by preloading gems and running GC.
14-
Async::Service::Controller.warmup
15-
1613
controller.run
1714
end
1815

@@ -21,5 +18,5 @@ def run
2118
def controller
2219
configuration = context.lookup("async:service:configuration").instance.configuration
2320

24-
return Async::Service::Controller.new(configuration.services)
21+
return configuration.make_controller
2522
end

context/index.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ files:
1010
title: Getting Started
1111
description: This guide explains how to get started with `async-service` to create
1212
and run services in Ruby.
13+
- path: policies.md
14+
title: Container Policies
15+
description: This guide explains how to configure container policies for your services
16+
and understand the default failure handling behavior.
1317
- path: service-architecture.md
1418
title: Service Architecture
1519
description: This guide explains the key architectural components of `async-service`

context/policies.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Container Policies
2+
3+
This guide explains how to configure container policies for your services and understand the default failure handling behavior.
4+
5+
## Default Failure Handling
6+
7+
All services use {ruby Async::Service::Policy::DEFAULT} which monitors failure rates and stops the container when failures exceed a threshold.
8+
9+
**Default threshold:** 6 failures in 60 seconds (0.1 failures per second).
10+
11+
This means:
12+
- Services can tolerate occasional failures and transient issues.
13+
- More than 6 failures in any 60-second window stops the container.
14+
- Prevents services from restart-looping indefinitely when fundamentally broken.
15+
16+
This fail-fast behavior is appropriate for orchestrated environments (Kubernetes, systemd) where the orchestrator will restart the entire service.
17+
18+
### Why This Default?
19+
20+
Without failure monitoring, a broken service with `restart: true` would restart indefinitely, wasting resources. The default policy:
21+
22+
- **Catches problems quickly**: Broken services stop within 10-20 seconds.
23+
- **Prevents resource waste**: Doesn't keep trying to start services that will never succeed.
24+
- **Enables orchestrator recovery**: Systemd/Kubernetes can restart the whole process with a clean state.
25+
- **Detects environmental issues**: Bad hardware, corrupted pre-fork state, or system-level problems can't be fixed by restarting children - the entire service needs to be restarted (potentially on different hardware).
26+
- **Signals clear failure**: Exit code indicates the service couldn't maintain healthy operation.
27+
28+
## Configuring Policies
29+
30+
Use `container_policy` in your service configuration to customize failure handling:
31+
32+
``` ruby
33+
# config/service.rb
34+
35+
# More lenient: allow 5 failures per minute:
36+
container_policy Async::Service::Policy.new(maximum_failures: 5, window: 60)
37+
38+
service "web" do
39+
# Your service configuration.
40+
end
41+
42+
service "worker" do
43+
# Also uses the same policy.
44+
end
45+
```
46+
47+
The policy applies to **all services** in the configuration file.
48+
49+
### Choosing a Threshold
50+
51+
Consider your service characteristics:
52+
53+
**Strict (catch problems immediately):**
54+
``` ruby
55+
container_policy Async::Service::Policy.new(maximum_failures: 1, window: 5)
56+
```
57+
58+
**Balanced (tolerate transient issues):**
59+
``` ruby
60+
container_policy Async::Service::Policy.new(maximum_failures: 5, window: 60)
61+
```
62+
63+
**Lenient (allow many retries):**
64+
``` ruby
65+
container_policy Async::Service::Policy.new(maximum_failures: 20, window: 60)
66+
```
67+
68+
Factors to consider:
69+
- **Traffic volume**: High-traffic services may have more absolute failures.
70+
- **Error types**: Some errors are transient (network timeouts, rate limits).
71+
- **Dependencies**: Upstream services may need time to recover.
72+
- **Deployment environment**: Kubernetes/systemd handle restarts, local dev doesn't.
73+
74+
## Per-Container Policy Instances
75+
76+
The `container_policy` method accepts a block that's evaluated **each time a container is created**:
77+
78+
``` ruby
79+
# config/service.rb
80+
container_policy do
81+
# This block is called for EACH container created
82+
# Each container gets its own policy instance with fresh state
83+
Async::Service::Policy.new(maximum_failures: 5, window: 60)
84+
end
85+
```
86+
87+
If your policy is tracking per-container state, this will ensure each container has new policy with clean state.

guides/links.yaml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
getting-started:
22
order: 1
3-
service-architecture:
3+
policies:
44
order: 2
5-
best-practices:
5+
service-architecture:
66
order: 3
7-
deployment:
7+
best-practices:
88
order: 4
9+
deployment:
10+
order: 5

guides/policies/readme.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Container Policies
2+
3+
This guide explains how to configure container policies for your services and understand the default failure handling behavior.
4+
5+
## Default Failure Handling
6+
7+
All services use {ruby Async::Service::Policy::DEFAULT} which monitors failure rates and stops the container when failures exceed a threshold.
8+
9+
**Default threshold:** 6 failures in 60 seconds (0.1 failures per second).
10+
11+
This means:
12+
- Services can tolerate occasional failures and transient issues.
13+
- More than 6 failures in any 60-second window stops the container.
14+
- Prevents services from restart-looping indefinitely when fundamentally broken.
15+
16+
This fail-fast behavior is appropriate for orchestrated environments (Kubernetes, systemd) where the orchestrator will restart the entire service.
17+
18+
### Why This Default?
19+
20+
Without failure monitoring, a broken service with `restart: true` would restart indefinitely, wasting resources. The default policy:
21+
22+
- **Catches problems quickly**: Broken services stop within 10-20 seconds.
23+
- **Prevents resource waste**: Doesn't keep trying to start services that will never succeed.
24+
- **Enables orchestrator recovery**: Systemd/Kubernetes can restart the whole process with a clean state.
25+
- **Detects environmental issues**: Bad hardware, corrupted pre-fork state, or system-level problems can't be fixed by restarting children - the entire service needs to be restarted (potentially on different hardware).
26+
- **Signals clear failure**: Exit code indicates the service couldn't maintain healthy operation.
27+
28+
## Configuring Policies
29+
30+
Use `container_policy` in your service configuration to customize failure handling:
31+
32+
``` ruby
33+
# config/service.rb
34+
35+
# More lenient: allow 5 failures per minute:
36+
container_policy Async::Service::Policy.new(maximum_failures: 5, window: 60)
37+
38+
service "web" do
39+
# Your service configuration.
40+
end
41+
42+
service "worker" do
43+
# Also uses the same policy.
44+
end
45+
```
46+
47+
The policy applies to **all services** in the configuration file.
48+
49+
### Choosing a Threshold
50+
51+
Consider your service characteristics:
52+
53+
**Strict (catch problems immediately):**
54+
``` ruby
55+
container_policy Async::Service::Policy.new(maximum_failures: 1, window: 5)
56+
```
57+
58+
**Balanced (tolerate transient issues):**
59+
``` ruby
60+
container_policy Async::Service::Policy.new(maximum_failures: 5, window: 60)
61+
```
62+
63+
**Lenient (allow many retries):**
64+
``` ruby
65+
container_policy Async::Service::Policy.new(maximum_failures: 20, window: 60)
66+
```
67+
68+
Factors to consider:
69+
- **Traffic volume**: High-traffic services may have more absolute failures.
70+
- **Error types**: Some errors are transient (network timeouts, rate limits).
71+
- **Dependencies**: Upstream services may need time to recover.
72+
- **Deployment environment**: Kubernetes/systemd handle restarts, local dev doesn't.
73+
74+
## Per-Container Policy Instances
75+
76+
The `container_policy` method accepts a block that's evaluated **each time a container is created**:
77+
78+
``` ruby
79+
# config/service.rb
80+
container_policy do
81+
# This block is called for EACH container created
82+
# Each container gets its own policy instance with fresh state
83+
Async::Service::Policy.new(maximum_failures: 5, window: 60)
84+
end
85+
```
86+
87+
If your policy is tracking per-container state, this will ensure each container has new policy with clean state.

lib/async/service/configuration.rb

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,15 @@ def self.for(*environments)
5252
end
5353

5454
# Initialize an empty configuration.
55-
def initialize(environments = [])
55+
# @parameter environments [Array] Environment instances.
56+
# @parameter container_policy [Proc] Optional proc that returns a policy for container lifecycle management.
57+
def initialize(environments = [], container_policy: nil)
5658
@environments = environments
59+
@container_policy = container_policy
5760
end
5861

5962
attr :environments
63+
attr_accessor :container_policy
6064

6165
# Check if the configuration is empty.
6266
# @returns [Boolean] True if no environments are configured.
@@ -84,11 +88,22 @@ def services(implementing: nil)
8488

8589
# Create a controller for the configured services.
8690
#
91+
# @parameter container_policy [Proc] A proc that returns the policy to use for managing child lifecycle events.
92+
# @parameter options [Hash] Additional options passed to the controller.
8793
# @returns [Controller] A controller that can be used to start/stop services.
88-
def controller(**options)
89-
Controller.new(self.services(**options).to_a)
94+
def make_controller(container_policy: @container_policy, implementing: nil, **options)
95+
controller = Controller.new(self.services(implementing: implementing).to_a, **options)
96+
97+
if container_policy
98+
controller.define_singleton_method(:make_policy, &container_policy)
99+
end
100+
101+
return controller
90102
end
91103

104+
# Alias for backwards compatibility.
105+
alias controller make_controller
106+
92107
# Add the environment to the configuration.
93108
def add(environment)
94109
@environments << environment

lib/async/service/controller.rb

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
# Copyright, 2024-2025, by Samuel Williams.
55

66
require "async/container/controller"
7+
require_relative "policy"
78

89
module Async
910
module Service
@@ -13,32 +14,11 @@ module Service
1314
# within containers. It extends Async::Container::Controller to provide
1415
# service-specific functionality.
1516
class Controller < Async::Container::Controller
16-
# Warm up the Ruby process by preloading gems and running GC.
17-
def self.warmup
18-
begin
19-
require "bundler"
20-
Bundler.require(:preload)
21-
rescue Bundler::GemfileNotFound, LoadError
22-
# Ignore.
23-
end
24-
25-
if ::Process.respond_to?(:warmup)
26-
::Process.warmup
27-
elsif ::GC.respond_to?(:compact)
28-
3.times{::GC.start}
29-
::GC.compact
30-
end
31-
end
32-
3317
# Run a configuration of services.
3418
# @parameter configuration [Configuration] The service configuration to run.
3519
# @parameter options [Hash] Additional options for the controller.
3620
def self.run(configuration, **options)
37-
controller = Async::Service::Controller.new(configuration.services.to_a, **options)
38-
39-
self.warmup
40-
41-
controller.run
21+
configuration.make_controller(**options).run
4222
end
4323

4424
# Create a controller for the given services.
@@ -58,17 +38,43 @@ def initialize(services, **options)
5838
@services = services
5939
end
6040

41+
# Warm up the Ruby process by preloading gems, running GC, and compacting memory.
42+
# This reduces startup latency and improves copy-on-write efficiency.
43+
def warmup
44+
begin
45+
require "bundler"
46+
Bundler.require(:preload)
47+
rescue Bundler::GemfileNotFound, LoadError
48+
# Ignore.
49+
end
50+
51+
if ::Process.respond_to?(:warmup)
52+
::Process.warmup
53+
elsif ::GC.respond_to?(:compact)
54+
3.times{::GC.start}
55+
::GC.compact
56+
end
57+
end
58+
6159
# All the services associated with this controller.
6260
# @attribute [Array(Async::Service::Generic)]
6361
attr :services
6462

63+
# Create a policy for managing child lifecycle events.
64+
# @returns [Policy] The service-level policy with failure rate monitoring.
65+
def make_policy
66+
Policy::DEFAULT
67+
end
68+
6569
# Start all named services.
6670
def start
6771
@services.each do |service|
6872
service.start
6973
end
7074

7175
super
76+
77+
self.warmup
7278
end
7379

7480
# Setup all services into the given container.

lib/async/service/loader.rb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,18 @@ def service(name = nil, **options, &block)
5858

5959
@configuration.add(self.environment(**options, &block))
6060
end
61+
62+
# Set the container policy for all services in this configuration.
63+
# Can be called with either an argument or a block.
64+
# @parameter value [Async::Container::Policy] The policy to use for managing child lifecycle events.
65+
# @parameter block [Proc] A block that returns a policy instance.
66+
def container_policy(value = nil, &block)
67+
if @configuration.container_policy
68+
Console.warn(self, "Container policy is already set, overriding previous value!")
69+
end
70+
71+
@configuration.container_policy = block_given? ? block : proc{value}
72+
end
6173
end
6274
end
6375
end

0 commit comments

Comments
 (0)