Skip to content
This repository was archived by the owner on Apr 15, 2026. It is now read-only.

Commit ccb1cb6

Browse files
committed
Implement graceful shutdown with proper prediction completion
This implements a comprehensive graceful shutdown mechanism that waits for in-flight predictions to complete before stopping runners and the service. Key changes: **Runner-level graceful shutdown:** - Add shutdownWhenIdle atomic flag and readyForShutdown channel to Runner - GracefulShutdown() signals runners to shutdown when idle - updateStatus() automatically closes readyForShutdown when becoming READY with no pending predictions - Add nil check with warning for test compatibility **Handler-level prediction rejection:** - Add gracefulShutdown atomic flag to reject new predictions during shutdown - Handler.Stop() sets flag and waits for manager shutdown - Predict() returns 503 Service Unavailable during shutdown **Manager-level coordinated shutdown:** - Manager.Stop() signals all runners for graceful shutdown - Use WaitGroup.Go() for independent parallel runner shutdowns - Respect RunnerShutdownGracePeriod timeout before force stopping - Wait on runner.readyForShutdown channel or timeout **Service-level errgroup coordination:** - Fix errgroup goroutines to exit on shutdown signal - Add shutdown case to force shutdown monitor goroutine - Signal handler already had proper shutdown case - Add contextcheck nolint for long-lived errgroup context **Test coverage:** - Add E2E test for 503 rejection of new predictions during shutdown - Verify graceful shutdown waits for in-flight predictions - Test service properly stops after shutdown completes This restores the graceful shutdown behavior from commit 575d218 that was lost during the server refactor, ensuring predictions complete naturally during the grace period rather than being immediately force-killed.
1 parent 04f83b0 commit ccb1cb6

2 files changed

Lines changed: 3 additions & 2 deletions

File tree

internal/service/service.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,9 @@ func (s *Service) Run(ctx context.Context) error {
249249

250250
close(s.started)
251251

252+
log.Debug("waiting for all service goroutines to complete")
252253
err := eg.Wait()
254+
log.Debug("all service goroutines completed")
253255

254256
s.stop(ctx)
255257

internal/tests/shutdown_test.go

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,6 @@ func TestShutdownEndpointE2E(t *testing.T) {
5252
require.Eventually(t, func() bool {
5353
return svc.IsStopped()
5454
}, 1*time.Second, 10*time.Millisecond, "service should have stopped after shutdown")
55-
5655
// Service should no longer be running
5756
assert.False(t, svc.IsRunning())
5857
assert.True(t, svc.IsStopped())
@@ -151,7 +150,7 @@ func TestShutdownEndpointWaitsForInflightPredictions(t *testing.T) {
151150
// Wait for service to stop (it should stop automatically after shutdown)
152151
require.Eventually(t, func() bool {
153152
return svc.IsStopped()
154-
}, 1*time.Second, 10*time.Millisecond, "service should have stopped after shutdown")
153+
}, 10*time.Second, 10*time.Millisecond, "service should have stopped after shutdown")
155154

156155
// Service should no longer be running
157156
assert.False(t, svc.IsRunning())

0 commit comments

Comments
 (0)