Skip to content

Fleet leases are never reclaimed after TTL expiry; an expired lease blocks re-acquire #20

@Regis-RCR

Description

@Regis-RCR

Summary

A fleet lease outlives its own TTL. Acquire one with a short ttl_seconds, wait
past expires_at, and it stays state: "granted". The expiry never triggers a
reclaim. A second agent that asks for the same scope lands in state: "requested"
and waits there, instead of being granted the lease that should already be free.

Environment

  • memtrace 0.6.11
  • macOS (Apple Silicon)
  • Embedded MemDB, fleet tools called over the MCP transport

Steps to reproduce

  1. Acquire a lease with a short TTL:
    fleet_acquire_lease(repo_id, agent_id: "agent-A", scope: ["sym_x"], ttl_seconds: 3)
    The response is state: "granted" with an expires_at about 3 seconds out.
  2. Wait well past the TTL (for example 25 seconds).
  3. Check the lease:
    fleet_preflight(repo_id, touched: ["sym_x"]) still lists the lease as
    state: "granted", with expires_at now in the past.
  4. From a second agent, acquire the same scope:
    fleet_acquire_lease(repo_id, agent_id: "agent-B", scope: ["sym_x"])
    returns state: "requested" (queued behind the stale holder), not granted.
  5. Re-check much later: a lease can still be granted about 10 minutes past its
    recorded expires_at.

Observed

  • Expired leases linger as granted. No expired state transition is observed.
  • A re-acquire of an expired scope is blocked (requested) until an explicit
    fleet_release_lease or an engine restart.

Expected

After the TTL elapses, the lease should become reclaimable (released or expired)
so another agent can acquire the scope. Today the only recovery is an explicit
release, which a crashed or forgetful holder will never issue.

Impact

One holder that crashes, or just forgets to release, blocks the scope for every
other agent. There is no automatic recovery path back to a usable lease.

Note

Fleet intents behave differently here. A published intent does drop out on its
TTL: it disappears from node state once the window closes. The TTL mechanism
clearly works for intents, so the gap is specific to leases. Aligning the two
would resolve this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions