Replication in D-LOCKSS is driven by two complementary mechanisms:
-
CRDT Cluster Sync — Each shard runs an embedded IPFS Cluster with CRDT consensus. When a file is pinned to a shard's cluster, the
LocalPinTrackeron every peer in that shard automatically syncs and pins the content locally. -
ReplicationRequest Protocol — The
replicationManager(extracted fromShardManager) periodically broadcastsReplicationRequestmessages for pinned manifests. Peers that don't yet have the file perform auto-replication: fetch viaPinRecursiveand add to the cluster.
| Parameter | Default | Env Variable | Location |
|---|---|---|---|
| Replication Check Interval | 1 minute | DLOCKSS_CHECK_INTERVAL |
config.Replication.CheckInterval |
| Root Shard Check Interval | 20 seconds | (hardcoded) | rootReplicationCheckInterval |
| Request Cooldown Per Manifest | 5 minutes | (hardcoded) | replicationRequestCooldownDuration |
| Max Requests Per Cycle | 50 | (hardcoded) | maxReplicationRequestsPerCycle |
| Auto-Replication Enabled | true | DLOCKSS_AUTO_REPLICATION_ENABLED |
config.Replication.AutoReplicationEnabled |
| Auto-Replication Timeout | 5 minutes | DLOCKSS_AUTO_REPLICATION_TIMEOUT |
config.Replication.AutoReplicationTimeout |
| Max Concurrent Checks | 5 | DLOCKSS_MAX_CONCURRENT_CHECKS |
config.Replication.MaxConcurrentReplicationChecks |
| Pin Reannounce Interval | 2 minutes | DLOCKSS_PIN_REANNOUNCE_INTERVAL |
config.Replication.PinReannounceInterval |
| Min Replication | 5 | DLOCKSS_MIN_REPLICATION |
config.Replication.MinReplication |
| Max Replication | 10 | DLOCKSS_MAX_REPLICATION |
config.Replication.MaxReplication |
For a newly ingested file to reach full replication across a shard:
- Ingest (immediate): File pinned locally,
IngestMessagebroadcast to shard, clusterPin()called. - CRDT Sync (seconds): Cluster state propagates to peers via PubSub;
LocalPinTrackerdetects new pin and startsPinRecursive. - First Replication Check (up to 20s at root, 1m elsewhere):
replicationManager.runChecker()sendsReplicationRequestfor all pinned manifests. - Auto-Replication (seconds to minutes): Peers receiving the request that don't have the file fetch it via
PinRecursive(up to 5-minute timeout). - Cooldown (5 minutes): After sending a request for a manifest, no new request is sent for that manifest for 5 minutes.
Typical convergence: Most files replicate within 1-2 minutes via CRDT sync alone. Files that fail the initial sync (large DAGs, slow block propagation) recover on the next replication cycle after the 5-minute cooldown.
Once a ReplicationRequest is sent for a manifest, replicationRequestCooldownDuration prevents resending for 5 minutes. If the first request fails (e.g., the receiving peer's PinRecursive times out), the file appears "stuck" until the cooldown expires.
Mitigation: The cooldown prevents flooding but causes visible delays for files that fail on the first attempt.
PinRecursive for large files or over slow links may hit the AutoReplicationTimeout. The file remains unreplicated until the next replication cycle.
Mitigation: Heartbeat-driven re-pin gradually fills in missing blocks (see below).
The replicationManager.sem channel limits concurrent auto-replications to MaxConcurrentReplicationChecks (default 5). When all slots are occupied, additional ReplicationRequest messages are silently dropped.
Mitigation: Increase DLOCKSS_MAX_CONCURRENT_CHECKS for nodes with sufficient bandwidth.
At most 50 ReplicationRequest messages are sent per checker cycle. With thousands of files, not all manifests are requested in a single cycle.
Mitigation: Subsequent cycles pick up remaining manifests. The cooldown map ensures already-sent requests aren't duplicated.
Every heartbeat (~10s), each node picks one pinned manifest CID (round-robin) and:
- Re-pins the ManifestCID recursively (
PinRecursive, 2-minute timeout). Idempotent — returns instantly when the DAG is already complete locally, and incrementally fetches missing blocks otherwise. - Pins the PayloadCID as its own root so Kubo's reprovider (
pinnedstrategy) re-announces it. - Provides both CIDs to the DHT (only if the re-pin succeeded).
A CompareAndSwap guard prevents concurrent re-provides from piling up.
Impact: Resource-constrained nodes (e.g., Raspberry Pis) that failed the initial PinRecursive gradually complete the DAG over successive heartbeats without manual intervention. DHT provider records (which expire after ~24h) are kept fresh.
export DLOCKSS_CHECK_INTERVAL=15s # Default: 1mFaster detection at non-root shards. Root shards already check every 20s.
export DLOCKSS_MAX_CONCURRENT_CHECKS=10 # Default: 5More parallel auto-replications. Higher bandwidth usage.
export DLOCKSS_AUTO_REPLICATION_TIMEOUT=10m # Default: 5mAllows more time for large DAG fetches. Ties up semaphore slots longer.
For faster convergence in testnets:
export DLOCKSS_CHECK_INTERVAL=15s
export DLOCKSS_MAX_CONCURRENT_CHECKS=10- Keep
CheckIntervalat 1m for reasonable resource usage (root shards already use 20s). - Keep
AutoReplicationTimeoutat 5m unless dealing with consistently large files. - The 5-minute request cooldown is a deliberate trade-off between convergence speed and network overhead; files that fail on the first attempt self-heal after the cooldown expires.
The monitor's replication snapshot log line reports:
total_manifests: Number of known manifeststotal_at_target: Files with replica count >= min(MinReplication, shard_peer_count)avg_replication: Average replica count across all manifests
Node daemon logs to watch:
"auto-replication: fetched and pinned"— successful auto-replication"auto-replication: failed to fetch/pin"—PinRecursivetimeout or failure"auto-replication skipped, concurrency limit reached"— semaphore full"ReplicationRequest sent"— outbound request (debug level)