Skip to content

Commit 28ba65e

Browse files
badrishcCopilot
andauthored
Hold per-DB checkpoint locks until all general-BGSAVE per-DB checkpoints complete (#1796)
* Hold per-DB checkpoint locks until all general-BGSAVE per-DB checkpoints complete Fixes a residual race in PR #1767 that caused MultiDatabaseSaveInProgressTest to flake in CI Release builds. The general BGSAVE path synchronously paused all per-DB checkpoint locks before returning 'Background saving started', but the per-DB checkpoint helper released each per-DB lock as soon as that single DB's checkpoint completed - not when the entire general save finished. If DB 0's checkpoint completed before the test's 'BGSAVE 0' arrived over the wire, BGSAVE 0 would succeed instead of failing with 'ERR checkpoint already in progress'. Locally the test takes 6-7s and the race never loses; in CI Release it ran in 1s and reliably failed. See https://github.com/microsoft/garnet/actions/runs/25757540604/job/75650328662. Fix: - RunPausedCheckpointsAndReleaseLocksAsync (used by both general and per-DB BGSAVE) resumes ALL pre-paused DBs in its outer finally, after Task.WhenAll. So per-DB locks are held until ALL per-DB checkpoints complete, not just each individual one. A per-DB BGSAVE issued mid-flight reliably observes the in-progress checkpoint. - The per-DB checkpoint inner work is now a local async function TakeOneCheckpointAsync that performs only (TakeCheckpointAsync + UpdateLastSaveData) without resuming. - Pre-fill checkpointTasks[] with Task.CompletedTask so the catch path can safely double-await even if the synchronous task-creation loop throws partway through. The double-await ensures we never resume a per-DB lock while its checkpoint is still running. - Remove the handedOffCount partial-resume bookkeeping that's no longer needed. - The previously-shared RunPausedCheckpointAsync helper is removed - its only other caller (TaskCheckpointBasedOnAofSizeLimitAsync) now inlines the same try/checkpoint/ update/finally/resume sequence so its single-DB pause-resume lifecycle is visible in one place. Net effect: - General BGSAVE: per-DB locks held until ALL per-DB checkpoints complete, so any per-DB BGSAVE issued mid-flight reliably fails with 'checkpoint already in progress'. - Per-DB BGSAVE alone (single-DB path through RunPausedCheckpointsAndReleaseLocksAsync with pausedCount=1): unchanged - that single per-DB lock is still released exactly when that single checkpoint completes. - AOF-size-driven checkpoint: behaviorally unchanged (lock cleanup inlined). - Other legal scenarios (per-DB then per-DB on different DB, per-DB then general, general blocks general) preserved. Verification: 10/10 runs in Release config of MultiDatabaseSaveInProgressTest + MultiDatabaseGeneralSaveBlocksGeneralSaveTest, full MultiDatabaseTests suite (31/31) passes locally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Pipeline BGSAVE + BGSAVE 0 in MultiDatabaseSaveInProgressTest to eliminate roundtrip race Even with the previous server-side fix that holds all per-DB checkpoint locks until the entire general save completes, the test still flaked in CI Release builds because the actual checkpoint of ~1MB of in-memory data completes in microseconds-to-milliseconds. That window is comparable to (and sometimes shorter than) the client→server roundtrip between issuing the general BGSAVE and the follow-up BGSAVE 0: Failed Garnet.test.MultiDatabaseTests.MultiDatabaseSaveInProgressTest [1 s] Error Message: ERR checkpoint already in progress Assert.That(caughtException, expression) Expected: <StackExchange.Redis.RedisServerException> But was: null (see https://github.com/microsoft/garnet/actions/runs/25773997865) Replace the two sequential SE.Redis Execute calls with a single LightClient pipelined send of 'BGSAVE\r\nBGSAVE 0\r\n'. Both commands arrive at the server in the same network packet, so the server processes BGSAVE 0 immediately after the general BGSAVE's synchronous setup completes - while DB 0's per-DB checkpoint lock is still held. This makes the assertion deterministic regardless of how fast the actual checkpoint runs. CountResponseType.Bytes is used because RESP token-counting in LightClient only treats '-' as an error marker at position 0, so a pipelined response containing two RESP tokens '+...\r\n-...\r\n' would never trigger CompletePendingRequests under the default Tokens mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add [CancelAfter(180_000)] to MultipleReplicasWithVectorSetsAsync This test has been timing out in CI. Set an explicit 180s cancellation timeout so the shared ClusterTestContext.cts is configured accordingly and polling loops (BackOff(cts.Token)) can exit cleanly instead of hanging until the test runner kills the process. Matches the convention applied in PR #1767 for MultipleReplicasWithVectorSetsAndDeletesAsync. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: badrishc <badrishc@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 4cc403f commit 28ba65e

3 files changed

Lines changed: 49 additions & 33 deletions

File tree

libs/server/Databases/MultiDatabaseManager.cs

Lines changed: 33 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,15 @@ public override async Task TaskCheckpointBasedOnAofSizeLimitAsync(long aofSizeLi
295295

296296
if (pausedDbId < 0) return;
297297

298-
await RunPausedCheckpointAsync(databasesMapSnapshot[pausedDbId], pausedDbId, token, logger).ConfigureAwait(false);
298+
try
299+
{
300+
var storeTailAddress = await TakeCheckpointAsync(databasesMapSnapshot[pausedDbId], logger: logger, token: token).ConfigureAwait(false);
301+
UpdateLastSaveData(pausedDbId, storeTailAddress);
302+
}
303+
finally
304+
{
305+
ResumeCheckpoints(pausedDbId);
306+
}
299307
}
300308
finally
301309
{
@@ -967,69 +975,68 @@ private void CopyDatabases(IDatabaseManager src, bool enableAof)
967975
}
968976

969977
/// <summary>
970-
/// Run pre-paused per-DB checkpoints in parallel, then release the outer locks held by the caller.
978+
/// Run pre-paused per-DB checkpoints in parallel, then resume the per-DB checkpoint locks
979+
/// and release the outer locks held by the caller.
971980
/// Caller must hold <see cref="databasesContentLock"/> as a reader and must have synchronously
972-
/// pause-locked every DB in <paramref name="pausedDbIds"/>[0..<paramref name="pausedCount"/>).
981+
/// pause-locked the first <paramref name="pausedCount"/> entries of <paramref name="pausedDbIds"/>.
982+
/// Per-DB checkpoint locks are held until ALL per-DB checkpoints complete (not just each
983+
/// individual one) so a per-DB BGSAVE issued mid-flight during a general BGSAVE reliably
984+
/// observes the in-progress checkpoint and fails with "checkpoint already in progress".
973985
/// </summary>
974986
private async Task<bool> RunPausedCheckpointsAndReleaseLocksAsync(int[] pausedDbIds, int pausedCount,
975987
bool multiDbLockHeld, CancellationToken token, ILogger logger)
976988
{
989+
// Pre-fill with Task.CompletedTask so the catch path can safely await Task.WhenAll
990+
// even if the synchronous task-creation loop below throws partway through.
991+
var checkpointTasks = new Task[pausedCount];
992+
for (var i = 0; i < pausedCount; i++)
993+
checkpointTasks[i] = Task.CompletedTask;
994+
977995
try
978996
{
979997
// Force async so that the entry point can return synchronously to the caller.
980998
await Task.Yield();
981999

9821000
var databaseMapSnapshot = databases.Map;
983-
var checkpointTasks = new Task[pausedCount];
984-
var handedOffCount = 0;
9851001

9861002
try
9871003
{
9881004
for (var i = 0; i < pausedCount; i++)
989-
{
990-
checkpointTasks[i] = RunPausedCheckpointAsync(databaseMapSnapshot[pausedDbIds[i]], pausedDbIds[i], token, logger);
991-
handedOffCount = i + 1;
992-
}
1005+
checkpointTasks[i] = TakeOneCheckpointAsync(databaseMapSnapshot[pausedDbIds[i]], pausedDbIds[i]);
9931006

9941007
await Task.WhenAll(checkpointTasks).ConfigureAwait(false);
9951008
}
9961009
catch (Exception ex)
9971010
{
9981011
logger?.LogError(ex, "Checkpointing threw exception");
9991012

1000-
// Per-DB helpers that were started always resume their own dbId in finally.
1001-
// Resume locks for any pre-paused DBs that didn't get handed off (very rare —
1002-
// would require the synchronous tasks[] assignment loop above to throw).
1003-
for (var i = handedOffCount; i < pausedCount; i++)
1004-
ResumeCheckpoints(pausedDbIds[i]);
1013+
// Make sure any tasks already started are observed before we resume the per-DB
1014+
// locks in the outer finally (otherwise we could resume a lock while its
1015+
// checkpoint is still running).
1016+
try { await Task.WhenAll(checkpointTasks).ConfigureAwait(false); }
1017+
catch { /* already logged above */ }
10051018
}
10061019
}
10071020
finally
10081021
{
1022+
for (var i = 0; i < pausedCount; i++)
1023+
ResumeCheckpoints(pausedDbIds[i]);
1024+
10091025
if (multiDbLockHeld)
10101026
multiDbCheckpointingLock.WriteUnlock();
10111027

10121028
databasesContentLock.ReadUnlock();
10131029
}
10141030

10151031
return true;
1016-
}
10171032

1018-
/// <summary>
1019-
/// Run a single pre-paused per-DB checkpoint and resume the per-DB checkpoint lock.
1020-
/// Caller must have already pause-locked <paramref name="dbId"/> via <see cref="TryPauseCheckpoints(int)"/>.
1021-
/// </summary>
1022-
private async Task RunPausedCheckpointAsync(GarnetDatabase db, int dbId, CancellationToken token, ILogger logger)
1023-
{
1024-
try
1033+
// Local function: take one per-DB checkpoint and update LASTSAVE. Does NOT resume the
1034+
// per-DB lock — the outer finally above resumes all paused DBs after WhenAll completes.
1035+
async Task TakeOneCheckpointAsync(GarnetDatabase db, int dbId)
10251036
{
10261037
var storeTailAddress = await TakeCheckpointAsync(db, logger: logger, token: token).ConfigureAwait(false);
10271038
UpdateLastSaveData(dbId, storeTailAddress);
10281039
}
1029-
finally
1030-
{
1031-
ResumeCheckpoints(dbId);
1032-
}
10331040
}
10341041

10351042
private void UpdateLastSaveData(int dbId, long? storeTailAddress)

test/Garnet.test.cluster/VectorSets/ClusterVectorSetTests.cs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -464,6 +464,7 @@ static void Incr(byte[] k)
464464
}
465465

466466
[Test]
467+
[CancelAfter(180_000)]
467468
public async Task MultipleReplicasWithVectorSetsAsync()
468469
{
469470
const int PrimaryIndex = 0;

test/Garnet.test/MultiDatabaseTests.cs

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1377,13 +1377,21 @@ public void MultiDatabaseSaveInProgressTest()
13771377
db2.ListLeftPush($"k{i}o", new string('x', 256));
13781378
}
13791379

1380-
// Issue general background save
1381-
res = db1.Execute("BGSAVE");
1382-
ClassicAssert.AreEqual("Background saving started", res.ToString());
1383-
1384-
// Issue background save to DB 0 while general save is in progress - illegal
1385-
Assert.Throws<RedisServerException>(() => db1.Execute("BGSAVE", "0"),
1386-
Encoding.ASCII.GetString(CmdStrings.RESP_ERR_CHECKPOINT_ALREADY_IN_PROGRESS));
1380+
// Issue a general BGSAVE and a per-DB BGSAVE on DB 0 as a single pipelined batch
1381+
// via LightClient. Pipelining eliminates the client→server roundtrip between the
1382+
// two commands so the per-DB BGSAVE arrives at the server while the general BGSAVE's
1383+
// synchronous setup is still holding DB 0's per-DB checkpoint lock — guaranteeing
1384+
// the "checkpoint already in progress" error regardless of how fast the actual
1385+
// checkpoint completes. Without pipelining, a fast in-memory checkpoint can finish
1386+
// before the per-DB BGSAVE arrives over the wire and the test flakes (CI Release).
1387+
using (var lcRequest = TestUtils.CreateRequest(countResponseType: CountResponseType.Bytes))
1388+
{
1389+
var expectedResponse =
1390+
"+Background saving started\r\n" +
1391+
$"-{Encoding.ASCII.GetString(CmdStrings.RESP_ERR_CHECKPOINT_ALREADY_IN_PROGRESS)}\r\n";
1392+
var response = lcRequest.Execute("BGSAVE", "BGSAVE 0", expectedResponse.Length);
1393+
ClassicAssert.AreEqual(expectedResponse, response);
1394+
}
13871395

13881396
int lastsave_old = lastsave;
13891397
// Wait for save to complete

0 commit comments

Comments
 (0)