Skip to content

Commit 3b8b52c

Browse files
committed
perf(sync): single-shot DCV retry — drop per-order challenge polling
PerformDcvIfNeededAsync now accepts an optional waitForChallengeSecondsOverride parameter. Enroll continues to pass null, preserving the configured DcvWaitForChallengeSeconds budget (default 60s) for one-shot end-to-end issuance. TryRunDcvDuringSyncAsync passes 0, which forces a single TrackOrder call per pending order — if CERTInext hasn't yet exposed the DCV slot, sync moves on and the next sync cycle will pick the order up. Motivation: with the previous code, every pending order in Synchronize/GetSingleRecord could spend up to 60s polling for the DCV challenge to materialize. On accounts with many pending orders this scaled poorly — a live DCV-on sync against ~150 pending orders took 2:35. After: the same sync completes in ~2s of plugin time per record (one TrackOrder per pending order, no per-order delay loop). Already-issued orders were and remain skipped (the gate at Synchronize line 719 only runs the retry for EXTERNALVALIDATION). Tests: - New SyncDcvRetry_DoesSingleShotTrackOrder_WhenChallengeNotReady pins the single-shot behaviour: even with DcvWaitForChallengeSeconds=60 in config, the sync path makes exactly ONE TrackOrder call for an order whose domainVerification is null and exits in <10s wall-clock. - Existing happy-path tests still exercise the Enroll budget (which keeps the full polling behaviour) via the null override. 146/146 unit tests pass.
1 parent 797f666 commit 3b8b52c

2 files changed

Lines changed: 84 additions & 5 deletions

File tree

CERTInext.Tests/CERTInextCAPluginDcvTests.cs

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -395,6 +395,65 @@ public void SetDomainValidatorFactory_SecondCall_OverridesFirst()
395395
"the most recent SetDomainValidatorFactory call must replace the earlier one");
396396
}
397397

398+
// ---------------------------------------------------------------------------
399+
// Sync path is single-shot for the DCV challenge wait
400+
// ---------------------------------------------------------------------------
401+
402+
[Fact]
403+
public async Task SyncDcvRetry_DoesSingleShotTrackOrder_WhenChallengeNotReady()
404+
{
405+
// Sync MUST NOT poll the configured DcvWaitForChallengeSeconds budget per
406+
// pending order — that would scale O(orders × 60s) per cycle and tie up
407+
// gateway threads for minutes per sync. When TrackOrder returns null
408+
// domainVerification, sync exits immediately and lets the next sync cycle
409+
// pick the order up.
410+
var mock = NewMock();
411+
412+
// High config budget — would normally drive 6+ polls × 5s waits. The sync
413+
// override of 0 must prevent that.
414+
var config = DcvConfig(dcvWaitForChallengeSeconds: 60);
415+
416+
mock.Setup(c => c.TrackOrderAsync(MockCertificateData.DcvOrderId, It.IsAny<CancellationToken>()))
417+
.ReturnsAsync(new TrackOrderResponse
418+
{
419+
OrderDetails = new TrackOrderResponseDetails
420+
{
421+
OrderStatusId = "1",
422+
CertificateStatusId = "1",
423+
DomainVerification = null
424+
}
425+
});
426+
427+
// GetSingleRecord calls GetCertificateAsync first to materialize the record;
428+
// the sync-DCV-retry kicks in afterwards. The pending response keeps the
429+
// retry path engaged so we exercise the override.
430+
mock.Setup(c => c.GetCertificateAsync(MockCertificateData.DcvOrderId, It.IsAny<CancellationToken>()))
431+
.ReturnsAsync(MockCertificateData.PendingCertRecord(MockCertificateData.DcvOrderId));
432+
433+
var validator = new FakeDomainValidator();
434+
var plugin = BuildPlugin(mock.Object, new FakeDomainValidatorFactory(validator), config);
435+
436+
var sw = System.Diagnostics.Stopwatch.StartNew();
437+
// GetSingleRecord calls TryRunDcvDuringSyncAsync internally — which is the
438+
// sync-style path with waitForChallengeSecondsOverride=0.
439+
var record = await plugin.GetSingleRecord(MockCertificateData.DcvOrderId);
440+
sw.Stop();
441+
442+
record.Should().NotBeNull();
443+
// The 0-budget single shot must complete well under the 60s config budget.
444+
// Use a generous 10s ceiling to tolerate slow CI hosts; the actual cost is
445+
// ~1 TrackOrder. Without the override we'd be ≥60s.
446+
sw.Elapsed.Should().BeLessThan(TimeSpan.FromSeconds(10),
447+
"sync's DCV retry must be single-shot, not poll the configured challenge budget");
448+
449+
mock.Verify(c => c.TrackOrderAsync(MockCertificateData.DcvOrderId, It.IsAny<CancellationToken>()),
450+
Times.Exactly(1),
451+
"PerformDcvIfNeededAsync's single-shot challenge check must make exactly ONE " +
452+
"TrackOrder call when waitForChallengeSecondsOverride=0 and the slot is null. " +
453+
"Without the override, the polling loop would issue many more calls within " +
454+
"the 60s budget.");
455+
}
456+
398457
// ---------------------------------------------------------------------------
399458
// Failure modes
400459
// ---------------------------------------------------------------------------

CERTInext/CERTInextCAPlugin.cs

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1059,7 +1059,15 @@ public DomainValidatorConfigProvider(Dictionary<string, object> config)
10591059
/// * a bounded DCV timeout linked to the caller's cancellation token,
10601060
/// * swallowing of non-cancellation exceptions so a single bad order does not
10611061
/// halt a 12-hour sync — the order will be retried on the next cycle.
1062-
/// Returns <c>true</c> when DCV actually executed, <c>false</c> when skipped.
1062+
///
1063+
/// Uses a single-shot challenge check (<c>waitForChallengeSeconds=0</c>) by default
1064+
/// because sync runs periodically: if CERTInext hasn't yet exposed the DCV slot for
1065+
/// this order, the next sync cycle will pick it up. Waiting per-order during sync
1066+
/// scales poorly — a single pending order's 60s budget becomes minutes of wasted
1067+
/// gateway thread time across an account with many orders. See PR #2 discussion.
1068+
///
1069+
/// Returns <c>true</c> when DCV actually executed (or DCV is already complete),
1070+
/// <c>false</c> when skipped.
10631071
/// </summary>
10641072
private async Task<bool> TryRunDcvDuringSyncAsync(string orderNumber, CancellationToken ct)
10651073
{
@@ -1081,10 +1089,11 @@ private async Task<bool> TryRunDcvDuringSyncAsync(string orderNumber, Cancellati
10811089
dcvCts.CancelAfter(TimeSpan.FromMinutes(timeoutMinutes));
10821090

10831091
_logger.LogInformation(
1084-
"Attempting deferred DCV during sync/refresh. OrderNumber={OrderNumber}, DcvTimeoutMinutes={Timeout}",
1092+
"Attempting deferred DCV during sync/refresh (single-shot challenge check). " +
1093+
"OrderNumber={OrderNumber}, DcvTimeoutMinutes={Timeout}",
10851094
orderNumber, timeoutMinutes);
10861095

1087-
return await PerformDcvIfNeededAsync(orderNumber, dcvCts.Token);
1096+
return await PerformDcvIfNeededAsync(orderNumber, dcvCts.Token, waitForChallengeSecondsOverride: 0);
10881097
}
10891098
catch (OperationCanceledException) when (ct.IsCancellationRequested)
10901099
{
@@ -1110,15 +1119,26 @@ private async Task<bool> TryRunDcvDuringSyncAsync(string orderNumber, Cancellati
11101119
///
11111120
/// Rule: if the order is already issued we never attempt DCV — it would be a no-op
11121121
/// at best and could confuse the CA at worst.
1122+
///
1123+
/// <paramref name="waitForChallengeSecondsOverride"/> lets the sync path force a
1124+
/// single-shot challenge check (pass <c>0</c>) so a sync cycle doesn't spend up to
1125+
/// <c>DcvWaitForChallengeSeconds</c> per pending order waiting for CERTInext to
1126+
/// expose the DCV slot — sync runs periodically, so unexposed orders are picked up
1127+
/// on the next cycle instead. Enroll passes <c>null</c> to keep the full configured
1128+
/// budget (user-visible latency benefits from a one-shot end-to-end finish).
11131129
/// </summary>
1114-
private async Task<bool> PerformDcvIfNeededAsync(string orderNumber, CancellationToken ct)
1130+
private async Task<bool> PerformDcvIfNeededAsync(
1131+
string orderNumber,
1132+
CancellationToken ct,
1133+
int? waitForChallengeSecondsOverride = null)
11151134
{
11161135
// Poll TrackOrder until CERTInext exposes the DCV challenge (domainVerification
11171136
// populated) OR the cert reaches a terminal state OR the wait budget expires.
11181137
// Under concurrent enrollment load CERTInext sometimes takes a few seconds to
11191138
// materialize the slot after GenerateOrderSSL returns — without this wait a
11201139
// race-condition order skips DCV entirely and waits for the next sync cycle.
1121-
int waitBudgetSeconds = _config.GetEffectiveDcvWaitForChallengeSeconds();
1140+
int waitBudgetSeconds = waitForChallengeSecondsOverride
1141+
?? _config.GetEffectiveDcvWaitForChallengeSeconds();
11221142
// Challenge-wait poll interval is clamped to [1s, 5s] so it's responsive even
11231143
// when an admin has set DcvPropagationDelaySeconds high for slow zones (that
11241144
// setting governs how long we wait *after* publishing a TXT record, which is a

0 commit comments

Comments
 (0)