fix: Honor x-ld-fd-fallback header in FDv2 initializer phase and on successful responses#251
Conversation
…uccessful responses
The .NET SDK previously only inspected the x-ld-fd-fallback header on
FDv2 synchronizer error responses. Two gaps:
- The header was ignored on successful (200) responses. The server
can include it alongside a valid payload, expecting the SDK to
apply the payload and then switch to FDv1.
- The header was ignored entirely during the initializer phase. An
initializer that received the directive would just fail through to
the next initializer or to the FDv2 synchronizer chain rather than
switching directly to the FDv1 fallback synchronizer.
Polling and streaming data sources now check the response/connection
headers on the success path and emit Off+FDv1Fallback after applying
the payload. The FDv2 composite attaches the FDv1 fallback applier to
the initializers entry as well as the synchronizers entry, with
extra-skip semantics so an initializer-phase fallback bypasses the
synchronizer chain entirely. A shared latch coordinates the FDv1
fallback applier with the surrounding fallback/recovery appliers so
exhaustion signals from disposed entries cannot block the fallback
synchronizer that just started.
…nit coverage
The contract-test harness now treats the FDv1 Fallback Synchronizer as a
distinct field on the data system (dataSystem.fdv1Fallback) rather than
deriving it from the FDv2 Primary/Fallback synchronizer chain, and gates
the directive subtests on a new fdv1-fallback capability.
Wire the test service to match:
- declare the fdv1-fallback capability
- accept the new fdv1Fallback config field as SdkConfigPollingParams
- build the FDv1 Fallback Synchronizer from that field directly,
instead of coercing it out of the last polling synchronizer
- bump the FDv2 contract-tests pin from v3.0.0-alpha.3 to v3.0.0-alpha.6
Add unit-level coverage for the new SDK behavior:
- FDv2PollingDataSourceTest: success+fallback header path
- FDv2StreamingDataSourceTest: success+fallback header path
- FDv2DataSourceTest: FDv1FallbackActionApplier extra-skip behavior
and the shared GatedObserver latch.
…egardless of state The InitializationTracker only transitioned to FallingBack when it observed an underlying source's raw Off state with FDv1Fallback=true. But by the time the outer composite's secondaries see status updates from an inner composite, the inner CompositeSource's sanitizer has mapped Off to Interrupted -- so the Off case never matched, and the tracker stayed at NoData (or Data, if a fallback synchronizer's Apply with empty selector arrived). Either way, the tracker.Task never completed, so client initialization timed out at StartWaitTime and the contract-test service threw "Client initialization failed" for the streaming- error and polling-initializer FDv1 fallback directive scenarios. Move the FDv1Fallback check ahead of the state switch so it fires for any state (Interrupted on a recoverable error or sanitized Off, Off on an unrecoverable error, Valid + header on a successful directive) when the signal comes from either the initializer or synchronizer category. Apply the same handling to the initializer-phase fallback now that it is honored. Add three end-to-end tests against a real FDv2DataSource composite that reproduce each failing harness scenario: synchronizer Off+FDv1Fallback, synchronizer success-then-Off+FDv1Fallback, and initializer Off+FDv1Fallback. With the fix removed the first two hang at MakeCustomClient's StartWaitTime (matching the harness failure) and the action applier sequence is unverified; with the fix they all pass and the FDv1 fallback synchronizer is engaged.
Adds RealStreamingSourceWithFallbackHeaderEngagesFDv1FallbackAfterApply, which wires a real FDv2StreamingDataSource (with a mock IEventSource) into the FDv2DataSource composite alongside a mock FDv1 fallback synchronizer. Drives synthetic SSE events that mirror the harness suite "directive on streaming success applies payload then engages FDv1": a server-intent + put-object + payload-transferred sequence with the x-ld-fd-fallback header on the connection-open response. The earlier mock-only tests asserted the action applier sequence in isolation but did not exercise the streaming source's MaybeMarkInitialized + Shutdown path inside the composite. This test bridges that gap.
Mirrors the harness suite "directive on streaming error engages FDv1 fallback" using a real FDv2StreamingDataSource driven by a mock IEventSource that fires an EventSourceServiceUnsuccessfulResponseException with the x-ld-fd-fallback header. Without the InitializationTracker fix this test would hang at MakeCustomClient's StartWaitTime; with the fix it passes in ~25ms. Also adds TriggerError to the integration mock event source so the test can simulate the EventSource library's error path.
When the harness reports a failure (e.g. "Client initialization failed" from the test service), the contract-test logs are the only way to see why the SDK didn't reach the expected state. Capture them as a workflow artifact so subsequent debugging doesn't require re-running locally.
…tyName
The contract-test service uses System.Text.Json with PropertyNamingPolicy =
CamelCase. CamelCase converts the C# property name FDv1Fallback to
"fDv1Fallback" (only the first character lowercased), but the harness
sends the field as "fdv1Fallback" (lowercase 'd' too). The mismatch made
sdkParams.DataSystem.FDv1Fallback always null, which meant
dataSystemBuilder.FDv1FallbackSynchronizer(...) was never called, which
meant dataSystemConfiguration.FDv1FallbackSynchronizer was null, which
caused the OUTER composite's FDv1 fallback factory to throw NRE the moment
the action applier tried to invoke it.
Add an explicit [JsonPropertyName("fdv1Fallback")] attribute so the
deserializer accepts the harness's wire format. Also remove the temporary
debug logging from CompositeSource and SdkClientEntity that was used to
identify the bug.
…er is configured FDv2DataSystem.Create unconditionally constructed a one-entry list with FactoryWithContext(clientContext)(dataSystemConfiguration.FDv1FallbackSynchronizer). When FDv1FallbackSynchronizer was null this still produced a non-null SourceFactory delegate that captured the null configurer. The OUTER composite would then add a phantom FDv1 fallback entry; the moment an action applier advanced to it the entry's factory invocation would NullReferenceException inside FactoryWithContext. Build the list as empty when no FDv1 fallback is configured, so the OUTER composite simply has no FDv1 fallback entry and the data system halts cleanly per the spec.
…lts the data system without a fallback The previous attempt to guard FDv2DataSystem.Create against a null FDv1FallbackSynchronizer (commit dec9450) regressed Requirement 1.6.3(4): when no FDv1 fallback is configured and the directive arrives, the SDK must halt -- it must not keep retrying the failing synchronizer. Skipping the FDv1 fallback factory list also stripped the fdv1FallbackApplier from the synchronizer entry's outer-level observer, so the directive was no longer observed: nothing disposed the streaming source, and the EventSource library kept reconnecting after every 500. Attach the fdv1FallbackApplier (and the initializer variant) to the outer composite unconditionally. When an FDv1 fallback entry is in the outer list the applier advances to it; when one is not, the applier's BlockCurrent + DisposeCurrent + GoToNext sequence exhausts the outer list -- which disposes the current synchronizer (stopping reconnects) and transitions the outer composite to Off via InternalDispose. Add an end-to-end test (SynchronizerFDv1FallbackWithoutFallbackConfiguredHaltsDataSystem) that mirrors the harness "directive without FDv1 fallback configured halts the data system" scenario: a synchronizer reports Interrupted+FDv1Fallback (500 + directive), no FDv1 fallback configured, the underlying source must be disposed, and the data system must reach Off.
dec9450 to
0edfb9d
Compare
… with FDv1 fallback latch When an FDv2 initializer applies a non-empty changeset (success path) and the response also carries the x-ld-fd-fallback header, the source emits Apply followed by UpdateStatus(Off, FDv1Fallback) on the same propagation chain. Previously, ActionApplierBlacklistWhenSuccessOrOff's Apply path would synchronously enqueue advancement onto the outer composite, and the FDv1FallbackActionApplier's UpdateStatus path would independently enqueue its own advancement. The shared latch could not prevent the blacklist applier because it was only triggered during the later UpdateStatus call -- by then, the blacklist applier had already enqueued its actions. The result was the outer composite double-advancing past the FDv1 fallback entry. The fix: - Add ICompositeSourceActionable.EnqueueAction so observers can defer work onto the composite's serialized queue. - Pass the FDv1FallbackLatch into the blacklist applier. Its UpdateStatus path now consults the latch synchronously (sanitizer maps Off->Interrupted before reaching outer secondaries, so this branch is largely defensive). Its Apply path defers advancement onto the queue with a bounded re-enqueue, giving the source thread time to complete UpdateStatus and set the latch before the deferred action checks it. - Reorder the initializer entry's CompositeObserver so the FDv1 fallback applier runs before the blacklist applier, ensuring the latch is set as early as possible in any UpdateStatus propagation. Adds a composite-level regression test that drives the exact scenario end-to-end against a real FDv2DataSource composite, asserting the FDv1 fallback synchronizer is engaged and the FDv2 streaming synchronizer is never started.
| ActionApplierFactory fdv1FallbackApplierFactory = | ||
| (actionable) => new FDv1FallbackActionApplier(actionable, fdv1FallbackTriggered); | ||
| // From the initializers entry, the FDv1 fallback entry is two ahead when synchronizers | ||
| // are configured (skip past synchronizers), or one ahead when they are not. |
| /// wants to check a latch that is set by a sibling applier in the same propagation. | ||
| /// </remarks> | ||
| /// <param name="action">the action to enqueue</param> | ||
| void EnqueueAction(Action action); |
There was a problem hiding this comment.
Could we instead add a small bit of tagging metadata (FDv2, FDv1) to the entries and then add `void BlockAll(Predicate<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory)> toBlockPredicate).
Then internally, the source list can just block the initializers and synchronizers by the predicate matching the that they are FDv2. Then the only one that remains are FDv1 whatevers (which is the fallback synchronizer).
This way none of the code related to how it moves through the composite source has to change, the internal queue is not exposed for any action to be enqueued, and there isn't special handling for if FDv1 fallback was seen during the initializers phase or the synchronizers phase.
| /// Action applier that blacklists the current datasource when init occurs or when Off status is seen, | ||
| /// then disposes the current datasource, goes to the next datasource, and starts it. | ||
| /// </summary> | ||
| private class ActionApplierBlacklistWhenSuccessOrOff : IDataSourceObserver |
There was a problem hiding this comment.
This ActionApplierBlacklistWhenSuccessOrOff got extremely complicated. Is it just easier to make a ActionApplierForInitializers and ActionApplierForSynchronizers and there is a bit of duplicated code wrt handling the FDv1 fallback signal?
I think this approach gets rid of the need for the GatedObserver, the latch, and the deferred re-enqueuing code.
…ective flag Address review feedback on the cross-applier latch + GatedObserver + deferred-Apply re-enqueue machinery used to coordinate FDv1 fallback. Tag each outer composite entry with a CompositeEntryKind (FDv2 / FDv1Fallback) and add ICompositeSourceActionable.BlockAll(predicate), which removes every matching entry from the source list. The FDv1 fallback applier now calls BlockAll(FDv2) + GoToNext + StartCurrent, which is phase-agnostic and replaces the per-phase _extraEntriesToSkip counting and the initializer-only applier-factory variant. Add a FDv1Fallback flag to ChangeSet so the polling and streaming sources can ride the directive on Apply when a successful response carries the x-ld-fd-fallback header. With the flag inline on the event, the blacklist applier sees it synchronously and bails, and the FDv1 fallback applier triggers off the same Apply -- the success-path race that the latch was solving no longer exists. Each flow-control applier (FastFallback, TimedFallbackAndRecovery, BlacklistWhenSuccessOrOff) now checks newError?.FDv1Fallback and changeSet.FDv1Fallback inline and bails so the FDv1 fallback applier owns the transition uncontested. Drop FDv1FallbackLatch, GatedObserver, the cross-applier latch wiring, the initializer-only fdv1FallbackApplierFactory variant, _extraEntriesToSkip, and the deferred-Apply bounded-re-enqueue branch in ActionApplierBlacklistWhenSuccessOrOff. EnqueueAction is removed from the public actionable interface (it is no longer needed by any applier). Tests: replace FDv1FallbackActionApplierWith*Skip*, GatedObserverSuppressesEventsAfterLatchTriggered with UsesBlockAllAndAdvances, TriggersOnApplyCarryingDirective, and IsIdempotent. The end-to-end Bugbot regression test InitializerSuccessWithFDv1FallbackDirectiveDoesNotOverAdvance still passes against the simpler design.
| namespace LaunchDarkly.Sdk.Server.Internal.FDv2DataSources | ||
| { | ||
| using FactoryList = List<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory)>; | ||
| using OuterFactoryList = List<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory, CompositeEntryKind Kind)>; |
There was a problem hiding this comment.
We don't need to support the list with no kind, we can change the API as this is internal only.
| IList<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory)> factoryTuples, | ||
| Logger logger, | ||
| bool circular = true) | ||
| : this(compositeDescription, updatesSink, WithDefaultKind(factoryTuples), logger, circular) |
There was a problem hiding this comment.
We don't need to support the constructor with a list with no source kind, we can change the API as this is internal only.
| } | ||
|
|
||
| private static IList<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory, CompositeEntryKind Kind)> | ||
| WithDefaultKind(IList<(SourceFactory Factory, ActionApplierFactory ActionApplierFactory)> tuples) |
There was a problem hiding this comment.
This won't be necessary if we require a list with kinds be provided at construction time (see other comment on constructor)
| /// continues to point at whichever element followed the previous head. Returns the number | ||
| /// of elements removed. | ||
| /// </summary> | ||
| public int RemoveAll(Predicate<T> match) |
There was a problem hiding this comment.
Ensure there is sufficient unit test coverage for the various head position edge cases, such as when the predicate matches every entry and all entries are removed.
| /// </summary> | ||
| public int RemoveAll(Predicate<T> match) | ||
| { | ||
| if (match is null) throw new ArgumentNullException(nameof(match)); |
There was a problem hiding this comment.
We don't need to throw if there is no matcher. We can just return as the null matcher does not match anything and so no removal occurs.
| } | ||
| if (newState != DataSourceState.Off) return; | ||
| if (newError?.FDv1Fallback == true) return; | ||
|
|
There was a problem hiding this comment.
Bring back comment, it is still a valid comment:
// When Off status is seen, blacklist current, dispose current, go to next, and start current
| { | ||
| if (Interlocked.CompareExchange(ref _triggered, 1, 0) != 0) return; | ||
|
|
||
| _actionable.BlockAll(kind => kind == CompositeEntryKind.FDv2); |
There was a problem hiding this comment.
Should use _actionable.DisposeCurrent(); to be defensive against a source that does not shut itself down. Since dispose current is supposed to be idempotent, this should not hurt.
| Time = DateTime.Now, | ||
| FDv1Fallback = true | ||
| }; | ||
| _initTask.TrySetResult(true); |
There was a problem hiding this comment.
_initTask.TrySetResult(true); being conditional on if (fdv1Fallback) seems incorrect.
I think MaybeMarkInitialized() just before if (fdv1Fallback) took care of this concern.
There was a problem hiding this comment.
Might make sense for @kinyoklion to review this file as he is more familiar with it than I am.
…k refactor Tighten the CompositeSource API and apply the small comment / defensive cleanup items called out by tanderson-ld and Cursor on the prior commit. Drop the 2-tuple CompositeSource constructor and the WithDefaultKind helper. CompositeSource is internal-only, so all callers now pass a 3-tuple list with explicit CompositeEntryKind. Inner sub-composites in FDv2DataSource pass FDv2 uniformly (the kind is inert for inner composites whose appliers never call BlockAll). Update the six test declarations in CompositeSourceTest.cs to match. Soften SourcesList.RemoveAll(null) to return 0 rather than throw -- a null matcher matches nothing, so there is nothing to remove. Add 9 unit tests for RemoveAll covering the head-position edge cases: matches none, matches all, matches only entries before / after / at the head, single-element list, empty list, and a circular wrap case. Restore the deleted "When Off status is seen, blacklist current..." comment in ActionApplierBlacklistWhenSuccessOrOff.UpdateStatus. In FDv1FallbackActionApplier.Trigger, call DisposeCurrent() between BlockAll and GoToNext as a defensive guard against a source that does not shut itself down. DisposeCurrent is idempotent so this is safe. Update the matching unit-test CallSequence assertions. In FDv2StreamingDataSource, drop the redundant _initTask.TrySetResult inside the FDv1 fallback branch -- MaybeMarkInitialized() on the preceding line already covers it. Update the InitializerSuccess... regression test so the mock initializer sets fdv1Fallback: true on its ChangeSet, mirroring what FDv2PollingDataSource does in production. Without this, the test exercised the pre-refactor race-prone path rather than the new inline-bail path.
…ured The FDv1 fallback applier is now attached to both the initializers and synchronizers entries unconditionally. When the directive fires and no FDv1 fallback synchronizer was configured, BlockAll(FDv2) exhausts the outer composite. Exhaustion-driven Off goes directly to the external sink, bypassing the InitializationTracker's observers, so the tracker permanently stays in FallingBack and dataSource.Start() never resolves. Pass hasFdv1Fallback to InitializationTracker. When a transition would enter FallingBack without a fallback configured, transition to Failed instead so Start() resolves with false (matching the pre-refactor behavior where the applier was conditionally attached and the error propagated through SynchronizersExhausted -> Failed). Strengthen SynchronizerFDv1FallbackWithoutFallbackConfiguredHaltsDataSystem to assert that Start() resolves within 5s with false -- without the fix the test hangs and times out.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default mode and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit d9a0697. Configure here.
| }; | ||
| Shutdown(fallbackError); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Polling source _initTask never completes on store failure with fallback
Low Severity
When the FDv1 fallback header is present on a successful 200 response but the data store rejects the changeset write (e.g., transient store failure), ProcessChangeSet returns early without setting _initTask. Control then falls through to the if (fdv1Fallback) block which calls Shutdown, permanently canceling the repeating poll task. Since _initTask is never completed (no SetResult or TrySetResult), the polling source's Start() task remains pending indefinitely — a minor resource leak. Previously the poll would retry on the next tick; now Shutdown makes it permanent. A _initTask.TrySetResult(false) (or true) before Shutdown would resolve the dangling task.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d9a0697. Configure here.
…lback If ProcessChangeSet fails (data store rejects the changeset) on a response carrying the FDv1 directive, control falls through to the fdv1Fallback branch in UpdateTaskAsync. Shutdown permanently cancels the poll loop, but _initTask is never resolved -- the polling source's Start() task hangs forever (a resource leak; not user-visible because CompletingDataSource.Start() returns the tracker's task rather than the underlying source's). Add _initTask.TrySetResult(false) before Shutdown. TrySet is a no-op when ProcessChangeSet already set it to true on the success path, so this only affects the store-failure path.
🤖 I have created a release *beep* *boop* --- ## [8.12.1](LaunchDarkly.ServerSdk-v8.12.0...LaunchDarkly.ServerSdk-v8.12.1) (2026-05-12) ### Bug Fixes * Honor x-ld-fd-fallback header in FDv2 initializer phase and on successful responses ([#251](#251)) ([4bb667d](4bb667d)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [LaunchDarkly.ServerSdk](https://github.com/launchdarkly/dotnet-core) | nuget | patch | `8.12.0` -> `8.12.1` | --- ### Release Notes <details> <summary>launchdarkly/dotnet-core (LaunchDarkly.ServerSdk)</summary> ### [`v8.12.1`](https://github.com/launchdarkly/dotnet-core/releases/tag/LaunchDarkly.ServerSdk-v8.12.1): LaunchDarkly.ServerSdk: v8.12.1 [Compare Source](launchdarkly/dotnet-core@LaunchDarkly.ServerSdk-v8.12.0...LaunchDarkly.ServerSdk-v8.12.1) ##### Bug Fixes - Honor x-ld-fd-fallback header in FDv2 initializer phase and on successful responses ([#​251](launchdarkly/dotnet-core#251)) ([4bb667d](launchdarkly/dotnet-core@4bb667d)) *** This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- CURSOR_SUMMARY --> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or PR is renamed to start with "rebase!". 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).


Summary
FDv2PollingDataSourceandFDv2StreamingDataSourcenow honorX-LD-FD-Fallback: trueon successful responses (not just errors): the payload is applied first, then the data source closes withFDv1Fallback=trueso the action applier can swap to FDv1.FDv1FallbackLatch+GatedObservercoordination ensures the existing fallback-on-Off appliers stop reacting after fallback is triggered, preventing them from blocking the now-running FDv1 fallback entry on stale exhaustion signals.fdv1-fallbackcapability and accepts a top-leveldataSystem.fdv1Fallbackconfig object, wiring it directly toFDv1FallbackSynchronizerinstead of inferring from the last polling synchronizer. Bumps the FDv2 contract-tests pin fromv3.0.0-alpha.3tov3.0.0-alpha.6.Mirrors the Go reference change in go-server-sdk#365 and contract-test wiring in go-server-sdk#368. Spec rationale lives in sdk-specs#155; the new harness suite is in sdk-test-harness#336.
Test plan
dotnet test pkgs/sdk/server/test/...— 1555 tests pass (8 new tests added). FDv2-only filter passes 159 across 3 consecutive runs.BlockCurrent → DisposeCurrent → (GoToNext)*N → StartCurrentsequencing, andGatedObserversuppression after the latch fires.v3.0.0-alpha.6runs the new "FDv1 Fallback Directive" suite cleanly in CI.Test coverage caveat
There is no in-repo end-to-end test that drives a real
FDv2DataSourcecomposite through the initializer-phase fallback path (initializer reportsOff+FDv1Fallback→ payload applied → FDv2 synchronizer chain bypassed → FDv1 fallback engaged). An attempt at one was flaky against the existing CompositeSource action queue; rather than commit a flaky test, the unit-level applier and latch tests were kept and the contract-test harness'sfdv1-fallbacksuite is the end-to-end safety net. The synchronizer-phase end-to-end fallback IS covered by the pre-existingErrorWithFDv1FallbackTriggersFallbackToFDv1Synchronizerstest.Note
Medium Risk
Medium risk: changes core FDv2 data-source/composite switching and fallback behavior based on
x-ld-fd-fallback, which could impact initialization and update delivery if edge cases are missed.Overview
FDv2 now honors the
x-ld-fd-fallback: truedirective across more paths and cleanly hands off to FDv1. Polling and streaming sources propagate the directive even on successful responses by tagging the emittedChangeSetand then shutting down withFDv1Fallback=trueso the composite can switch to the FDv1 fallback synchronizer.The FDv2 composite fallback logic is reworked to be phase-agnostic: a new
CompositeEntryKindplusICompositeSourceActionable.BlockAlllets the FDv1 fallback applier remove all FDv2 entries in one shot (initializers and synchronizers) and advance to the FDv1 fallback entry, while other appliers bail out when the directive is present to avoid double-advancing.Contract-test plumbing is updated to support the new harness suite: the test harness pin moves to
v3.0.0-alpha.6, the test service advertisesfdv1-fallbackand acceptsdataSystem.fdv1Fallback(with correct JSON name), and CI uploads the test-service log on FDv2 contract-test failures. Extensive new unit/integration tests cover directive-on-success, directive-on-error, initializer-phase skipping, and “no fallback configured” halt behavior.Reviewed by Cursor Bugbot for commit 752da6a. Bugbot is set up for automated code reviews on this repo. Configure here.