Commit c46043e
Add HTTP/2 PING for broken connection detection. (#49095)
* Extract HTTP/2 PING keepalive from PR #48420
Extracts the HTTP/2 PING health probing feature from PR #48420
(AzCosmos_HttpConnectionMaxLife) as a standalone changeset.
Production code:
- Http2PingHandler: sends PING frames on idle H2 parent channels
- Configs: COSMOS.HTTP2_PING_HEALTH_ENABLED + PING_INTERVAL_IN_SECONDS
- ReactorNettyClient: installs PingHandler via doOnConnected
- IHttpClientInterceptor: test hook for DNS resolver + doOnConnected
- HttpClientConfig/ConnectionPolicy/CosmosClientBuilder: interceptor plumbing
Test code:
- Http2PingKeepaliveTest: proves PINGs sent and ACKed on idle connections
- Http2PingFrameCounterHandler: test utility for counting PING ACK frames
- manual-http-network-fault TestNG suite + Maven profile
Verified in Docker (--cap-add=NET_ADMIN): 5 PINGs sent, 10 ACKs received.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Remove IHttpClientInterceptor plumbing, add PING concurrency guard + broken connection detection
- Remove IHttpClientInterceptor interface, CosmosHttpClientInterceptor, and all
plumbing through CosmosClientBuilder, ConnectionPolicy, ImplementationBridgeHelpers,
RxDocumentClientImpl, HttpClientConfig, and ReactorNettyClient
- Revert CosmosInterceptorHelper to upstream/main (remove registerHttpClientInterceptor)
- Http2PingHandler: at-most-one outstanding PING (pingOutstandingSinceNanos)
- Http2PingHandler: suppress PING when active streams exist (numActiveStreams > 0)
- Http2PingHandler: mark connection unhealthy via PING_HEALTH_DEGRADED channel
attribute when ACK not received within one interval; clear on late ACK
- Http2PingHandler: add global counters + isConnectionHealthDegraded() static helper
- Test uses auto-installed handler via global counters (no interceptor needed)
- Add design spec: azure-cosmos/docs/http2-ping-keepalive-spec.md
* Align PING timing with Rust SDK: interval=1s, timeout=2s, close on timeout
- Default PING interval: 5s -> 1s (aligned with Rust SDK's hyper config)
- Default PING timeout: 10s -> 2s (aligned with Rust SDK's hyper config)
- Worst-case dead connection detection: ~15s -> ~3s
- Minimum check frequency: 1000ms -> 500ms
- On PING timeout: close connection via ctx.close() (aligned with Rust SDK
where hyper kills connections on timeout -> shard eviction)
- PING_HEALTH_DEGRADED attribute set before close for diagnostic visibility
- Spec: clarify sharding is Rust-specific (hyper opens 1 conn per client);
reactor-netty natively pools multiple H2 connections, no shard layer needed
* Add PING interval and timeout tests, remove thin client dependency
- pingFramesSentAtConfiguredInterval: verifies PINGs fire at 1s interval
over a 10s idle period (expects >= 5 PINGs sent and ACKed), asserts
connection is reused (PING kept it alive)
- connectionClosedOnPingTimeout: uses iptables DROP to blackhole traffic,
verifies PING handler closes connection after 2s timeout, asserts
recovery request uses a different connection (Docker --cap-add=NET_ADMIN)
- Remove COSMOS.THINCLIENT_ENABLED from Maven profile — HTTP/2 PING
only requires COSMOS.HTTP2_ENABLED=true, no thin client needed
* Force single-connection H2 pool in tests for deterministic channel assertions
Set COSMOS.HTTP2_MAX/MIN_CONNECTION_POOL_SIZE=1 so that:
- pingFramesSentAtConfiguredInterval can assert SAME connection (pool size=1
means if the connection survives, recovery must use the same one)
- connectionClosedOnPingTimeout has deterministic initial channel ID
Docker-validated: Tests run: 2, Failures: 0, Errors: 0
- pingsSent=14, pingAcksReceived=14, SAME=true (pool maxc:1, minc:1)
- DIFFERENT_CONNECTION=true after iptables DROP + PING timeout
* Add Cosmos_Live_Test_Http2NetworkFault CI stage, delete unused Http2PingFrameCounterHandler
- Add live-http2-network-fault-platform-matrix.json (Linux, single profile)
- Add Cosmos_Live_Test_Http2NetworkFault stage in tests.yml:
MaxParallel=1, profile=manual-http-network-fault, timeout=30min
Passes COSMOS.HTTP2_ENABLED, HTTP2_PING_HEALTH_ENABLED, interval=1s, timeout=2s
- Delete Http2PingFrameCounterHandler — unused, tests use global counters
- Remove Http2PingFrameCounterHandler reference from spec
* Add consecutive failure threshold for defensive PING-based connection close
Instead of closing on first missed ACK, the handler now tracks consecutive
failures and only closes after reaching the threshold (default: 2).
Flow with threshold=2:
1. PING #1 sent, timeout elapses → consecutiveFailures=1, unblock next PING
2. PING #2 sent, timeout elapses → consecutiveFailures=2 >= threshold → close
On ACK received at any point → consecutiveFailures=0 (full reset).
Timing with defaults (interval=1s, timeout=2s, threshold=2):
- Worst-case detection: ~6s (2 rounds of interval+timeout)
- Tolerates 1 transient network blip without closing
New config: COSMOS.HTTP2_PING_FAILURE_THRESHOLD (default: 2)
* Align failure threshold with Rust SDK (5), test overrides to 2 for speed
- Default COSMOS.HTTP2_PING_FAILURE_THRESHOLD: 2 -> 5 (aligned with Rust SDK's
http2_consecutive_failure_threshold = 5)
- Test sets threshold=2 via system property for faster execution (~6s vs ~15s)
- Test wait time: 10s (sufficient for 2-round threshold)
- Docker validated: Tests run: 2, Failures: 0, Errors: 0
- Logs show 'attempt 1/2 — will retry' then 'closing connection' on 2nd failure
* Remove spec doc — design decisions documented in PR description
* Add sudo detection for CI VMs, PreSteps for iptables installation
- Http2PingKeepaliveTest: detect root vs non-root user, prefix iptables
commands with 'sudo' on CI VMs (no sudo when running as root in Docker)
- Add execCommand() helper with error stream logging
- tests.yml: add PreSteps to install iproute2 + iptables, load sch_netem
kernel module on CI Ubuntu VMs (following PR #48420 pattern)
- Matrix already restricts to ubuntu-only (no Windows jobs)
* Log levels: WARN only for connection close, INFO for retries and send failures
* Remove unnecessary isDebugEnabled guards — SLF4J parameterized logging handles this
* Single INFO log for connection close, everything else DEBUG
* Use Http2ConnectionConfig API for pool size, replace em dashes with --
- Tests use GatewayConnectionConfig.getHttp2ConnectionConfig()
.setMaxConnectionPoolSize(1).setMinConnectionPoolSize(1) instead of
system properties for pool size
- Replace all unicode em dashes with plain -- in logs and comments
* Fix readAllBytes() compile error -- use BufferedReader for JDK 8 compat
* Deep review cleanup: remove unused Http2ConnectionConfig import, replace all em dashes
* Replace global counters with channel attribute for handler ref, assert connection identity only
- Remove global AtomicInteger counters and ConcurrentHashMap registry
- Store handler ref as PING_HANDLER_REF channel attribute (lifecycle-bound)
- Add Http2PingHandler.getFrom(Channel) for programmatic access
- Test asserts connection identity (same/different) not PING counts
- Per-instance counters (getPingsSent/getPingAcksReceived) retained for
programmatic access via getFrom()
- Docker validated: Tests run: 2, Failures: 0, Errors: 0
* Add CHANGELOG entry for HTTP/2 PING keepalive
* Simplify CHANGELOG wording
* Remove unused PING_HANDLER_REF attribute and getFrom() method
* Address Copilot review comments
1. extractParentChannelId: throw AssertionError instead of returning
placeholder -- prevents false-positive test passes
2. Http2PingHandler: add thread-safety comment on write-future listener
(runs on same event loop as scheduled task)
3. ReactorNettyClient: move doOnConnected PING registration inside
if (isH2Enabled) branch -- avoids per-connect overhead for H1 clients
Docker validated: Tests run: 2, Failures: 0, Errors: 0
* Address xinlian12 review comments (4-8)
4. Test comment updated -- 10s idle < 60s maxIdleTime, test validates
PING frames flow (connection reuse) not eviction prevention
5. Remove PING_HEALTH_DEGRADED attribute + isConnectionHealthDegraded()
-- no consumer exists, attribute set right before ctx.close()
6. Use getJVMConfigAsBoolean() for isHttp2PingHealthEnabled() --
consistent with 7+ sibling boolean configs
7. Remove sch_netem/iproute2 from CI PreSteps -- tests only use iptables
8. Use pipeline name guard (get(HANDLER_NAME)==null) instead of
AttributeKey -- consistent with Http2ParentChannelExceptionHandler
Docker validated: Tests run: 2, Failures: 0, Errors: 0
* Remove test 1 (pingFramesSentAtConfiguredInterval) -- does not prove PINGs
Test 1 only proved connection reuse during a 10s idle period, but
maxIdleTime=60s means the connection survives regardless of PINGs.
Test 2 (connectionClosedOnPingTimeout) is the definitive proof that
PINGs flow: iptables DROP goes undetected without PING probing.
* Address review round 2: remove dead code, track lastReadNanos, bounds guard
1. Remove dead code: getPingsSent/getPingAcksReceived + AtomicInteger fields,
httpCfgAccessor() private method (zero consumers after test 1 removal)
2. Track lastReadNanos (inbound) instead of lastActivityNanos (read+write) --
PING fires when no response received for interval, detects half-dead
connections where writes succeed but reads never arrive
3. Add Math.max(1, ...) bounds guard on pingTimeoutSeconds to prevent
zero/negative timeout causing instant timeout on every PING
4. Writes no longer reset idle timer (only reads do)
5. pingsSent changed from AtomicInteger to plain int (event-loop confined)
* Add HTTP/2 PING resilience: probe PING and client-level kill switch
Send a probe PING 500ms after handler installation to detect servers
that reject PING frames (e.g., Cosmos DB Dedicated Gateway's old Mux
stack which responds with RST_STREAM(0, PROTOCOL_ERROR)).
If PROTOCOL_ERROR is received while a PING is outstanding, disable
HTTP/2 PING for all connections from this CosmosClient via a shared
AtomicBoolean kill switch. This prevents repeatedly killing connections
to PING-incompatible endpoints.
Changes:
- Http2PingHandler: probe PING in handlerAdded(), exceptionCaught()
override for PROTOCOL_ERROR detection, clientPingDisabled guard in
maybeSendPing() and installIfAbsent()
- ReactorNettyClient: AtomicBoolean http2PingDisabled field shared
with all Http2PingHandler instances
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Self-review: move http2CfgAccessor to static field, remove write() override and active-streams check, add thread-safety comment
* Remove probe PING, remove active-streams check, revert channelInactive kill-switch; PROTOCOL_ERROR via exceptionCaught is the reliable path
* Use lazy init pattern for http2CfgAccessor
* Fix parent channel resolution and customHeaderCleaner duplicate guard
- doOnConnected fires for child stream channels where
Http2MultiplexHandler is absent; resolve ch.parent() before
checking for the multiplexer and installing PingHandler.
- Guard customHeaderCleaner addAfter with null check to prevent
IllegalArgumentException: Duplicate handler name.
- Add Http2PingHandler DEBUG logging to test log4j2 config.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Remove PROTOCOL_ERROR logic and client-level kill switch
The SQLx Mux's Http2ServerProtocolHandler rejects PING with
PROTOCOL_ERROR, but SQLx doesn't negotiate H2 via ALPN so
clients never speak H2 to it. The Proxy (Standard GW,
ThinClient) uses nghttp2 which auto-ACKs PINGs correctly.
Since no live endpoint returns PROTOCOL_ERROR for PINGs, the
defensive kill-switch adds complexity without value. Removed:
- exceptionCaught() PROTOCOL_ERROR detection
- clientPingDisabled AtomicBoolean (shared kill switch)
- Kill-switch check in maybeSendPing()
- Related parameter from constructor and installIfAbsent()
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add late-bound BooleanSupplier gate for HTTP/2 PING handler installation
HttpClient gains a default no-op setHttp2PingScopeSupplier so all client implementations get the new contract for free. ReactorNettyClient stores the supplier in a volatile field (defaults to () -> false) and consults it inside the doOnConnected hook, so PING handler installation is decided per H2 connection at attach time rather than at client construction.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Permanently disable HTTP/2 PING handler on SharedGatewayHttpClient
Shared gateway clients are reused across multiple Cosmos clients with differing thin-client preferences, so a per-client supplier could be flipped on or off by any caller. The constructor pins the inner supplier to () -> false and the setter is a no-op so PING stays off regardless of caller intent. This keeps the rollout defensive: only dedicated thin-client clients can enable PING.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Expose hasThinClientReadLocations as a live AtomicBoolean reference
GlobalEndpointManager already tracks whether the most recent DatabaseAccount refresh returned thinClientReadableLocations. Surface a package-private accessor that returns the underlying AtomicBoolean so downstream code (e.g. the HTTP/2 PING gate) can capture a live reference without holding a strong ref to the manager itself.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Reserve UserAgentFeatureFlags bit 6 for Http2PingHealth
Adds Http2PingHealth = 1 << 6 to the user-agent feature flag bitmap so telemetry can attribute requests to clients that have the HTTP/2 PING health check actively installed.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Scope HTTP/2 PING health check to thin-client-enabled clients
Wires the late-bound BooleanSupplier into the reactor HTTP client after globalEndpointManager.init(), using the live AtomicBoolean from GlobalEndpointManager so the gate flips whenever a DatabaseAccount refresh reveals (or removes) thinClientReadableLocations.
Adds isHttp2PingHealthEffectivelyEnabled() which combines: (a) HTTP/2 enabled, (b) PING properties configured, (c) DA exposes thin-client read locations, (d) the underlying HttpClient is not a SharedGatewayHttpClient. The same predicate gates the Http2PingHealth user-agent flag so telemetry stays in sync with runtime behavior.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Update CHANGELOG entry for HTTP/2 PING health check scoping
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Update Http2PingKeepaliveTest for thin-client scoping
Sets COSMOS.THINCLIENT_ENABLED=true alongside HTTP2_ENABLED in @BeforeClass so the DatabaseAccount refresh returns thinClientReadableLocations and the PING handler actually installs. Switches the iptables blackhole from port 443 on the gateway hostname to port 10250 with no host filter, since thin-client H2 traffic flows over port 10250 at regional hostnames - this is what the PING handler is monitoring.
Verified end-to-end against thin-client-mr-bs-ci: connection 8184be81 evicted after PING timeouts, recovery read used fresh connection 584f4d29.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Scope Http2NetworkFault CI stage to thin-client test account
Switches the Http2NetworkFault pipeline stage from the generic test endpoint to the thin-client KeyVault secrets (thinclient-test-endpoint / thinclient-test-key) and sets COSMOS.THINCLIENT_ENABLED=true so the PING health check is actually exercised in CI.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Tighten CHANGELOG entry for HTTP/2 PING keepalive
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Re-check PING scope per tick in Http2PingHandler
Thread the per-client BooleanSupplier scopeSupplier from ReactorNettyClient into Http2PingHandler so maybeSendPing() re-evaluates it on every interval. Closes the gap where an already-installed handler kept PINGing after the account's DatabaseAccount refresh dropped thinClientReadableLocations (e.g., thin-client flipped off at runtime). When the supplier returns false the next tick cancels the scheduled task; the connection stays in the pool until normal idle eviction.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review feedback: ACK payload match, dormant scope-off, JSON parentChannelId extractor, unit tests
- Http2PingHandler: rename lastReadNanos -> lastActivityNanos with clarifying comment;
match PING ACK by payload (RFC 9113 6.7); make scope-off / kill-switch dormant
instead of cancelling the timer so the handler auto-rearms when scope re-enables
- RxDocumentClientImpl: suppress Http2PingHealth UA flag when ping interval <= 0
(the handler's install gate clamps to 1s, but the UA flag should honor intent)
- Http2PingKeepaliveTest: @afterclass alwaysRun=true; replace brittle substring
parentChannelId extractor with JSON walk over gatewayStatisticsList;
remove dead extractHostFromEndpoint
- Add Http2PingHandlerTest: 6 EmbeddedChannel-based unit tests for ACK matching,
mismatched ACK ignored, idempotent install, ctor clamping, supplier null-safety
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Remove HTTP/2 PING thin-client scoping
PING is now installed on any H2 parent channel when the global gate (Configs.isHttp2PingHealthEnabled + H2 enabled + valid interval) is set, not just when the account exposes thin-client read locations.
- Drop scopeSupplier from Http2PingHandler + installIfAbsent
- Drop http2PingScopeSupplier / setHttp2PingScopeSupplier from HttpClient / ReactorNettyClient / SharedGatewayHttpClient
- Drop hasThinClientReadLocations / SharedGateway gates from isHttp2PingHealthEffectivelyEnabled
- Drop getHasThinClientReadLocationsRef from GlobalEndpointManager
- Update Http2PingHandlerTest (remove 2 obsolete tests, simplify rest)
Routing-side hasThinClientReadLocations() use (useThinClientStoreModel) remains untouched.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address PR review: late-ACK guard, execCommand failure, doc refresh
- Http2PingHandler.channelRead: also guard ACK on pingOutstandingSinceNanos != 0 so a late ACK arriving after timeout cleared it -- but before the next PING bumps pingsSent -- cannot mask consecutive failures on a degraded connection.
- Http2PingHandlerTest: add ackAfterTimeoutCleared_doesNotResetState regression test for the above.
- Http2PingKeepaliveTest.execCommand: throw on non-zero exit so a missing NET_ADMIN cap fails the test loudly instead of silently skipping network fault injection.
- Http2PingKeepaliveTest: refresh class Javadoc + beforeClass comment to match the current single test and drop stale 'PING is scoped to thin-client' wording (scoping was removed in d33d311).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Refactor Http2PingKeepaliveTest for two-pipeline runs (thin-client + Compute)
Drive transport selection from COSMOS.THINCLIENT_ENABLED sysprop set by Maven
profile rather than hardcoding port 10250 / thin-client in @BeforeClass.
- manual-http-network-fault profile: THINCLIENT_ENABLED=true (port 10250).
- New manual-http-network-fault-compute profile: THINCLIENT_ENABLED=false
(port 443, IP-scoped iptables since :443 is shared with other TLS traffic).
- Warm-up read now hard-asserts H2 negotiation (fails fast if the regional
gateway only speaks HTTP/1.1, so a silent fallback can't make the test
green for the wrong reason).
- extractEndpointHost helper resolves the regional gateway host from
gatewayStatisticsList for the IP-scoped variant.
- iptables cleanup simplified by capturing the exact -D rule string and
re-running it best-effort in finally.
Validated in Docker against thin-client-mr-bs-ci:
thinClient=true, port=10250, DIFFERENT_CONNECTION=true
thinClient=false, port=443, DIFFERENT_CONNECTION=true
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add Compute matrix entry for Http2NetworkFault stage
The live-http2-network-fault matrix previously had a single include block
running -Pmanual-http-network-fault (thin-client, port 10250). Add a sibling
entry running -Pmanual-http-network-fault-compute (Compute / GW V2, port 443).
CI will now spawn two parallel jobs from this matrix:
HttpNetworkFaultThinClient -> -Pmanual-http-network-fault
HttpNetworkFaultCompute -> -Pmanual-http-network-fault-compute
Both jobs run the same TestNG suite (manual-http-network-fault-testng.xml).
The SDK transport is chosen by the COSMOS.THINCLIENT_ENABLED sysprop the
profile sets, so Http2PingKeepaliveTest exercises both transports without
test-class duplication.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Consolidate HTTP/2 PING install gating into single helper
Add Http2PingHandler.isPingHealthEffectivelyEnabled(Http2ConnectionConfig)
that consolidates the three gates (kill-switch, ping interval > 0, H2
effectively enabled). ReactorNettyClient and RxDocumentClientImpl now
delegate to this single helper so the transport install site and the
user-agent feature flag cannot drift apart.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Use full Netty channel toString in Http2PingHandler log lines
Aligns channel identifier in PING handler log lines with Http2ParentChannelExceptionHandler, which logs ctx.channel() directly (Netty's [id: 0x..., L:..., R:...] form). The short hex via id().asShortText() loses local/remote address context that's useful when correlating PING events with other parent-channel exceptions and pool eviction logs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address xinlian12 review: env-var fallback, dormant state-clear, doOnConnected comment
- Configs: add COSMOS_HTTP2_PING_* env-var fallback for all 4 PING getters (parity with HTTP2_ENABLED-family)
- Http2PingHandler: clear pingOutstandingSinceNanos and consecutiveFailures on dormant return to prevent stale state after re-enable
- Http2PingHandler / ReactorNettyClient: correct doOnConnected comments — reactor-netty 1.2.x fires State.CONFIGURED on parent only; parent resolution + installIfAbsent guard are defensive
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* test(cosmos): unit tests for Http2PingHandler kill-switch clear and channelInactive cleanup
Addresses kushagraThapar review (M1, M2) on PR #49095.
- killSwitchOff_clearsOutstandingPingState: when COSMOS.HTTP2_PING_HEALTH_ENABLED
flips to false, the next maybeSendPing tick clears pingOutstandingSinceNanos and
consecutiveFailures so the handler returns to a dormant state without re-installing.
- channelInactive_cancelsPingTask: closing the channel cancels the scheduled PING
task and nulls the field so no further work is dispatched after handler removal.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci(cosmos): scope HTTP/2 PING flags per profile, fix Compute matrix leg
The Cosmos_live_test_http2_network_fault stage's AdditionalArgs hardcoded
-DCOSMOS.THINCLIENT_ENABLED=true (plus HTTP2_ENABLED / HTTP2_PING_*) at the
stage level, so the value was applied to every matrix entry. The CLI -D
propagates into the failsafe-forked JVM and overrides the per-profile
<systemPropertyVariables> in azure-cosmos-tests/pom.xml, meaning the
'manual-http-network-fault-compute' leg silently ran with thin-client
enabled and hit port 10250 instead of exercising the regional gateway on
:443 -- making it a duplicate of the thin-client leg.
Move the HTTP/2 PING flags (HTTP2_ENABLED, HTTP2_PING_HEALTH_ENABLED,
HTTP2_PING_INTERVAL_IN_SECONDS, HTTP2_PING_TIMEOUT_IN_SECONDS) into both
the 'manual-http-network-fault' and 'manual-http-network-fault-compute'
pom profiles (universal to the test class), and drop them plus
THINCLIENT_ENABLED from tests.yml so the per-profile values control
routing. The stage AdditionalArgs now only carries the truly stage-level
account host / key / leak-detection flag.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(cosmos): count PING write failures toward close threshold
A failed writeAndFlush listener previously cleared pingOutstandingSinceNanos
but did NOT increment consecutiveFailures. If a channel was stuck in a state
where writes consistently fail while channel.isActive() stays true (H2 codec
rejecting frames, stalled flow-control, queued ClosedChannelException not yet
propagated), the handler would retry forever without ever closing.
Treat a write failure as a failed health probe: increment consecutiveFailures
and trigger ctx.close() + cancelPingTask() at the threshold, mirroring the
timeout path. Practical risk is low (channelInactive usually fires shortly),
but this closes a real invariant gap in the state machine.
Added Http2PingHandlerTest.writeFailure_incrementsConsecutiveFailuresAndClosesAtThreshold
that injects a failing outbound handler and verifies the close path fires after
threshold consecutive failures.
Addresses PR #49095 review comment from xinlian12.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(cosmos-tests): close throw-away CosmosAsyncClient created by TestUtils
TestUtils.createDummyQueryFeedOperationState(...,AsyncDocumentClient) builds an inner CosmosAsyncClient via CosmosClientBuilder and previously never closed it. CosmosNettyLeakDetectorFactory then flagged the resulting active client at @afterclass with messages like 'CosmosClient [N] leaked' on tests such as GatewayReadConsistencyStrategySpyWireTest.
Track every inner client created by that overload in a static list inside TestUtils and expose closeDummyClients(). Drain it via a new @afterclass hook in TestSuiteBase and an explicit call in GatewayReadConsistencyStrategySpyWireTest.afterClass() (the only relevant caller that does not extend TestSuiteBase).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(cosmos-tests): drain TestUtils dummy clients from leak detector
The previous attempt added an inherited @afterclass method on
TestSuiteBase, which broke CosmosNettyLeakDetectorFactory's per-class
counter (it decrements once per @afterclass invocation but is sized for
a single one). The detector fired prematurely on the first @afterclass
method, before the derived test's afterClass() had closed its own
CosmosAsyncClient, producing false leak failures across many tests
(e.g. WebExceptionRetryPolicyE2ETests).
Revert the TestSuiteBase and GatewayReadConsistencyStrategySpyWireTest
hooks and instead call TestUtils.closeDummyClients() from
CosmosNettyLeakDetectorFactory.onAfterClassCore() right before
snapshotting active clients. This adds no new @afterclass method and
covers every test class through the existing listener.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(cosmos-tests): isolate dummy client cleanup to GatewayReadConsistencyStrategySpyWireTest
GatewayReadConsistencyStrategySpyWireTest is the only test that triggers
the TestUtils dummy client leak. Restore the explicit drain in its
existing afterClass() (no new @afterclass added, so the leak detector's
per-class counter is unaffected) and revert the broader hook in
CosmosNettyLeakDetectorFactory.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Isolate dummy-client leak fix to GatewayReadConsistencyStrategySpyWireTest
Revert the DUMMY_CLIENTS tracking and closeDummyClients() scaffolding from
TestUtils.java and route the two call sites in
GatewayReadConsistencyStrategySpyWireTest.executeQueryAndCapture and
executeReadManyAndCapture through the existing safe overload of
createDummyQueryFeedOperationState that accepts the outer CosmosAsyncClient
(the same pattern executeChangeFeedAndCapture already uses). No throw-away
inner CosmosAsyncClient is created, so there is nothing to leak and no
shared-class scaffolding to drag along.
Net change vs main is two lines in a single test file. TestUtils.java is
byte-for-byte identical to main.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Address xinlian12 PR comments: TOCTOU guard, doc clarity, ACK-timeout unit test
- ReactorNettyClient: wrap customHeaderCleaner addAfter in try/catch to mirror
the sibling Http2ParentChannelExceptionHandler install pattern. Guards against
a TOCTOU race when the channel is closed between the null-check and addAfter.
- Configs (HTTP2_PING_INTERVAL_IN_SECONDS): replace contradictory `Detected
within 3s` comment. Clarify that one missed PING round is ~3s (interval + 2s
timeout) and connection close requires HTTP2_PING_FAILURE_THRESHOLD
consecutive failures (~15s worst case).
- Http2PingHandlerTest: add ackTimeout_incrementsConsecutiveFailuresAndClosesAtThreshold
to mirror the existing write-failure coverage. Update class Javadoc to move
the ACK-timeout and write-failure paths into the unit-test bullet list.
Verified: azure-cosmos-tests Http2PingHandlerTest 9/9 pass via -Punit verify.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* docs(cosmos): clarify HTTP/2 PING threshold semantics and expand CHANGELOG
Addresses kushagraThapar PR review 4432203224:
- H2: Configs.java threshold comment no longer claims false alignment
with Rust SDK's http2_consecutive_failure_threshold (which is a
per-HTTP-request shard-health knob, not per-PING-ACK). Now explicitly
notes that peer HTTP/2 stacks (Hyper / .NET SocketsHttpHandler /
Go net/http) typically close on the first PING-ACK timeout, and that
Java's threshold of 5 is intentionally more tolerant.
- L8 + H5: CHANGELOG entry for PR 49095 now calls out:
* default-ON state
* all four COSMOS.HTTP2_PING_* tunable knob names
* the defensive concurrent-install hardening on the H2 parent
pipeline (customHeaderCleaner / Http2ParentChannelExceptionHandler)
No code semantics change; comments and changelog only.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* H3: route HTTP/2 PING-driven channel close as retryable transport failure
Addresses kushagraThapar H3 review on PR #49095 - when Http2PingHandler
closes a connection after consecutive PING ACK failures, surface the close
as a retryable transport failure so ClientRetryPolicy retries in the SAME
region without marking the regional gateway endpoint unavailable.
Production:
- Http2PingHandler fires typed Http2PingTimeoutChannelClosedException
through the pipeline before closing, so reactor-netty's response Mono
surfaces the typed cause on any in-flight HTTP read.
- New Http2PingCloseRewrapHandler installed alongside the PING handler
rewraps NIO ClosedChannelException as Http2PingTimeoutChannelClosedException
when fired after a PING-driven close (handles the post-close race where
the exception travels the quiescent path).
- ClientRetryPolicy gains a shouldRetryOnGatewayTimeout H3 branch that
detects Http2PingTimeoutChannelClosedException (and substatus 10006) and
routes to in-region retry without endpoint mark-down.
- RxGatewayStoreModel stamps substatus 10006
(HTTP2_PING_TIMEOUT_CHANNEL_CLOSED) on the resulting CosmosException so
diagnostics and retry policy identify the cause without unwrapping.
- WebExceptionUtility.isWebExceptionRetriable returns true for the typed
exception.
- HttpConstants exposes the new substatus.
Tests:
- Http2PingKeepaliveTest restructured for Option B (in-flight read with
iptables DROP on thin-client port). Verified in Docker end-to-end:
Tests run: 1, Failures: 0; SAME_REGION=true, DIFFERENT_CONNECTION=true.
- New ClientRetryPolicyHttp2PingCloseTest covers the retry-policy branch
in isolation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* H3: align doc/comment/CHANGELOG wording with PING-send-failure close path
The H3 close path fires on EITHER consecutive PING-ACK timeouts OR consecutive PING-send failures (the latter was added later for stalled H2 codec/flow-control). Several comments, the typed exception message, the Javadoc of Http2PingCloseRewrapHandler, the install-site comment in ReactorNettyClient, and the CHANGELOG entry still described only the ACK-timeout half. Update all six sites to mention both paths.
Also corrects the stale 'per-request captor in ReactorNettyClient.onErrorMap' reference in Http2PingHandler -- the actual consumer of PING_TIMEOUT_CLOSED is Http2PingCloseRewrapHandler.channelInactive on each H2 child stream.
Pure documentation cleanup: no behavioral change.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* test(cosmos): account for Http2PingHealth feature flag in user-agent helpers
The HTTP/2 PING keepalive change added a new Http2PingHealth bit (1 << 6) to UserAgentFeatureFlags. When both H2 and PING are enabled (the default), the runtime emits |F50 instead of the previous |F10, breaking userAgent assertions in CosmosDiagnosticsTest and UserAgentSuffixTest.
Both helpers (generateHttp2OptedInUserAgentIfRequired, validateUserAgentSuffix) now compute the hex suffix dynamically from the runtime conditions in RxDocumentClientImpl.addUserAgentSuffix and Http2PingHandler.isPingHealthEffectivelyEnabled: when H2 is enabled, the Http2 bit is set; when the PING kill-switch is on AND interval > 0, the Http2PingHealth bit is OR'd in. Integer.toHexString(featureValue).toUpperCase(Locale.ROOT) matches the runtime UserAgentContainer.setFeatureEnabledFlagsAsSuffix output (|F10 for H2 only, |F50 for H2+PING).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(cosmos): address xinlian12 PR review on retry comment + PING getter robustness
Addresses two of xinlian12's latest review comments on PR #49095:
1. ClientRetryPolicy: clarify that the gateway-timeout retry path may cycle through preferred locations on multi-region accounts (via routeToLocation(failoverRetryCount, true) inside shouldRetryOnGatewayTimeout), not always stay on the same endpoint. The previous wording was accurate only for single-preferred-region accounts. The key invariant (no markEndpointUnavailableFor* call) remains correctly described.
2. Configs: defend the three HTTP/2 PING getters (getHttp2PingIntervalInSeconds, getHttp2PingTimeoutInSeconds, getHttp2PingFailureThreshold) against NumberFormatException from malformed user input. These getters execute inside the doOnConnected lambda in ReactorNettyClient, so a bad value like COSMOS_HTTP2_PING_INTERVAL_IN_SECONDS=1s would fail every new H2 connection. Mirrors the defensive pattern already used by getConnectionAcquireTimeout / getThinClientConnectionTimeoutInMs: try/catch with WARN log and fallback to default. Extracted to a private parseIntConfigOrDefault helper to avoid 3x duplication.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Update CHANGELOG.md
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent cbeae1e commit c46043e
24 files changed
Lines changed: 1698 additions & 18 deletions
File tree
- sdk/cosmos
- azure-cosmos-tests
- src/test
- java/com/azure/cosmos
- faultinjection
- implementation
- http
- rx
- resources
- azure-cosmos
- src/main/java/com/azure/cosmos/implementation
- directconnectivity
- http
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
857 | 857 | | |
858 | 858 | | |
859 | 859 | | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
| 901 | + | |
| 902 | + | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
860 | 921 | | |
861 | 922 | | |
862 | 923 | | |
| |||
Lines changed: 11 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
1978 | 1979 | | |
1979 | 1980 | | |
1980 | 1981 | | |
| 1982 | + | |
| 1983 | + | |
| 1984 | + | |
| 1985 | + | |
| 1986 | + | |
1981 | 1987 | | |
1982 | | - | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
1983 | 1993 | | |
1984 | 1994 | | |
1985 | 1995 | | |
| |||
Lines changed: 11 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| 19 | + | |
| 20 | + | |
18 | 21 | | |
19 | 22 | | |
20 | 23 | | |
| |||
115 | 118 | | |
116 | 119 | | |
117 | 120 | | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
118 | 124 | | |
119 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
120 | 130 | | |
121 | 131 | | |
122 | 132 | | |
| |||
0 commit comments