Description
ProcessorClientImpl::start() in source/extensions/filters/common/ext_proc/grpc_client_impl.h:220-235 returns nullptr without invoking callbacks.onGrpcError() when getOrCreateRawAsyncClientWithHashKey() returns an error (missing or invalid cluster discovered at runtime, raw async-client construction failure, etc):
auto client_or_error =
client_manager_.getOrCreateRawAsyncClientWithHashKey(config_with_hash_key, scope_, true);
if (!client_or_error.status().ok()) {
ENVOY_LOG_PERIODIC_MISC(error, std::chrono::seconds(10), "Creating raw asyc client failed {}",
client_or_error.status());
return nullptr; // no callback fired
}
The downstream openStream() predicate in source/extensions/filters/http/ext_proc/ext_proc.cc:786-793:
ExternalProcessorStreamPtr stream_object = grpc_client->start(*this, ...);
if (processing_complete_ || stream_object == nullptr) {
// Stream failed while starting and either onGrpcError or onGrpcClose was already called.
return sent_immediate_response_ ? StreamOpenState::Error : StreamOpenState::IgnoreError;
}
…treats stream_object == nullptr as IgnoreError and continues request processing. Since start() did not invoke onGrpcError() / onGrpcClose(), processing_complete_ is still false on the error path — but the predicate widens to accept that case after #39388. The filter therefore silently admits the request rather than running the operator-configured failure_mode_allow policy.
Behavior chronology
First affected release: v1.35.0. Reachable on 1.35.x / 1.36.x / 1.37.x / 1.38.x / current main.
Proposed fix
Call callbacks.onGrpcError(Grpc::Status::Unavailable, "ext_proc client creation failed: <status>") before return nullptr;. The ext_proc filter then takes the operator-configured failure mode (fail-open or fail-close per failure_mode_allow) instead of unconditionally failing open. Pull request follows.
Background
This was first reported privately as GHSA-jp7j-jw2q-9494. @yanjunxiang-google asked for the fix to land in the open as a non-security bug ("This is not an issue which can be exploited by a client or an ext_proc server. We can fix this in open."), so filing here per their redirect.
Description
ProcessorClientImpl::start()insource/extensions/filters/common/ext_proc/grpc_client_impl.h:220-235returnsnullptrwithout invokingcallbacks.onGrpcError()whengetOrCreateRawAsyncClientWithHashKey()returns an error (missing or invalid cluster discovered at runtime, raw async-client construction failure, etc):The downstream
openStream()predicate insource/extensions/filters/http/ext_proc/ext_proc.cc:786-793:…treats
stream_object == nullptrasIgnoreErrorand continues request processing. Sincestart()did not invokeonGrpcError()/onGrpcClose(),processing_complete_is still false on the error path — but the predicate widens to accept that case after #39388. The filter therefore silently admits the request rather than running the operator-configuredfailure_mode_allowpolicy.Behavior chronology
if (processing_complete_) { ... return IgnoreError; }early-return afterstart(), with the invariant that the unhappy paths instart()fireonGrpcError/onGrpcClosesynchronously and setprocessing_complete_ = true.ASSERT(stream_object == nullptr)inside that block and refactored.throwfrom ext_proc data plane on client-start failure; widened the predicate toif (processing_complete_ || stream_object == nullptr). This was correct for the explicit-throw removal, but thestart()path that returnsnullptrwithout firingonGrpcError/onGrpcClosesynchronously now becomes a silent admission instead of an aborted request.First affected release: v1.35.0. Reachable on 1.35.x / 1.36.x / 1.37.x / 1.38.x / current main.
Proposed fix
Call
callbacks.onGrpcError(Grpc::Status::Unavailable, "ext_proc client creation failed: <status>")beforereturn nullptr;. The ext_proc filter then takes the operator-configured failure mode (fail-open or fail-close perfailure_mode_allow) instead of unconditionally failing open. Pull request follows.Background
This was first reported privately as
GHSA-jp7j-jw2q-9494. @yanjunxiang-google asked for the fix to land in the open as a non-security bug ("This is not an issue which can be exploited by a client or an ext_proc server. We can fix this in open."), so filing here per their redirect.