Skip to content

[Feature Request] Add timed_out shard count to shards response section for search timeout observability #21087

@sekubos

Description

@sekubos

Is your feature request related to a problem? Please describe

When using the timeout request parameter with allow_partial_search_results=true, the search response only includes a top-level timed_out: true boolean. All shards — including those that timed out — are reported as "successful" in the _shards section. There is no way to determine how many shards actually timed out (1 shard vs 99% of them).

This makes it impossible to assess result completeness. A response where 1 out of 40 shards timed out is fundamentally different
from one where 39 out of 40 timed out, but both return the same timed_out: true with "successful": 40.

Example current response:

json
{
  "timed_out": true,
  "_shards": {
    "total": 40,
    "successful": 40,
    "skipped": 0,
    "failed": 0
  }
}

Describe the solution you'd like

Add a timed_out count to the _shards response object, alongside the existing total, successful, skipped, and failed fields.
Example:

json
"_shards": {
  "total": 40,
  "successful": 40,
  "skipped": 0,
  "failed": 0,
  "timed_out": 12
}

Timed-out shards would remain counted as successful for backward compatibility. The new field provides explicit visibility into how many shards returned truncated results.

Related component

Search

Describe alternatives you've considered

  1. profile=true parameter — Returns per-shard timing breakdowns, allowing you to infer which shards exceeded the timeout by comparing query time against the timeout value. However, profiling adds significant overhead and is not viable for production traffic.

  2. terminate_after instead of timeout — Uses a document-count limit per shard instead of a time limit. Returns a
    terminated_early: true flag, which provides clearer truncation visibility. However, it bounds work (document count) rather than time, so it doesn't address the same use case. A shard scanning 1000 documents could take 5ms or 500ms depending on query complexity.

Additional context

Use-cases:
Applications using shard-level timeouts to bound query latency need to make runtime decisions about result quality — whether to display results with a "partial results" warning, retry the query, or fall back to cached results. Without knowing the ratio of timed-out shards, there is no way to distinguish between nearly-complete and nearly-empty result sets.

This is particularly important for:

  • E-commerce search (showing partial product results with confidence indicators)
  • Log analytics (knowing whether aggregation results are statistically meaningful)
  • Real-time dashboards (deciding whether to display data or show a "data incomplete" state)

Metadata

Metadata

Assignees

Labels

SearchSearch query, autocomplete ...etcenhancementEnhancement or improvement to existing feature or request

Type

No type
No fields configured for issues without a type.

Projects

Status

🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions