Is your feature request related to a problem? Please describe
When using the timeout request parameter with allow_partial_search_results=true, the search response only includes a top-level timed_out: true boolean. All shards — including those that timed out — are reported as "successful" in the _shards section. There is no way to determine how many shards actually timed out (1 shard vs 99% of them).
This makes it impossible to assess result completeness. A response where 1 out of 40 shards timed out is fundamentally different
from one where 39 out of 40 timed out, but both return the same timed_out: true with "successful": 40.
Example current response:
json
{
"timed_out": true,
"_shards": {
"total": 40,
"successful": 40,
"skipped": 0,
"failed": 0
}
}
Describe the solution you'd like
Add a timed_out count to the _shards response object, alongside the existing total, successful, skipped, and failed fields.
Example:
json
"_shards": {
"total": 40,
"successful": 40,
"skipped": 0,
"failed": 0,
"timed_out": 12
}
Timed-out shards would remain counted as successful for backward compatibility. The new field provides explicit visibility into how many shards returned truncated results.
Related component
Search
Describe alternatives you've considered
-
profile=true parameter — Returns per-shard timing breakdowns, allowing you to infer which shards exceeded the timeout by comparing query time against the timeout value. However, profiling adds significant overhead and is not viable for production traffic.
-
terminate_after instead of timeout — Uses a document-count limit per shard instead of a time limit. Returns a
terminated_early: true flag, which provides clearer truncation visibility. However, it bounds work (document count) rather than time, so it doesn't address the same use case. A shard scanning 1000 documents could take 5ms or 500ms depending on query complexity.
Additional context
Use-cases:
Applications using shard-level timeouts to bound query latency need to make runtime decisions about result quality — whether to display results with a "partial results" warning, retry the query, or fall back to cached results. Without knowing the ratio of timed-out shards, there is no way to distinguish between nearly-complete and nearly-empty result sets.
This is particularly important for:
- E-commerce search (showing partial product results with confidence indicators)
- Log analytics (knowing whether aggregation results are statistically meaningful)
- Real-time dashboards (deciding whether to display data or show a "data incomplete" state)
Is your feature request related to a problem? Please describe
When using the
timeoutrequest parameter withallow_partial_search_results=true, the search response only includes a top-leveltimed_out: trueboolean. All shards — including those that timed out — are reported as "successful" in the_shardssection. There is no way to determine how many shards actually timed out (1 shard vs 99% of them).This makes it impossible to assess result completeness. A response where 1 out of 40 shards timed out is fundamentally different
from one where 39 out of 40 timed out, but both return the same
timed_out: truewith "successful": 40.Example current response:
Describe the solution you'd like
Add a timed_out count to the _shards response object, alongside the existing total, successful, skipped, and failed fields.
Example:
Timed-out shards would remain counted as
successfulfor backward compatibility. The new field provides explicit visibility into how many shards returned truncated results.Related component
Search
Describe alternatives you've considered
profile=trueparameter — Returns per-shard timing breakdowns, allowing you to infer which shards exceeded the timeout by comparing query time against the timeout value. However, profiling adds significant overhead and is not viable for production traffic.terminate_afterinstead of timeout — Uses a document-count limit per shard instead of a time limit. Returns aterminated_early: true flag, which provides clearer truncation visibility. However, it bounds work (document count) rather than time, so it doesn't address the same use case. A shard scanning 1000 documents could take 5ms or 500ms depending on query complexity.
Additional context
Use-cases:
Applications using shard-level timeouts to bound query latency need to make runtime decisions about result quality — whether to display results with a "partial results" warning, retry the query, or fall back to cached results. Without knowing the ratio of timed-out shards, there is no way to distinguish between nearly-complete and nearly-empty result sets.
This is particularly important for: