execution/binary: make duplicate series error message deterministic#711
Merged
fpetkovski merged 1 commit intoMay 29, 2026
Conversation
The errManyToManyMatch.Error() method previously formatted the two conflicting series in whichever order they happened to arrive on the "original" / "duplicate" slots. That order is driven by upstream StepVector iteration which is itself fed by Go map iteration (see execution/scan/subquery.go), so the rendered error string was non-deterministic across runs and across parallel engine workers. Downstream consumers (e.g. Cortex's distributor / querier) compare error strings to classify failures, and the flapping ordering caused flaky tests and inconsistent user-visible error messages -- same root cause as the issue tracked in cortexproject/cortex#7546. Fix: lexicographically sort the two label-set strings before formatting the message. This is a one-line if-swap, no new imports required, and preserves the existing message format so existing matchers continue to work. A regression test exercises both right-hand and left-hand side cases as well as adversarial label values containing the same framing punctuation ('[', ']', ',', '{', '}', '"') and asserts byte-identical output regardless of which series came first. Refs: cortexproject/cortex#7546 Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
Open
10 tasks
sandy2008
added a commit
to sandy2008/cortex-1
that referenced
this pull request
May 30, 2026
…-series quirk Addresses review feedback on cortexproject#7550. Replaces the generic bracket-list canonicalizer (errBracketListRE / canonicalizeErrMsg / splitTopLevelCommas) with a single-purpose, anchored duplicateSeriesRE + canonicalizeDuplicateSeriesErr that only normalizes the non-deterministic two-element list in the PromQL "found duplicate series for the match group" error. The regex matches the match group and both entries as whole quote-aware labelsets, so a "," inside a labelset is never mistaken for the entry separator; the two-element shape is hard-coded so the comparator falls back to a strict (failing) compare if upstream reformats. Also restores the *promv1.Error.Detail comparison that the original cmp.Equal performed, and expands TestSameErrorClass to 19 subtests (multi-label reorder, label values containing brackets/commas/braces/escaped quotes, nested brackets, same list with different surrounding text, three-element and changed-separator loud-break guards, and an unrelated-bracket-list anchoring guard). Remove this workaround once prometheus/prometheus#18810 and thanos-io/promql-engine#711 are vendored. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
errManyToManyMatch.Error()method previously formatted the two conflicting series in whichever order they happened to arrive on theoriginal/duplicateslots. That order is driven by upstreamStepVectoriteration which is itself fed by Go map iteration (seeexecution/scan/subquery.go), so the rendered error string is non-deterministic across runs and across parallel engine workers.Same root cause as the upstream Prometheus issue/PR (prometheus/prometheus#18809 / prometheus/prometheus#18810). Tracked downstream as a flaky test in Cortex: cortexproject/cortex#7546 (tactical workaround in cortexproject/cortex#7550 — removable once both upstream fixes land and are vendored).
Fix: lexicographically sort the two label-set strings before formatting the message. One-line if-swap, no new imports, preserves the existing message format so existing matchers keep working.
A regression test in the new
execution/binary/utils_test.goexercises both right-hand and left-hand side cases as well as adversarial label values containing the same framing punctuation ([,],,,{,},"), and asserts byte-identical output regardless of which series came first. Verified by temporarily reverting the swap: all subtests FAIL with the unsorted ordering; restoring the patch makes them PASS.Fixes #710.