Skip to content

sort: Fix inconsistent sort orderg under i18n-collator with equal sorting keys.#12013

Open
ksgk1 wants to merge 8 commits intouutils:mainfrom
ksgk1:sort-fix-inconsistent-sort-behaviour-i18n-collator
Open

sort: Fix inconsistent sort orderg under i18n-collator with equal sorting keys.#12013
ksgk1 wants to merge 8 commits intouutils:mainfrom
ksgk1:sort-fix-inconsistent-sort-behaviour-i18n-collator

Conversation

@ksgk1
Copy link
Copy Markdown

@ksgk1 ksgk1 commented Apr 26, 2026

In #11980 it was noted, that the sorting behaviour was inconsistent. The cause for this was equal sorting keys, without a fallback option.

I implemented a fallback for equal keys, that fixes this issue.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 26, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/cp/link-heap is now being skipped but was previously passing.
Congrats! The gnu test tests/tail/pipe-f is now passing!

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 26, 2026

Merging this PR will improve performance by 31.27%

⚡ 1 improved benchmark
✅ 310 untouched benchmarks
⏩ 46 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory cp_recursive_deep_tree[(120, 4)] 699.2 KB 532.7 KB +31.27%

Comparing ksgk1:sort-fix-inconsistent-sort-behaviour-i18n-collator (573eaa0) with main (5aade31)

Open in CodSpeed

Footnotes

  1. 46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@cakebaker cakebaker linked an issue Apr 27, 2026 that may be closed by this pull request
@cakebaker
Copy link
Copy Markdown
Contributor

Can you please add a test to tests/by-util/test_sort.rs to ensure we don't regress in the future? Thanks.

@ksgk1
Copy link
Copy Markdown
Author

ksgk1 commented Apr 27, 2026

I updated my code, and added different test cases.

Comment thread tests/by-util/test_sort.rs Outdated
Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
Comment on lines +2967 to +2973
let expected_output = "01\n01\n0_1\n0_1\n02\n02\n";
new_ucmd!()
.env("LC_ALL", "en_US.UTF-8")
.arg("fix_i18n_collate_inconsistency_1.txt")
.arg("fix_i18n_collate_inconsistency_2.txt")
.succeeds()
.stdout_is(expected_output);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is not correct as GNU sort returns a different result when running the command manually:

$ LC_ALL=en_US.UTF-8 sort tests/fixtures/sort/fix_i18n_collate_inconsistency_1.txt tests/fixtures/sort/fix_i18n_collate_inconsistency_2.txt 
0_1
0_1
01
01
02
02

Comment thread tests/by-util/test_sort.rs
@ksgk1
Copy link
Copy Markdown
Author

ksgk1 commented Apr 28, 2026

Thank you a lot for the valuable feedback. I will fix those issues as soon as possible and verify my implementation against sort's output.

Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent sort behavior

2 participants