Skip to content

Strict weak ordering revisited#186

Closed
shewitt-au wants to merge 16 commits into
WerWolv:masterfrom
shewitt-au:sort-checker
Closed

Strict weak ordering revisited#186
shewitt-au wants to merge 16 commits into
WerWolv:masterfrom
shewitt-au:sort-checker

Conversation

@shewitt-au

@shewitt-au shewitt-au commented Jul 30, 2025

Copy link
Copy Markdown
Contributor

Overview

Some time ago I fixed a crash related to sorting in the pattern drawer. The cause was a sort predicate that did not implement strict weak ordering correctly. I have revisited this and found (and fixed) another violation (transitivity of incomparability).

The code I used to find this is included in this PR (disabled).

The issue is here.

A PR without the sort checking code.

A bad sort predicate. So what?

I made this repo to demonstrate some of the potential consequences of incorrect strict weak ordering. I've made it easy. It's simple and can be built with CMake. I've mainly ran it on mingw64 (it exhibits the most severe consequences, and the ImHex I use is built with it). It uses a deliberatly poorly coded sort predicate which attempts to simulate the bug I referred to above. It shuffles a vector; sorts it with a bad predicate; check for gargabe; repeats. Strap in and check it out. Here's an example run I'll kick off right now.

$ ./bin/main.exe
Reference vector:
  1   2   3   4   5   6   7   8   9  10
 11  12  13  14  15  16  17  18  19  20
 21  22  23  24  25  26  27  28  29  30
 31  32  33  34  35  36  37  38  39  40
 41  42  43  44  45  46  47  48  49  50
 51  52  53  54  55  56  57  58  59  60
 61  62  63  64  65  66  67  68  69  70
 71  72  73  74  75  76  77  78  79  80
 81  82  83  84  85  86  87  88  89  90
 91  92  93  94  95  96  97  98  99 100

Times shuffled: 15

vector before sorting:
 82  55  88  32   9  54  38  89  71  21
 87 100  27  92  28  31  83  26  13  91
  6  17   8  60  16  68  84  24  62  75
 64  96  14   1  93  58  74  51  95  29
  3  80  86  44  50  65  10  43  23  70
 25   4  40  19   2  97  36  94  41  49
 18  85  81  78  45  52  35  99  79  61
 57  73  42  98   5  56  90  69  39  72
 59  63  46  34  11  67   7  33  12  20
 47  53  37  66  77  15  48  76  30  22

vector after sorting:
  0   0   0   0   0   0 597167803 268445148   1   2
  3   4   5   6   7   8   9  10  11  12
 13  14  15  16  17  18  19  20  21  22
 23  24  25  26  27  28  29  30  31  32
 33  34  35  36  37  38  39  40  41  42
 43  44  45  46  47  48  49  50  51  52
 53  54  55  56  57  58  59  65  60  61
 62  63  64  66  67  68  69  70  71  72
 73  74  75  76  77  78  79  90  80  81
 82  83  84  85  86  87  88  89 100  95

The problem

Incomparability for us just means equality. If x<y is false and y<x is false (x and y are incomparable) we say x==y.

In the code path were I found this issue (sorting by the "Value" column) the code in std::strong_ordering Token::Literal::operator<=> does the comparison. If lhs and rhs are the same type the implementation is pretty straight forward. It handles some cases where lhs and rhs and different numeric types.

Then we get this

[](auto, auto) -> std::strong_ordering {
    return std::strong_ordering::equal;
}

If lhs and rhs are different types we consider them equal. This is the most sensibe value we could unconditionally return, but in our case it wont do the trick.

Say we have a u128 x, a std::string y and another u128 z.
The code the above gives x==y (different types).
Similarly y==z.

Transitivity of incomparability (equality) mandates that in this case x==z.
But x and z are the same type, u128. The comparison is numerical. If x and z have different values then x==z is NOT true as is required. We have violated the strict weak ordering.

The solution in the PR

I have altered the code quoted above to look like this:

 []<typename TL, typename TR>(TL, TR) -> std::strong_ordering {
     return hlp::variant_type_index_v<TL, LiteralVariantType> <=> hlp::variant_type_index_v<TR, LiteralVariantType>;
    }

hlp::variant_type_index_v is a utility I wrote the get the index of a type in a std::variant.

The std::variant we use looks like this:

std::variant<char, bool, u128, i128, double, std::string, std::shared_ptr<ptrn::Pattern>>;

So char < bool, bool < std::shared_ptr<ptrn::Pattern> etc... The comparison is based on the type index in the std::variant.

In the example I gave above now x<y and y>z, so transitivity does not apply.

Conslusion

A sort predicate that does not implement a strict weak ordering correctly is dangerous. Elements can be deleted. Element that never existed in the sequence to be sorted are sometimes conjured. In the example program I linked above I get heap corruption. MingW64 seems particularly vulnarable. I'm not judging here. Perhaps when the predicate is well formed this is a strength. It's not a strength when it's malformed however.

@shewitt-au shewitt-au changed the title Sort checker Strict weak ordering revisited Jul 30, 2025
@shewitt-au shewitt-au marked this pull request as ready for review July 30, 2025 19:31
@shewitt-au

shewitt-au commented Jul 31, 2025

Copy link
Copy Markdown
Contributor Author

@WerWolv and @paxcut

Could you guys have a look at this? Your opinions would be apreciated.

Thanks.

@WerWolv

WerWolv commented Jul 31, 2025

Copy link
Copy Markdown
Owner

Thank you for looking into this again. I'd like to merge the code fixing the issue but as of right now there seems to be a lot of debugging / error checking code in there that I'm not sure we really need. I'd argue once it's fixed properly it's unlikely to easily break again. It just seems kinda over engineered to have a 200 line change to fix a bug that, from what I can tell, is a one-liner

If we were to keep all the debugging code, I think it would be good to just have it enabled at all times in debug builds to ensure correctness, if it's gated behind a macro it will get forgotten.

On another note, please try to follow the general code style / quality of the rest of the code. For example, don't use using or std::cout or snake_case while the rest of the code base uses camelCase.

@shewitt-au

shewitt-au commented Jul 31, 2025

Copy link
Copy Markdown
Contributor Author

@WerWolv

There is a link in this PR to a PR without debugging code.

It's here.

@WerWolv

WerWolv commented Jul 31, 2025

Copy link
Copy Markdown
Owner

Cool thanks! Could you fix the code style issues I raised before there? Then it should be good for merging

@Nemoumbra

Copy link
Copy Markdown

For example, don't use using

using LiteralVariantType = std::variant<char, bool, u128, i128, double, std::string, std::shared_ptr<ptrn::Pattern>>;
One typedef coming up 🙃

@WerWolv

WerWolv commented Jul 31, 2025

Copy link
Copy Markdown
Owner

For example, don't use using

using LiteralVariantType = std::variant<char, bool, u128, i128, double, std::string, std::shared_ptr<ptrn::Pattern>>; One typedef coming up 🙃

That use of using is fine, I meant the using std::cout; for example

@shewitt-au

Copy link
Copy Markdown
Contributor Author

There should not be any couts in the PR I linked (no checking code). I will recheck however.

@shewitt-au

shewitt-au commented Jul 31, 2025

Copy link
Copy Markdown
Contributor Author

Sorry man. It's been a long time since I've programmed in C++. C++11 era, when it was new. That's one or the reason why I find ImHex interesting (apart from the highlighting, that was a revelation).

I had to Google "snake_case".

I think I've addressed the style issues.

I am unsure on the capitalisation of variantTypeIndex_v.

Note that these changes are in this PR, not this one.

@WerWolv

WerWolv commented Jul 31, 2025

Copy link
Copy Markdown
Owner

No problem, thanks for the work!
Since it's a constexpr value, I generally use PascalCase so VariantTypeIndexV I guess for this case. Or maybe better would be to have the type be named VariantTypeIndexImpl and the constexpr value VariantTypeIndex

@shewitt-au

Copy link
Copy Markdown
Contributor Author

Like this? Again, in the other leaner PR.

@shewitt-au shewitt-au closed this Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants