ICU-23448 Fix Invalid-enum-value crash in udat_open#4056
Conversation
44d3718 to
4acb852
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
Validate timeStyle and dateStyle in udat_open to prevent passing invalid styles to C++ APIs, which causes Undefined Behavior (load of invalid enum value) flagged by UBSAN. Also update date_format_fuzzer.cpp to avoid passing invalid enums to C++ APIs directly, and instead test udat_open validation with raw bytes. Add a unit test in cdattst.c to verify. Bug: https://issues.oss-fuzz.com/issues/477633331
These uninitialized members were causing crashes in tests under UBSAN.
Prevent invalid styles from causing integer overflow and other issues. Added unit test TestInvalidStyles.
4acb852 to
d2f04f4
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
richgillam
left a comment
There was a problem hiding this comment.
LOKTM, but a couple things didn't completely make sense to me...
| static_assert(sizeof(dateStyle) == sizeof(dateStyleInt), "sizes must match"); | ||
| static_assert(sizeof(timeStyle) == sizeof(timeStyleInt), "sizes must match"); | ||
| std::memcpy(&dateStyleInt, &dateStyle, sizeof(dateStyle)); | ||
| std::memcpy(&timeStyleInt, &timeStyle, sizeof(timeStyle)); |
There was a problem hiding this comment.
A couple dumb questions: Why do you need to convert it into an int and back here? Is this necessary to check if the input is a valid enum value?
And would it make sense to just treat invalid enum values are equivalent to kNone rather than just failing on construction?
There was a problem hiding this comment.
- Converting to
intviamemcpyis necessary to avoid UBSAN (Undefined Behavior Sanitizer) warnings. In C++, casting an arbitrary invalid integer to an enum and then using it (e.g. in comparison) is undefined behavior.memcpyallows us to inspect the raw bits safely without triggering UB. - Returning
nullptr(failing on construction) is safer than treating invalid values askNone. Treating invalid values askNonemight hide bugs in user code and lead to unexpected formatting behavior instead of a clear failure.
There was a problem hiding this comment.
It's incredibly ugly, but if we have to do it that way, I guess we have to do it that way. I do think it'd help to have some comments in there explaining why we have to do it that way, though.
There was a problem hiding this comment.
memcpy() is unnecessary. Assigning an enum to an int is safe and conventional.
Also, the static_asserts+memcpy could be a portability problem: There is no guarantee that an enum is always the same size as an int32_t. (More likely same as an int.)
Please just use
int32_t dateStyleInt = dateStyle;
int32_t timeStyleInt = timeStyle;
There was a problem hiding this comment.
Done. Replaced memcpy and static_assert with simple assignment. Verified that it compiles and passes tests under UBSAN/ASAN (even with -O3).
There was a problem hiding this comment.
Reverted back to memcpy (with static_assert size checks) because simple assignment actually triggered UBSan errors in CI (e.g. runtime error: load of value 100000, which is not a valid value for type 'EStyle'). In C++, loading an invalid value as an enum is undefined behavior, and UBSan catches this during the assignment. memcpy avoids loading it as enum type.
|
|
||
| #include "cstring.h" | ||
| #include "windtfmt.h" | ||
| #include <cstring> |
There was a problem hiding this comment.
We usually include
- standard C headers
- standard C++ headers (like this one)
- unicode/utypes.h
- other public ICU headers
- internal ICU headers
There was a problem hiding this comment.
Done. Removed the unnecessary <cstring> include and moved <typeinfo> to the top of the file.
| static_assert(sizeof(dateStyle) == sizeof(dateStyleInt), "sizes must match"); | ||
| static_assert(sizeof(timeStyle) == sizeof(timeStyleInt), "sizes must match"); | ||
| std::memcpy(&dateStyleInt, &dateStyle, sizeof(dateStyle)); | ||
| std::memcpy(&timeStyleInt, &timeStyle, sizeof(timeStyle)); |
There was a problem hiding this comment.
memcpy() is unnecessary. Assigning an enum to an int is safe and conventional.
Also, the static_asserts+memcpy could be a portability problem: There is no guarantee that an enum is always the same size as an int32_t. (More likely same as an int.)
Please just use
int32_t dateStyleInt = dateStyle;
int32_t timeStyleInt = timeStyle;
| static UBool isValidStyle(int32_t style) { | ||
| return (style >= DateFormat::kFull && style <= DateFormat::kShort) || | ||
| (style >= DateFormat::kFullRelative && style <= DateFormat::kShortRelative); | ||
| } | ||
|
|
||
| static UBool isValidTimeStyle(int32_t style) { | ||
| return (style >= DateFormat::kFull && style <= DateFormat::kShort); | ||
| } |
There was a problem hiding this comment.
Please put these into an anonymous namespace and remove static.
There was a problem hiding this comment.
Done. Moved the helper functions into an anonymous namespace and removed static in both datefmt.cpp and udat.cpp.
| dateStyleInt = dateStyleInt + kDateOffset; | ||
| } | ||
| return create(timeStyle, dateStyle, aLocale); | ||
| return create(static_cast<EStyle>(timeStyleInt), static_cast<EStyle>(dateStyleInt), aLocale); |
There was a problem hiding this comment.
the timeStyle is unchanged, so please just use it, rather than casting back from the int
There was a problem hiding this comment.
Done. Used timeStyle directly in the create call.
…, include ordering, use unchanged param)
|
PTAL |
richgillam
left a comment
There was a problem hiding this comment.
Your changes and explanations are good enough for me; I hope they'll also be good enough for Markus.
… to memcpy with size check)
|
"int32_t dateStyleInt = dateStyle;" trigger ubsan while the value in dateStyle is not a valid enum, change back to memcpy PTAL |
Making this kind of code "undefined behavior" makes C++ nearly impossible to work with. |
Checklist
Validate timeStyle and dateStyle in udat_open to prevent passing invalid styles to C++ APIs, which causes Undefined Behavior (load of invalid enum value) flagged by UBSAN.
Also update date_format_fuzzer.cpp to avoid passing invalid enums to C++ APIs directly, and instead test udat_open validation with raw bytes.
Add a unit test in cdattst.c to verify.
Bug: https://issues.oss-fuzz.com/issues/477633331