Skip to content

[compiler-rt][profile] Accept Unicode profile names on Windows#202335

Open
woodruffw wants to merge 1 commit into
llvm:mainfrom
woodruffw-forks:ww/unicode-profile-names
Open

[compiler-rt][profile] Accept Unicode profile names on Windows#202335
woodruffw wants to merge 1 commit into
llvm:mainfrom
woodruffw-forks:ww/unicode-profile-names

Conversation

@woodruffw

Copy link
Copy Markdown
Contributor

This addresses a small FIXME: when opening a profile file on Windows, we now use CreateFileW instead of CreateFileA.

This fixes a relatively niche scenario in which the profile file contains Unicode codepoints that CreateFileA can't handle.

I've also added a small test for lprofOpenFileEx that exercises the change.

@llvmorg-github-actions llvmorg-github-actions Bot added compiler-rt PGO Profile Guided Optimizations labels Jun 8, 2026
@llvmorg-github-actions

Copy link
Copy Markdown

@llvm/pr-subscribers-pgo

Author: William Woodruff (woodruffw)

Changes

This addresses a small FIXME: when opening a profile file on Windows, we now use CreateFileW instead of CreateFileA.

This fixes a relatively niche scenario in which the profile file contains Unicode codepoints that CreateFileA can't handle.

I've also added a small test for lprofOpenFileEx that exercises the change.


Full diff: https://github.com/llvm/llvm-project/pull/202335.diff

2 Files Affected:

  • (modified) compiler-rt/lib/profile/InstrProfilingUtil.c (+18-2)
  • (added) compiler-rt/test/profile/Windows/instrprof-file-ex-unicode.c (+22)
diff --git a/compiler-rt/lib/profile/InstrProfilingUtil.c b/compiler-rt/lib/profile/InstrProfilingUtil.c
index a9d9df813764b..21005bc5ff87f 100644
--- a/compiler-rt/lib/profile/InstrProfilingUtil.c
+++ b/compiler-rt/lib/profile/InstrProfilingUtil.c
@@ -237,10 +237,26 @@ COMPILER_RT_VISIBILITY FILE *lprofOpenFileEx(const char *ProfileName) {
 
   f = fdopen(fd, "r+b");
 #elif defined(_WIN32)
-  // FIXME: Use the wide variants to handle Unicode filenames.
-  HANDLE h = CreateFileA(ProfileName, GENERIC_READ | GENERIC_WRITE,
+  int WideLength = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS,
+                                       ProfileName, -1, NULL, 0);
+  if (WideLength == 0)
+    return NULL;
+
+  WCHAR *WideProfileName =
+      (WCHAR *)malloc((size_t)WideLength * sizeof(*WideProfileName));
+  if (!WideProfileName)
+    return NULL;
+
+  if (MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, ProfileName, -1,
+                          WideProfileName, WideLength) == 0) {
+    free(WideProfileName);
+    return NULL;
+  }
+
+  HANDLE h = CreateFileW(WideProfileName, GENERIC_READ | GENERIC_WRITE,
                          FILE_SHARE_READ | FILE_SHARE_WRITE, 0, OPEN_ALWAYS,
                          FILE_ATTRIBUTE_NORMAL, 0);
+  free(WideProfileName);
   if (h == INVALID_HANDLE_VALUE)
     return NULL;
 
diff --git a/compiler-rt/test/profile/Windows/instrprof-file-ex-unicode.c b/compiler-rt/test/profile/Windows/instrprof-file-ex-unicode.c
new file mode 100644
index 0000000000000..ebd3b3bc94488
--- /dev/null
+++ b/compiler-rt/test/profile/Windows/instrprof-file-ex-unicode.c
@@ -0,0 +1,22 @@
+// RUN: %clang_profgen -o %t.exe %s
+// RUN: rm -rf %t.dir
+// RUN: mkdir %t.dir
+// RUN: cd %t.dir
+// RUN: %run %t.exe
+
+#include <stdio.h>
+#include <windows.h>
+
+extern FILE *lprofOpenFileEx(const char *);
+
+int main(void) {
+  const char *Filename = "profile-\xe6\x97\xa5.dump";
+  FILE *File = lprofOpenFileEx(Filename);
+  if (!File)
+    return 1;
+
+  fputs("profile data", File);
+  fclose(File);
+
+  return GetFileAttributesW(L"profile-\u65e5.dump") == INVALID_FILE_ATTRIBUTES;
+}

fputs("profile data", File);
fclose(File);

return GetFileAttributesW(L"profile-\u65e5.dump") == INVALID_FILE_ATTRIBUTES;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: This tests that the file is reachable by its UTF-16 filename, after being opened/created as UTF-8.

@woodruffw woodruffw force-pushed the ww/unicode-profile-names branch 2 times, most recently from 96ab6d4 to 171eb7b Compare June 8, 2026 15:12
This addresses a small FIXME: when opening a profile file
on Windows, we now use `CreateFileW` instead of `CreateFileA`.

This fixes a relatively niche scenario in which the profile
file contains Unicode codepoints that `CreateFileA` can't handle.

I've also added a small test for `lprofOpenFileEx`
that exercises the change.

Signed-off-by: William Woodruff <william@yossarian.net>
@woodruffw woodruffw force-pushed the ww/unicode-profile-names branch from 171eb7b to 72038b1 Compare June 9, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler-rt PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant