Skip to content

Print chars from Unicode 15.1-17.0#408

Open
jjt wants to merge 1 commit into
walles:masterfrom
jjt:jjt/unicode-17
Open

Print chars from Unicode 15.1-17.0#408
jjt wants to merge 1 commit into
walles:masterfrom
jjt:jjt/unicode-17

Conversation

@jjt
Copy link
Copy Markdown

@jjt jjt commented May 8, 2026

Go 1.26 (current stable) still ships Unicode 15.0.0, so unicode.IsPrint() does not recognize blocks added in later Unicode releases. Whitelist them explicitly so they render rather than show as '?'. Go's master branch has bumped to Unicode 17.0, so we can delete this table when that's released.

15.1: CJK Ext I.

16.0: Todhri, Garay, Tulu-Tigalari, Sunuwar, Egyptian Hieroglyphs Ext-A, Gurung Khema, Kirat Rai, Symbols for Legacy Computing Supplement, Ol Onal.

17.0: Sidetic, Sharada Supplement, Tolong Siki, Beria Erfe, Tangut Components Supplement, Misc Symbols Supplement, Tai Yo, CJK Ext J.

@jjt jjt force-pushed the jjt/unicode-17 branch from 09d234b to 2e913bf Compare May 8, 2026 02:51
@jjt
Copy link
Copy Markdown
Author

jjt commented May 9, 2026

Let me know if you're interested in merging this and I'll update the branch at that point instead keeping it up to date in the interim.

I've been running this locally with some of the large type pieces from the Symbols for Legacy Computing Supplement in Unicode 16 and it's been working well for me.

I also benchmarked this change with synthetic log output with some Unicode 15.1+ chars mixed in and it seems to have a negligible impact for the majority of users since we only hit this table when we hit Unicode 15.1+ chars.

Go 1.26 (current stable) still ships Unicode 15.0.0, so unicode.IsPrint()
does not recognize blocks added in later Unicode releases. Whitelist them
explicitly so they render rather than show as '?'. Go's master branch has
bumped to Unicode 17.0, so we can delete this table when that's released.

15.1: CJK Ext I.

16.0: Todhri, Garay, Tulu-Tigalari, Sunuwar, Egyptian Hieroglyphs Ext-A,
Gurung Khema, Kirat Rai, Symbols for Legacy Computing Supplement, Ol Onal.

17.0: Sidetic, Sharada Supplement, Tolong Siki, Beria Erfe, Tangut
Components Supplement, Misc Symbols Supplement, Tai Yo, CJK Ext J.
@jjt jjt force-pushed the jjt/unicode-17 branch from 2e913bf to cc912cc Compare May 10, 2026 06:03
@walles
Copy link
Copy Markdown
Owner

walles commented May 15, 2026

Hi!

What kind of input are you looking at that is using these ranges? Trying to determine the urgency.

Go 1.27.0 (which is where I assume current master is heading) is expected in August: https://tip.golang.org/doc/go1.27

@jjt
Copy link
Copy Markdown
Author

jjt commented May 19, 2026

It's not urgent in the sense that I don't have any downstream users, and am running this patch locally anyway. So no pressure on my account to get this merged. I won't be offended if you close in favour of waiting for golang 1.27.0 - that seems like the prudent move here.

For context, I'm using moor as my pager for jujutsu and it's been great! I vibe coded a program which transforms the jj output and among other things replaces the standard box drawing graph edge characters with the Large Type Pieces from the Symbols For Legacy Computing Supplment block because I like the way they look better.

Default graph rendering

image

Using Large Type Pieces

image

@jjt
Copy link
Copy Markdown
Author

jjt commented May 19, 2026

Jujutsu logs are why I filed #406. I have a shell fn that always pipes the output of jj to moor so that the one-line output actually stays as one line instead of wrapping. Previously I had tput rmam; jj ...; tput smam to manage this, but it was clunky. Piping through moor --quit-if-one-screen=height sidesteps it cleanly for my use.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants