Print chars from Unicode 15.1-17.0#408
Conversation
|
Let me know if you're interested in merging this and I'll update the branch at that point instead keeping it up to date in the interim. I've been running this locally with some of the large type pieces from the Symbols for Legacy Computing Supplement in Unicode 16 and it's been working well for me. I also benchmarked this change with synthetic log output with some Unicode 15.1+ chars mixed in and it seems to have a negligible impact for the majority of users since we only hit this table when we hit Unicode 15.1+ chars. |
Go 1.26 (current stable) still ships Unicode 15.0.0, so unicode.IsPrint() does not recognize blocks added in later Unicode releases. Whitelist them explicitly so they render rather than show as '?'. Go's master branch has bumped to Unicode 17.0, so we can delete this table when that's released. 15.1: CJK Ext I. 16.0: Todhri, Garay, Tulu-Tigalari, Sunuwar, Egyptian Hieroglyphs Ext-A, Gurung Khema, Kirat Rai, Symbols for Legacy Computing Supplement, Ol Onal. 17.0: Sidetic, Sharada Supplement, Tolong Siki, Beria Erfe, Tangut Components Supplement, Misc Symbols Supplement, Tai Yo, CJK Ext J.
|
Hi! What kind of input are you looking at that is using these ranges? Trying to determine the urgency. Go 1.27.0 (which is where I assume current master is heading) is expected in August: https://tip.golang.org/doc/go1.27 |
|
It's not urgent in the sense that I don't have any downstream users, and am running this patch locally anyway. So no pressure on my account to get this merged. I won't be offended if you close in favour of waiting for golang 1.27.0 - that seems like the prudent move here. For context, I'm using moor as my pager for jujutsu and it's been great! I vibe coded a program which transforms the jj output and among other things replaces the standard box drawing graph edge characters with the Large Type Pieces from the Symbols For Legacy Computing Supplment block because I like the way they look better. Default graph rendering
Using Large Type Pieces
|
|
Jujutsu logs are why I filed #406. I have a shell fn that always pipes the output of jj to moor so that the one-line output actually stays as one line instead of wrapping. Previously I had
|



Go 1.26 (current stable) still ships Unicode 15.0.0, so
unicode.IsPrint()does not recognize blocks added in later Unicode releases. Whitelist them explicitly so they render rather than show as '?'. Go's master branch has bumped to Unicode 17.0, so we can delete this table when that's released.15.1: CJK Ext I.
16.0: Todhri, Garay, Tulu-Tigalari, Sunuwar, Egyptian Hieroglyphs Ext-A, Gurung Khema, Kirat Rai, Symbols for Legacy Computing Supplement, Ol Onal.
17.0: Sidetic, Sharada Supplement, Tolong Siki, Beria Erfe, Tangut Components Supplement, Misc Symbols Supplement, Tai Yo, CJK Ext J.