Skip to content

Commit 724a36d

Browse files
committed
Document the case-mapping lookup tables
Add a doc-comment to the top of `case-mapping.rs`.
1 parent 95365cc commit 724a36d

1 file changed

Lines changed: 13 additions & 0 deletions

File tree

src/tools/unicode-table-generator/src/case_mapping.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
//! Generates lookup tables (LUTs) for "case-mapping": mapping a `char` to its
2+
//! uppercase or lowercase equivalent(s).
3+
//!
4+
//! The first table, `LOWERCASE_TABLE` (respectively `UPPERCASE_TABLE`), is a
5+
//! sorted array of `(char, u32)` pairs. The case-mapping for a character is
6+
//! found by binary search.
7+
//!
8+
//! If a character expands to multiple characters upon case-folding, the value
9+
//! in the LUT is actually an index into a second LUT, `LOWERCASE_TABLE_MULTI`
10+
//! (respectively `UPPERCASE_TABLE_MULTI`). This is signalled by the 22nd bit
11+
//! of the value being set; since all Unicode code points are less than
12+
//! 0x110000, this bit is free for us to use.
13+
114
use std::char;
215
use std::collections::BTreeMap;
316
use std::fmt::Write;

0 commit comments

Comments
 (0)