pub struct CollationReorderingV1<'data> {
    pub min_high_no_reorder: u32,
    pub reorder_table: ZeroVec<'data, u8>,
    pub reorder_ranges: ZeroVec<'data, u32>,
Expand description

Script reordering data

🚧 This code is considered unstable; it may change at any time, in breaking or non-breaking ways, including in SemVer minor releases. While the serde representation of data structs is guaranteed to be stable, their Rust representation might not be. Use with caution.


§min_high_no_reorder: u32

Limit of last reordered range. 0 if no reordering or no split bytes.

Comment from ICU4C’s collationsettings.h

§reorder_table: ZeroVec<'data, u8>

256-byte table for reordering permutation of primary lead bytes; NULL if no reordering. A 0 entry at a non-zero index means that the primary lead byte is “split” (there are different offsets for primaries that share that lead byte) and the reordering offset must be determined via the reorderRanges.

Comment from ICU4C’s collationsettings.h

§reorder_ranges: ZeroVec<'data, u32>

Primary-weight ranges for script reordering, to be used by reorder(p) for split-reordered primary lead bytes.

Each entry is a (limit, offset) pair. The upper 16 bits of the entry are the upper 16 bits of the exclusive primary limit of a range. Primaries between the previous limit and this one have their lead bytes modified by the signed offset (-0xff..+0xff) stored in the lower 16 bits.

CollationData::makeReorderRanges() writes a full list where the first range (at least for terminators and separators) has a 0 offset. The last range has a non-zero offset. minHighNoReorder is set to the limit of that last range.

In the settings object, the initial ranges before the first split lead byte are omitted for efficiency; they are handled by reorder(p) via the reorderTable. If there are no split-reordered lead bytes, then no ranges are needed.

Comment from ICU4C’s collationsettings.h; names refer to ICU4C.

