pub struct TitlecaseMapper<CM> { /* private fields */ }
Expand description

A wrapper around CaseMapper that can compute titlecasing stuff, and is able to load additional data to support the non-legacy “head adjustment” behavior.

By default, Self::titlecase_segment() and Self::titlecase_segment_to_string() perform “leading adjustment”, where they wait till the first relevant character to begin titlecasing. For example, in the string 'twixt, the apostrophe is ignored because the word starts at the first “t”, which will get titlecased (producing 'Twixt). Other punctuation will also be ignored, like in the string «hello», which will get titlecased to «Hello».

This is a separate type from CaseMapper because it loads the additional data required by LeadingAdjustment::Auto to perform the best possible leading adjustment.

If you are planning on only using LeadingAdjustment::None or LeadingAdjustment::ToCased, consider using CaseMapper directly; this type will have no additional behavior.

Examples

Basic casemapping behavior:

use icu_casemap::TitlecaseMapper;
use icu_locid::langid;

let cm = TitlecaseMapper::new();
let root = langid!("und");

let default_options = Default::default();

// note that the subsequent words are not titlecased, this function assumes
// that the entire string is a single segment and only titlecases at the beginning.
assert_eq!(cm.titlecase_segment_to_string("hEllO WorLd", &root, default_options), "Hello world");
assert_eq!(cm.titlecase_segment_to_string("Γειά σου Κόσμε", &root, default_options), "Γειά σου κόσμε");
assert_eq!(cm.titlecase_segment_to_string("नमस्ते दुनिया", &root, default_options), "नमस्ते दुनिया");
assert_eq!(cm.titlecase_segment_to_string("Привет мир", &root, default_options), "Привет мир");

// Some behavior is language-sensitive
assert_eq!(cm.titlecase_segment_to_string("istanbul", &root, default_options), "Istanbul");
assert_eq!(cm.titlecase_segment_to_string("istanbul", &langid!("tr"), default_options), "İstanbul"); // Turkish dotted i

assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &root, default_options), "Եւ երևանի");
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &langid!("hy"), default_options), "Եվ երևանի"); // Eastern Armenian ech-yiwn ligature

assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &root, default_options), "Ijkdijk");
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &langid!("nl"), default_options), "IJkdijk"); // Dutch IJ digraph

Implementations§

source§

impl TitlecaseMapper<CaseMapper>

source

pub const fn new() -> Self

A constructor which creates a TitlecaseMapper using compiled data

Enabled with the compiled_data Cargo feature.

📚 Help choosing a constructor

source

pub fn try_new_with_any_provider( provider: &(impl AnyProvider + ?Sized) ) -> Result<Self, DataError>

A version of Self::new that uses custom data provided by an AnyProvider.

📚 Help choosing a constructor

source

pub fn try_new_with_buffer_provider( provider: &(impl BufferProvider + ?Sized) ) -> Result<Self, DataError>

A version of Self::new that uses custom data provided by a BufferProvider.

Enabled with the serde feature.

📚 Help choosing a constructor

source

pub fn try_new_unstable<P>(provider: &P) -> Result<Self, DataError>

A version of Self::new that uses custom data provided by a DataProvider.

📚 Help choosing a constructor

⚠️ The bounds on provider may change over time, including in SemVer minor releases.
source§

impl<CM: AsRef<CaseMapper>> TitlecaseMapper<CM>

source

pub fn try_new_with_mapper_with_any_provider( provider: &(impl AnyProvider + ?Sized), casemapper: CM ) -> Result<Self, DataError>

A version of Self::new_with_mapper that uses custom data provided by an AnyProvider.

📚 Help choosing a constructor

source

pub fn try_new_with_mapper_with_buffer_provider( provider: &(impl BufferProvider + ?Sized), casemapper: CM ) -> Result<Self, DataError>

A version of Self::new_with_mapper that uses custom data provided by a BufferProvider.

Enabled with the serde feature.

📚 Help choosing a constructor

source

pub const fn new_with_mapper(casemapper: CM) -> Self

A constructor which creates a TitlecaseMapper from an existing CaseMapper (either owned or as a reference) and compiled data

Enabled with the compiled_data Cargo feature.

📚 Help choosing a constructor

source

pub fn try_new_with_mapper_unstable<P>( provider: &P, casemapper: CM ) -> Result<Self, DataError>

Construct this object to wrap an existing CaseMapper (or a reference to one), loading additional data as needed. A version of Self::new_with_mapper that uses custom data provided by a DataProvider.

📚 Help choosing a constructor

⚠️ The bounds on provider may change over time, including in SemVer minor releases.
source

pub fn titlecase_segment<'a>( &'a self, src: &'a str, langid: &LanguageIdentifier, options: TitlecaseOptions ) -> impl Writeable + 'a

Returns the full titlecase mapping of the given string as a Writeable, treating the string as a single segment (and thus only titlecasing the beginning of it).

This should typically be used as a lower-level helper to construct the titlecasing operation desired by the application, for example one can titlecase on a per-word basis by mixing this with a WordSegmenter.

This function is context and language sensitive. Callers should pass the text’s language as a LanguageIdentifier (usually the id field of the Locale) if available, or Default::default() for the root locale.

See Self::titlecase_segment_to_string() for the equivalent convenience function that returns a String, as well as for an example.

source

pub fn titlecase_segment_to_string( &self, src: &str, langid: &LanguageIdentifier, options: TitlecaseOptions ) -> String

Returns the full titlecase mapping of the given string as a String, treating the string as a single segment (and thus only titlecasing the beginning of it).

This should typically be used as a lower-level helper to construct the titlecasing operation desired by the application, for example one can titlecase on a per-word basis by mixing this with a WordSegmenter.

This function is context and language sensitive. Callers should pass the text’s language as a LanguageIdentifier (usually the id field of the Locale) if available, or Default::default() for the root locale.

See Self::titlecase_segment() for the equivalent lower-level function that returns a Writeable

Examples
use icu_casemap::TitlecaseMapper;
use icu_locid::langid;

let cm = TitlecaseMapper::new();
let root = langid!("und");

let default_options = Default::default();

// note that the subsequent words are not titlecased, this function assumes
// that the entire string is a single segment and only titlecases at the beginning.
assert_eq!(cm.titlecase_segment_to_string("hEllO WorLd", &root, default_options), "Hello world");
assert_eq!(cm.titlecase_segment_to_string("Γειά σου Κόσμε", &root, default_options), "Γειά σου κόσμε");
assert_eq!(cm.titlecase_segment_to_string("नमस्ते दुनिया", &root, default_options), "नमस्ते दुनिया");
assert_eq!(cm.titlecase_segment_to_string("Привет мир", &root, default_options), "Привет мир");

// Some behavior is language-sensitive
assert_eq!(cm.titlecase_segment_to_string("istanbul", &root, default_options), "Istanbul");
assert_eq!(cm.titlecase_segment_to_string("istanbul", &langid!("tr"), default_options), "İstanbul"); // Turkish dotted i

assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &root, default_options), "Եւ երևանի");
assert_eq!(cm.titlecase_segment_to_string("և Երևանի", &langid!("hy"), default_options), "Եվ երևանի"); // Eastern Armenian ech-yiwn ligature

assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &root, default_options), "Ijkdijk");
assert_eq!(cm.titlecase_segment_to_string("ijkdijk", &langid!("nl"), default_options), "IJkdijk"); // Dutch IJ digraph

Leading adjustment behaviors:

use icu_casemap::titlecase::{LeadingAdjustment, TitlecaseOptions};
use icu_casemap::TitlecaseMapper;
use icu_locid::langid;

let cm = TitlecaseMapper::new();
let root = langid!("und");

let default_options = Default::default();
let mut no_adjust: TitlecaseOptions = Default::default();
no_adjust.leading_adjustment = LeadingAdjustment::None;

// Exhibits leading adjustment when set:
assert_eq!(
    cm.titlecase_segment_to_string("«hello»", &root, default_options),
    "«Hello»"
);
assert_eq!(
    cm.titlecase_segment_to_string("«hello»", &root, no_adjust),
    "«hello»"
);

assert_eq!(
    cm.titlecase_segment_to_string("'Twas", &root, default_options),
    "'Twas"
);
assert_eq!(
    cm.titlecase_segment_to_string("'Twas", &root, no_adjust),
    "'twas"
);

assert_eq!(
    cm.titlecase_segment_to_string("", &root, default_options),
    ""
);
assert_eq!(cm.titlecase_segment_to_string("", &root, no_adjust), "");

Tail casing behaviors:

use icu_casemap::titlecase::{TitlecaseOptions, TrailingCase};
use icu_casemap::TitlecaseMapper;
use icu_locid::langid;

let cm = TitlecaseMapper::new();
let root = langid!("und");

let default_options = Default::default();
let mut preserve_case: TitlecaseOptions = Default::default();
preserve_case.trailing_case = TrailingCase::Unchanged;

// Exhibits trailing case when set:
assert_eq!(
    cm.titlecase_segment_to_string("spOngeBoB", &root, default_options),
    "Spongebob"
);
assert_eq!(
    cm.titlecase_segment_to_string("spOngeBoB", &root, preserve_case),
    "SpOngeBoB"
);

Trait Implementations§

source§

impl<CM: Clone> Clone for TitlecaseMapper<CM>

source§

fn clone(&self) -> TitlecaseMapper<CM>

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl<CM: Debug> Debug for TitlecaseMapper<CM>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<CM> RefUnwindSafe for TitlecaseMapper<CM>
where CM: RefUnwindSafe,

§

impl<CM> !Send for TitlecaseMapper<CM>

§

impl<CM> !Sync for TitlecaseMapper<CM>

§

impl<CM> Unpin for TitlecaseMapper<CM>
where CM: Unpin,

§

impl<CM> UnwindSafe for TitlecaseMapper<CM>
where CM: UnwindSafe,

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> ErasedDestructor for T
where T: 'static,

source§

impl<T> MaybeSendSync for T