Tizen Native API
|
The Unormalization module provides Unicode normalization functionality for standard unicode normalization.
#include <utils_i18n.h>
The Unormalization module provides Unicode normalization functionality for standard unicode normalization. All instances of i18n_unormalizer_h are unmodifiable/immutable. Instances returned by i18n_unormalization_get_instance() are singletons that must not be deleted by the caller.
Creates a normalizer and normalizes a unicode string
i18n_unormalizer_h normalizer = NULL; i18n_uchar src = 0xAC00; i18n_uchar dest[4] = {0,}; int dest_str_len = 0; int i = 0; // gets instance for normalizer i18n_unormalization_get_instance( NULL, "nfc", I18N_UNORMALIZATION_DECOMPOSE, &normalizer ); // normalizes a unicode string i18n_unormalization_normalize( normalizer, &src, 1, dest, 4, &dest_str_len ); dlog_print(DLOG_INFO, LOG_TAG, "src is 0x%x\n", src ); // src is 0xAC00 (0xAC00: A Korean character combined with consonant and vowel) for ( i = 0; i < dest_str_len; i++ ) { dlog_print(DLOG_INFO, LOG_TAG, "dest[%d] is 0x%x\t", i + 1, dest[i] ); // dest[1] is 0x1100 dest[2] is 0x1161 (0x1100: consonant, 0x1161: vowel) }
Functions | |
int | i18n_unormalization_get_instance (const char *package_name, const char *name, i18n_unormalization_mode_e mode, i18n_unormalizer_h *normalizer) |
Gets a i18n_unormalizer_h which uses the specified data file and composes or decomposes text according to the specified mode. | |
int | i18n_unormalization_normalize (i18n_unormalizer_h normalizer, const i18n_uchar *src, int32_t len, i18n_uchar *dest, int32_t capacity, int32_t *len_deststr) |
Writes the normalized form of the source string to the destination string(replacing its contents). | |
Typedefs | |
typedef const void * | i18n_unormalizer_h |
i18n_unormalizer_h. |
typedef const void* i18n_unormalizer_h |
i18n_unormalizer_h.
Result values for normalization quick check functions.
Enumeration of constants for normalization modes. For details about standard Unicode normalization forms and about the algorithms which are also used with custom mapping tables see http://www.unicode.org/unicode/reports/tr15/.
I18N_UNORMALIZATION_COMPOSE |
Decomposition followed by composition. Same as standard NFC when using an "nfc" instance. Same as standard NFKC when using an "nfkc" instance. For details about standard Unicode normalization forms see http://www.unicode.org/unicode/reports/tr15/ |
I18N_UNORMALIZATION_DECOMPOSE |
Map and reorder canonically. Same as standard NFD when using an "nfc" instance. Same as standard NFKD when using an "nfkc" instance. For details about standard Unicode normalization forms see http://www.unicode.org/unicode/reports/tr15/ |
I18N_UNORMALIZATION_FCD |
"Fast C or D" form. If a string is in this form, then further decomposition without reordering would yield the same form as DECOMPOSE. Text in "Fast C or D" form can be processed efficiently with data tables that are "canonically closed", that is, that provide equivalent data for equivalent text, without having to be fully normalized. Not a standard Unicode normalization form. Not a unique form: Different FCD strings can be canonically equivalent. For details see http://www.unicode.org/notes/tn5/#FCD |
I18N_UNORMALIZATION_COMPOSE_CONTIGUOUS |
Compose only contiguously. Also known as "FCC" or "Fast C Contiguous". The result will often but not always be in NFC. The result will conform to FCD which is useful for processing. Not a standard Unicode normalization form. For details see http://www.unicode.org/notes/tn5/#FCC |
int i18n_unormalization_get_instance | ( | const char * | package_name, |
const char * | name, | ||
i18n_unormalization_mode_e | mode, | ||
i18n_unormalizer_h * | normalizer | ||
) |
Gets a i18n_unormalizer_h which uses the specified data file and composes or decomposes text according to the specified mode.
[in] | package_name | NULL for ICU built-in data, otherwise application data package name. |
[in] | name | "nfc" or "nfkc" or "nfkc_cf" or the name of the custom data file. |
[in] | mode | The normalization mode (compose or decompose). |
[out] | normalizer | The requested normalizer on success. |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_unormalization_normalize | ( | i18n_unormalizer_h | normalizer, |
const i18n_uchar * | src, | ||
int32_t | len, | ||
i18n_uchar * | dest, | ||
int32_t | capacity, | ||
int32_t * | len_deststr | ||
) |
Writes the normalized form of the source string to the destination string(replacing its contents).
The source and destination strings must be different buffers.
[in] | normalizer | i18n normalizer handle. |
[in] | src | The source string. |
[in] | len | The length of the source string, otherwise -1 if NULL-terminated. |
[out] | dest | The destination string Its contents are replaced with normalized src. |
[in] | capacity | The number of string_uchar that can be written to dest |
[out] | len_deststr | The length of the destination string |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |