Tizen Native API
|
Functions | |
int | i18n_ucollator_create (const char *locale, i18n_ucollator_h *collator) |
Creates a i18n_ucollator_h for comparing strings. | |
int | i18n_ucollator_destroy (i18n_ucollator_h collator) |
Closes a i18n_ucollator_h. | |
int | i18n_ucollator_str_collator (const i18n_ucollator_h collator, const i18n_uchar *src, int32_t src_len, const i18n_uchar *target, int32_t target_len, i18n_ucollator_result_e *result) |
Compares two stirngs. | |
int | i18n_ucollator_equal (const i18n_ucollator_h collator, const i18n_uchar *src, int32_t src_len, const i18n_uchar *target, int32_t target_len, i18n_ubool *equal) |
Compares two strings for equality. | |
int | i18n_ucollator_set_strength (i18n_ucollator_h collator, i18n_ucollator_strength_e strength) |
Sets the collation strength used in a collator. | |
int | i18n_ucollator_set_attribute (i18n_ucollator_h collator, i18n_ucollator_attribute_e attr, i18n_ucollator_attribute_value_e val) |
Sets a universal attribute setter. | |
Typedefs | |
typedef i18n_ucollator_attribute_value_e | i18n_ucollator_strength_e |
Enumeration in which the base letter represents a primary difference. Set comparison level to I18N_UCOLLATOR_PRIMARY to ignore secondary and tertiary differences. Use this to set the strength of an i18n_ucollator_h. Example of primary difference, "abc" < "abd" Diacritical differences on the same base letter represent a secondary difference. Set comparison level to I18N_UCOLLATOR_SECONDARY to ignore tertiary differences. Use this to set the strength of an i18n_ucollator_h. Example of secondary difference, "ä" >> "a". Uppercase and lowercase versions of the same character represent a tertiary difference. Set comparison level to I18N_UCOLLATOR_TERTIARY to include all comparison differences. Use this to set the strength of an i18n_ucollator_h. Example of tertiary difference, "abc" <<< "ABC". Two characters are considered "identical" when they have the same unicode spellings. I18N_UCOLLATOR_IDENTICAL. For example, "ä" == "ä". i18n_ucollator_strength_e is also used to determine the strength of sort keys generated from Ui18n_ucollator_hs. These values can now be found in the i18n_ucollator_attribute_value_e enum. |
The Ucollator module performs locale-sensitive string comparison.
#include <utils_i18n.h>
The Ucollator module performs locale-sensitive string comparison. It builds searching and sorting routines for natural language text and provides correct sorting orders for most locales supported.
Converts two different byte strings to two different unicode strings and compares the unicode strings to check if the strings are equal to each other.
i18n_uchar uchar_src[64] = {0,}; i18n_uchar uchar_target[64] = {0,}; char *src = "tizen"; char *target = "bada"; int uchar_src_len = 0; int uchar_target_len = 0; i18n_ucollator_h coll = NULL; i18n_ubool result = NULL; i18n_ustring_from_UTF8( uchar_src, 64, NULL, src, -1 ); i18n_ustring_from_UTF8( uchar_target, 64, NULL, target, -1 ); // creates a collator i18n_ucollator_create( "en_US", &coll ); // sets strength for coll i18n_ucollator_set_strength( coll, I18N_UCOLLATOR_PRIMARY ); // compares uchar_src with uchar_target i18n_ustring_get_length( uchar_src, &uchar_src_len ); i18n_ustring_get_length( uchar_target, &uchar_target_len ); i18n_ucollator_equal( coll, uchar_src, uchar_src_len, uchar_target, uchar_target_len, &result ); dlog_print(DLOG_INFO, LOG_TAG, "%s %s %s\n", src, result == 1 ? "is equal to" : "is not equal to", target ); // tizen is not equal to bada // destroys the collator i18n_ucollator_destroy( coll );
Sorts in ascending order on the given data using string_ucollator
i18n_ucollator_h coll = NULL; char *src[3] = { "cat", "banana", "airplane" }; char *tmp = NULL; i18n_uchar buf_01[16] = {0,}; i18n_uchar buf_02[16] = {0,}; i18n_ucollator_result_e result = I18N_UCOLLATOR_EQUAL; int i = 0, j = 0; int ret = I18N_ERROR_NONE; int buf_01_len = 0, buf_02_len = 0; for ( i = 0; i < sizeof( src ) / sizeof( src[0] ); i++ ) { dlog_print(DLOG_INFO, LOG_TAG, "%s\n", src[i] ); } // cat banana airplane // creates a collator ret = i18n_ucollator_create( "en_US", &coll ); // compares and sorts in ascending order if ( ret == I18N_ERROR_NONE ) { i18n_ucollator_set_strength( coll, I18N_UCOLLATOR_TERTIARY ); for ( i = 0; i < 2; i++ ) { for ( j = 0; j < 2 - i; j++ ) { i18n_ustring_copy_ua( buf_01, src[j] ); i18n_ustring_copy_ua( buf_02, src[j+1] ); i18n_ustring_get_length( buf_01, &buf_01_len ); i18n_ustring_get_length( buf_02, &buf_02_len ); // compares buf_01 with buf_02 i18n_ucollator_str_collator( coll, buf_01, buf_01_len, buf_02, buf_02_len, &result ); if ( result == I18N_UCOLLATOR_GREATER ) { tmp = src[j]; src[j] = src[j+1]; src[j+1] = tmp; } } } } // destroys the collator i18n_ucollator_destroy( coll ); // deallocate memory for collator for ( i = 0; i < sizeof( src ) / sizeof( src[0] ); i++ ) { dlog_print(DLOG_INFO, LOG_TAG, "%s\n", src[i] ); } // ariplane banana cat
Enumeration for attributes that collation service understands. All the attributes can take I18N_UCOLLATOR_DEFAULT value, as well as the values specific to each one.
I18N_UCOLLATOR_FRENCH_COLLATION |
Attribute for direction of secondary weights - used in Canadian French. Acceptable values are I18N_UCOLLATOR_ON, which results in secondary weights being considered backwards, and I18N_UCOLLATOR_OFF which treats secondary weights in the order they appear |
I18N_UCOLLATOR_ALTERNATE_HANDLING |
Attribute for handling variable elements. Acceptable values are I18N_UCOLLATOR_NON_IGNORABLE (default) which treats all the codepoints with non-ignorable primary weights in the same way, and I18N_UCOLLATOR_SHIFTED which causes codepoints with primary weights that are equal or below the variable top value to be ignored at the primary level and moved to the quaternary level |
I18N_UCOLLATOR_CASE_FIRST |
Controls the ordering of upper and lower case letters. Acceptable values are I18N_UCOLLATOR_OFF (default), which orders upper and lower case letters in accordance to their tertiary weights, I18N_UCOLLATOR_UPPER_FIRST which forces upper case letters to sort before lower case letters, and I18N_UCOLLATOR_LOWER_FIRST which does the opposite |
I18N_UCOLLATOR_CASE_LEVEL |
Controls whether an extra case level (positioned before the third level) is generated or not. Acceptable values are I18N_UCOLLATOR_OFF (default), when case level is not generated, and I18N_UCOLLATOR_ON which causes the case level to be generated. Contents of the case level are affected by the value of the I18N_UCOLLATOR_CASE_FIRST attribute. A simple way to ignore accent differences in a string is to set the strength to I18N_UCOLLATOR_PRIMARY and enable case level |
I18N_UCOLLATOR_NORMALIZATION_MODE |
Controls whether the normalization check and necessary normalizations are performed. When set to I18N_UCOLLATOR_OFF (default) no normalization check is performed. The correctness of the result is guaranteed only if the input data is in so-called FCD form (see users manual for more info). When set to I18N_UCOLLATOR_ON, an incremental check is performed to see whether the input data is in the FCD form. If the data is not in the FCD form, incremental NFD normalization is performed |
I18N_UCOLLATOR_DECOMPOSITION_MODE |
An alias for the I18N_UCOLLATOR_NORMALIZATION_MODE attribute |
I18N_UCOLLATOR_STRENGTH |
The strength attribute. Can be either I18N_UCOLLATOR_PRIMARY, I18N_UCOLLATOR_SECONDARY, I18N_UCOLLATOR_TERTIARY, I18N_UCOLLATOR_QUATERNARY, or I18N_UCOLLATOR_IDENTICAL. The usual strength for most locales (except Japanese) is tertiary. Quaternary strength is useful when combined with shifted setting for the alternate handling attribute and for JIS X 4061 collation, when it is used to distinguish between Katakana and Hiragana. Otherwise, quaternary level is affected only by the number of non-ignorable code points in the string. Identical strength is rarely useful, as it amounts to codepoints of the NFD form of the string |
I18N_UCOLLATOR_NUMERIC_COLLATION |
When turned on, this attribute makes substrings of digits that are sort according to their numeric values. This is a way to get '100' to sort AFTER '2'. Note that the longest digit substring that can be treated as a single unit is 254 digits (not counting leading zeros). If a digit substring is longer than that, the digits beyond the limit will be treated as a separate digit substring. A "digit" in this sense is a code point with General_Category=Nd, which does not include circled numbers, roman numerals, and so on. Only a contiguous digit substring is considered, that is, non-negative integers without separators. There is no support for plus/minus signs, decimals, exponents, and so on |
I18N_UCOLLATOR_ATTRIBUTE_COUNT |
The number of UColAttribute constants |
Enumeration containing attribute values for controling collation behavior. Here are all the allowable values. Not every attribute can take every value. The only universal value is I18N_UCOLLATOR_DEFAULT, which resets the attribute value to the predefined value for that locale.
I18N_UCOLLATOR_DEFAULT |
Accepted by most attributes |
I18N_UCOLLATOR_PRIMARY |
Primary collation strength |
I18N_UCOLLATOR_SECONDARY |
Secondary collation strength |
I18N_UCOLLATOR_TERTIARY |
Tertiary collation strength |
I18N_UCOLLATOR_DEFAULT_STRENGTH |
Default collation strength |
I18N_UCOLLATOR_QUATERNARY |
Quaternary collation strength |
I18N_UCOLLATOR_IDENTICAL |
Identical collation strength |
I18N_UCOLLATOR_OFF |
Turn the feature off - works for I18N_UCOLLATOR_FRENCH_COLLATION, I18N_UCOLLATOR_CASE_LEVEL & I18N_UCOLLATOR_DECOMPOSITION_MODE |
I18N_UCOLLATOR_ON |
Turn the feature on - works for I18N_UCOLLATOR_FRENCH_COLLATION, I18N_UCOLLATOR_CASE_LEVEL & I18N_UCOLLATOR_DECOMPOSITION_MODE |
I18N_UCOLLATOR_SHIFTED |
Valid for I18N_UCOLLATOR_ALTERNATE_HANDLING. Alternate handling will be shifted. |
I18N_UCOLLATOR_NON_IGNORABLE |
Valid for I18N_UCOLLATOR_ALTERNATE_HANDLING. Alternate handling will be non ignorable. |
I18N_UCOLLATOR_LOWER_FIRST |
Valid for I18N_UCOLLATOR_CASE_FIRST - lower case sorts before upper case. |
I18N_UCOLLATOR_UPPER_FIRST |
Upper case sorts before lower case. |
Enumeration for source and target string comparison result. I18N_UCOLLATOR_EQUAL is returned if the source string is compared to be less than the target string in the i18n_ucollator_str_collator() method. i18n_ucollator_equal() is returned if the source string is compared to be equal to the target string in the i18n_ucollator_str_collator() method. I18N_UCOLLATOR_GREATER is returned if the source string is compared to be greater than the target string in the i18n_ucollator_str_collator() method.
int i18n_ucollator_create | ( | const char * | locale, |
i18n_ucollator_h * | collator | ||
) |
Creates a i18n_ucollator_h for comparing strings.
The i18n_ucollator_h is used in all the calls to the Collation service.
After finished, collator must be disposed off by calling i18n_ucollator_destroy().
[in] | locale | The locale containing the required collation rules Special values for locales can be passed in - if NULL is passed for the locale, the default locale collation rules will be used If empty string ("") or "root" is passed, UCA rules will be used. |
[out] | collator | i18n_ucollator_h, otherwise 0 if an error occurs |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_ucollator_destroy | ( | i18n_ucollator_h | collator | ) |
Closes a i18n_ucollator_h.
Once closed, a string_ucollator_h should not be used. Every an open collator should be closed. Otherwise, a memory leak will result.
[in] | collator | The i18n_ucollator_h to close |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_ucollator_equal | ( | const i18n_ucollator_h | collator, |
const i18n_uchar * | src, | ||
int32_t | src_len, | ||
const i18n_uchar * | target, | ||
int32_t | target_len, | ||
i18n_ubool * | equal | ||
) |
Compares two strings for equality.
This function is equivalent to i18n_ucollator_str_collator().
[in] | collator | The i18n_ucollator_h containing the comparison rules |
[in] | src | The source string |
[in] | src_len | The length of the source, otherwise -1 if null-terminated |
[in] | target | The target string |
[in] | target_len | The length of the target, otherwise -1 if null-terminated |
[out] | equal | If true source is equal to target, otherwise false |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_ucollator_set_attribute | ( | i18n_ucollator_h | collator, |
i18n_ucollator_attribute_e | attr, | ||
i18n_ucollator_attribute_value_e | val | ||
) |
Sets a universal attribute setter.
[in] | collator | The i18n_collator_h containing attributes to be changed |
[in] | attr | The attribute type |
[in] | val | The attribute value |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_ucollator_set_strength | ( | i18n_ucollator_h | collator, |
i18n_ucollator_strength_e | strength | ||
) |
Sets the collation strength used in a collator.
The strength influences how strings are compared.
[in] | collator | The i18n_collator_h to set. |
[in] | strength | The desired collation strength. One of i18n_ucollator_strength_e |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |
int i18n_ucollator_str_collator | ( | const i18n_ucollator_h | collator, |
const i18n_uchar * | src, | ||
int32_t | src_len, | ||
const i18n_uchar * | target, | ||
int32_t | target_len, | ||
i18n_ucollator_result_e * | result | ||
) |
Compares two stirngs.
The strings will be compared using the options already specified.
[in] | collator | The i18n_ucollator_h containing the comparison rules |
[in] | src | The source string |
[in] | src_len | The length of the source, otherwise -1 if null-terminated |
[in] | target | The target string. |
[in] | target_len | The length of the target, otherwise -1 if null-terminated |
[out] | result | The result of comparing the strings One of I18N_UCOLLATOR_EQUAL, I18N_UCOLLATOR_GREATER, or I18N_UCOLLATOR_LESS |
I18N_ERROR_NONE | Successful |
I18N_ERROR_INVALID_PARAMETER | Invalid function parameter |