Tizen Native API  5.0
Ubrk

Ubrk defines methods for finding the location of boundaries in text.

Required Header

#include <utils_i18n.h>

Overview

Pointer to a i18n_ubreak_iterator_h maintains a current position and scans over text returning the index of characters where boundaries occur.

Functions

int i18n_ubrk_create (i18n_ubreak_iterator_type_e type, const char *locale, const i18n_uchar *text, int32_t text_length, i18n_ubreak_iterator_h *break_iter)
 Opens a new i18n_ubreak_iterator_h for locating text boundaries for a specified locale.
int i18n_ubrk_create_rules (const i18n_uchar *rules, int32_t rules_length, const i18n_uchar *text, int32_t text_length, i18n_ubreak_iterator_h *break_iter, i18n_uparse_error_s *parse_err)
 Opens a new i18n_ubreak_iterator_h for locating text boundaries using specified breaking rules.
int i18n_ubrk_safe_clone (const i18n_ubreak_iterator_h break_iter, void *stack_buffer, int32_t *p_buffer_size, i18n_ubreak_iterator_h *break_iter_clone)
 Thread safe cloning operation.
int i18n_ubrk_destroy (i18n_ubreak_iterator_h break_iter)
 Closes a i18n_ubreak_iterator_h.
int i18n_ubrk_set_text (i18n_ubreak_iterator_h break_iter, const i18n_uchar *text, int32_t text_length)
 Sets an existing iterator to point to a new piece of text.
int32_t i18n_ubrk_current (const i18n_ubreak_iterator_h break_iter)
 Determines the most recently-returned text boundary.
int32_t i18n_ubrk_next (i18n_ubreak_iterator_h break_iter)
 Advances the iterator to the boundary following the current boundary.
int32_t i18n_ubrk_previous (i18n_ubreak_iterator_h break_iter)
 Sets the iterator position to the boundary preceding the current boundary.
int32_t i18n_ubrk_first (i18n_ubreak_iterator_h break_iter)
 Sets the iterator position to zero, the start of the text being scanned.
int32_t i18n_ubrk_last (i18n_ubreak_iterator_h break_iter)
 Sets the iterator position to the index immediately beyond the last character in the text being scanned.
int32_t i18n_ubrk_preceding (i18n_ubreak_iterator_h break_iter, int32_t offset)
 Sets the iterator position to the first boundary preceding the specified offset.
int32_t i18n_ubrk_following (i18n_ubreak_iterator_h break_iter, int32_t offset)
 Advances the iterator to the first boundary following the specified offset.
const char * i18n_ubrk_get_available (int32_t index)
 Gets a locale for which text breaking information is available.
int32_t i18n_ubrk_count_available (void)
 Determines how many locales have text breaking information available.
i18n_ubool i18n_ubrk_is_boundary (i18n_ubreak_iterator_h break_iter, int32_t offset)
 Returns true if the specfied position is a boundary position.
int32_t i18n_ubrk_get_rule_status (i18n_ubreak_iterator_h break_iter)
 Returns the status from the break rule that determined the most recently returned break position.
int32_t i18n_ubrk_get_rule_status_vec (i18n_ubreak_iterator_h break_iter, int32_t *fill_in_vec, int32_t capacity)
 Gets the statuses from the break rules that determined the most recently returned break position.
const char * i18n_ubrk_get_locale_by_type (const i18n_ubreak_iterator_h break_iter, i18n_ulocale_data_locale_type_e type)
 Returns the locale of the break iterator. You can choose between the valid and the actual locale.

Typedefs

typedef void * i18n_ubreak_iterator_s
 i18n_ubreak_iterator_s.
typedef void * i18n_ubreak_iterator_h
 i18n_ubreak_iterator_h.

Defines

#define I18N_U_BRK_SAFECLONE_BUFFERSIZE   528
 A recommended size (in bytes) for the memory buffer to be passed to i18n_ubrk_safe_clone().
#define I18N_UBRK_DONE   ((int32_t) -1)
 Value indicating all text boundaries have been returned.

Define Documentation

A recommended size (in bytes) for the memory buffer to be passed to i18n_ubrk_safe_clone().

Deprecated:
Deprecated since Tizen 3.0
Since :
2.3.1
#define I18N_UBRK_DONE   ((int32_t) -1)

Value indicating all text boundaries have been returned.

Since :
2.3.1

Typedef Documentation

typedef void* i18n_ubreak_iterator_h

i18n_ubreak_iterator_h.

Since :
2.3.1
typedef void* i18n_ubreak_iterator_s

i18n_ubreak_iterator_s.

Since :
2.3

Enumeration Type Documentation

The possible types of text boundaries.

Since :
2.3.1
Enumerator:
I18N_UBRK_CHARACTER 

Character breaks

I18N_UBRK_WORD 

Word breaks

I18N_UBRK_LINE 

Line breaks

I18N_UBRK_SENTENCE 

Sentence breaks


Function Documentation

int32_t i18n_ubrk_count_available ( void  )

Determines how many locales have text breaking information available.

This function is most useful as determining the loop ending condition for calls to i18n_ubrk_get_available().

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Returns:
The number of locales for which text breaking information is available.
Exceptions:
I18N_ERROR_NONESuccessful
See also:
i18n_ubrk_get_available()
int i18n_ubrk_create ( i18n_ubreak_iterator_type_e  type,
const char *  locale,
const i18n_uchar text,
int32_t  text_length,
i18n_ubreak_iterator_h break_iter 
)

Opens a new i18n_ubreak_iterator_h for locating text boundaries for a specified locale.

A i18n_ubreak_iterator_h may be used for detecting character, line, word, and sentence breaks in text.

Remarks:
Error codes are described in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]typeThe type of i18n_ubreak_iterator_h to open: one of I18N_UBRK_CHARACTER, I18N_UBRK_WORD, I18N_UBRK_LINE, I18N_UBRK_SENTENCE
[in]localeThe locale specifying the text-breaking conventions. If NULL, the default locale will be used.
[in]textThe text to be iterated over. May be NULL, then the iterator will be created without any text. The text can be set later with i18n_ubrk_set_text() function.
[in]text_lengthThe number of characters in text, or -1 if NULL-terminated.
[out]break_iterA pointer to the i18n_ubreak_iterator_h for the specified locale.
Returns:
The obtained error code.
Return values:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_create_rules()
int i18n_ubrk_create_rules ( const i18n_uchar rules,
int32_t  rules_length,
const i18n_uchar text,
int32_t  text_length,
i18n_ubreak_iterator_h break_iter,
i18n_uparse_error_s parse_err 
)

Opens a new i18n_ubreak_iterator_h for locating text boundaries using specified breaking rules.

Remarks:
Error codes are described in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]rulesA set of rules specifying the text breaking conventions.
[in]rules_lengthThe number of characters in rules, or -1 if NULL-terminated.
[in]textThe text to be iterated over. May be NULL, in which case i18n_ubrk_set_text() is used to specify the text to be iterated.
[in]text_lengthThe number of characters in text, or -1 if NULL-terminated.
[out]break_iterA pointer to the i18n_ubreak_iterator_h for the specified rules.
[out]parse_errReceives position and context information for any syntax errors detected while parsing the rules.
Returns:
The obtained error code.
Return values:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_create()
int32_t i18n_ubrk_current ( const i18n_ubreak_iterator_h  break_iter)

Determines the most recently-returned text boundary.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The character index most recently returned by, i18n_ubrk_next(), i18n_ubrk_previous(), i18n_ubrk_first(), or i18n_ubrk_last().
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter

Closes a i18n_ubreak_iterator_h.

Once closed, a i18n_ubreak_iterator_h may no longer be used.

Remarks:
Error codes are described in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to close. Must not be NULL.
Returns:
The obtained error code.
Return values:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
int32_t i18n_ubrk_first ( i18n_ubreak_iterator_h  break_iter)

Sets the iterator position to zero, the start of the text being scanned.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The new iterator position (zero).
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_last()
int32_t i18n_ubrk_following ( i18n_ubreak_iterator_h  break_iter,
int32_t  offset 
)

Advances the iterator to the first boundary following the specified offset.

The value returned is always greater than offset, or I18N_UBRK_DONE.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
[in]offsetThe offset to begin scanning.
Returns:
The text boundary following offset, or I18N_UBRK_DONE.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_preceding()
const char* i18n_ubrk_get_available ( int32_t  index)

Gets a locale for which text breaking information is available.

A i18n_ubreak_iterator_h in a locale returned by this function will perform the correct text breaking for the locale.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]indexThe index of the desired locale.
Returns:
A locale for which number text breaking information is available, or 0 if none.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_count_available()

Returns the locale of the break iterator. You can choose between the valid and the actual locale.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section and in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]break_iterBreak iterator. Must not be NULL.
[in]typeLocale type (valid or actual).
Returns:
locale string
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter

Returns the status from the break rule that determined the most recently returned break position.

The values appear in the rule source within brackets, {123}, for example. For rules that do not specify a status, a default value of 0 is returned.

For word break iterators, the possible values are defined in enum i18n_uchar_u_word_break_values_e.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The status from the break rule that determined the most recently returned break position.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
int32_t i18n_ubrk_get_rule_status_vec ( i18n_ubreak_iterator_h  break_iter,
int32_t *  fill_in_vec,
int32_t  capacity 
)

Gets the statuses from the break rules that determined the most recently returned break position.

The values appear in the rule source within brackets, {123}, for example. The default status value for rules that do not explicitly provide one is zero.

For word break iterators, the possible values are defined in enum i18n_uchar_u_word_break_values_e.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section and in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
[out]fill_in_vecAn array to be filled in with the status values.
[in]capacityThe length of the supplied vector. A length of zero causes the function to return the number of status values, in the normal way, without attempting to store any values.
Returns:
The number of rule status values from rules that determined the most recent boundary returned by the break iterator.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
i18n_ubool i18n_ubrk_is_boundary ( i18n_ubreak_iterator_h  break_iter,
int32_t  offset 
)

Returns true if the specfied position is a boundary position.

As a side effect, leaves the iterator pointing to the first boundary position at or after offset.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
[in]offsetThe offset to check.
Returns:
True if "offset" is a boundary position.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
int32_t i18n_ubrk_last ( i18n_ubreak_iterator_h  break_iter)

Sets the iterator position to the index immediately beyond the last character in the text being scanned.

This is not the same as the last character.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The character offset immediately beyond the last character in the text being scanned.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_first()
int32_t i18n_ubrk_next ( i18n_ubreak_iterator_h  break_iter)

Advances the iterator to the boundary following the current boundary.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The character index of the next text boundary, or I18N_UBRK_DONE if all text boundaries have been returned.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_previous()
int32_t i18n_ubrk_preceding ( i18n_ubreak_iterator_h  break_iter,
int32_t  offset 
)

Sets the iterator position to the first boundary preceding the specified offset.

The new position is always smaller than offset, or I18N_UBRK_DONE.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
[in]offsetThe offset to begin scanning.
Returns:
The text boundary preceding offset, or I18N_UBRK_DONE.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_following()
int32_t i18n_ubrk_previous ( i18n_ubreak_iterator_h  break_iter)

Sets the iterator position to the boundary preceding the current boundary.

Remarks:
The specific error code can be obtained using the get_last_result() method. Error codes are described in Exceptions section.
Since :
2.3.1
Parameters:
[in]break_iterThe break iterator to use. Must not be NULL.
Returns:
The character index of the preceding text boundary, or I18N_UBRK_DONE if all text boundaries have been returned.
Exceptions:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
See also:
i18n_ubrk_next()
int i18n_ubrk_safe_clone ( const i18n_ubreak_iterator_h  break_iter,
void *  stack_buffer,
int32_t *  p_buffer_size,
i18n_ubreak_iterator_h break_iter_clone 
)

Thread safe cloning operation.

Remarks:
Error codes are described in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]break_iteriterator to be cloned. Must not be NULL.
[in]stack_buffer(Deprecated Since 3.0. Use NULL instead.) User allocated space for the new clone. If NULL new memory will be allocated. If buffer is not large enough, new memory will be allocated. Clients can use the I18N_U_BRK_SAFECLONE_BUFFERSIZE. This will probably be enough to avoid memory allocations.
[in]p_buffer_size(Deprecated Since 3.0. Use NULL instead.) A pointer to size of allocated space. If *p_buffer_size == 0, a sufficient size for use in cloning will be returned ('pre-flighting') If *p_buffer_size is not enough for a stack-based safe clone, new memory will be allocated.
[out]break_iter_cloneA pointer to the cloned i18n_ubreak_iterator_h.
Returns:
The obtained error code.
Return values:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter
int i18n_ubrk_set_text ( i18n_ubreak_iterator_h  break_iter,
const i18n_uchar text,
int32_t  text_length 
)

Sets an existing iterator to point to a new piece of text.

Remarks:
Error codes are described in i18n_error_code_e description.
Since :
2.3.1
Parameters:
[in]break_iterThe iterator to use. Must not be NULL.
[in]textThe text to be set. Must not be NULL.
[in]text_lengthThe length of the text.
Returns:
The obtained error code.
Return values:
I18N_ERROR_NONESuccessful
I18N_ERROR_INVALID_PARAMETERInvalid function parameter