libunibreak  4.1
Macros | Functions
linebreak.h File Reference

Header file for the line breaking algorithm. More...

#include <stddef.h>
#include "unibreakbase.h"
Include dependency graph for linebreak.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Macros

#define LINEBREAK_MUSTBREAK   0
 Break is mandatory. More...
 
#define LINEBREAK_ALLOWBREAK   1
 Break is allowed. More...
 
#define LINEBREAK_NOBREAK   2
 No break is possible. More...
 
#define LINEBREAK_INSIDEACHAR   3
 A UTF-8/16 sequence is unfinished. More...
 

Functions

void init_linebreak (void)
 Initializes the second-level index to the line breaking properties. More...
 
void set_linebreaks_utf8 (const utf8_t *s, size_t len, const char *lang, char *brks)
 Sets the line breaking information for a UTF-8 input string. More...
 
void set_linebreaks_utf16 (const utf16_t *s, size_t len, const char *lang, char *brks)
 Sets the line breaking information for a UTF-16 input string. More...
 
void set_linebreaks_utf32 (const utf32_t *s, size_t len, const char *lang, char *brks)
 Sets the line breaking information for a UTF-32 input string. More...
 
int is_line_breakable (utf32_t char1, utf32_t char2, const char *lang)
 Tells whether a line break can occur between two Unicode characters. More...
 

Detailed Description

Header file for the line breaking algorithm.

Author
Wu Yongwei

Macro Definition Documentation

◆ LINEBREAK_ALLOWBREAK

#define LINEBREAK_ALLOWBREAK   1

Break is allowed.

◆ LINEBREAK_INSIDEACHAR

#define LINEBREAK_INSIDEACHAR   3

A UTF-8/16 sequence is unfinished.

◆ LINEBREAK_MUSTBREAK

#define LINEBREAK_MUSTBREAK   0

Break is mandatory.

◆ LINEBREAK_NOBREAK

#define LINEBREAK_NOBREAK   2

No break is possible.

Function Documentation

◆ init_linebreak()

void init_linebreak ( void  )

Initializes the second-level index to the line breaking properties.

If it is not called, the performance of get_char_lb_class_lang (and thus the main functionality) can be pretty bad, especially for big codepoints like those of Chinese.

◆ is_line_breakable()

int is_line_breakable ( utf32_t  char1,
utf32_t  char2,
const char *  lang 
)

Tells whether a line break can occur between two Unicode characters.

This is a wrapper function to expose a simple interface. Generally speaking, it is better to use set_linebreaks_utf32 instead, since complicated cases involving combining marks, spaces, etc. cannot be correctly processed.

Parameters
char1the first Unicode character
char2the second Unicode character
langlanguage of the input
Returns
one of LINEBREAK_MUSTBREAK, LINEBREAK_ALLOWBREAK, LINEBREAK_NOBREAK, or LINEBREAK_INSIDEACHAR

◆ set_linebreaks_utf16()

void set_linebreaks_utf16 ( const utf16_t s,
size_t  len,
const char *  lang,
char *  brks 
)

Sets the line breaking information for a UTF-16 input string.

Parameters
[in]sinput UTF-16 string
[in]lenlength of the input
[in]langlanguage of the input
[out]brkspointer to the output breaking data, containing LINEBREAK_MUSTBREAK, LINEBREAK_ALLOWBREAK, LINEBREAK_NOBREAK, or LINEBREAK_INSIDEACHAR
See also
set_linebreaks for a note about lang.

◆ set_linebreaks_utf32()

void set_linebreaks_utf32 ( const utf32_t s,
size_t  len,
const char *  lang,
char *  brks 
)

Sets the line breaking information for a UTF-32 input string.

Parameters
[in]sinput UTF-32 string
[in]lenlength of the input
[in]langlanguage of the input
[out]brkspointer to the output breaking data, containing LINEBREAK_MUSTBREAK, LINEBREAK_ALLOWBREAK, LINEBREAK_NOBREAK, or LINEBREAK_INSIDEACHAR
See also
set_linebreaks for a note about lang.

◆ set_linebreaks_utf8()

void set_linebreaks_utf8 ( const utf8_t s,
size_t  len,
const char *  lang,
char *  brks 
)

Sets the line breaking information for a UTF-8 input string.

Parameters
[in]sinput UTF-8 string
[in]lenlength of the input
[in]langlanguage of the input
[out]brkspointer to the output breaking data, containing LINEBREAK_MUSTBREAK, LINEBREAK_ALLOWBREAK, LINEBREAK_NOBREAK, or LINEBREAK_INSIDEACHAR
See also
set_linebreaks for a note about lang.