This section describes how to scan a string containing multibyte characters, one character at a time. The difficulty in doing this is to know how many bytes each character contains. Your program can use mblen
to find this out.
mblen
function with a non-null string argument returns the number of bytes that make up the multibyte character beginning at string, never examining more than size bytes. (The idea is to supply for size the number of bytes of data you have in hand.)
The return value of mblen
distinguishes three possibilities: the first size bytes at string start with valid multibyte character, they start with an invalid byte sequence or just part of a character, or string points to an empty string (a null character).
For a valid multibyte character, mblen
returns the number of bytes in that character (always at least 1
, and never more than size). For an invalid byte sequence, mblen
returns -1
. For an empty string, it returns 0
.
If the multibyte character code uses shift characters, then mblen
maintains and updates a shift state as it scans. If you call mblen
with a null pointer for string, that initializes the shift state to its standard initial value. It also returns nonzero if the multibyte character code in use actually has a shift state. See Shift State.
The function mblen
is declared in `stdlib.h'.