Wide characters are much simpler than multibyte characters. They are simply characters with more than eight bits, so that they have room for more than 256 distinct codes. The wide character data type, wchar_t
, has a range large enough to hold extended character codes as well as old-fashioned ASCII codes.
An advantage of wide characters is that each character is a single data object, just like ordinary ASCII characters. Wide characters also have some disadvantages:
A program must be modified and recompiled in order to use wide characters at all.
Files of wide characters cannot be read by programs that expect ordinary characters.
Wide character values 0
through 0177
are always identical in meaning to the ASCII character codes. The wide character value zero is often used to terminate a string of wide characters, just as a single byte with value zero often terminates a string of ordinary characters.
If your system supports extended characters, then each extended character has both a wide character code and a corresponding multibyte basic sequence.
In this chapter, the term code is used to refer to a single extended character object to emphasize the distinction from the char
data type.