The GNU C Library - Example of Conversion

Node: Example of Conversion Next: Shift State Prev: Converting One Char Up: Extended Characters

Character-by-Character Conversion Example

Here is an example that reads multibyte character text from descriptor input and writes the corresponding wide characters to descriptor output . We need to convert characters one by one for this example because mbstowcs is unable to continue past a null character, and cannot cope with an apparently invalid partial character by reading more input.

	file_mbstowcs (int input, int output)
	  char buffer[BUFSIZ + MB_LEN_MAX];
	  int filled = 0;
	  int eof = 0;
	  while (!eof)
	      int nread;
	      int nwrite;
	      char *inp = buffer;
	      wchar_t outbuf[BUFSIZ];
	      wchar_t *outp = outbuf;
	      /* Fill up the buffer from the input file.  */
	      nread = read (input, buffer + filled, BUFSIZ);
	      if (nread < 0)
	          perror ("read");
	          return 0;
	      /* If we reach end of file, make a note to read no more. */
	      if (nread == 0)
	        eof = 1;
	      /* filled  is now the number of bytes in buffer . */
	      filled += nread;
	      /* Convert those bytes to wide characters--as many as we can. */
	      while (1)
	          int thislen = mbtowc (outp, inp, filled);
	          /* Stop converting at invalid character;
	             this can mean we have read just the first part
	             of a valid character.  */
	          if (thislen == -1)
	          /* Treat null character like any other,
	             but also reset shift state. */
	          if (thislen == 0) {
	            thislen = 1;
	            mbtowc (NULL, NULL, 0);
	          /* Advance past this character. */
	          inp += thislen;
	          filled -= thislen;
	      /* Write the wide characters we just made.  */
	      nwrite = write (output, outbuf,
	                      (outp - outbuf) * sizeof (wchar_t));
	      if (nwrite < 0)
	          perror ("write");
	          return 0;
	      /* See if we have a real invalid character. */
	      if ((eof && filled > 0) || filled >= MB_CUR_MAX)
	          error ("invalid multibyte character");
	          return 0;
	      /* If any characters must be carried forward,
	         put them at the beginning of buffer . */
	      if (filled > 0)
	        memcpy (inp, buffer, filled);
	  return 1;

Next: Shift State Up: Extended Characters