English 中文(简体)
Iterating over string/strlen with umlauted characters

This is a follow-up to my previous question . I succeeded in implementing the algorithm for checking umlauted characters. The next problem comes from iterating over all characters in a string. I do this like so:

int main()
    char* str = "Hej du kalleåäö";
    printf("length of str: %d", strlen(str));

    for (int i = 0; i < strlen(str); i++)
        printf("%s ", to_morse(str[i]));
    return 0;

The problem is that, because of the umlauted characters, it prints 18, and also makes the to_morse function fail (ignoring these characters). The toMorse method accepts an unsigned char as a parameter. What would be the best way to solve this? I know I can check for the umlaut character here instead of the letterNr function but I don t know if that would be a pretty/logical solution.


Normally, you d store the string in a wchar_t and use something like ansi_strlen to get the length of it - that would give you the number of printed characters as opposed to the number of bytes you stored.

You really shouldn t be implementing UTF or Unicode or whatever multibyte character handling yourself - there are libraries for that sort of thing.

On OS X, Cocoa is a solution - note the use of "%C" in NSLog - that s an unichar (16-bit Unicode character):

#import <Cocoa/Cocoa.h>

int main()
        NSAutoreleasePool * pool = [NSAutoreleasePool new];
        NSString * input = @"Hej du kalleåäö";

        printf("length of str: %d", [input length]);
        int i=0;
        for (i = 0; i < [input length]; i++)
                NSLog(@"%C", [input characterAtIndex:i]);

        [pool release];

You could do something like

for (int i = 0; str[i]!=  ; ++i){
    //do something with str[i]

Strings in C are terminated with . So it is possible to check for the end of the string like that.

EDIT: What locale are you using?

If you are going to iterating over a string, don t bother with getting its length with strlen. Just iterate until you see a NUL character:

char *p = str;
while(*p !=   ) {
", *p);

As for the umlauted characters and such, are they UTF-8? If the string is multi-byte, you could do something like this:

size_t n = strlen(str);
char *p = str;
char *e = p + n;
while(*p !=   ) {
    wchar_t wc;
    int l = mbtowc(&wc, p, e - p);
    if(l <= 0) break;
    p += l;
    /* do whatever with wc which is now in wchar_t form */

I honestly don t know if mbtowc will simply return -1 if it encounters a NUL in the middle of a MB character. If it does, you could just pass MB_CUR_MAX instead of e - p and do away with the strlen call. But I have a feeling this is not the case.

Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...
