English 中文(简体)
What s a portable way of converting Byte-Order of strings in C
原标题:

I am trying to write server that will communicate with any standard client that can make socket connections (e.g. telnet client)

It started out as an echo server, which of course did not need to worry about network byte ordering.

I am familiar with ntohs, ntohl, htons, htonl functions. These would be great by themselves if I were transfering either 16 or 32-bit ints, or if the characters in the string being sent were multiples of 2 or 4 bytes.

I d like create a function that operates on strings such as:

str_ntoh(char* net_str, char* host_str, int len)
{
    uint32_t* netp, hostp;
    netp = (uint32_t*)&net_str;
    for(i=0; i < len/4; i++){
         hostp[i] = ntoh(netp[i]);
    }
}

Or something similar. The above thing assumes that the wordsize is 32-bits. We can t be sure that the wordsize on the sending machine is not 16-bits, or 64-bits right?

For client programs, such as telnet, they must be using hton* before they send and ntoh* after they receive data, correct?

EDIT: For the people that thing because 1-char is a byte that endian-ness doesn t matter:

int main(void)
{
    uint32_t a = 0x01020304;
    char* c = (char*)&a;
printf("%x %x %x %x
", c[0], c[1], c[2], c[3]);

}

Run this snippet of code. The output for me is as follows:

$ ./a.out
  4 3 2 1

Those on powerPC chipsets should get 1 2 3 4 but those of us on intel chipset should see what I got above for the most part.

最佳回答

Maybe I m missing something here, but are you sending strings, that is, sequences of characters? Then you don t need to worry about byte order. That is only for the bit pattern in integers. The characters in a string are always in the "right" order.

EDIT:

Derrick, to address your code example, I ve run the following (slightly expanded) version of your program on an Intel i7 (little-endian) and on an old Sun Sparc (big-endian)

#include <stdio.h>
#include <stdint.h> 

int main(void)
{
    uint32_t a = 0x01020304;
    char* c = (char*)&a;
    char d[] = { 1, 2, 3, 4 };
    printf("The integer: %x %x %x %x
", c[0], c[1], c[2], c[3]);
    printf("The string:  %x %x %x %x
", d[0], d[1], d[2], d[3]);
    return 0;
}

As you can see, I ve added a real char array to your print-out of an integer.

The output from the little-endian Intel i7:

The integer: 4 3 2 1
The string:  1 2 3 4

And the output from the big-endian Sun:

The integer: 1 2 3 4
The string:  1 2 3 4

Your multi-byte integer is indeed stored in different byte order on the two machines, but the characters in the char array have the same order.

问题回答

With your function signature as posted you don t have to worry about byte order. It accepts a char*, that can only handle 8-bit characters. With one byte per character, you cannot have a byte order problem.

You d only run into a byte order problem if you send Unicode, either in UTF16 or UTF32 encoding. And the endian-ness of the sending machine doesn t match the one of the receiving machine. The simple solution for that is to use UTF8 encoding. Which is what most text is sent as across networks. Being byte oriented, it doesn t have a byte order issue either. Or you could send a BOM.

If you d like to send them as an 8-bit encoding (the fact that you re using char implies this is what you want), there s no need to byte swap. However, for the unrelated issue of non-ASCII characters, so that the same character > 127 appears the same on both ends of the connection, I would suggest that you send the data in something like UTF-8, which can represent all unicode characters and can be safely treated as ASCII strings. The way to get UTF-8 text based on the default encoding varies by the platform and set of libraries you re using.

If you re sending 16-bit or 32-bit encoding... You can include one character with the byte order mark which the other end can use to determine the endianness of the character. Or, you can assume network byte order and use htons() or htonl() as you suggest. But if you d like to use char, please see the previous paragraph. :-)

It seems to me that the function prototype doesn t match its behavior. You re passing in a char *, but you re then casting it to uint32_t *. And, looking more closely, you re casting the address of the pointer, rather than the contents, so I m concerned that you ll get unexpected results. Perhaps the following would work better:

arr_ntoh(uint32_t* netp, uint32_t* hostp, int len)
  {
  for(i=0; i < len; i++)
    hostp[i] = ntoh(netp[i]);
  }

I m basing this on the assumption that what you ve really got is an array of uint32_t and you want to run ntoh() on all of them.

I hope this is helpful.





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签