English 中文(简体)
Is searching a HashTable slower when the key are strings and the strings contain spaces
原标题:

Today I was discussing with another developer about a limitation in a third party library, where we couldn t use spaces in a string. The reasoning was that the strings were used as keys in a .NET Hashtable, and that searching a .NET HashTable was significantly slower when the keys contained spaces.

Now since I m too lazy to write a test, but I still want to understand why this would be so, I ask my question here:

Is it slower to search a Hashtable when the string that is used contains a space?

I would not expect this, since before the search is performed, the hash is obtained using String.GetHashCode() and then that hash is used to locate the entry in the table.

Thanks!

最佳回答

Straight from the Rotor source, the core of the String.GetHashcode method:

                int     c;
                char *s = src;
                while ((c = s[0]) != 0) {
                    hash1 = ((hash1 << 5) + hash1) ^ c;
                    c = s[1];
                    if (c == 0)
                        break;
                    hash2 = ((hash2 << 5) + hash2) ^ c;
                    s += 2;
                }

What I can make up of this: spaces do not get any special treatment.

Conclusion:

  • The third party does not use HashTable or wraps something around to string to make spaces slower.
  • Or they are trying to obfuscate their implementation by telling stories.
问题回答

It shouldn t be slower. It uses GetHashCode() internally so set of characters in string doesn t matter.

This said, performance is dependent only on GetHashCode implementation for String. You may get different results for different framework versions (from MSDN):

The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.

White space increases the length of the string slowing the hash function but I expect this to be really insignificant. On the other side, leaving the white spaces in the string can lead to a better hash with less collisions. So I don t think there s any problem using a string with spaces in a HashTable.





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签