The exact hash algorithm isn t specified by the standard, so the results
will vary. The algorithm used by VC10 doesn t seem to take all of the
characters into account if the string is longer than 10 characters; it
advances with an increment of 1 + s.size() / 10
. This is legal,
albeit from a QoI point of view, rather disappointing; such hash codes
are known to perform very poorly for some typical sets of data (like
URLs). I d strongly suggest you replace it with either a FNV hash or
one based on a Mersenne prime:
FNV hash:
struct hash
{
size_t operator()( std::string const& s ) const
{
size_t result = 2166136261U ;
std::string::const_iterator end = s.end() ;
for ( std::string::const_iterator iter = s.begin() ;
iter != end ;
++ iter ) {
result = (16777619 * result)
^ static_cast< unsigned char >( *iter ) ;
}
return result ;
}
};
Mersenne prime hash:
struct hash
{
size_t operator()( std::string const& s ) const
{
size_t result = 2166136261U ;
std::string::const_iterator end = s.end() ;
for ( std::string::const_iterator iter = s.begin() ;
iter != end ;
++ iter ) {
result = 127 * result
+ static_cast< unsigned char >( *iter ) ;
}
return result ;
}
};
(The FNV hash is supposedly better, but the Mersenne prime hash will be
faster on a lot of machines, because multiplying by 127 is often
significantly faster than multiplying by 16777619.)