English 中文(简体)
未预料到的与下列物体的碰撞:
原标题:Unexpected collision with std::hash

I know hashing infinite number of string into 32b int must generate collision, but I expect from hashing function some nice distribution.

难道不相信这2条扼杀装置有同样的 has?

size_t hash0 = std::hash<std::string>()("generated_id_0");
size_t hash1 = std::hash<std::string>()("generated_id_1");
//hash0 == hash1

我知道我可以使用<代码>boost:hash<std:string>。 或者说,但我想知道,std:hash有什么错误。 我用它错了吗? 是否应该用“种子”?

最佳回答

您使用<代码>没有错误:hash。 问题在于专业领域:hash<std:string>。 由“2010年视觉演播室”搭配的标准图书馆执行所提供的材料,只需要一组说明特征,以确定散射值(主要是出于业绩原因)。 同时,具有14种特性的扼杀物的最后一个特性不属于这一组,因此,这两种扼杀都会产生同样的印数。

就我所知,这种行为符合标准,即demands,但只有以同样理由向散列函数发出多次呼吁,必须始终保持同样的价值。 然而, has碰撞的概率是最低的should。 《2010年联邦调查》的实施符合强制性规定,但未能说明任择条款。

For details, see the implementation in the header file xfunctional (starting at line 869 in my copy) and §17.6.3.4 of the C++ standard (latest public draft).

If you absolutely need a better hash function for strings, you should implement it yourself. It s actually not that hard.

问题回答

The exact hash algorithm isn t specified by the standard, so the results will vary. The algorithm used by VC10 doesn t seem to take all of the characters into account if the string is longer than 10 characters; it advances with an increment of 1 + s.size() / 10. This is legal, albeit from a QoI point of view, rather disappointing; such hash codes are known to perform very poorly for some typical sets of data (like URLs). I d strongly suggest you replace it with either a FNV hash or one based on a Mersenne prime:

FNV hash:

struct hash
{
    size_t operator()( std::string const& s ) const
    {
        size_t result = 2166136261U ;
        std::string::const_iterator end = s.end() ;
        for ( std::string::const_iterator iter = s.begin() ;
              iter != end ;
              ++ iter ) {
            result = (16777619 * result)
                    ^ static_cast< unsigned char >( *iter ) ;
        }
        return result ;
    }
};

Mersenne prime hash:

struct hash
{
    size_t operator()( std::string const& s ) const
    {
        size_t result = 2166136261U ;
        std::string::const_iterator end = s.end() ;
        for ( std::string::const_iterator iter = s.begin() ;
              iter != end ;
              ++ iter ) {
            result = 127 * result
                   + static_cast< unsigned char >( *iter ) ;
        }
        return result ;
    }
};

(The FNV hash is supposedly better, but the Mersenne prime hash will be faster on a lot of machines, because multiplying by 127 is often significantly faster than multiplying by 16777619.)

你们可能具有不同的散射价值。 我有不同的信教价值(GCC 4.5):

hashtest.cpp

#include <string>
#include <iostream>
#include <functional>
int main(int argc, char** argv)
{
size_t hash0 = std::hash<std::string>()("generated_id_0");
size_t hash1 = std::hash<std::string>()("generated_id_1");
std::cout << hash0 << (hash0 == hash1 ? " == " : " != ") << hash1 << "
";
return 0;
}

Output

# g++ hashtest.cpp -o hashtest -std=gnu++0x
# ./hashtest
16797002355621538189 != 16797001256109909978

你们不是种子洗.功能,你最多只能吃盐。

The function is used in the right way and this collision could be just fortuitous.

You cannot tell whether the hashing function is not evenly distributed unless you perform a massive test with random keys.

TRI1的散射功能和最新的标准界定了像护卫这样的物品的适当超载。 当我使用以下 st子来管理这一法典时:tr1:hash(g++ 4.1.2),我对这两条str有不同的散射值。





相关问题
Undefined reference

I m getting this linker error. I know a way around it, but it s bugging me because another part of the project s linking fine and it s designed almost identically. First, I have namespace LCD. Then I ...

C++ Equivalent of Tidy

Is there an equivalent to tidy for HTML code for C++? I have searched on the internet, but I find nothing but C++ wrappers for tidy, etc... I think the keyword tidy is what has me hung up. I am ...

Template Classes in C++ ... a required skill set?

I m new to C++ and am wondering how much time I should invest in learning how to implement template classes. Are they widely used in industry, or is this something I should move through quickly?

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

typedef ing STL wstring

Why is it when i do the following i get errors when relating to with wchar_t? namespace Foo { typedef std::wstring String; } Now i declare all my strings as Foo::String through out the program, ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

Window iconification status via Xlib

Is it possible to check with the means of pure X11/Xlib only whether the given window is iconified/minimized, and, if it is, how?

热门标签