English 中文(简体)
Anagram Hash Function
原标题:

I know something like this has been asked before, but the answer was sort of side tracked.

I want to develop a hash function which will take a word and spit out an address of an array.

So, for example, if you input god:

  1. sort the word, d o g

  2. perform some sort of function on this to get an address d o g -> some number

  3. insert dog into address some_number in array[].

I can t seem to make a function which doesn t get screwed up somehow.

  public static int hashCode(String word){
     char[] x = word.toCharArray();
     Arrays.sort(x);
     int hash = 0;
     for(int i =0; i<x.length; i++)
     {
        hash +=(x[i]-96)*(x[i]-96)*(x[i]-96)*(i+1)*(i+1)+i; 
     }
     hash %=size; // get a value that s inside the bounds of the array
     if(hash<0)
      hash = hash + size;

     return (hash); 
    }

This is my current algorithm but there are two problems.

  • the array size has the be huge so that there aren t a ton of collisions

  • there still are a few collisions, chair for example, produces: smarminess, parr, chair

What do you guys think? I really appreciate your help

最佳回答

Your hash function looks totally arbitrary. Why are you using that?

There are a few common, well known and relatively good hash functions, see a description here:

http://www.azillionmonkeys.com/qed/hash.html

See also https://stackoverflow.com/questions/263400#263416

问题回答

There is a lot of research on hash functions and collision resolution. Here s a place to start: Hash Table

I guess that -- from your title and from the Arrays.sort(x) function -- that you re looking for a hash function that expressly collides when two strings are anagrams of each other. Is this correct? If so, you should specify that requirement INSIDE the question.

The article that Vinko suggested is good. I also recommend Integer Hash Function for other algorithms that you might try.

Good luck!

If you you really want to develop a "hash" that deliberately collides for all anagrams (in other words one that s amenable to finding anagrams in a hash table) then why not split the string into an array of characters, filter out any characters you want to ignore (non-letters) and sort the results, concatenate and then hash that string.

Thus "dog" and "god" both get munged into "dgo" and that s your key for all anagrams of "dog."

In modern versions of Python all that verbiage can be summarized in the following one-line function:

def anagrash(s):
    return   .join(sorted([x for x in s.lower() if s.isalpha()]))

... which you might use as:

anagrams = dict()
for each in phrases:
    ahash = anagrash(each)
    if ahash not in anagrams:
        anagrams[ahash] = list()
    anagrams[ahash].append(each)

... to build a dictionary of possible anagrams from a list of phrases.

Then, to filter out all of the phrases for which no anagram was found:

for key,val in anagrams:
    if len(val) < 2:
        del anagrams[key]

So, there s your homework assignment. Less than a dozen lines of Python. Porting that to whatever language your instructor is teaching and wrapping it in logic to read in the phrases and write out results is all left as an exercise to the student.





相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签