English 中文(简体)
使用 Levenshtien 的字词社交网络
原标题:social network of word using Levenshtien Distance
  • 时间:2012-05-25 11:25:56
  •  标签:
  • algorithm

这是从"这里 提取的问题。

Two words are friends if they have a Levenshtein distance of 1 (For details see http://en.wikipedia.org/wiki/Levenshtein_distance). That is, you can add, remove, or substitute exactly one letter in word X to create word Y. A word’s social network consists of all of its friends, plus all of their friends, and all of their friends’ friends, and so on. Write a program to tell us how big the social network for the word hello is, using this word list https://raw.github.com/codeeval/Levenshtein-Distance-Challenge/master/input_levenshtein_distance.txt Input

Your program should accept as its first argument a path to a filename.The input file contains the word list. This list is also available at https://raw.github.com/codeeval/Levenshtein-Distance-Challenge/master/input_levenshtein_distance.txt Output

例如,abcde一词的社会网络是4846。

Can any one help to come up with some logic for the same. It is not a home work problem.

问题回答

A simple O(n^2) solution would be to model the problem as a graph:
G = (V,E), where V = { all words } and E = { (u,v) | u is friend of v }.

下一个算法如下(高等级伪代码):

1. Create the graph from the data
2. Run a BFS from the source, and continue while there are more 
   vertices that can be discovered. 
3. When you are done, the size of the `visited` set is the size of 
   the social network (this set is the actual social network)

<强度 > 复杂度:

  • Creating this graph is O(n^2) (check all pairs).
  • BFS is also O(n^2) since |E| < n^2, so you get total of O(n^2) algorithm.

如果你知道如何找到Levenshtein距离, 你需要知道的只是两个词之间的Levenstein距离。

在这里您不需要绘制完整的图表。 更好的方法将是维持一个散列表格, 上面写着您知道的单词。 这样您就可以避免多余的对子。 这正是我的意思 。

Suppose the words are: Right Bright Wright

所有对子都有一个编辑距离。 但如果你只想要 Rights S 社交网络, 你不需要考虑对对Bright和Wright。

继续这样对所检查的所有单词进行检查, 直到您所检查的列表中没有添加 。

您可以使用 BFS 或 DFS 或任何算法来返回图形的树覆盖, 它与您的口味相当相近 。





相关问题
How to add/merge several Big O s into one

If I have an algorithm which is comprised of (let s say) three sub-algorithms, all with different O() characteristics, e.g.: algorithm A: O(n) algorithm B: O(log(n)) algorithm C: O(n log(n)) How do ...

Grokking Timsort

There s a (relatively) new sort on the block called Timsort. It s been used as Python s list.sort, and is now going to be the new Array.sort in Java 7. There s some documentation and a tiny Wikipedia ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Enumerating All Minimal Directed Cycles Of A Directed Graph

I have a directed graph and my problem is to enumerate all the minimal (cycles that cannot be constructed as the union of other cycles) directed cycles of this graph. This is different from what the ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签