English 中文(简体)
灰色体内部分特定字的最好对应体
原标题:Best matches for a partially specified word in Python
  • 时间:2011-03-25 16:36:31
  •  标签:
  • python

I have a file dict.txt that has all words in the English language.

用户将输入其词:

典型的投入是:最好由强调者而不是(-)来说明身份不明的特性。

我希望该方案能够提出一个清单,列出在字典中发现的所有最佳对应措施。

例: 如果部分字句满,清单将包含跑道、跑步、悬.、ro等等。

是否有办法利用休息时间这样做?

问题回答

Instead of using _ to denote wildcards, use w instead. Add  to the beginning and end of the pattern, then just run the dictionary through a regexp matcher. So -un--- becomes:

>>> import re
>>> re.findall(r wunwww , "run runner bunt bunter bunted bummer")
[ runner ,  bunter ,  bunted ]

w 对应任何字体。 符合任何字边界。

探照算法或某件事等音响,但我给你一个开端。

一种解决办法可能是将档案(如果能够在合理时间内完成)编入一个树木结构,每个特性代表一个节点价值,每个儿童都是随后的特性。 然后,你可以把投入作为地图,verse树。 性格是接下来的路要走的,而干ash则意味着它应当包括所有的儿童节点。 每当你打上一页的深层时,你知道的那段话的长度就等于一页。

很幸运的是,一旦你指数化,你的搜索就会大大加快。 指数化可永远采用......

a. 记忆线,但这只是:

import re
import sys

word =  \b  + sys.argv[1].replace( - ,  \w ) +  \b 
print word

with open( data.txt ,  r ) as fh:
    print re.findall(word, fh.read())

对我采取了几种做法;

首先是把你的字句放在“字句”的前面][字句][字句] = 字数(语句);然后,你的问询成为所有相关字数的交汇点。 非常快,但记忆密集,许多准备工作。

# search for  r-n 
matches = list(words[3][0][ r ] & words[3][2][ n ])

第二种是使用定期表达方式对词典进行直线扫描;记忆足迹缓慢但最小。

import re

foundMatch = re.compile( r.n ).match
matches = [word for word in allWords if foundMatch(word)]

第三是对一字检索的检索;

第四,它像你想要的一样,是一字塔:

with open( dictionary.txt ) as inf:
    all_words = [word.strip().lower() for word in inf]  # one word per line

find_word =  r-tt-r 
matching_words = []
for word in all_words:
    if len(word)==len(find_word):
        if all(find==ch or find== -  for find,ch in zip(find_word, word)):
            matching_words.append(word)

<>Edit:第一种选择的全文如下:

from collections import defaultdict
import operator

try:
    inp = raw_input    # Python 2.x
except NameError:
    inp = input        # Python 3.x

class Words(object):
    @classmethod
    def fromFile(cls, fname):
        with open(fname) as inf:
            return cls(inf)

    def __init__(self, words=None):
        super(Words,self).__init__()
        self.words = set()
        self.index = defaultdict(lambda: defaultdict(lambda: defaultdict(set)))
        _addword = self.addWord
        for word in words:
            _addword(word.strip().lower())

    def addWord(self, word):
        self.words.add(word)
        _ind = self.index[len(word)]
        for ind,ch in enumerate(word):
            _ind[ind][ch].add(word)

    def findAll(self, pattern):
        pattern = pattern.strip().lower()
        _ind = self.index[len(pattern)]
        return reduce(operator.__and__, (_ind[ind][ch] for ind,ch in enumerate(pattern) if ch!= - ), self.words)

def main():
    print( Loading dict...  )
    words = Words.fromFile( dict.txt )
    print( done. )

    while True:
        seek = inp( Enter partial word ("-" is wildcard, nothing to exit):  ).strip()
        if seek:
            print("Matching words: "+   .join(words.findAll(seek))+ 
 )
        else:
            break

if __name__=="__main__":
    main()




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签