I have a file dict.txt that has all words in the English language.
用户将输入其词:
典型的投入是:最好由强调者而不是(-)来说明身份不明的特性。
我希望该方案能够提出一个清单,列出在字典中发现的所有最佳对应措施。
例: 如果部分字句满,清单将包含跑道、跑步、悬.、ro等等。
是否有办法利用休息时间这样做?
I have a file dict.txt that has all words in the English language.
用户将输入其词:
典型的投入是:最好由强调者而不是(-)来说明身份不明的特性。
我希望该方案能够提出一个清单,列出在字典中发现的所有最佳对应措施。
例: 如果部分字句满,清单将包含跑道、跑步、悬.、ro等等。
是否有办法利用休息时间这样做?
这样做的一个方便办法是使用。 由于不清楚这一问题是否是家庭工作,因此细节留待读者。
Instead of using _ to denote wildcards, use w instead. Add to the beginning and end of the pattern, then just run the dictionary through a regexp matcher. So -un--- becomes:
>>> import re
>>> re.findall(r wunwww , "run runner bunt bunter bunted bummer")
[ runner , bunter , bunted ]
w 对应任何字体。 符合任何字边界。
如果你想一再这样做,你就应当制定一个指数:
wordlist = [word.strip() for word in "run, ran, rat, rob, fish, tree".split( , )]
from collections import defaultdict
class Index(object):
def __init__(self, wordlist=()):
self.trie = defaultdict(set)
for word in wordlist:
self.add_word(word)
def add_word(self, word):
""" adds word to the index """
# save the length of the word
self.trie[len(word)].add(word)
for marker in enumerate(word):
# add word to the set of words with (pos,char)
self.trie[marker].add(word)
def find(self, pattern, wildcard= - ):
# get all word with matching length as candidates
candidates = self.trie[len(pattern)]
# get all words with all the markers
for marker in enumerate(pattern):
if marker[1] != wildcard:
candidates &= self.trie[marker]
# exit early if there are no candicates
if not candidates:
return None
return candidates
with open( dict.txt , rt ) as lines:
wordlist = [word.strip() for word in lines]
s = Index(wordlist)
print s.find("r--")
Tries 用于搜索探测器。 这是使用单一字典的简单定点。
探照算法或某件事等音响,但我给你一个开端。
一种解决办法可能是将档案(如果能够在合理时间内完成)编入一个树木结构,每个特性代表一个节点价值,每个儿童都是随后的特性。 然后,你可以把投入作为地图,verse树。 性格是接下来的路要走的,而干ash则意味着它应当包括所有的儿童节点。 每当你打上一页的深层时,你知道的那段话的长度就等于一页。
很幸运的是,一旦你指数化,你的搜索就会大大加快。 指数化可永远采用......
a. 记忆线,但这只是:
import re
import sys
word = \b + sys.argv[1].replace( - , \w ) + \b
print word
with open( data.txt , r ) as fh:
print re.findall(word, fh.read())
对我采取了几种做法;
首先是把你的字句放在“字句”的前面][字句][字句] = 字数(语句);然后,你的问询成为所有相关字数的交汇点。 非常快,但记忆密集,许多准备工作。
注
# search for r-n
matches = list(words[3][0][ r ] & words[3][2][ n ])
第二种是使用定期表达方式对词典进行直线扫描;记忆足迹缓慢但最小。
注
import re
foundMatch = re.compile( r.n ).match
matches = [word for word in allWords if foundMatch(word)]
第三是对一字检索的检索;
第四,它像你想要的一样,是一字塔:
with open( dictionary.txt ) as inf:
all_words = [word.strip().lower() for word in inf] # one word per line
find_word = r-tt-r
matching_words = []
for word in all_words:
if len(word)==len(find_word):
if all(find==ch or find== - for find,ch in zip(find_word, word)):
matching_words.append(word)
<>Edit:第一种选择的全文如下:
from collections import defaultdict
import operator
try:
inp = raw_input # Python 2.x
except NameError:
inp = input # Python 3.x
class Words(object):
@classmethod
def fromFile(cls, fname):
with open(fname) as inf:
return cls(inf)
def __init__(self, words=None):
super(Words,self).__init__()
self.words = set()
self.index = defaultdict(lambda: defaultdict(lambda: defaultdict(set)))
_addword = self.addWord
for word in words:
_addword(word.strip().lower())
def addWord(self, word):
self.words.add(word)
_ind = self.index[len(word)]
for ind,ch in enumerate(word):
_ind[ind][ch].add(word)
def findAll(self, pattern):
pattern = pattern.strip().lower()
_ind = self.index[len(pattern)]
return reduce(operator.__and__, (_ind[ind][ch] for ind,ch in enumerate(pattern) if ch!= - ), self.words)
def main():
print( Loading dict... )
words = Words.fromFile( dict.txt )
print( done. )
while True:
seek = inp( Enter partial word ("-" is wildcard, nothing to exit): ).strip()
if seek:
print("Matching words: "+ .join(words.findAll(seek))+
)
else:
break
if __name__=="__main__":
main()
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...
Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...
Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...
I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...
I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...