我需要用一个庞大的档案,即一个单一、漫长的字眼。 我知道按行程通过档案线反复使用的方法,但是由于单行结构,这些方法不适用于我。
任何替代办法?
我需要用一个庞大的档案,即一个单一、漫长的字眼。 我知道按行程通过档案线反复使用的方法,但是由于单行结构,这些方法不适用于我。
任何替代办法?
这确实取决于您对word的定义。 但试图这样做:
f = file("your-filename-here").read()
for word in f.split():
# do something with word
print word
这将使用白天性作为字体。
当然,这只是一个迅速的例子。
长期线? 我假定这个界线太大,无法合理地保存记忆,因此,你需要某种缓冲。
首先,这是一种坏的形式;如果你对档案有任何形式的控制,每行一个字。
如果不是的话,使用这样的东西:
line =
while True:
word, space, line = line.partition( )
if space:
# A word was found
yield word
else:
# A word was not found; read a chunk of data from file
next_chunk = input_file.read(1000)
if next_chunk:
# Add the chunk to our line
line = word + next_chunk
else:
# No more data; yield the last word and return
yield word.rstrip(
)
return
You really should consider using Generator
def word_gen(file):
for line in file:
for word in line.split():
yield word
with open( somefile ) as f:
word_gen(f)
这样做有更有效的方法,但很谨慎,这可能是最短的:
words = open( myfile ).read().split()
如果记忆令人关切,那么你就不想这样做,因为它将把整个东西装上记忆,而不是把它.。
I ve answered a similar question before, but I have refined the method used in that answer and here is the updated version (copied from a recent answer):
Here is my totally functional approach which avoids having to read and split lines. It makes use of the
itertools
module:Note for python 3, replace
itertools.imap
withmap
import itertools def readwords(mfile): byte_stream = itertools.groupby( itertools.takewhile(lambda c: bool(c), itertools.imap(mfile.read, itertools.repeat(1))), str.isspace) return ("".join(group) for pred, group in byte_stream if not pred)
样本使用:
>>> import sys >>> for w in readwords(sys.stdin): ... print (w) ... I really love this new method of reading words in python I really love this new method of reading words in python It s soo very Functional! It s soo very Functional! >>>
I guess in your case, this would be the way to use the function:
with open( words.txt , r ) as f: for word in readwords(f): print(word)
读到正常轨道,然后将其分成白色空间,将其打成字?
类似:
word_list = loaded_string.split()
阅读后,你可以做到:
l = len(pattern)
i = 0
while True:
i = str.find(pattern, i)
if i == -1:
break
print str[i:i+l] # or do whatever
i += l
页: 1
内容提要 Miner建议看好。 简单简短。 我在一段时间前写成的法典中使用了以下文字:
l = []
f = open("filename.txt", "rU")
for line in f:
for word in line.split()
l.append(word)
更长的版本Donald Miner建议的内容。
Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...
I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...
Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...
Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...
I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...
I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...