English 中文(简体)
It 耳 of
原标题:Iterate through words of a file in Python

我需要用一个庞大的档案,即一个单一、漫长的字眼。 我知道按行程通过档案线反复使用的方法,但是由于单行结构,这些方法不适用于我。

任何替代办法?

问题回答

这确实取决于您对word的定义。 但试图这样做:

f = file("your-filename-here").read()
for word in f.split():
    # do something with word
    print word

这将使用白天性作为字体。

当然,这只是一个迅速的例子。

长期线? 我假定这个界线太大,无法合理地保存记忆,因此,你需要某种缓冲。

首先,这是一种坏的形式;如果你对档案有任何形式的控制,每行一个字。

如果不是的话,使用这样的东西:

line =   
while True:
    word, space, line = line.partition(   )
    if space:
        # A word was found
        yield word
    else:
        # A word was not found; read a chunk of data from file
        next_chunk = input_file.read(1000)
        if next_chunk:
            # Add the chunk to our line
            line = word + next_chunk
        else:
            # No more data; yield the last word and return
            yield word.rstrip( 
 )
            return

You really should consider using Generator

def word_gen(file):
    for line in file:
        for word in line.split():
            yield word

with open( somefile ) as f:
    word_gen(f)

这样做有更有效的方法,但很谨慎,这可能是最短的:

 words = open( myfile ).read().split()

如果记忆令人关切,那么你就不想这样做,因为它将把整个东西装上记忆,而不是把它.。

I ve answered a similar question before, but I have refined the method used in that answer and here is the updated version (copied from a recent answer):

Here is my totally functional approach which avoids having to read and split lines. It makes use of the itertools module:

Note for python 3, replace itertools.imap with map

import itertools

def readwords(mfile):
    byte_stream = itertools.groupby(
      itertools.takewhile(lambda c: bool(c),
          itertools.imap(mfile.read,
              itertools.repeat(1))), str.isspace)

    return ("".join(group) for pred, group in byte_stream if not pred)

样本使用:

>>> import sys
>>> for w in readwords(sys.stdin):
...     print (w)
... 
I really love this new method of reading words in python
I
really
love
this
new
method
of
reading
words
in
python
           
It s soo very Functional!
It s
soo
very
Functional!
>>>

I guess in your case, this would be the way to use the function:

with open( words.txt ,  r ) as f:
    for word in readwords(f):
        print(word)

读到正常轨道,然后将其分成白色空间,将其打成字?

类似:

word_list = loaded_string.split()

阅读后,你可以做到:

l = len(pattern)
i = 0
while True:
    i = str.find(pattern, i)
    if i == -1:
        break
    print str[i:i+l] # or do whatever
    i += l

页: 1

内容提要 Miner建议看好。 简单简短。 我在一段时间前写成的法典中使用了以下文字:

l = []
f = open("filename.txt", "rU")
for line in f:
    for word in line.split()
        l.append(word)

更长的版本Donald Miner建议的内容。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签