English 中文(简体)
Structured programming and Python generators?
原标题:

Update: What I really wanted all along were greenlets.


Note: This question mutated a bit as people answered and forced me to "raise the stakes", as my trivial examples had trivial simplifications; rather than continue to mutate it here, I will repose the question when I have it clearer in my head, as per Alex s suggestion.


Python generators are a thing of beauty, but how can I easily break one down into modules (structured programming)? I effectively want PEP 380, or at least something comparable in syntax burden, but in existing Python (e.g. 2.6)

As an (admittedly stupid) example, take the following:

def sillyGenerator():
  for i in xrange(10):
    yield i*i
  for i in xrange(12):
    yield i*i
  for i in xrange(8):
    yield i*i

Being an ardent believer in DRY, I spot the repeated pattern here and factor it out into a method:

def quadraticRange(n):
  for i in xrange(n)
    yield i*i

def sillyGenerator():
  quadraticRange(10)
  quadraticRange(12)
  quadraticRange(8)

...which of course doesn t work. The parent must call the new function in a loop, yielding the results:

def sillyGenerator():
  for i in quadraticRange(10):
    yield i
  for i in quadraticRange(12):
    yield i
  for i in quadraticRange(8):
    yield i

...which is even longer than before!

If I want to push part of a generator into a function, I always need this rather verbose, two-line wrapper to call it. It gets worse if I want to support send():

def sillyGeneratorRevisited():
  g = subgenerator()
  v = None
  try:
    while True:
      v = yield g.send(v)
  catch StopIteration:
    pass
  if v < 4:
    # ...
  else:
    # ...

And that s not taking into account passing in exceptions. The same boilerplate every time! Yet one cannot apply DRY and factor this identical code into a function, because...you d need the boilerplate to call it! What I want is something like:

def sillyGenerator():
  yield from quadraticRange(10)
  yield from quadraticRange(12)
  yield from quadraticRange(8)

def sillyGeneratorRevisited():
  v = yield from subgenerator()
  if v < 4:
    # ...
  else:
    # ...

Does anyone have a solution to this problem? I have a first-pass attempt, but I d like to know what others have come up with. Ultimately, any solution will have to tackle examples where the main generator performs complex logic based on the result of data sent into the generator, and potentially makes a very large number of calls to sub-generators: my use-case is generators used to implement long-running, complex state machines.

最佳回答

However, I d like to make my reusability criteria one notch harder: what if I need a control structure around my repeated generation?

itertools often helps even there - you need to provide concrete examples where you think it doesn t.

For instance, I might want to call a subgenerator forever with different parameters.

itertools.chain.from_iterable.

Or my subgenerators might be very costly, and I only want to start them up as and when they are reached.

Both chain and chain_from_iterable do that -- no sub-iterator is "started up" until the very instant the first item from it is needed.

Or (and this is a real desire) I might want to vary what I do next based on what my controller passes me using send().

A specific example would be greatly appreciated. Anyway, worst case, you ll be coding for x in blargh: yield x where the suspended Pep3080 would let you code yield from blargh -- about 4 extra characters (not a tragedy;-).

And if some sophisticated coroutine-version of some itertools functionality (itertools mostly supports iterators - there s no equivalent coroutools module yet) becomes warranted, because a certain pattern of coroutine composition is often repeated in your code, then it s not too hard to code it yourself.

For example, suppose we often find ourselves doing something like: first yield a certain value; then, repeatedly, if we re sent foo , yield the next item from fooiter, if bla , from blaiter, if zop , from zopiter, anything else, from defiter. As soon as we spot the second occurrence of this compositional pattern, we can code:

def corou_chaiters(initsend, defiter, val2itermap):
  currentiter = iter([initsend])
  while True:
    val = yield next(currentiter)
    currentiter = val2itermap(val, defiter)

and call this simple compositional function as and when needed. If we need to compose other coroutines, rather than general iterators, we ll have a slightly different composer using the send method instead of the next built-in function; and so forth.

If you can offer an example that s not easily tamed by such techniques, I suggest you do so in a separate question (specifically targeted to coroutine-like generators), as there s already a lot of material on this one that will have little to do with your other, much more complex/sophisticated, example.

问题回答

You want to chain several iterators together:

from itertools import chain

def sillyGenerator(a,b,c):
    return chain(quadraticRange(a),quadraticRange(b),quadraticRange(c))

Impractical (unfortunately) answer:

from __future__ import PEP0380

def sillyGenerator():
    yield from quadraticRange(10)
    yield from quadraticRange(12)
    yield from quadraticRange(8)

Potentially practical reference: Syntax for delegating to a subgenerator

Unfortunately making this impractical: Python language moratorium

UPDATE Feb 2011:

The moratorium has been lifted, and PEP 380 is on the TODO list for Python 3.3. Hopefully this answer will be practical soon.

Read Guido s remarks on comp.python.devel

import itertools

def quadraticRange(n):
  for i in xrange(n)
    yield i*i

def sillyGenerator():
  return itertools.chain(
    quadraticRange(10),
    quadraticRange(12),
    quadraticRange(8),
  )

def sillyGenerator2():
  return itertools.chain.from_iterable(
    quadraticRange(n) for n in [10, 12, 8])

The last is useful if you want to make sure one iterator is exhausted before another starts (including its initialization code).

There is a Python Enhancement Proposal for providing a yield from statement for "delegating generation". Your example would be written as:

def sillyGenerator():
  sq = lambda i: i * i
  yield from map(sq, xrange(10))
  yield from map(sq, xrange(12))
  yield from map(sq, xrange(8))

Or better, in the spirit of DRY:

def sillyGenerator():
  for i in [10, 12, 8]:
    yield from quadraticRange(i)

The proposal is in draft status and its eventual inclusion is not certain, but it shows that other developers share your thoughts about generators.

For an arbitrary number of calls to quadraticRange:

from itertools import chain

def sillyGenerator(*args):
    return chain(*map(quadraticRange, args))

This code uses map and itertools.chain. It takes an arbitrary number of arguments and passes them in order to quadraticRange. The resulting iterators are then chained.

There is a pattern which I call "generator kernel" where generators don t yield directly to the user but to some "kernel" loop that treats (some of) their yields as "system calls" with special meaning.

You can apply it here by an intermediate function that accepts yielded generators and unrolls them automatically. To make it easy to use, we ll create that intermediate function in a decorator:

import functools, types

def flatten(values_or_generators):
    for x in values_or_generators:
        if isinstance(x, GeneratorType):
            for y in x:
                yield y
        else:
            yield x

# Better name anyone?
def subgenerator(g):
    """Decorator making ``yield <gen>`` mean ``yield from <gen>``."""

    @functools.wraps(g)
    def flat_g(*args, **kw):
        return flatten(g(*args, **kw))
    return flat_g

and then you can just write:

def quadraticRange(n):
    for i in xrange(n)
        yield i*i

@subgenerator
def sillyGenerator():
    yield quadraticRange(10)
    yield quadraticRange(12)
    yield quadraticRange(8)

Note that subgenerator() is unrolling exactly one level of the hierarchy. You could easily make it multi-level (by managing a manual stack, or just replacing the inner loop with for y in flatten(x): - but I think it s better as it is, so that every generator that wants to use this non-standard syntax has to be explicitly wrapped with @subgenerator.

Note also that detection of generators is imperfect! It will detect things written as generators, but that s an implementation detail. As a caller of a generator, all you care about is that it returns an iterator. It could be a function returning some itertools object, and then this decorator would fail.

Checking whether the object has a .next() method is too broad - you won t be able to yield strings without them being taken apart. So the most reliable way would be to check for some explicit marker, so you d write e.g.:

@subgenerator
def sillyGenerator():
    yield  from , quadraticRange(10)
    yield  from , quadraticRange(12)
    yield  from , quadraticRange(8)

Hey, that s almost like the PEP!

[credits: this answer gives a similar function - but it s deep (which I consider wrong) and is not a framed as a decorator]

class Communicator:
    def __init__(self, inflow):
        self.outflow = None
        self.inflow = inflow

Then you do:

c = Communicator(something)
yield c
response = c.outflow

And instead of the boilerplate code you can simply do:

 for i in run():
     something = i.inflow
     # ...
     i.outflow = value_to_return_back

It s simple enough code that works without much brain.





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签