There aren t all that many cases where a list comprehension (LC for short) will be substantially more useful than the equivalent generator expression (GE for short, i.e., using round parentheses instead of square brackets, to generate one item at a time rather than "all in bulk at the start").
Sometimes you can get a little extra speed by "investing" the extra memory to hold the list all at once, depending on vagaries of optimization and garbage collection on one or another version of Python, but that hardly amounts to substantial extra usefulness of LC vs GE.
Essentially, to get substantial extra use out of the LC as compared to the GE, you need use cases which intrinsically require "more than one pass" on the sequence. In such cases, a GE would require you to generate the sequence once per pass, while, with an LC, you can generate the sequence once, then perform multiple passes on it (paying the generation cost only once). Multiple generation may also be problematic if the GE / LC are based on an underlying iterator that s not trivially restartable (e.g., a "file" that s actually a Unix pipe).
For example, say you are reading a non-empty open text file f
which has a bunch of (textual representations of) numbers separated by whitespace (including newlines here and there, empty lines, etc). You could transform it into a sequence of numbers with either a GE:
G = (float(s) for line in f for s in line.split())
or a LC:
L = [float(s) for line in f for s in line.split()]
Which one is better? Depends on what you re doing with it (i.e, the use case!). If all you want is, say, the sum, sum(G) and sum(L) will do just as well. If you want the average, sum(L)/len(L) is fine for the list, but won t work for the generator -- given the difficulty in "restarting f", to avoid an intermediate list you ll have to do something like:
tot = 0.0
for i, x in enumerate(G): tot += x
return tot/(i+1)
nowhere as snappy, fast, concise and elegant as return sum(L)/len(L)
.
Remember that sorted(G)
does return a list (inevitably), so L.sort()
(which is in-place) is the rough equivalent in this case -- sorted(L) would be supererogatory (as now you have two lists). So when sorting is needed a generator may often be preferred simply due to conciseness.
All in all, since L is identically equivalent to list(G)
, it s hard to get very excited about the ability to express it via punctuation (square brackets instead of round parentheses) instead of a single, short, pronounceable and obvious word like list
;-). And that s all a LC is -- punctuation-based syntax shortcut for list(some_genexp)
...!