Iterators¶
From Generators to Iterators¶
Back to our initial very simple generator. What’s the YieldType
, SendType
and ReturnType
of this generator?
def get_something_generator():
yield "foo"
yield "bar"
yield "baz"
Actually it could be written as:
from typing import Generator
def get_something_generator() -> Generator[str, None, None]:
yield "foo"
yield "bar"
yield "baz"
This generator doesn’t receive and return anything and just yields a value on
every call. That means when calling we iterate over some values? Yes indeed.
Therefore Generator[YieldType, None, None]
could be rewritten as
Iterator[YieldType]
.
from typing import Iterator
def get_something_iterator() -> Iterator[str]:
yield "foo"
yield "bar"
yield "baz"
What’s an Iterator¶
Citing the Python docs, an Iterator is
An object representing a stream of data. Repeated calls to the iterator’s
__next__()
method […] return successive items in the stream. When no more data are available a StopIteration exception is raised instead.
Iterator base class¶
From the definition above we can derive a base class:
class Iterator:
def __next__(self):
"""Return the next item from the iterator.
When exhausted, raise StopIteration
"""
Hey, but if an Iterator
is a Generator
where are the generator protocol
methods?
Let’s take a look at the methods of a generator
object again:
print(dir(get_something_iterator()))
It contains the __next__
dunder method of the Iterator
. Does that mean …
Yes maybe from looking at the dir(get_something_iterator())
output you already
recognized the culprit. We lied initially about the Generator base class. I am
very sorry!
Actually the Generator
base class looks like:
class Generator(Iterator):
def send(self, value):
"""Send a value into the generator.
Return next yielded value or raise StopIteration.
"""
def throw(self, typ, val=None, tb=None):
"""Raise an exception in the generator.
Return next yielded value or raise StopIteration.
"""
def close(self):
"""Raise GeneratorExit inside generator.
"""
A Generator
is derived from an Iterator
which provides __next__
. That’s
why the Python glossary uses the term generator iterator
instead of
generator
but actually it might be a iterator generator
. Because this is a
bit confusing and the class name is also just generator
the term generator
fits best to me.
When providing the Iterator protocol for the Generator it could be implemented
with the following __next__
method:
def __next__(self):
"""Return the next item from the generator.
When exhausted, raise StopIteration.
"""
return self.send(None)
To verify this behavior let us take look at our simple generator/iterator function:
def get_something_iterator():
yield "foo"
yield "bar"
yield "baz"
Using it as a Generator
:
generator = get_something_iterator()
generator.send(None)
generator.send(None)
generator.send(None)
generator.send(None)
Output:
>>> generator = get_something_iterator()
>>> generator.send(None)
'foo'
>>> generator.send(None)
'bar'
>>> generator.send(None)
'baz'
>>> generator.send(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Which is actually the same as using the Iterator
behavior with the
next
function:
iterator = get_something_iterator()
next(iterator)
next(iterator)
next(iterator)
next(iterator)
Output:
>>> iterator = get_something_iterator()
>>> next(iterator)
'foo'
>>> next(iterator)
'bar'
>>> next(iterator)
'baz'
>>> next(iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
For Loop¶
Simplified version of a function mimicking a for ... in
loop behavior to
print the values of an Iterator
:
def for_loop(iterator):
while True:
try:
value = next(iterator)
print(value)
except StopIteration:
return
for_loop(get_something_iterator())
The same behavior written with a real for ... in
loop:
for value in get_something_iterator():
print(value)
Call Flow¶
What’s the interesting thing about Generators
/Iterators
and the yield
statement? Let us look at the following call flows:
def get_something_generator():
print("Within generator 1")
yield "foo"
print("Within generator 2")
yield "bar"
print("Within generator 3")
yield "baz"
print("Within generator 4")
for i in get_something_generator():
print("Within for 1")
print(i)
print("Within for 2")
Output:
Within generator 1
Within for 1
foo
Within for 2
Within generator 2
Within for 1
bar
Within for 2
Within generator 3
Within for 1
baz
Within for 2
Within generator 4
In contrast to iterating over a list:
for i in list(get_something_generator()):
print("Within for 1")
print(i)
print("Within for 2")
Output:
Within generator 1
Within generator 2
Within generator 3
Within generator 4
Within for 1
foo
Within for 2
Within for 1
bar
Within for 2
Within for 1
baz
Within for 2
A yield
statements suspend the current execution sequence and returns to the
calling statement! As you can see therefore Python Generators/Iterators are
coroutines!
Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed.