Iterators

From Generators to Iterators

Back to our initial very simple generator. What’s the YieldType, SendType and ReturnType of this generator?

def get_something_generator():
    yield "foo"
    yield "bar"
    yield "baz"

Actually it could be written as:

from typing import Generator

def get_something_generator() -> Generator[str, None, None]:
    yield "foo"
    yield "bar"
    yield "baz"

This generator doesn’t receive and return anything and just yields a value on every call. That means when calling we iterate over some values? Yes indeed. Therefore Generator[YieldType, None, None] could be rewritten as Iterator[YieldType].

from typing import Iterator

def get_something_iterator() -> Iterator[str]:
    yield "foo"
    yield "bar"
    yield "baz"

What’s an Iterator

Citing the Python docs, an Iterator is

An object representing a stream of data. Repeated calls to the iterator’s __next__() method […] return successive items in the stream. When no more data are available a StopIteration exception is raised instead.

Source

Iterator base class

From the definition above we can derive a base class:

class Iterator:
    def __next__(self):
        """Return the next item from the iterator.
        When exhausted, raise StopIteration
        """

Hey, but if an Iterator is a Generator where are the generator protocol methods?

Let’s take a look at the methods of a generator object again:

print(dir(get_something_iterator()))

It contains the __next__ dunder method of the Iterator. Does that mean … Yes maybe from looking at the dir(get_something_iterator()) output you already recognized the culprit. We lied initially about the Generator base class. I am very sorry!

Actually the Generator base class looks like:

class Generator(Iterator):
    def send(self, value):
        """Send a value into the generator.
        Return next yielded value or raise StopIteration.
        """

    def throw(self, typ, val=None, tb=None):
        """Raise an exception in the generator.
        Return next yielded value or raise StopIteration.
        """

    def close(self):
        """Raise GeneratorExit inside generator.
        """

A Generator is derived from an Iterator which provides __next__. That’s why the Python glossary uses the term generator iterator instead of generator but actually it might be a iterator generator. Because this is a bit confusing and the class name is also just generator the term generator fits best to me.

When providing the Iterator protocol for the Generator it could be implemented with the following __next__ method:

    def __next__(self):
        """Return the next item from the generator.
        When exhausted, raise StopIteration.
        """
        return self.send(None)

To verify this behavior let us take look at our simple generator/iterator function:

def get_something_iterator():
    yield "foo"
    yield "bar"
    yield "baz"

Using it as a Generator:

generator = get_something_iterator()
generator.send(None)
generator.send(None)
generator.send(None)
generator.send(None)

Output:

>>> generator = get_something_iterator()
>>> generator.send(None)
'foo'
>>> generator.send(None)
'bar'
>>> generator.send(None)
'baz'
>>> generator.send(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Which is actually the same as using the Iterator behavior with the next function:

iterator = get_something_iterator()
next(iterator)
next(iterator)
next(iterator)
next(iterator)

Output:

>>> iterator = get_something_iterator()
>>> next(iterator)
'foo'
>>> next(iterator)
'bar'
>>> next(iterator)
'baz'
>>> next(iterator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

For Loop

Simplified version of a function mimicking a for ... in loop behavior to print the values of an Iterator:

def for_loop(iterator):
    while True:
        try:
            value = next(iterator)
            print(value)
        except StopIteration:
            return

for_loop(get_something_iterator())

The same behavior written with a real for ... in loop:

for value in get_something_iterator():
    print(value)

Call Flow

What’s the interesting thing about Generators/Iterators and the yield statement? Let us look at the following call flows:

def get_something_generator():
    print("Within generator 1")
    yield "foo"
    print("Within generator 2")
    yield "bar"
    print("Within generator 3")
    yield "baz"
    print("Within generator 4")
for i in get_something_generator():
    print("Within for 1")
    print(i)
    print("Within for 2")

Output:

Within generator 1
Within for 1
foo
Within for 2
Within generator 2
Within for 1
bar
Within for 2
Within generator 3
Within for 1
baz
Within for 2
Within generator 4

In contrast to iterating over a list:

for i in list(get_something_generator()):
    print("Within for 1")
    print(i)
    print("Within for 2")

Output:

Within generator 1
Within generator 2
Within generator 3
Within generator 4
Within for 1
foo
Within for 2
Within for 1
bar
Within for 2
Within for 1
baz
Within for 2

A yield statements suspend the current execution sequence and returns to the calling statement! As you can see therefore Python Generators/Iterators are coroutines!

Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed.