AsyncIO - Introduction¶
What is AsyncIO¶
async IO
(Asynchronous IO): a language-agnostic paradigm (model) that has implementations across a host of programming languagesasync/await
: two new Python keywords that are used to define coroutinesasyncio
: the Python package that provides a foundation and API for running and managing coroutines
Background¶
C10K¶
It’s time for web servers to handle ten thousand clients simultaneously
How can servers handle more then ten thousand requests simultaneously? In the past two solutions where prominent:
Multiprocessing - Requires a lot of memory, IPC required.
Threading - Still requires some memory, Synchronization is difficult.
Clearly there must be another solution: “Asynchronous I/O”
The term “Asynchronous I/O” is used to refer to […] an execution model for network programming, where the scalability of an I/O bound application is governed by the number of open socket connections that can be handled in a single OS process rather than by the number of concurrent OS level threads. This approach can significantly improve the scalability of an application, as most POSIX based operating systems can effectively manage thousands or tens of thousands of open socket connections without any significant tuning of process options, but only hundreds of threads (with the default size of the C stack being a key culprit - consuming the resources of an entire thread to wait for an I/O operation can waste a whole lot of memory).
Parallelism vs. Concurrency¶
You can find this section in nearly all articles about asyncio
and really it
is necessary to understand its foundations. This chapter will give you just a
short overview.
To get a deeper understanding the following links can be recommended:
What’s concurrency? Some tasks are run concurrently. What’s parallelism? Some tasks are run in parallel. But isn’t this the same? Not really. Parallelism is a subset ob concurrency. A task can be suspended and another can be started. Both tasks are still concurrent because they are not finished yet. In contrast two parallel tasks are run at the same time. Speaking of CPUs a concurrent task runs on a single CPU, is interrupted, paused and resumed. Parallel tasks are running on different CPU cores.
So what are both best for. Parallel tasks are best for computing something and the CPU is busy with calculations. For example with rendering a video. Interrupting tasks or better task switching is good for situations where the CPU has to wait for something. When the CPU has to wait for some event to happen a different work task can be executed. And when does this happen often? During I/O. I/O devices like memory, disk, networks are a few times slower then the CPU. While waiting for some data to be written or read it can execute some other task.
Name |
Python Standard Library |
Concurrency Type |
Memory Consumption |
CPUs |
Best For |
---|---|---|---|---|---|
Processes |
|
Multiprocessing |
High |
many |
CPU bound |
Threads |
|
preemptive multitasking |
Medium |
1 (because of the GIL) |
I/O bound |
AsyncIO |
|
cooperative multitasking |
Low |
1 |
I/O bound |