Asynchronous HTTP Requests in Python with aiohttp and asyncio
Asynchronous code has increasingly become a mainstay of Python development. With asyncio becoming part of the standard library and many third party packages providing features compatible with it, this paradigm is not going away anytime soon.
Let's walk through how to use the aiohttp library to take advantage of this for making asynchronous HTTP requests, which is one of the most common use cases for non-blocking code.
What is non-blocking code?
You may hear terms like "asynchronous", "non-blocking" or "concurrent" and be a little confused as to what they all mean. According to this much more detailed tutorial , two of the primary properties are:
-
Asynchronous routines are able to "pause" while waiting on their ultimate result to let other routines run in the meantime.
-
Asynchronous code , through the mechanism above, facilitates concurrent execution. To put it differently, asynchronous code gives the look and feel of concurrency.
So asynchronous code is code that can hang while waiting for a result, in order to let other code run in the meantime. It doesn't "block" other code from running so we can call it "non-blocking" code.
The asyncio library provides a variety of tools for Python developers to do this, and aiohttp provides an even more specific functionality for HTTP requests. HTTP requests are a classic example of something that is well-suited to asynchronicity because they involve waiting for a response from a server, during which time it would be convenient and efficient to have other code running.
Setting up
Make sure to have your Python environment setup before we get started. Follow this guide up through the virtualenv section if you need some help. Getting everything working correctly, especially with respect to virtual environments is important for isolating your dependencies if you have multiple projects running on the same machine. You will need at least Python 3.7 or higher in order to run the code in this post.
Now that your environment is set up, you're going to need to install some third party libraries. We're going to use aiohttp for making asynchronous requests, and the requests library for making regular synchronous HTTP requests in order to compare the two later on. Install both of these with the following command after activating your virtual environment:
pip install aiohttp-3.7.4.post0 requests==2.25.1
With this you should be ready to move on and write some code.
Making an HTTP Request with aiohttp
Let's start off by making a single GET
request using aiohttp, to demonstrate how the keywords async
and await
work. We're going to use the Pokemon API
as an example, so let's start by trying to get the data associated with the legendary 151st Pokemon, Mew
.
Run the following Python code, and you should see the name "mew" printed to the terminal:
import aiohttp
import asyncio
async def main():
async with aiohttp.ClientSession() as session:
pokemon_url = 'https://pokeapi.co/api/v2/pokemon/151'
async with session.get(pokemon_url) as resp:
pokemon = await resp.json()
print(pokemon['name'])
asyncio.run(main())
In this code, we're creating a coroutine called main
, which we are running with the asyncio event loop
. In here we are opening an aiohttp client session
, a single object that can be used for quite a number of individual requests and by default can make connections with up to 100 different servers at a time. With this session, we are making a request to the Pokemon API and then awaiting a response.
This async
keyword basically tells the Python interpreter that the coroutine we're defining should be run asynchronously with an event loop. The await
keyword passes control back to the event loop, suspending the execution of the surrounding coroutine and letting the event loop run other things until the result that is being "awaited" is returned.
Making a large number of requests
Making a single asynchronous HTTP request is great because we can let the event loop work on other tasks instead of blocking the entire thread while waiting for a response. But this functionality truly shines when trying to make a larger number of requests. Let's demonstrate this by performing the same request as before, but for all 150 of the original Pokemon .
Let's take the previous request code and put it in a loop, updating which Pokemon's data is being requested and using await
for each request:
import aiohttp
import asyncio
import time
start_time = time.time()
async def main():
async with aiohttp.ClientSession() as session:
for number in range(1, 151):
pokemon_url = f'https://pokeapi.co/api/v2/pokemon/{number}'
async with session.get(pokemon_url) as resp:
pokemon = await resp.json()
print(pokemon['name'])
asyncio.run(main())
print("--- %s seconds ---" % (time.time() - start_time))
This time, we're also measuring how much time the whole process takes. If you run this code in your Python shell, you should see something like the following printed to your terminal:
8 seconds seems pretty good for 150 requests, but we don't really have anything to compare it to. Let's try accomplishing the same thing synchronously using the requests library.
Comparing speed with synchronous requests
Requests was designed to be an HTTP library "for humans" so it has a very beautiful and simplistic API. I highly recommend it for any projects in which speed might not be of primary importance compared to developer-friendliness and easy to follow code.
To print the first 150 Pokemon as before, but using the requests library, run the following code:
import requests
import time
start_time = time.time()
for number in range(1, 151):
url = f'https://pokeapi.co/api/v2/pokemon/{number}'
resp = requests.get(url)
pokemon = resp.json()
print(pokemon['name'])
print("--- %s seconds ---" % (time.time() - start_time))
You should see the same output with a different runtime:
At nearly 29 seconds, this is significantly slower than the previous code. For each consecutive request, we have to wait for the previous step to finish before even beginning the process. It takes much longer because this code is waiting for 150 requests to finish sequentially.
Utilizing asyncio for improved performance
So 8 seconds compared to 29 seconds is a huge jump in performance, but we can do even better using the tools that asyncio
provides. In the original example, we are using await after each individual HTTP request, which isn't quite ideal. It's still faster than the requests example because we are running everything in coroutines, but we can instead run all of these requests "concurrently" as asyncio tasks and then check the results at the end, using asyncio.ensure_future
and asyncio.gather
.
If the code that actually makes the request is broken out into its own coroutine function, we can create a list of tasks, consisting of futures
for each request. We can then unpack this list to a gather
call, which runs them all together. When we await
this call to asyncio.gather
, we will get back an iterable for all of the futures that were passed in, maintaining their order in the list. This way we're only awaiting one time.
To see what happens when we implement this, run the following code:
import aiohttp
import asyncio
import time
start_time = time.time()
async def get_pokemon(session, url):
async with session.get(url) as resp:
pokemon = await resp.json()
return pokemon['name']
async def main():
async with aiohttp.ClientSession() as session:
tasks = []
for number in range(1, 151):
url = f'https://pokeapi.co/api/v2/pokemon/{number}'
tasks.append(asyncio.ensure_future(get_pokemon(session, url)))
original_pokemon = await asyncio.gather(*tasks)
for pokemon in original_pokemon:
print(pokemon)
asyncio.run(main())
print("--- %s seconds ---" % (time.time() - start_time))
This brings our time down to a mere 1.53 seconds for 150 HTTP requests! That is a vast improvement over even our initial async/await example. This example is completely non-blocking, so the total time to run all 150 requests is going to be roughly equal to the amount of time that the longest request took to run. The exact numbers will vary depending on your internet connection.
Concluding Thoughts
As you can see, using libraries like aiohttp to rethink the way you make HTTP requests can add a huge performance boost to your code and save a lot of time when making a large number of requests. By default, it is a bit more verbose than synchronous libraries like requests, but that is by design as the developers wanted to make performance a priority.
In this tutorial, we have only scratched the surface of what you can do with aiohttp and asyncio, but I hope that this has made starting your journey into the world of asynchronous Python a little easier.