Load testing using Locust
There are many applications we can use to load test services. Most of these applications and libraries provide useful information about the response time and delay of the service. They also provide information about the failure rate. Locust is one of the best tools for this purpose. We will use it to load test three methods for serving a Transformer-based model: using FastAPI only, using Dockerized FastAPI, and TFX-based serving using FastAPI. Let’s get started:
- First, we must install
locust
:$ pip install locust
This command will install Locust. The next step is to make all the services serving an identical task use the same model. Fixing two of the most important parameters of this test will ensure that all the services have been designed identically to serve a single purpose. Using the same model will help us freeze anything else and focus on the deployment performance of the methods.
- Once everything is ready, you can start load testing...