“What response times can we expect from your GraphQL API?”
This is probably one of the most frequently asked questions we get at Marketplacer in relation to our GraphQL API. It’s a very reasonable question, and we understand why our customers ask it, but it’s not always that easy to answer…
In this article, we share one of the approaches we’ve taken to benchmarking GraphQL API response times at Marketplacer.
But first, some background…
The “problem” with GraphQL…
The entire value proposition of the GraphQL design pattern (for me anyway) is that:
- It avoids over-fetching
- It avoids under-fetching
Or, to put it in another more positive way, you get exactly the payload response you want. The ability to write queries and shape your payload to your exact needs is what makes it such an attractive proposition. To illustrate what I mean, let’s take a look at one of the queries we offer at Marketplacer: advertSearch.
An “advert” in Marketplacer is really just a product, but for various reasons I won’t go into here, we call them “adverts” in the Marketplacer GraphQL API.
AdvertSearch allows consumers of our GraphQL API to pull Products hosted in Marketplacer through into whichever upstream system needs them. Two typical use-cases would be:
- To render a list of products on a search results page
- To display a single Product Detail Page (or “PDP” in the land of eCom)
These 2 uses-cases have different characteristics:
- The 1st requires multiple objects, but each with a “small” amount of data
- The 2nd requires just 1 object, but with a “large” amount of data
In both these cases, you can use the advertSearch query to get this data, but is it safe to assume both queries would have similar response times? Er, no it would not…
This conundrum really brings us back to where we started, and the question: “What response times can we expect from your GraphQL API?” There isn’t really 1 answer…
Query Profiles
In order to attempt to answer the question around response times, we designed a number of “query profiles” that attempt to cover a range of common use cases.
We then run each of these query profiles at a regular cadence using K6, (a very popular load testing framework) and capture the resulting data for subsequent reporting. This not only gives us a fairly decent response(s) to the aforementioned question, but it also:
- Allows us to track responses times over time
- Easily introduce new profiles should our customers request them
The current characteristics of our profiles include the following:
- Query Depth: How many connections are we traversing in the query? This ranges from 1 upwards. The “deeper” the query, the more costly it is, which may have impact on response times
- Page Size: Requesting a maximum of 10- results per page, as opposed to 1000, has implications for response times
- Filters: How many filters are being applied with search-type queries?
So for example, here are 2 query profiles for advertSearch, (we have many more but these are just 2 extreme examples):
- 1-Level Depth / 10 Results Per Page / No Filtering
- 2-Level Depth / 1000 Results Per Page / 1x Filter
As I’m sure you can imagine, the response time of the 1st query is lower than the 2nd one…
Measurements Taken
As mentioned, we use K6 as the primary vehicle to run tests, with each test being run with the following parameters:
- Virtual Users (VUs): 1
- Duration: 10s
These can, of course, be changed, but for benchmarking purposes, we feel this ok. We then gather the following metrics from the result of each test:
- Median Response Time
- Average Response Time (averages are not great, but we capture them anyway)
- p(90): 90% of requests should be faster than the given latency
- p(95): 95% of requests should be faster than the given latency
K6 has a much wider range of metrics available, but again we just wanted to try to answer a simple question in the simplest way possible…
Final Thoughts
As mentioned in the introduction, we employ some complementary approaches to measuring GraphQL response times; however, this technique gives us a well-defined set of tests that make point-in-time comparisons fairly easy.