How Federation handles the N+1 query problem
GraphQL developers quickly encounter the infamous "N+1 query problem" with operations that return a list:
query TopReviews {topReviews(first: 10) {idratingproduct {nameimageUrl}}}
In a monolithic GraphQL server, the execution engine takes these steps:
- Resolve the
Query.topReviews
field, which returns a list ofReview
s. - For each
Review
in the list, resolve theReview.product
field.
If the Query.topReviews
field returns 10 reviews, then the executor resolves Review.product
field 10 times. If the Reviews.product
field makes a database or REST query for a single Product
, then we'll see 10 unique calls to the data source. This is suboptimal for the following reasons:
- It's more efficient to fetch the 10 products in a single query (e.g.
SELECT * FROM products WHERE id IN (<product ids>)
). - If any reviews refer to the same product, then we're wasting resources fetching something we already have.
The solution for monolithic GraphQL APIs is the dataloader pattern. All GraphQL server implementations support this pattern. The Apollo Server documentation explains how to use the JavaScript implementation in Node.js servers.
The N+1 problem in a federated graph
Consider the same TopReviews
operation, but we've implemented the Review
and Product
types in separate subgraphs:
Fortunately, query planning handles N+1 queries for entities like the Product
type by default! The query plan for this operation works like this:
- First, we
Fetch
the list ofReview
s from the Reviews subgraph using the root fieldQuery.topReviews
. We also ask for theid
of each associated product. - Next, we extract the
Product
entity references andFetch
them in a batch to the Products subgraph'sQuery._entities
root field. - After we get back the
Product
entities, we merge them into the list ofReview
s, indicated by theFlatten
step.
Writing efficient entity resolvers
In most subgraph implementations (including @apollo/subgraph
), we don't write the Query._entities
resolver directly. Instead, we use the reference resolver API for resolving an individual entity reference:
const resolvers = {Product: {__resolveReference(productRepresentation) {return fetchProductByID(productRepresentation.id);}}};
The motivation for this API relates to a subtle, critical aspect of the subgraph specification: the order of resolved entities must match the order of the given entity references. If we return entities in the wrong order, those fields are merged with the wrong entities and we'll have incorrect results. To avoid this issue, most subgraph libraries handle entity order for you.
This does reintroduce the N+1 query problem: in the example above, we'll call fetchProductByID
once for each entity reference.
Fortunately, the solution is exactly the same in a monolithic graph: dataloaders. In nearly every situation, reference resolvers should use a dataloader.
const resolvers = {Product: {__resolveReference(product, context) {return context.dataloaders.products(product.id);}}};
Now, when the query planner calls the Products subgraph with a batch of Product
entities, we'll make a single batched request to the Products data source.