A popular library used in GraphQL implementations is called DataLoader, and in many ways the name is somewhat descriptive of its purpose. As described in the JavaScript repo for the Node.js implementation for GraphQL
“DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.”
The DataLoader solvers the N+1 problem that otherwise requires a resolver to make multiple individual requests to a database (or data source, i.e. another API), resulting in inefficient and slow data retrieval.
A DataLoader serves as a batching and caching layer for combining multiple requests int a single request. Grouping together identical requests and executing them more efficiently, thus minimizing the number of database or API round trips.
DataLoader Operation:
- Create a new instance of DataLoader, specifying a batch loading function. This function would define how to load the data for a given set of keys.
- The resolver iterates through the collection and instead of fetching the related data adds the keys for the data to be fetched to the DataLoader instance.
- The DataLoader collects the keys and for multiple keys, deduplicates the request and executes.
- Once the batch is executed DataLoader returns the results associating them with their respective keys.
- The resolver can then access the response data and resolve the field or relationships as needed.
DataLoader also caches the results of the previous requests so if the same key is requested again DataLoader retrieves from cache instead of making another request. This caching further improves performance and reduces redundant fetching.
DataLoader Implementation Examples
JavaScript & Node.js
The following is a basic implementation using Apollo Server of DataLoader for GraphQL.
const { ApolloServer, gql } = require("apollo-server");
const { DataLoader } = require("dataloader");
// Simulated data source
const db = {
users: [
{ id: 1, name: "John" },
{ id: 2, name: "Jane" },
],
posts: [
{ id: 1, userId: 1, title: "Post 1" },
{ id: 2, userId: 2, title: "Post 2" },
{ id: 3, userId: 1, title: "Post 3" },
],
};
// Simulated asynchronous data loader function
const batchPostsByUserIds = async (userIds) => {
console.log("Fetching posts for user ids:", userIds);
const posts = db.posts.filter((post) => userIds.includes(post.userId));
return userIds.map((userId) => posts.filter((post) => post.userId === userId));
};
// Create a DataLoader instance
const postsLoader = new DataLoader(batchPostsByUserIds);
const resolvers = {
Query: {
getUserById: (_, { id }) => {
return db.users.find((user) => user.id === id);
},
},
User: {
posts: (user) => {
// Use DataLoader to load posts for the user
return postsLoader.load(user.id);
},
},
};
// Define the GraphQL schema
const typeDefs = gql`
type User {
id: ID!
name: String!
posts: [Post]
}
type Post {
id: ID!
title: String!
}
type Query {
getUserById(id: ID!): User
}
`;
// Create Apollo Server instance
const server = new ApolloServer({ typeDefs, resolvers });
// Start the server
server.listen().then(({ url }) => {
console.log(`Server running at ${url}`);
});
This example I created a DataLoader instance postsLoader
using the DataLoader
class from the dataloader
package. I define a batch loading function batchPostsByUserIds
that takes an array of user IDs and retrieves the corresponding posts for each user from the db.posts
array. The function returns an array of arrays, where each sub-array contains the posts for a specific user.
In the User
resolver I user the load
method of DataLoader to load the posts for a user. The load
method handles batching and caching behind the scenes, ensuring that redundant requests are minimized and results are cached for subsequent requests.
When the GraphQL server receives a query for the posts
field of a User
the DataLoader automatically batches the requests for multiple users and executes the batch loading function to retrieve the posts.
This example demonstrates a very basic implementation of DataLoader in a GraphQL server. In a real-world scenario there would of course be a number of additional capabilities and implementation details that you’d need to work on for your particular situation.
Spring Boot Java Implementation
Just furthering the kinds of examples, the following is a Spring Boot example.
First add the dependencies.
<dependencies>
<!-- GraphQL for Spring Boot -->
<dependency>
<groupId>com.graphql-java</groupId>
<artifactId>graphql-spring-boot-starter</artifactId>
<version>5.0.2</version>
</dependency>
<!-- DataLoader -->
<dependency>
<groupId>org.dataloader</groupId>
<artifactId>dataloader</artifactId>
<version>3.4.0</version>
</dependency>
</dependencies>
Next create the components and configure DataLoader.
import com.graphql.spring.boot.context.GraphQLContext;
import graphql.servlet.context.DefaultGraphQLServletContext;
import org.dataloader.BatchLoader;
import org.dataloader.DataLoader;
import org.dataloader.DataLoaderRegistry;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.web.context.request.WebRequest;
import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.CompletionStage;
import java.util.stream.Collectors;
@SpringBootApplication
public class DataLoaderExampleApplication {
// Simulated data source
private static class Db {
List<User> users = List.of(
new User(1, "John"),
new User(2, "Jane")
);
List<Post> posts = List.of(
new Post(1, 1, "Post 1"),
new Post(2, 2, "Post 2"),
new Post(3, 1, "Post 3")
);
}
// User class
private static class User {
private final int id;
private final String name;
User(int id, String name) {
this.id = id;
this.name = name;
}
int getId() {
return id;
}
String getName() {
return name;
}
}
// Post class
private static class Post {
private final int id;
private final int userId;
private final String title;
Post(int id, int userId, String title) {
this.id = id;
this.userId = userId;
this.title = title;
}
int getId() {
return id;
}
int getUserId() {
return userId;
}
String getTitle() {
return title;
}
}
// DataLoader batch loading function
private static class BatchPostsByUserIds implements BatchLoader<Integer, List<Post>> {
private final Db db;
BatchPostsByUserIds(Db db) {
this.db = db;
}
@Override
public CompletionStage<List<List<Post>>> load(List<Integer> userIds) {
System.out.println("Fetching posts for user ids: " + userIds);
List<List<Post>> result = userIds.stream()
.map(userId -> db.posts.stream()
.filter(post -> post.getUserId() == userId)
.collect(Collectors.toList()))
.collect(Collectors.toList());
return CompletableFuture.completedFuture(result);
}
}
// GraphQL resolver
private static class UserResolver implements GraphQLResolver<User> {
private final DataLoader<Integer, List<Post>> postsDataLoader;
UserResolver(DataLoader<Integer, List<Post>> postsDataLoader) {
this.postsDataLoader = postsDataLoader;
}
List<Post> getPosts(User user) {
return postsDataLoader.load(user.getId()).join();
}
}
// GraphQL configuration
@Bean
public GraphQLSchemaProvider graphQLSchemaProvider() {
return (graphQLSchemaBuilder, environment) -> {
// Define the GraphQL schema
GraphQLObjectType userObjectType = GraphQLObjectType.newObject()
.name("User")
.field(field -> field.name("id").type(Scalars.GraphQLInt))
.field(field -> field.name("name").type(Scalars.GraphQLString))
.field(field -> field.name("posts").type(new GraphQLList(postObjectType)))
.build();
GraphQLObjectType postObjectType = GraphQLObjectType.newObject()
.name("Post")
.field(field -> field.name("id").type(Scalars.GraphQLInt))
.field(field -> field.name("title").type(Scalars.GraphQLString))
.build();
GraphQLObjectType queryObjectType = GraphQLObjectType.newObject()
.name("Query")
.field(field -> field.name("getUserById")
.type(userObjectType)
.argument(arg -> arg.name("id").type(Scalars.GraphQLInt))
.dataFetcher(environment -> {
// Retrieve the requested user ID
int userId = environment.getArgument("id");
// Fetch the user by ID from the data source
Db db = new Db();
return db.users.stream()
.filter(user -> user.getId() == userId)
.findFirst()
.orElse(null);
}))
.build();
return graphQLSchemaBuilder.query(queryObjectType).build();
};
}
// DataLoader registry bean
@Bean
public DataLoaderRegistry dataLoaderRegistry() {
DataLoaderRegistry dataLoaderRegistry = new DataLoaderRegistry();
Db db = new Db();
dataLoaderRegistry.register("postsDataLoader", DataLoader.newDataLoader(new BatchPostsByUserIds(db)));
return dataLoaderRegistry;
}
// GraphQL context builder
@Bean
public GraphQLContext.Builder graphQLContextBuilder(DataLoaderRegistry dataLoaderRegistry) {
return new GraphQLContext.Builder().dataLoaderRegistry(dataLoaderRegistry);
}
public static void main(String[] args) {
SpringApplication.run(DataLoaderExampleApplication.class, args);
}
}
This example I define the Db
class as a simulated data source with users
and posts
lists. I create a BatchPostsByUserIds
class that implements the BatchLoader
interface from DataLoader for batch loading of posts based on user IDs.
The UserResolver
class is a GraphQL resolver that uses the postsDataLoader
to load posts for a specific user.
For the configuration I define the schema using GraphQLSchemaProvider
and create GraphQLObjectType
for User
and Post
, and Query
object type with a resolver for the getUserById
field.
The dataLoaderRegistry
bean registers the postsDataLoader
with the DataLoader registry.
This implementation will efficiently batch and cache requests for loading posts based on user IDs.
References
- Github Repository: graphql/dataloader
- Using DataLoader in GraphQL
- GraphQL.NET’s implementation of DataLoader
- Strawberry DataLoader (.NET)
- GraphQL for Spring Java DataLoaders
You must be logged in to post a comment.