DataLoader for GraphQL Implementations

A popular library used in GraphQL implementations is called DataLoader, and in many ways the name is somewhat descriptive of its purpose. As described in the JavaScript repo for the Node.js implementation for GraphQL

“DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.”

The DataLoader solvers the N+1 problem that otherwise requires a resolver to make multiple individual requests to a database (or data source, i.e. another API), resulting in inefficient and slow data retrieval.

A DataLoader serves as a batching and caching layer for combining multiple requests int a single request. Grouping together identical requests and executing them more efficiently, thus minimizing the number of database or API round trips.

DataLoader Operation:

  1. Create a new instance of DataLoader, specifying a batch loading function. This function would define how to load the data for a given set of keys.
  2. The resolver iterates through the collection and instead of fetching the related data adds the keys for the data to be fetched to the DataLoader instance.
  3. The DataLoader collects the keys and for multiple keys, deduplicates the request and executes.
  4. Once the batch is executed DataLoader returns the results associating them with their respective keys.
  5. The resolver can then access the response data and resolve the field or relationships as needed.

DataLoader also caches the results of the previous requests so if the same key is requested again DataLoader retrieves from cache instead of making another request. This caching further improves performance and reduces redundant fetching.

DataLoader Implementation Examples

JavaScript & Node.js

The following is a basic implementation using Apollo Server of DataLoader for GraphQL.

const { ApolloServer, gql } = require("apollo-server");
const { DataLoader } = require("dataloader");

// Simulated data source
const db = {
  users: [
    { id: 1, name: "John" },
    { id: 2, name: "Jane" },
  posts: [
    { id: 1, userId: 1, title: "Post 1" },
    { id: 2, userId: 2, title: "Post 2" },
    { id: 3, userId: 1, title: "Post 3" },

// Simulated asynchronous data loader function
const batchPostsByUserIds = async (userIds) => {
  console.log("Fetching posts for user ids:", userIds);
  const posts = db.posts.filter((post) => userIds.includes(post.userId));
  return => posts.filter((post) => post.userId === userId));

// Create a DataLoader instance
const postsLoader = new DataLoader(batchPostsByUserIds);

const resolvers = {
  Query: {
    getUserById: (_, { id }) => {
      return db.users.find((user) => === id);
  User: {
    posts: (user) => {
      // Use DataLoader to load posts for the user
      return postsLoader.load(;

// Define the GraphQL schema
const typeDefs = gql`
  type User {
    id: ID!
    name: String!
    posts: [Post]

  type Post {
    id: ID!
    title: String!

  type Query {
    getUserById(id: ID!): User

// Create Apollo Server instance
const server = new ApolloServer({ typeDefs, resolvers });

// Start the server
server.listen().then(({ url }) => {
  console.log(`Server running at ${url}`);

This example I created a DataLoader instance postsLoader using the DataLoader class from the dataloader package. I define a batch loading function batchPostsByUserIds that takes an array of user IDs and retrieves the corresponding posts for each user from the db.posts array. The function returns an array of arrays, where each sub-array contains the posts for a specific user.

In the User resolver I user the load method of DataLoader to load the posts for a user. The load method handles batching and caching behind the scenes, ensuring that redundant requests are minimized and results are cached for subsequent requests.

When the GraphQL server receives a query for the posts field of a User the DataLoader automatically batches the requests for multiple users and executes the batch loading function to retrieve the posts.

This example demonstrates a very basic implementation of DataLoader in a GraphQL server. In a real-world scenario there would of course be a number of additional capabilities and implementation details that you’d need to work on for your particular situation.

Spring Boot Java Implementation

Just furthering the kinds of examples, the following is a Spring Boot example.

First add the dependencies.

  <!-- GraphQL for Spring Boot -->
  <!-- DataLoader -->

Next create the components and configure DataLoader.

import com.graphql.spring.boot.context.GraphQLContext;
import graphql.servlet.context.DefaultGraphQLServletContext;
import org.dataloader.BatchLoader;
import org.dataloader.DataLoader;
import org.dataloader.DataLoaderRegistry;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.web.context.request.WebRequest;

import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.CompletionStage;

public class DataLoaderExampleApplication {

  // Simulated data source
  private static class Db {
    List<User> users = List.of(
        new User(1, "John"),
        new User(2, "Jane")

    List<Post> posts = List.of(
        new Post(1, 1, "Post 1"),
        new Post(2, 2, "Post 2"),
        new Post(3, 1, "Post 3")

  // User class
  private static class User {
    private final int id;
    private final String name;

    User(int id, String name) { = id; = name;

    int getId() {
      return id;

    String getName() {
      return name;

  // Post class
  private static class Post {
    private final int id;
    private final int userId;
    private final String title;

    Post(int id, int userId, String title) { = id;
      this.userId = userId;
      this.title = title;

    int getId() {
      return id;

    int getUserId() {
      return userId;

    String getTitle() {
      return title;

  // DataLoader batch loading function
  private static class BatchPostsByUserIds implements BatchLoader<Integer, List<Post>> {
    private final Db db;

    BatchPostsByUserIds(Db db) {
      this.db = db;

    public CompletionStage<List<List<Post>>> load(List<Integer> userIds) {
      System.out.println("Fetching posts for user ids: " + userIds);
      List<List<Post>> result =
          .map(userId ->
              .filter(post -> post.getUserId() == userId)
      return CompletableFuture.completedFuture(result);

  // GraphQL resolver
  private static class UserResolver implements GraphQLResolver<User> {
    private final DataLoader<Integer, List<Post>> postsDataLoader;

    UserResolver(DataLoader<Integer, List<Post>> postsDataLoader) {
      this.postsDataLoader = postsDataLoader;

    List<Post> getPosts(User user) {
      return postsDataLoader.load(user.getId()).join();

  // GraphQL configuration
  public GraphQLSchemaProvider graphQLSchemaProvider() {
    return (graphQLSchemaBuilder, environment) -> {
      // Define the GraphQL schema
      GraphQLObjectType userObjectType = GraphQLObjectType.newObject()
          .field(field ->"id").type(Scalars.GraphQLInt))
          .field(field ->"name").type(Scalars.GraphQLString))
          .field(field ->"posts").type(new GraphQLList(postObjectType)))

      GraphQLObjectType postObjectType = GraphQLObjectType.newObject()
          .field(field ->"id").type(Scalars.GraphQLInt))
          .field(field ->"title").type(Scalars.GraphQLString))

      GraphQLObjectType queryObjectType = GraphQLObjectType.newObject()
          .field(field ->"getUserById")
              .argument(arg ->"id").type(Scalars.GraphQLInt))
              .dataFetcher(environment -> {
                // Retrieve the requested user ID
                int userId = environment.getArgument("id");
                // Fetch the user by ID from the data source
                Db db = new Db();
                    .filter(user -> user.getId() == userId)

      return graphQLSchemaBuilder.query(queryObjectType).build();

  // DataLoader registry bean
  public DataLoaderRegistry dataLoaderRegistry() {
    DataLoaderRegistry dataLoaderRegistry = new DataLoaderRegistry();
    Db db = new Db();
    dataLoaderRegistry.register("postsDataLoader", DataLoader.newDataLoader(new BatchPostsByUserIds(db)));
    return dataLoaderRegistry;

  // GraphQL context builder
  public GraphQLContext.Builder graphQLContextBuilder(DataLoaderRegistry dataLoaderRegistry) {
    return new GraphQLContext.Builder().dataLoaderRegistry(dataLoaderRegistry);

  public static void main(String[] args) {, args);

This example I define the Db class as a simulated data source with users and posts lists. I create a BatchPostsByUserIds class that implements the BatchLoader interface from DataLoader for batch loading of posts based on user IDs.

The UserResolver class is a GraphQL resolver that uses the postsDataLoader to load posts for a specific user.

For the configuration I define the schema using GraphQLSchemaProvider and create GraphQLObjectType for User and Post, and Query object type with a resolver for the getUserById field.

The dataLoaderRegistry bean registers the postsDataLoader with the DataLoader registry.

This implementation will efficiently batch and cache requests for loading posts based on user IDs.


Other GraphQL Standards, Practices, Patterns, & Related Posts

A Hasura Quick Start with Remote Schema, Remote Joins

I’ve been building GraphQL APIs for a number of years now – of along side RESTful, gRPC, XML, and other API styles I won’t even bring up right now – and so far GraphQL APIs have been great to work with. The libraries in different languages form .NET’s Hot Chocolate, Go’s graphql-go, Apollo’s JavaScript based tooling and servers, to Java’s GraphQL for Spring have worked great.

Sometimes you’re in the fortunate situation where you’re using PostgreSQL or SQL Server, or other supported database for a tool like Hasura. Being able to get a full GraphQL (with REST options too) API running in seconds is pretty impressive. From a development perspective it is a massive boost. As Hasura adds more database connectors as they have with Snowflake and Amazon Athena, the server and tooling becomes even more powerful.

With that I wanted to show a N+1 demo where N is day 1 with Hasura. The idea is what do you do immediately after you get a sample service running with Hasura. How do you integrate it with other services, or more specifically how do you integrate your Hasura API along side APIs you’ve written yourself, such as an enterprise GraphQL for Spring based API running against Mongo or other data source? This repo is the basis for several demonstration repositories I am building that will show how you can setup – generally for local development – Hasura + X API with Y Language stack.

This is the Hasura quick start repository here, with migrations and metadata for a local setup. The first demonstration repo for a peripheral GraphQL API will be a Spring based API in this repository. The following steps will get the quick start repository up and running.

  1. Clone this repo git clone
  2. From the root (where the docker-compose.yml file is located) execute docker compose up -d.
  3. Navigate into the hasura directory.
  4. Execute hasura metadata apply, then hasura migrate apply, and then hasura metadata apply. Just do it, it’s a strange workflow thing.
  5. Navigate now into the `hasura` directory and execute hasura console.

These steps are demonstrated in this video from 48 seconds.

What do you get once deployed?

The following are some of the core capabilities of Hasura and showcase what you can get up and running in a matter of seconds, even when you start from a completely empty database! First off you’ll find the database now has 3 tables along with their pertinent schema built out in PostgreSQL and available via Hasura, as shown here under the Data tab of the console.

I also created a schema diagram just to provide a visual of how these tables are designed.

For the remote schema, the Spring API, the following steps will get it cloned and running locally.

  1. Clone this repo git clone
  2. Execute ./gradlew build to get the jar file build. It will then be located in the build/libs directory of the project.
  3. Next build the docker image with docker build -t adron/hasura-spring-boot-graphql . to build the docker image locally.
  4. Now you can either start this container with docker compose up -d using the docker-compose.yml in the project or you can run the image with Docker specifically with docker run -p 8081:8080 adron/hasura-spring-boot-graphql.

For a walkthrough of getting the Spring API running, check out 2:28 onward in this video.

Now both of these instances are running locally and you can test each out respectively, but not specifically together. I’ll have probably write up another post on how to get services that spin up separately to run together for localized development. However, with the way things are setup in the two repos, it’s as if one team is the Hasura team building a GraphQL API and another is a Spring Java GraphQL API team, and they’re working autonomously of each other just based on contract of the APIs themselves.

Remote Schema

With that being the scenario, I’ve deployed the Spring API out remotely so that I could show how to put together a remote schema connection and then a remote join query, i.e. nested query in GraphQL speak, across these two APIs.

To add the remote schema, click on the remote schemas tab on the console. Add a name (1), then the URI (2), and optionally if needed add appropriate headers (3) or forward all headers from client requests.

Once that’s added, navigate to the relationships tab of the new remote schema and click on add. Then for this example, select remote database (1), then add a name (4) (Customer in the example) and then for type choose object (3) (per the example).

Then scroll down on that console screen and choose sales_data (1) and default, public, and users (2) under the reference database, schema, and table. Next up choose the source field (3) and reference column (4).

Once added it will look like this in the console.

This creates a relationship to be able to make nested queries against these sources with GraphQL. If it were a single contiguous database the schema would look like this. I’ve color coded the sales_data table as red, to signify it is the table we know is in another database (or, specifically, provided via another hosted API). However, as stated, in a single database the relationships would now look like this. The relationship however, isn’t in a database, but stored in the Hasura metadata between users and sales_data.

Now writing a query across this data would shape up like this. Because of the way the relationship was drawn via the remote schema, the path to get the nested object Customer (2) for the sales data is to start with the sales_data (1) entity. As shown.

sales_data {
  Customer {

Now we want to add more details about the particular customer like their email and details. To do this we’ll utilize another nesting level within this query that delves into relationships that are in the PostgreSQL database itself.

sales_data {
  Customer {
    emails {
    details {

With this the nested details email (3) and details (4) will be provided, which is foreign key relationships to the primary key table users in the underlying database, made available by Hasura’s relationships in metadata.

Boom! That’s it. Pretty easy setup if the databases and APIs have Hasura available to connect them in this way. Otherwise, this is a huge challenge to develop against if you’re just using solely a tech stack like Apollo, Spring Boot, or Hot Chocolate. Often something along federation and more complexities would come into play. But more on that later, I’ve got a piece coming on federation, stitching, remote schemas, and gateway – among various ways – to get multiple GraphQL, or GraphQL and RESTful APIs together into a singular, or singularly managed, API end point.

Hope that was useful, if you’ve got comments, questions, or curiosities let me know in the comments here, or pop over to the video and leave a comment there.


The full video of setup and how the remote schema & joins work in Hasura.

Gradle Build Tool

A few helpful links and details to where information is on the Gradle Build Tool.


Via SDKMAN sdk install gradle x.y.z where x.y.z is the version, like 8.0.2.

Via Brew with brew install gradle.

Manually check out the instructions here.

Building a Java Library (or application, Gradle plugin, etc)

Using the init task. From inside a directory with the pertinent project.

gradle init

You’ll be prompted for options.

With the project initialized this is what that initialized folder structure looks like.

At this point add the Java code for the library, similar to this example, and execute a build like this.

./gradlew build

Build Collateral

View the test report via the HTML output file at lib/build/reports/tests/test/index.html.

The JAR file is available in lib/build/libs with the name lib.jar. Verify the archive is valid with jar tf lib/build/libs/lib.jar.

Add the version by setting the version = '0.1.1' in the build.gradle file.

Run the jar task ./gradlew jar and the build will create a lib/build/libs/lib-0.1.1.jar with the expected version.

Add all this to the build by adding the following to the build.gradle file:

tasks.named('jar') {
    manifest {
                   'Implementation-Version': project.version)

Verifying this all works, execute a ./gradlew jar and then extract the MANIFEST.MF via jar xf lib/build/libs/lib-0.1.0.jar META-INF/MANIFEST.MF.

Adding API Docs

In the */ file, replace the / in the comment by / * so that we get javadoc markup.

Run the ./gradlew javadoc task. The generated javadoc files are located at lib/build/docs/javadoc/index.html.

To add this as a build task, in build.gradle add a section with the following:

java {

Publish a Build Scan

Execute a build scan with ./gradlew build --scan.

Common Issues + Tips n’ Tricks

gradlew – Permission Denied issue

Let’s say you execute Gradle with ./gradlew with whatever parameter and immediately get a response of “Permission Denied”. The most common solution, especially for included gradlew executables included in repositories, is to just give the executable permission to execute. This is done with a simple addition chmod +x gradelw and you should now be ready to execute!