DataLoader for GraphQL Implementations

A popular library used in GraphQL implementations is called DataLoader, and in many ways the name is somewhat descriptive of its purpose. As described in the JavaScript repo for the Node.js implementation for GraphQL

“DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.”

The DataLoader solvers the N+1 problem that otherwise requires a resolver to make multiple individual requests to a database (or data source, i.e. another API), resulting in inefficient and slow data retrieval.

A DataLoader serves as a batching and caching layer for combining multiple requests int a single request. Grouping together identical requests and executing them more efficiently, thus minimizing the number of database or API round trips.

DataLoader Operation:

  1. Create a new instance of DataLoader, specifying a batch loading function. This function would define how to load the data for a given set of keys.
  2. The resolver iterates through the collection and instead of fetching the related data adds the keys for the data to be fetched to the DataLoader instance.
  3. The DataLoader collects the keys and for multiple keys, deduplicates the request and executes.
  4. Once the batch is executed DataLoader returns the results associating them with their respective keys.
  5. The resolver can then access the response data and resolve the field or relationships as needed.

DataLoader also caches the results of the previous requests so if the same key is requested again DataLoader retrieves from cache instead of making another request. This caching further improves performance and reduces redundant fetching.

DataLoader Implementation Examples

JavaScript & Node.js

The following is a basic implementation using Apollo Server of DataLoader for GraphQL.

const { ApolloServer, gql } = require("apollo-server");
const { DataLoader } = require("dataloader");

// Simulated data source
const db = {
  users: [
    { id: 1, name: "John" },
    { id: 2, name: "Jane" },
  ],
  posts: [
    { id: 1, userId: 1, title: "Post 1" },
    { id: 2, userId: 2, title: "Post 2" },
    { id: 3, userId: 1, title: "Post 3" },
  ],
};

// Simulated asynchronous data loader function
const batchPostsByUserIds = async (userIds) => {
  console.log("Fetching posts for user ids:", userIds);
  const posts = db.posts.filter((post) => userIds.includes(post.userId));
  return userIds.map((userId) => posts.filter((post) => post.userId === userId));
};

// Create a DataLoader instance
const postsLoader = new DataLoader(batchPostsByUserIds);

const resolvers = {
  Query: {
    getUserById: (_, { id }) => {
      return db.users.find((user) => user.id === id);
    },
  },
  User: {
    posts: (user) => {
      // Use DataLoader to load posts for the user
      return postsLoader.load(user.id);
    },
  },
};

// Define the GraphQL schema
const typeDefs = gql`
  type User {
    id: ID!
    name: String!
    posts: [Post]
  }

  type Post {
    id: ID!
    title: String!
  }

  type Query {
    getUserById(id: ID!): User
  }
`;

// Create Apollo Server instance
const server = new ApolloServer({ typeDefs, resolvers });

// Start the server
server.listen().then(({ url }) => {
  console.log(`Server running at ${url}`);
});

This example I created a DataLoader instance postsLoader using the DataLoader class from the dataloader package. I define a batch loading function batchPostsByUserIds that takes an array of user IDs and retrieves the corresponding posts for each user from the db.posts array. The function returns an array of arrays, where each sub-array contains the posts for a specific user.

In the User resolver I user the load method of DataLoader to load the posts for a user. The load method handles batching and caching behind the scenes, ensuring that redundant requests are minimized and results are cached for subsequent requests.

When the GraphQL server receives a query for the posts field of a User the DataLoader automatically batches the requests for multiple users and executes the batch loading function to retrieve the posts.

This example demonstrates a very basic implementation of DataLoader in a GraphQL server. In a real-world scenario there would of course be a number of additional capabilities and implementation details that you’d need to work on for your particular situation.

Spring Boot Java Implementation

Just furthering the kinds of examples, the following is a Spring Boot example.

First add the dependencies.

<dependencies>
  <!-- GraphQL for Spring Boot -->
  <dependency>
    <groupId>com.graphql-java</groupId>
    <artifactId>graphql-spring-boot-starter</artifactId>
    <version>5.0.2</version>
  </dependency>
  
  <!-- DataLoader -->
  <dependency>
    <groupId>org.dataloader</groupId>
    <artifactId>dataloader</artifactId>
    <version>3.4.0</version>
  </dependency>
</dependencies>

Next create the components and configure DataLoader.

import com.graphql.spring.boot.context.GraphQLContext;
import graphql.servlet.context.DefaultGraphQLServletContext;
import org.dataloader.BatchLoader;
import org.dataloader.DataLoader;
import org.dataloader.DataLoaderRegistry;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.web.context.request.WebRequest;

import java.util.List;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.CompletionStage;
import java.util.stream.Collectors;

@SpringBootApplication
public class DataLoaderExampleApplication {

  // Simulated data source
  private static class Db {
    List<User> users = List.of(
        new User(1, "John"),
        new User(2, "Jane")
    );

    List<Post> posts = List.of(
        new Post(1, 1, "Post 1"),
        new Post(2, 2, "Post 2"),
        new Post(3, 1, "Post 3")
    );
  }

  // User class
  private static class User {
    private final int id;
    private final String name;

    User(int id, String name) {
      this.id = id;
      this.name = name;
    }

    int getId() {
      return id;
    }

    String getName() {
      return name;
    }
  }

  // Post class
  private static class Post {
    private final int id;
    private final int userId;
    private final String title;

    Post(int id, int userId, String title) {
      this.id = id;
      this.userId = userId;
      this.title = title;
    }

    int getId() {
      return id;
    }

    int getUserId() {
      return userId;
    }

    String getTitle() {
      return title;
    }
  }

  // DataLoader batch loading function
  private static class BatchPostsByUserIds implements BatchLoader<Integer, List<Post>> {
    private final Db db;

    BatchPostsByUserIds(Db db) {
      this.db = db;
    }

    @Override
    public CompletionStage<List<List<Post>>> load(List<Integer> userIds) {
      System.out.println("Fetching posts for user ids: " + userIds);
      List<List<Post>> result = userIds.stream()
          .map(userId -> db.posts.stream()
              .filter(post -> post.getUserId() == userId)
              .collect(Collectors.toList()))
          .collect(Collectors.toList());
      return CompletableFuture.completedFuture(result);
    }
  }

  // GraphQL resolver
  private static class UserResolver implements GraphQLResolver<User> {
    private final DataLoader<Integer, List<Post>> postsDataLoader;

    UserResolver(DataLoader<Integer, List<Post>> postsDataLoader) {
      this.postsDataLoader = postsDataLoader;
    }

    List<Post> getPosts(User user) {
      return postsDataLoader.load(user.getId()).join();
    }
  }

  // GraphQL configuration
  @Bean
  public GraphQLSchemaProvider graphQLSchemaProvider() {
    return (graphQLSchemaBuilder, environment) -> {
      // Define the GraphQL schema
      GraphQLObjectType userObjectType = GraphQLObjectType.newObject()
          .name("User")
          .field(field -> field.name("id").type(Scalars.GraphQLInt))
          .field(field -> field.name("name").type(Scalars.GraphQLString))
          .field(field -> field.name("posts").type(new GraphQLList(postObjectType)))
          .build();

      GraphQLObjectType postObjectType = GraphQLObjectType.newObject()
          .name("Post")
          .field(field -> field.name("id").type(Scalars.GraphQLInt))
          .field(field -> field.name("title").type(Scalars.GraphQLString))
          .build();

      GraphQLObjectType queryObjectType = GraphQLObjectType.newObject()
          .name("Query")
          .field(field -> field.name("getUserById")
              .type(userObjectType)
              .argument(arg -> arg.name("id").type(Scalars.GraphQLInt))
              .dataFetcher(environment -> {
                // Retrieve the requested user ID
                int userId = environment.getArgument("id");
                // Fetch the user by ID from the data source
                Db db = new Db();
                return db.users.stream()
                    .filter(user -> user.getId() == userId)
                    .findFirst()
                    .orElse(null);
              }))
          .build();

      return graphQLSchemaBuilder.query(queryObjectType).build();
    };
  }

  // DataLoader registry bean
  @Bean
  public DataLoaderRegistry dataLoaderRegistry() {
    DataLoaderRegistry dataLoaderRegistry = new DataLoaderRegistry();
    Db db = new Db();
    dataLoaderRegistry.register("postsDataLoader", DataLoader.newDataLoader(new BatchPostsByUserIds(db)));
    return dataLoaderRegistry;
  }

  // GraphQL context builder
  @Bean
  public GraphQLContext.Builder graphQLContextBuilder(DataLoaderRegistry dataLoaderRegistry) {
    return new GraphQLContext.Builder().dataLoaderRegistry(dataLoaderRegistry);
  }

  public static void main(String[] args) {
    SpringApplication.run(DataLoaderExampleApplication.class, args);
  }
}

This example I define the Db class as a simulated data source with users and posts lists. I create a BatchPostsByUserIds class that implements the BatchLoader interface from DataLoader for batch loading of posts based on user IDs.

The UserResolver class is a GraphQL resolver that uses the postsDataLoader to load posts for a specific user.

For the configuration I define the schema using GraphQLSchemaProvider and create GraphQLObjectType for User and Post, and Query object type with a resolver for the getUserById field.

The dataLoaderRegistry bean registers the postsDataLoader with the DataLoader registry.

This implementation will efficiently batch and cache requests for loading posts based on user IDs.

References

Other GraphQL Standards, Practices, Patterns, & Related Posts

Vue.js Studies Day 6 – Beyond “Getting Started”

The previous post to this one, just included some definitions to ensure I’ve covered the core topics as I’m moving along. Before that I covered getting a project app started in day 3 and day 2 posts. Before that I covered the curriculum flow I’m covering in these studies.

Now that I’ve covered a number of ways to get started with a Vue project with; JetBrains WebStorm, hand crafted artisanal, or via NPM it’s time to start covering the basics of how to build an app using Vue.js. I’ll be using the WebStorm IDE for development but will still call out where or what I’d need to do if I were working from the terminal with something else. By using the WebStorm IDE, I’ll also be starting from the generated app that the IDE created for me.

main.js

Starting in the src directory of the project, let’s cover a few of the key parts of this project of the application as created. First, we have a main.js file. In that file is the following code. This little bit of code is what is responsible for working with the Virtual DOM to mount the application to it for rendering the page we eventually see.

import { createApp } from 'vue'
import App from './App.vue'

import './assets/main.css'

createApp(App).mount('#app')

The first two lines import two variables from the ‘vue’ library called createApp and create an App object from ‘./App.vue’ template. Then the third import is just pulling in the main.css file. The fourth line uses the createApp operatig on the App object to .mount the application at the point of the #app anchor in the template App.vue. That leads to needing to take a look at App.vue to see what exactly the vue app is mounted to.

NOTE: I’m not 100% sure I’m describing this as precisely as I should, as the blog entry details, this is day 4 of Vue.js studies. I’d started in the past once or twice, and have done a ton of web dev over the years, but Vue.js is largely new to me. So if you see something worded incorrectly or oddly, please comment with questions or corrections.

App.vue

The App.vue file for the WebStorm generation seems to have a bit more from the other apps generated from other tooling. Whatever the case, let’s pick this apart and learn what is which and which is what in this file.

<script setup>
import HelloWorld from './components/HelloWorld.vue'
import TheWelcome from './components/TheWelcome.vue'
</script>

<template>
  <header>
    <img alt="Vue logo" class="logo" src="./assets/logo.svg" width="125" height="125" />

    <div class="wrapper">
      <HelloWorld msg="You did it!" />
    </div>
  </header>

  <main>
    <TheWelcome />
  </main>
</template>

<style scoped>
header {
  line-height: 1.5;
}

.logo {
  display: block;
  margin: 0 auto 2rem;
}

@media (min-width: 1024px) {
  header {
    display: flex;
    place-items: center;
    padding-right: calc(var(--section-gap) / 2);
  }

  .logo {
    margin: 0 2rem 0 0;
  }

  header .wrapper {
    display: flex;
    place-items: flex-start;
    flex-wrap: wrap;
  }
}
</style>

Right at the beginning of the App.vue file there is a <script setup></script> section that imports two other components that are located in the components directory that the App.vue file is located in. This is the HellowWorld.vue and TheWelcome.vue components. There is a third component, the WelcomeItem.vue component, in the component directory along with an icons directory.

With all of these assets now ready, clicking on the play or debug buttons in WebStorm will issue to the command to execute and display the app in the default browser. I’ll cover more of the assets as we get to each of them, for now I’m going to move along and get into some components and all.

When the IDE (or you do manually) launches the browser to display the app it will look like this.

Checkpoint: Vue App Operational

This is a good time to checkpoint. If you’ve been following along and everything is as shown above, we’re good to move along to the next steps. If not, then something is amiss, probably broken, and that needs resolved first. If you’ve run into a problem at this point, leave a comment with a description of the issue and I’ll endeavor to help you resolve the problem. Otherwise, moving along to some edits and checking out Vue.js’s features and capabilities.

Once the next post is live, I’ll add the link here to that post… it’s coming soon!

Day 5 Studies – Linked Lists 😱

I needed a refresher, as I’m at the point in my career I tend to just implement the things and not remember what they’re called anymore. Whether a refresher or you’re new to coding, maybe my refresher will be a good stroll through linked lists.

First off, what exactly is a linked list? The name sort of defines what we’re talking about, but a definition would definitely help. From an actual dictionary, I dug up the following definition.

“A simple linear data structure, each of whose nodes includes a pointer to the next (and sometimes previous) node in the list, enabling traversal of the structure in at least one direction from any starting point.”

via Wordnick

The parenthesis above overloads the definition a bit in describing both a simple linked list and a doubly linked list. Let’s define further to differentiate those two and throw in a circular one to boot.

Singly Linked Lists

Each node contains a pointer to the next node. The code for a singly linked list would look something like this, if you put it together in JavaScript.

class ListNode {
    constructor(data) {
        this.data = data
        this.next = null
    }
}

class LinkedList {
    constructor(head = null) {
        this.head = head
    }
}

Note however, in this example, the pointer is really the next inferred object. You’d need to use a language like C, or C++, to build a linked list with an actual pointer. that would look something like this.

struct Node {
    int data;
    struct Node* next;
};

#include <bits/stdc++.h>
using namespace std;
 
class Node {
public:
    int data;
    Node* next;
};
 
void printList(Node* n)
{
    while (n != NULL) {
        cout << n->data << " ";
        n = n->next;
    }
}

Here you can see the * used to set a pointer and then the reference to move between the nodes below. It’s a bit more complicated using a language like this, but I wanted to – because I’m often complicated – point out how it would be done with C++. This introduces additional capabilities, but I won’t go on about the nuances of using actual pointers here and am instead going to focus on the linked list concepts themselves. However it doesn’t hurt to know this and along with pointers themselves, this is a great topic for another post another time, so back to linked lists!

Referring back to the JavaScript example above, if you were to build out several nodes like this.

let node1 = new ListNode("a singular node.")
let node2 = new ListNode("some other node.")
let node3 = new ListNode("and another rando node.")

Then put together the nodes, and instantiate a linked list like this.

node1.next = node2
node2.next = node3

let list = new LinkedList(node1)

You can then request various nodes of the linked list or get the linked list object itself like this.

console.log(list.head)  
console.log(list.head.next.data)
console.log(list.head.next.next.data)

The first console log would print out the entire list.head object, which means all of the nested objects too. That would look like this.

ListNode {
  data: 'a singular node.',
  next: ListNode {
    data: 'some other node.',
    next: ListNode { data: 'and another rando node.', next: null }
  }
}

The second and third console logs would print out "some other node." and "and another rando node." respectively.

Double Linked Lists

Each node contains a pointer to the next and previous nodes. That turns the above example into this.

class ListNode {
    constructor(data) {
        this.data = data
        this.next = null
        this.prev = null
    }
}

class LinkedList {
    constructor(head = null) {
        this.head = head
    }
}

Summary

That’s the short SITREP on LinkedLists. There’s a lot more to cover, the functions that work on the list, the traversals, the addition and removal of nodes, etc. But this post is just intended for a quick review of the structure of linked lists at their core. On to next topics, cheers!

Could not resolve “@popperjs/core”

I keep getting this error on running dev. Albeit it doesn’t appear I’m getting it in production.

I run.

npm run dev

Then everything appears to be ok, with the standard message like this from vite.

  vite v2.9.9 dev server running at:

  > Local: http://localhost:3000/
  > Network: use `--host` to expose

  ready in 740ms.

But then, once I navigate to the site to check things out.

X [ERROR] Could not resolve "@popperjs/core"

    node_modules/bootstrap/dist/js/bootstrap.esm.js:6:24:
      6 │ import * as Popper from '@popperjs/core';
        ╵                         ~~~~~~~~~~~~~~~~

  You can mark the path "@popperjs/core" as external to exclude it from the bundle, which will
  remove this error.

…and this error annoyingly crops up.

11:36:23 PM [vite] error while updating dependencies:
Error: Build failed with 1 error:
node_modules/bootstrap/dist/js/bootstrap.esm.js:6:24: ERROR: Could not resolve "@popperjs/core"
    at failureErrorWithLog (C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:1603:15)
    at C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:1249:28
    at runOnEndCallbacks (C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:1034:63)
    at buildResponseToResult (C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:1247:7)
    at C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:1356:14
    at C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:666:9
    at handleIncomingPacket (C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:763:9)
    at Socket.readFromStdout (C:\Users\Adron Hall\Codez\estuary\node_modules\esbuild\lib\main.js:632:7)
    at Socket.emit (events.js:315:20)
    at addChunk (_stream_readable.js:309:12)
Vite Error, /node_modules/.vite/deps/pinia.js?v=72977742 optimized info should be defined
Vite Error, /node_modules/.vite/deps/bootstrap.js?v=14c3224a optimized info should be defined
Vite Error, /node_modules/.vite/deps/pinia.js?v=72977742 optimized info should be defined
Vite Error, /node_modules/.vite/deps/pinia.js?v=72977742 optimized info should be defined (x2)

...

…and on and on and on goes these errors. For whatever reason, even though npm install has been run once just get to the point of running npm run dev, there needs to be a subsequent, specifically executed npm install @popperjs/core to install this particular dependency that throws this error.

So that’s it, that’s your fix. Cheers!

The Best Collected Details on the GraphQL Specification, Section 3

References https://spec.graphql.org specifically October 2021 Edition.

This is the second part (the first part covered the overview and language of GraphQL) to a collection of notes and details, think of this as the cliff notes of the GraphQL Spec. Onward to section 3 of the spec…

The GraphQL type system is described in section 3 of the specification. Per the specification itself,

The GraphQl Type system describes the capabilities of a GraphQL service and is used to determine if a requested operation is valid, to guarantee the type of response results, and describes the input types of variables to determine if values provided at request time are valid.

This feature of the specification for the GraphQL language uses Interface Definition Language (IDL) to describe the type system. This can be used by tools to provide utility function as client code genration or boot-strapping. In a lot of the services and products around GraphQL like AppSync, Hasura, and others you’ll see this specifically in action. Tools that only execute requests can only allow TypeSystemDocument and disallow ExecuteDefintion or TypeSystemExtension to prevent extensions of the type system. If you do this be sure to provide a descriptive error for consumers of your data!

Continue reading “The Best Collected Details on the GraphQL Specification, Section 3”