Efficiently compiling GraphQL queries for MongoDB performance ⚡

We are thrilled to announce the availability of our Hasura MongoDB 1.0 connector.

This improvement shows our dedication to giving our users cutting-edge database solutions, not just simple interfaces to run queries.

Since its release, the performance of our database connector has emerged as a recurrent theme in the feedback we have gotten. One of the most common questions we have heard is, "How does it perform?" And then another popular follow-up is, "What is the magic going on under the hood to achieve performance with MongoDB and GraphQL?"

In this blog post, we break down and explore these important issues.

How do we achieve lightning-quick performance?

So how do we manage to make our Hasura MongoDB 1.0 connector so quick?

It essentially comes down to a few key tactics. The main overarching tactic is database push-downs and query compiling.

Simply put, we aim to transfer less data and make fewer server requests. This helps speed things up. To do this, we reduce unnecessary data fetching and processing on the server's (or clients') side and let the database handle the bulk of the work.

How do we do it? The user sends us a GraphQL request, which we extract, enrich with relations, permissions, and predicates, then compile before sending to the MongoDB data source.

This means we only get the exact data we need and nothing extra. The result?Faster data retrieval and a smoother experience for users.

In the following examples, we’ll be using a Chinook sample database, which can be found here.

Optimizing simple queries

Every query counts when it comes to improving performance, even the simplest ones. For these basic queries, we begin by compiling them with projection.

This makes sure that only the necessary fields are retrieved from the database.

Without Hasura, these queries frequently execute without such projection, resulting in the retrieval of extra fields that are subsequently removed at the server level.

Below is an example of only retrieving the necessary fields for a query from a Collection.

GraphQL Query:

query getArtists {
  Artist {
    Name
  }
}

Compiled Database Query:

{
  "command": {
    "aggregate": "Artist",
    "pipeline": [
      { "$match": {} },
      {
        "$replaceWith": {
          "Name": "$Name"
        }
      }
    ],
    "cursor": {},
    "$db": "chinook"
  }
}

However, it is important to keep in mind that even when data flow between the database and the server is optimized through projection, there is still a performance cost (at the database layer) associated with querying Documents that are overstuffed.

This is why it's crucial to structure your Documents appropriately for performance.
Ensure that Documents are right-sized, and split any non-essential embedded fields to reference Collections where it makes sense.

Really great cross-collection relationships

When it comes to handling MongoDB, one challenging aspect is establishing relationships across Collections, especially if we're keeping our Documents right-sized and making use of reference Collections.

Outside of Hasura, these joins are commonly seen at either the server or even client-client level. This often leads to slowdowns due to excessive data fetching and subsequent matching and filtering.

Our solution? We turn to lookup queries.

Using these, we can create queries that efficiently push joins down to the MongoDB database. This ensures that we only retrieve the specific related fields we require, avoiding unnecessary data bloat.

In the following example, we show how easy it is to take multiple cross-Collection relationships and execute the join at the database level using Hasura:

GraphQL query:

query getArtistsAlbumsTracks {
  Artist {
    Name
    Albums {
      Title
      Tracks {
        Name
        Composer
        Milliseconds
      }
    }
  }
}

Compiled database query:

{
  "command": {
    "aggregate": "Artist",
    "pipeline": [
      { "$match": {} },
      {
        "$lookup": {
          "from": "Album",
          "let": { "v_ArtistId": { "$getField": { "$literal": "ArtistId" } } },
          "pipeline": [
            {
              "$match": {
                "$and": [
                  {
                    "$expr": {
                      "$eq": [
                        "$$v_ArtistId",
                        { "$getField": { "$literal": "ArtistId" } }
                      ]
                    }
                  }
                ]
              }
            },
            { "$match": {} },
            {
              "$lookup": {
                "from": "Track",
                "let": {
                  "v_AlbumId": { "$getField": { "$literal": "AlbumId" } }
                },
                "pipeline": [
                  {
                    "$match": {
                      "$and": [
                        {
                          "$expr": {
                            "$eq": [
                              "$$v_AlbumId",
                              { "$getField": { "$literal": "AlbumId" } }
                            ]
                          }
                        }
                      ]
                    }
                  },
                  { "$match": {} },
                  {
                    "$replaceWith": {
                      "Name": "$Name",
                      "Composer": "$Composer",
                      "Milliseconds": "$Milliseconds"
                    }
                  }
                ],
                "as": "Tracks"
              }
            },
            {
              "$replaceWith": {
                "Title": "$Title",
                "Tracks": { "rows": { "$getField": { "$literal": "Tracks" } } }
              }
            }
          ],
          "as": "Albums"
        }
      },
      {
        "$replaceWith": {
          "Albums": { "rows": { "$getField": { "$literal": "Albums" } } },
          "Name": "$Name"
        }
      }
    ],
    "cursor": {},
    "$db": "chinook"
  }
}

Accelerated access control and permissions

After discussing how we optimize simple queries and relationships directly at the database level, you may be wondering how we incorporate access controls and permissions into this efficient framework. After all, it’s a critical aspect.

The good news is that our permission engine and query compilation system are tightly integrated. This design makes sure that the process is quick and seamless, even when handling complex queries with layered permissions.

As a result, we provide a solution that abstracts away complexities, ensuring both robust security and lightning-fast performance.

In the following example, we’ll be taking the cross-Collection relationship we executed previously while also enforcing access control permission for only seeing records that are assigned to you:

Permission rule (on Artist Collection, for role = `user`):

{"ArtistId":{"_eq":"X-Hasura-User-Id"}}

GraphQL query (role = `user`, user-id = `1`):

query getArtistsAlbumsTracksWithPermission {
  Artist {
    Name
    Albums {
      Title
      Tracks {
        Name
        Composer
        Milliseconds
      }
    }
  }
}

Compiled database query:

{
  "command": {
    "aggregate": "Artist",
    "pipeline": [
      { "$match": { "ArtistId": { "$eq": 1 } } },
      {
        "$lookup": {
          "from": "Album",
          "let": { "v_ArtistId": { "$getField": { "$literal": "ArtistId" } } },
          "pipeline": [
            {
              "$match": {
                "$and": [
                  {
                    "$expr": {
                      "$eq": [
                        "$$v_ArtistId",
                        { "$getField": { "$literal": "ArtistId" } }
                      ]
                    }
                  }
                ]
              }
            },
            { "$match": {} },
            {
              "$lookup": {
                "from": "Track",
                "let": {
                  "v_AlbumId": { "$getField": { "$literal": "AlbumId" } }
                },
                "pipeline": [
                  {
                    "$match": {
                      "$and": [
                        {
                          "$expr": {
                            "$eq": [
                              "$$v_AlbumId",
                              { "$getField": { "$literal": "AlbumId" } }
                            ]
                          }
                        }
                      ]

Flexibly crafting and composing GraphQL queries

With all of the talk about optimizing cross-Collection joins and our intricate permissions layer, let us take a look at another key feature: the art of composing GraphQL queries. We understand that developers frequently require more complex query constructs, particularly when adding arguments and specific predicates, such as "where" clauses.

Here is where a new difficulty arises.

How do we seamlessly incorporate our auto-generated query modeling into this mix?

Our generated query model, like joins and permissions, is seamlessly integrated into our query compiling mechanism. This means that whether you are writing simple queries or complex queries with multiple arguments, the underlying MongoDB queries will always be efficient and optimized.

In the following example, we’ll take the complex relationship query above, including its permissions, and then add in some arbitrary where arguments to additionally filter the records that are returned.

Permission rule (on Artist Collection, for role = `user`):

{"ArtistId":{"_eq":"X-Hasura-User-Id"}}

GraphQL query (role = `user`, user-id = `1`):

query getSpecificAlbumPaginatedTracks {
  Artist {
    Name
    Albums(where: {Title: {_eq: "Let There Be Rock"}}) {
      Title
      Tracks(limit: 5, offset: 5, order_by: {TrackId: desc}) {
        Name
        Composer
        Milliseconds
      }
    }
  }
}

Compiled database query:

{
  "command": {
    "aggregate": "Artist",
    "pipeline": [
      { "$match": { "ArtistId": { "$eq": 1 } } },
      {
        "$lookup": {
          "from": "Album",
          "let": { "v_ArtistId": { "$getField": { "$literal": "ArtistId" } } },
          "pipeline": [
            {
              "$match": {
                "$and": [
                  {
                    "$expr": {
                      "$eq": [
                        "$$v_ArtistId",
                        { "$getField": { "$literal": "ArtistId" } }
                      ]
                    }
                  }
                ]
              }
            },
            { "$match": { "Title": { "$eq": "Let There Be Rock" } } },
            {
              "$lookup": {
                "from": "Track",
                "let": {
                  "v_AlbumId": { "$getField": { "$literal": "AlbumId" } }
                },
                "pipeline": [
                  {
                    "$match": {
                      "$and": [
                        {
                          "$expr": {
                            "$eq": [
                              "$$v_AlbumId",
                              { "$getField": { "$literal": "AlbumId" } }
                            ]
                          }
                        }
                      ]
                    }
                  },
                  { "$match": {} },
                  { "$sort": { "TrackId": 1 } },
                  { "$skip": 5 },
                  { "$limit": 5 },
                  {
                    "$replaceWith": {
                      "Milliseconds": "$Milliseconds",
                      "Name": "$Name",
                      "Composer": "$Composer"
                    }
                  }
                ],
                "as": "Tracks"
              }
            },
            {
              "$replaceWith": {
                "Tracks": { "rows": { "$getField": { "$literal": "Tracks" } } },
                "Title": "$Title"
              }
            }
          ],
          "as": "Albums"
        }
      },
      {
        "$replaceWith": {
          "Name": "$Name",
          "Albums": { "rows": { "$getField": { "$literal": "Albums" } } }
        }
      }
    ],
    "cursor": {},
    "$db": "chinook"
  }
}

Pushing down as much as we can

While we’re able to compile and push down the vast majority of queries, like:

Projection
Permissions
Relationships (joins)
Arguments and filtering

There are a couple of areas where we still depend on the server to implement logic in-memory, such as cross-database joins and multiple unrelated (batch) queries.

Cross-database joins

While we have made significant progress in optimizing queries within the same database, cross-database joins present a unique challenge.

A server-side mechanism is required to perform these joins. However, even with this server-side intervention, we remain dedicated to efficiency.

According to our philosophy of lean and optimized data handling, the queries aimed at the related data sources are carefully compiled at each of their respective sources to ensure minimal data retrieval.

Multiple unrelated (batch) queries

One point worth mentioning is our approach to unrelated queries in PostgreSQL.

Currently, these unrelated queries are also routed directly to the database. Once executed, the results are transformed into JSON format within the database before being sent back to the Hasura server.

This methodology ensures streamlined processing and minimizes unnecessary back-and-forths.

This method is not currently available in MongoDB, but we’re continually exploring ways to refine our approach. Specifically, batching these kinds of queries is on our radar for potential enhancements in the future.

The future for MongoDB performance and Hasura

Our journey to optimize and improve Hasura's performance is ongoing, and we are looking forward to what comes next. We are hard at work on benchmarks, and you can expect a detailed report on our findings in an upcoming blog post.

We have compiled a selection of articles below for those of you interested in delving deeper into the complexities of Hasura's performance and getting a more comprehensive view on our perspectives on query compiling.

We invite you to learn more about the work we do and our commitment to delivering exceptional results.