Building a GraphQL to SQL Compiler on Postgres, MS SQL and MySQL

The release of Hasura 2.0 included one of the most exciting features to date - multiple database backends, and in particular, support for our first new backend, Microsoft SQL Server. Our roadmap already includes support for several new backends -- Google BigQuery (in alpha already), MySQL, Citus (coming soon), and several other databases over time. Soon you will be able to use all of the Hasura features you know and love with the choice of your favorite database!

On the face of it, this probably seems like a relatively straightforward change. After all, Hasura is ostensibly a GraphQL-to-SQL “compiler”, and once we have generated the SQL, we should be able to run it against our choice of database. However, as anyone who has used more than one relational database implementation before will tell you, it is rarely the case that databases support features of SQL in the same way -- or even with the same syntax. Certainly, many features of Hasura make clever use of some lesser-known features of Postgres in order to deliver high performance, and it has not been straightforward to naively translate those to other backend databases.

In addition, Hasura supports a great number of features on top of a simple query language. The ultimate goal is to be able to bring any database and get access to all of Hasura features: row-based permissions, remote schemas with remote joins, custom actions with action relationships, event triggers, caching, distributed tracing, high-performance subscriptions, to name just a few. We are not there yet, and each of these generic features brings its own set of challenges when it comes to the problem of generalizing from Postgres-only to any backend.

In this post, I’m going to take a look at some of the features we have started to support across multiple backends, some of the ones we’re currently working on, and some longer term plans for additional integrations.

An overview of our backend architecture

As I talked about briefly in my last post (Hasura 2.0 Engineering Overview), abstracting over multiple backends is made possible in the Hasura code using several Haskell type classes. Each of these type classes abstracts some set of operations that are used when working with different aspects of the pipeline: metadata parsing, schema generation, data transport, and execution. As you might imagine, these abstractions have been introduced by looking at the previous Postgres implementations and, ultimately, extracting signatures for common operations that we expect to implement across multiple backends.

As we have worked on adding more generic support for existing features across various backends, this set of type class abstractions has expanded. Of course, care is needed to abstract the right things at the right level. We don’t want to reimplement too many lines of code per backend because our level of abstraction was too coarse, but at the same time, we don’t want an explosion of abstractions because the level was too granular.

No doubt the hierarchy of abstractions we need will change over time as refinements are made. The ideal final state is a compact set of abstractions, each with its own clear semantics and -- ideally -- a reference implementation as well.

Queries with joins - not so simple!

Even before we consider advanced features, we already run into trouble when we consider querying functionality.

For example, Postgres allows us to take a query involving several tables, object and array relationships, and push all of those data fetches down as one single query.

How do we do that?

Consider a simple example in SQL, fetching all order information for a single customer ID, along with product information from a product table:

SELECT 
  name, street, city, state, zip 
FROM customers 
WHERE id = $1;

SELECT 
  p.id, p.name, p.price, o.tax_rate, o.quantity
FROM orders o
JOIN products p ON p.id = o.product_id
WHERE o.customer_id = $1;

Here, we make two queries:

The first to fetch the customer data and a second to fetch order and product data for that customer. You can imagine that this might be generated by a GraphQL query.

query { 
  customers(where: { id: { _eq: $cust_id } }) {
    name street city state zip
    orders {
      tax_rate quantity
      Product {
        id name price
      }
    }
  }
}

It wouldn’t make sense to use a join to fetch the customer data for each order/customer row, because the same data would be duplicated across each row. Also, we’d have to do quite a bit of manual work in order to collect all of the data back into a single JSON result after we were done.

Instead, in Postgres, we can do something better, by using JSON aggregations and subselects to make Postgres do the minimal amount of work, and even give us the JSON we want directly!

SELECT 
  json_build_object(
    'name', name, 
    'street', street, 
    'city', city, 
    'state', state, 
    'zip', zip,
    (SELECT json_agg(
       json_build_object(
         'id', p.id, 
         'name', p.name, 
         'price', p.price, 
         'tax_rate', o.tax_rate, 
         'quantity', o.quantity
       ))
     FROM orders o
     JOIN products p ON p.id = o.product_id
     WHERE o.customer_id = customers.id
    ) as orders
FROM customers 
WHERE id = $1;

What’s happening here? At each level, we are using the json_build_object function to construct a JSON object inside of Postgres itself. Instead of returning multiple columns in our result set, we return a single column of type JSON. More importantly, for every array relationship (for example, fetching the orders, here), we use a subselect and the json_agg aggregation function to aggregate a JSON array of subresults.

The JSON aggregation is done at the database, avoiding unnecessary serialisation overhead, and returning a compressed form of the data without duplication.

Clever!

Except there is a problem. Neither BigQuery nor MySQL support JSON aggregations in the specific way that Hasrua requires. Specifically, as the documentation confirms, MySQL does not provide a way to aggregate a JSON array with a well-defined element order.

So, we cannot implement queries for array relationships uniformly across backends, and we reach our first necessary abstraction:

Hasura must abstract over the way code generation proceeds for array relationships.

However, because there are sufficiently many other small differences in code generation between backends, we choose to make the abstraction much coarser. Ultimately, we abstract over the whole query generation, from incoming query AST to generated SQL.

In practice, this means we duplicate some code across backend implementations. But the alternative would be an extremely fine-grained set of abstractions over many details of the target SQL implementation. Instead, to address the possible code duplication problem, we implement similar backends using a set of shared functions. For example, in the case of MySQL, we have to fall back on an approach where we issue multiple queries and perform aggregation in Hasura itself. But this code is generic and can be reused across different backends where it is needed.

As a guideline for backend implementers, we use subselects and JSON aggregation wherever it is available, and fall back on the multiple-queries implementation if we have to.

Subscriptions

Another well-used, well-loved feature in Hasura is support for high-performance, highly-concurrent subscriptions against Postgres tables. In an older blog post, Tanmai has outlined the architecture which allows us to scale Hasura to millions of concurrent subscriptions.

One trick that we require is to batch similar queries for execution. That is, we want to identify queries which only differ in their variables, and send those as a single batch to Postgres.

At the level of code generation, we use some advanced features of Postgres to make sure this can happen. Specifically, we use the unnest function and lateral joins in order to generate a single Postgres query for multiple sets of variables.
Assuming the same database schema as the customer/order example above, we can imagine that we’d like to subscribe to a live list of all recent orders for a customer. In that case, we would generate code to batch queries for several customer IDs, and generate code like this:

SELECT results
FROM unnest($1) query_variables (customer_id)
LEFT JOIN LATERAL (
  SELECT ...
  FROM orders WHERE customer_id = query_variables.customer_id
  LIMIT 10
) results ON true

The lateral join here is going to execute the inner query for each row in the result of the call to unnest, that is, once for each set of variables we pass in. This is great for code reuse, because we can simply take the SQL we would have used to run the query normally, and wrap it in a query of this form.

However, as you may have guessed, these features are not always available across all backends in the way that we need, so we cannot implement this batching behavior generically. Specifically, in MS SQL Server, we don’t have access to parameterized queries using our existing ODBC bindings, so we must use a different approach.

However, the logic for managing cohorts of similar queries for batched execution is common across all backends supporting subscriptions, so we can choose the level of abstraction accordingly: we don’t abstract over the process of batching itself, but just the process of executing a batch of queries for a given set of variables.

Action Relationships

A third example of a Postgres-specific feature comes from action relationships, a feature of Hasura which allows users to perform joins between relational database tables and data from external APIs.

When running a query with an action relationship, we first fetch the data from the external API (the “action”) and then extract portions of the JSON response which will indicate the primary keys we need to fetch from Postgres. This is a little bit like the inverse of the problem from the first example: instead of taking a record set from Postgres and turning it into a single blob of JSON to return to the user, we need to take a single blob of JSON produced by an API and turn it into a Postgres record set, so that we can then join that record set with a table in Postgres.

To do this, we use one of the two functions jsonb_to_recordset or jsonb_to_record, depending on the type of relationship we are implementing.

If you’ve noticed the pattern so far, you’ll know what’s coming next: we can’t expect every backend to provide analogues of these functions, so we cannot provide a generic implementation of action relationships across all backends.

So, when we come to generalize action relationships, we will need to decide on the correct abstraction to add to our collection. We could choose to add a new abstraction for running a query in the context of JSON fetched from an action. However, maybe we don’t need to do this, because we already have a very similar abstraction, namely the one we just saw. Subscriptions need the backend to support batching of queries with multiple sets of variables, which is obviously similar in a lot of ways to the problem here. So hopefully, we will be able to design a single batching abstraction that backends can support, powering both subscriptions and action relationships.

Language Features, Generally

We’ve seen three specific examples of problems which arise when we try to naively generalize existing features to multiple backends, but of course they are all examples of a much more general problem: in each case, the target language (all variants of SQL) did not support some feature we needed, so we were required to work around that, using only the language features which were available to us.

In general, the problem we face is to identify all the different sorts of languages we may want to support in the long term, and to distill those down into a small set of language features which are required by each Hasura feature.

The lack of a language feature in a target language is not necessarily a problem - in the best case, we will be able to add a new abstraction and work around the lack of that feature by using some other approach, maybe sacrificing some performance in order to do so. In the worst case, if we cannot work around the lack of that feature, we will simply need to turn off the corresponding Hasura feature for that backend. This is unavoidable in some cases.

For example, providing row-level permissions for a new backend requires support for code generation of predicates, which seems like a low barrier to entry. However, in the case of remote schemas, we can’t even guarantee that, because an arbitrary GraphQL schema certainly doesn’t need to provide support for where clauses and predicates. This is acceptable, because we can switch our level of abstraction - we don’t really need to be able to generate code for predicates; all that is required is to modify a query for execution to incorporate permission rules. In the case of remote schemas we can do that by passing any relevant session variables to the GraphQL resolver as a function argument, for example.

However, with the recent addition of inherited roles, users can now issue queries in the context of multiple roles. There is not an obvious way to generalize the new abstraction to support inherited roles for remote schemas, and the answer may have to be that we must disable that feature for a subset of backends.

In general, there is a tension between these language features that we demand from new backends, and new general-purpose features that we want to implement for all backends. More language features implemented by a given backend mean that we can claim support for more general purpose features, but correspondingly, as we add more general-purpose features, we must demand more from our backend implementations, and in some cases, that might mean we have to exclude features from certain backends.

There is another interplay between language features and static analyses of queries. For example, with the recent support for query response caching in Hasura, we might want to look at a query and decide if it is likely to perform well in the cache or not. As we add more language features, this problem will become harder and harder, and we may need to pare down the language features we allow in order to be able to support certain analyses.

Conclusion

Adding support for product features can be challenging. Genericising support for product features across multiple back-ends, doubly so.

As we develop these features, we’ll identify a hierarchy of language features that are relied upon. At the bottom of this hierarchy will be something like an arbitrary remote schema, where we can assume almost nothing, and at the top will be backends like Postgres which support a vast collection of language features. New features will have to be developed generically, requiring only abstract language features, and not specific backends. Over time, we’ll fill in the gaps in the matrix of language features and backends, getting us closer to the ideal of supporting every Hasura feature against every type of backend.

As a related aside, if you find any of this work and description exciting, we are hiring for a variety of engineering roles.