Welcome to another quick webinar. We'll be talking about some internals at Hasura on how we think about GraphQL performance and how Hasura optimize the GraphQL performance. So for those of you who are new to Hasura or are using Hasura, this session should be quite useful to understand a little bit of the internals of how Hasura works and some of the stuff that's coming soon too. So I'll be chatting a little bit about that. Cool. All right. So yeah, like I said, I'll be chatting a little bit about high performance GraphQL and the approaches that Hasura uses internally. Just as a quick introduction, I'm Tanmai, I'm the co-founder at Hasura, I head product as well. So excited to be kind of chatting with you here. Cool.
So I'll be covering a few topics around how Hasura processes queries, how Hasura does predicate pushdown, caching, JSON aggregations, subscriptions, and then also some notes on scaling. So that's kind of our list of things today that we'll be talking about. So with that, let's get started with the first section, which is something I'm sure some of you are already familiar with. But let's kind of dive in and understand kind of what Hasura's approach internally looks like. So when we kind of have this GraphQL query, right, that you're running. And let's say you're making this GraphQL query and thinking about implementing a GraphQL server that could process this query.
And as usual, I have a sample set up here with artists and albums and tracks. So if you have artists that is coming in from a model, let's say a data model, and then you have artists, each artist has multiple albums, each album has multiple tracks, right? When you're kind trying to resolve this, right, in a typical GraphQL system, you'd be kind of doing a depth first reversal, right? And what this would result in, in a naive implementation, is that you'd try to resolve artists and say, let's say you get a hundred artists, right? So you make a database to get artists, you get a hundred artists, and then you try to fetch the albums. If you try to kind of naively fetch the album information for each artist as well, you'll be hitting the database N times, where N is a number of artists. If you have hundred artists, you'll be hitting the database a hundred times to fetch albums. Now obviously you shouldn't be doing that. And because each artist has those albums, so ideally you want to kind of make one query to fetch those albums, right?
But this problem might not be something you notice on day one or kind of when you implement something and test it out. But it kind of starts getting exacerbated as soon as the number of items you're fetching is large. As soon as you start fetching lists of things, right? Like an object in an nested list. Or especially if you have nested kind of things, right, where you're fetching multiple items that are related. So for example, if each album has tracks also, you'll now be running N square databases, right? Because you'll be trying to fetch the set of tracks per album and for multiple albums within the same artist. and then across all of those artists as well. Right? And so this N plus one problem gets compounded, right, as you kind of move ahead.
So that is a fairly typical issue, right? Now the straightforward way to resolve this is to start thinking about it with a data loader kind of approach, right? Where the idea is that you make one hit to fetch the top level item or the list of top level items. And then in the second hit, instead of trying to resolve albums for artists, what you want to do is run something like an inquiry that most data source sources will support, right? So you'll try to fetch those albums, where the album dot artist ID is in one, two, three, four, five, six, right? Where one, two, three, four, five, six are the artists that you fetch at the top level. And so now what you're going to try to do is you're going to start trying to fetch those entities, right? You're going to make the number of requests to the underlying data source and depending on the number of nodes, right, that you have in your GraphQL query.
This kind of works, but it starts to get a little bit complicated if you're fetching a large list, because sometimes the [inaudible 00:04:15] kind of thing doesn't support a very large number, right? There's sometimes a little bit of a... you have to make sure that you're deduping a little bit and stuff like that, which ideally like your data loader implementation should take care of if you're building this out yourself. But then also sometimes the inquiry becomes a little bit hard to process because maybe you want to fetch an artist, but the top five albums of each artist. So this top five query is not super easy to do, right? Because if you want to fetch each artist and their top five albums, you know, one artist might have a hundred albums or 10 albums or whatever, and you want to fetch the top five according to something, right? So pushing that on into that inquiry, right, becomes hard.
And so certain elements of sorted pagination become problematic even with this kind of an approach, right, and requires a fair amount of custom querying shenanigans to figure out how you can make that query efficiently and still hit database the least number of times. Given the fact that if, in this situation, all of this information is coming from one database, technically we could have been much better and we could have said, "Why can't we just fetch all of this information in one kind of efficient way?", right? Like, what does it take for us to be able to do something like that? Right? And so the approach that, you know, when we were kind of building this out at Hasura, when we were thinking about these kinds of problems, we thought that the ideal approach to think about this is to not really resolve a GraphQL query, but could we take a compilation approach to resolving the GraphQL query, right? Could we compile it instead of resolving the GraphQL query? Instead of doing a depth first reversal and executing stuff, could we instead reverse the GraphQL query, right, and compile that by transforming that AST into a SQL query, right? That would then go do exactly what it needed to do and operated on exactly the pieces of data that the query needs, right?
And so that's kind of one kind of core piece of Hasura and the way that it works, is that it kind of takes this incoming GraphQL query and compiles that out, right? Which is a major reason behind our very snappy performance that you've noticed even on fairly large GraphQL queries with Hasura, right? And so the approach that Hasura takes is, GraphQL query comes in, Hasura parses that out, has an internal AST, an abstract syntax tree that represents what the GraphQL query is. And then depending on kind of what sources they're coming from, Hasura starts trying to convert that into a SQL AST or an underlying [inaudible 00:06:52] whatever the data sources of the AST is, and then finally generates [inaudible 00:06:56] that tend to be actual query itself, the query language of the database, and then ones that one query, right?
This results in fairly snappy performance, right? Even if I'm just running a single instance Hasura process on a really large number of queries per second, even on nested queries, I can fetch large amounts of data very snappily, right? It's within a certain bound of performance as a raw database itself, even though you're using an HTTP client to fetch this information. And even on large tables that have large amounts of data, you're extracting large amounts of data, this process becomes quite fast because databases can kind of handle that load. Now another kind of side effect of that is that it all becomes quite easy to analyze performance, right? Because when you're looking for a particular SQL query, it becomes quite easy to understand that there is a particular SQL query that you can look at in aggregate to see what needs to be sped up, right? So you know where you need to, say, add an index to the query, right? Or where you need to optimize something. And that process of understanding that becomes much easier.
And we'll take a quick look at an example of seeing how that works at Hasura, right? So I have a setup here where I have two databases. In my Postgres system I have, say, invoices, like billing invoices, and each invoice has a customer ID because each customer has kind of raised an invoice, right? And so what I can do is kind of run queries where I can say, you know, I want query customers, right, and let's get the customers name, let's get the customers email, and let's get... let's just firstly do that query, right? So these are the customers that I have. And you can kind of see the underlying query that customer was making. This is just one single element, right? You can look at the underlying database plan itself. And then I can also kind of say invoices, right? And this relationship is because of that customer ID column, right? So we've told Husura that because invoice is a customer ID, that attribute is what's helping us create the relationship, right? Then I can say invoices, let's see what invoices has [inaudible 00:09:09]. Go to... and audits, billing address... right? And so I can fetch this information.
So you can see I'm fetching a fairly decent amount of data and it's pretty snappy, right? And if you look at the underlying query here, we're making a single query where we're fetching this information from customers and the invoices of those customers as well, right? And then of course I can do things like I can say I just want the first two and let's order by... do we have a date? Okay, there we go. So now I can easily fetch the first two invoices for each customer by dates, right? Or let's say latest... just kind of a more practical use case that you have, right? And again, this is pretty snappy. But also because we are able to kind of do this pagination of the nested list quite easily, right, which becomes a little bit interesting to do or interesting to generate this query if you're trying to make one query or one hit to the database and just fetch the invoices, right? So that's kind of what this piece looks like.
And like you can see, you can see the underlying query as well, right? So for example, if I was looking at this kind of particular query, and because I'm not leaning on a particularly large data set it's still quite snappy, right? But you know, if these were millions of customers and billions of invoices, the first thing I'd hit is I'll see that kind of the sequence scan is what will be a problem, right? So that sequence scan probably needs to become an index scan, right? Which means that we probably need to add an index on the customer ID column on our invoices table, right? But this is kind of progressive optimization that I can do easily, because if I notice that this query is becoming slow, I can kind of look at this database plan, chuck it into a bunch of different tools, right, to see what kind of optimization I need to start doing. Right? So that's kind of on the query compilation piece and how compiling those GraphQL queries into underlying queries to make that work pretty snappy.
Next we'll kind talk about, well you know, this is just that kind of fetching data aspect of it, right? But an important aspect of this is because this is an API, is that you want to make sure that you have some kind authorization, right? And when we're kind thinking about... Fabian had a question which I'll take just a second. And so you have, for example, a typical example would be that in our invoices and customer as use case, right, each customer should be able to see only their own invoice, right? And maybe only certain fields, maybe you don't want the customer to see all of the fields of the invoice and stuff like that. So there's kind of an authorization piece that needs to go in.
And if you think about authorization or the logic that you would write for authorization typically, it would be some kind of property or condition that would be a match between the user's information and a property of the data, right? So it's a property of the data and the property of the user's information that is kind of allowing you access to certain fields or certain entities, right? And one of the challenges here from a performance point of view, right, is that when you're kind of fetching this data, you can't actually fetch all of this data and then filter it once you fetch the data, right? If you're thinking about [inaudible 00:12:39] the resolver for albums or invoices or whatever you have, when you're trying to fetch that information, you can't fetch all of the invoices or all of the albums, fetch millions and millions of records, right, and then filter that. That's not going to work, right? Because I mean, that's work that the database should be doing.
So what you'd ideally do is when you're fetching this information, you'd want to run a query where you run that check or filter within the database itself. So if you're doing a select on invoices, you're fetching those invoices where the customer ID is that of the session user ID, right, or is the customer ID from whoever's making that API call. So you want to push that down. Now this aspect of pushing this filter down into the query, this filter condition is a predicate, right? So we call it a predicate pushdown. So what we're doing here is we're doing a predicate pushdown so that we can fetch that data and precisely that slice of data that is required for that particular API call, for that particular user, right?
And the way this kind of works is a simple query that you look at is kind of embed this predicate, right, inside the data fetch call itself right? So you do a select, but in the select you are aware and this aware condition is a predicate. This is the authorization rule that's deciding what elements you get access to. Right? So this is, again, particularly useful for lists of data where you're trying to fetch through lists of items, you're trying to fetch a subset of those items that are relevant for your user at that point, right? And again, from a compilation point of view, the way that we think about it is that you take the user ID and the rule, right, from the session. You see what permission rule it has, what policy it has, you take that policy and you compile that into the where clause of that SQL statement, if you're talking a SQL database, right? And so if you were talking to, say, Mongo, working on now, then you'd compile it into the where clause of the Mongo database, right? But it's the same predicate that's getting pushed down into the underlying database call, right?
And again, kind of just looking at that compilation pipeline, it's parse the GraphQL query, you get a GraphQL AST, then you look at the permission system, right, nd then you modify that internal AST to also have the permissions information in it, then you convert that into the SQL AST, and then you render that into the SQL query. Right? So that's kind of what it looks like. The key piece of the information here is where is that permission rule being stored, right? And that permission rule, and if you just kind take a look at the compilation pipeline, visualize that, right, that permission rule is what goes into Hasura metadata. So all of those things that I do on the Hasura UI or via the [inaudible 00:15:22] file or via APIs, you know however I consider Hasura, that is essentially Hasura metadata.
So that AST processing piece of Hasura, right? What it's doing is it's looking at the Hasura metadata, it's also looking at the raw table information, right, which is how Hasura knows how to compile into the right SQL query, right? So it's taking these two pieces of information, merging them, and then using that to create the SQL query, right? So that's kind of how it's working. And again, take a quick example to show you what that looks like. And then I'll quickly answer Fabian's question as well. So let's add in a simple example here to say, I can only fetch invoices as a customer if the customer ID is equal to something like a user ID. We could even call it a customer name, it doesn't matter what the session variable name is, and we can decide what fields you want to give access to. So let's say we don't want to give access to... I actually want to give access to everything.
But just as an example, let's remove access to billing, state, country, and city, and just have address. So as soon as I do this, and I use those permissions in Hasura, right, I say X Hasura rule... let's set this to customer, and X customer ID, [inaudible 00:16:48], right? The first thing that you notice is that on the left, I can't actually fetch for all the other stuff as an admin because Hasura looks at that role and says, "Ah, okay, the GraphQL schema itself only has a rule for invoices". It doesn't have a rule for anything else, right? So now if I do query, invoices, and you know, ID, and let's put the customer ID, right? I'm also not seeing billing, city, state, country, right? So now I'm getting only that data for that customer ID, right? I'm not getting information for any other customer ID. Of course, if I change this to 10, I'm going to get this information only for customer ID 10, right?
What's happening here is that Hasura is, again, if I look at the underlying query, right, Hasura's basically adding that where clause in that generated query, right? Which is pretty much exactly what you would've done as well, if you were processing this way. Right? And the nice thing is that this now kind of propagates across models, right? So no matter how I compose these models together, these rules will kind of automatically get added, right? So I don't have to worry about how I'm fetching invoices. Am I fetching invoices by fetching invoices directly? Am I doing customer invoices? Right. Whatever method I used to fetch that model, that rule will get applied safely. Right? In production, or again, kind of depending on how you're exposing the Hasura GraphQL API to your applications, these two things, the role and these session variables like customer ID or org ID or whatever, right, will not come in through headers typically, but typically would come in through something like an authorization token, right? So you'd have like an auth token that you'd have to reclaim or something, right? [inaudible 00:18:25].
And what you'd configure Hasura with is you'd configure Hasura with the GWT key so that Hasura can then perk inside this claim, validate the claim, validate the token, right? Peek inside the token, validate the token, and then also extract these claims from the token. Right? So the token typically will have some kind of a role or sub or something, which will map to a role in a customer ID. And so you can configure Hasura to extract those claims from the token itself. Right? So you won't actually be sending these headers in production, right? Because otherwise you need an admin token as well. So that's kind of what this authorization and predicate pushdown looks like. Right? And this is kind of the predicate, and when we say does Hasura does a predicate pushdown, this is essentially kind of what's what's happening underneath. Cool.
So just to quickly take two of the questions here from Fabian and Ravi. One is how does GraphQL to SQL translation compare between mutations and queries? Yes. Mutations are a little more complicated. We don't actually show the explained plan for mutations yet, but we will soon. When you take a mutation and you run that via Hasura, Hasura does a very similar thing for mutations, it opens up a transaction block, it has a begin, and then it has a bunch of mutation lines and an enter, right? Depending on how complex that mutation is. Right? So for example, if I have like a mutation where... let me just remove this so that I can get access to mutations here. Oops. So for example, if I did a mutation, right, and I let it insert something and then insert something else, right, and then update something else, what Hasura would do is Hasura would kind of open up a begin and then have insert tracks, all of the information there, figure out how to generate the right SQL statements for that, and then insert that in a block, right?
Now typically mutation performance, when we think about mutation performance, right, there is not a substantial amount of optimization that people do for mutation performance, but there is definitely a little bit, a few things that you can do, which is why we don't have it on our analyze yet. But typical kind of optimizations that you would want to do for improving mutation performance is you might have to do the opposite where you might want to remove certain indexes or constraints, right? You might also have situations where you want to have tables that are partitioned, right, you'll use partition tables so that inserts can be done faster, updates can be done faster. So there's a bunch of things that you can do to optimize nutrition performance as well. Or you might want to do a bulk insert and [inaudible 00:20:59] point inserts. Right? And that's kind of how it works. But usually that optimization ends up a little more straightforward or, from the Hasura side, a little more involved than the database side. But that's kind of how the mutation query generation works.
To Ravi's question on do we have the performance system for a multi tenant system? Actually, that's a very good question. Let me just... we did a nice post on Hasura multi tenant. A recent blog post. So let's se... there's an authorization example here. So I'll dig up the link and sent it across to you Ravi. But we did a recent post that I don't know if it's been published yet or not, but it's a post on how you can set up a multi tenant... you can add multiple databases into Hasura which are all for different tenants or have multiple schemas within the same database, which could all belong to different tenants. Or you can have one table that has something like a tenant ID. Right? And how all of those three systems are systems that you would build on Hasura, like how you could represent that in Hasura. From a performance point of view, all three end up working fairly similarly. And usually doesn't have performance impact on the Hasura side, as long as the tables that you're reading from have a tenant ID or even if they're different tenant tables, Hasura is able to push that predicate down. So any tenant that is fetching information, you'd only fetch information for that tenant specifically.
And from a performance standpoint, if all of the stuff that we discuss applies for a multi tenant system as well. Michael had a question on, for permissions, can we reference a value from a parent table as a filter variable? Yes. You can. You do it via relationship. Just to show you a quick example, right, let's say for example I wanted to say that I only want to allow selecting from invoices if invoice dot customer dot country equal to something, right? And this might come from a session variable, but if it's not a session variable, that's a column value, you can use column comparison. So you can say where that particular column value is equal to some token value, right? So you can use a static value there, you can reference a columns value there, you can compare two columns within the parent table, between the parent table and the child table. So as long as it's kind of declarative in a property of the data or a property of the data and the session, Hasura should be able to handle it, right? So that's kind of how it works.
Cool. Awesome. So that's kind of on the query compilation side and the predicate pushdown side, right? Next let's move on to how Hasura approaches caching, which we kind of break into two aspects. The first is query plan caching. And the second is data caching itself. Right? So query plan caching is the idea that when we take a GraphQL query and you compile it and come up with a result in SQL query that you're going to execute, right, like the execution plan itself, that is also something that can be cached, right? So we don't have to go through the whole pipeline again. And that is something that Hasura automatically does, it's transparent, you don't have to think about it. So multiple queries that you make of the same type, those query execution plans will be cached.
When we think about caching data, that's kind of a little more interesting. And this is something that you'd be able to use on Hasura with something like Hasura cloud or [inaudible 00:24:51] because you also attach a [inaudible 00:24:53] system to it. On Hasura cloud we have like a distributor data list that Hasura cloud uses automatically. But the caching mechanism is quite interesting and that's kind of what we'll talk about a little bit. So let's say for example, this is slightly complicated example, right? So let's say for example you are making a query to fetch restaurants, right? And I'm using a slash restaurants thing here, but what I mean here is query restaurants and, you know, some restaurants in your area, right? Now, depending on the user ID that's making this query, right, you might be querying for information that depends on a property of the user. Right? So for example, if I'm making a query as user ID one, then I'm in SF, right? I should be fetching those restaurants that are in SF, right. If I'm in Dublin, I should be fetching those restaurants in Dublin. Right? If I'm user ID three who's also in SF, I should be getting a cache hit and fetching the same results from the SF cache, right?
So it's interesting because when we think about caching, in certain use cases, it would just be like a public API call, the entire API call is same across all users, right? It doesn't depend on a property of the user. But more often than not, especially for APIs, the data that we're fetching depends on a property of the user. And depending on the property of the user, like it's not user ID, right? If we tried to cache based on user ID, we would never get a cache hit because all of the three user IDs are different. What we actually want to cache is based on user dot city, because the user dot city is the cache key here, right? And that kind of is an interesting thing to kind figure out of how we can cache, right?
Now, the way Hasura handles this, right, is based on what we were discussing previously around query compilation and the authorization model itself. What Hasura does is, because each model that we have has declarative authorization and relationship rules, right, all of those relationships between models and the permissions of what data I can access are all declarative. What Hasura is able to do is it's able to do kind of static analysis on them so that when it's fetching this information, when Hasura's kind of making that data fetch query, right, Hasura is able to determine what about that query depends on the user and what property of the user, right? So Hasura is kind of able to use that to say, "Oh, the cache key should actually just be user dot city", right? And that way Hasura can determine the cache key automatically for a particular type of query based whatever role is querying that data, and actually optimize for cache sheets, right?
And so there's a bunch of things here. There's different grades of this, right? So today on Hasura you'll be able to cache in public queries, you'll be able to cache all API calls where the session rules are the same. And then we'll be kind of adding an advanced version that we're working on now where we'll do a bunch of this automated cache key discovery for complicated caching use cases again, automatically, right? So that's kind of how Hasura works internally. Right? So if there is a authorization rule, Hasura will do a check to see how it's going to determine the cache key. And if it's not using a complicated authorization rule, then Hasura will just use a common cache for everybody, right? For all of those queries.
And so Hasura is kind of able to determine that the user property here in this restaurants kind of use case is the city key, right? And then it creates a cache based on city comma query, right? So that becomes the response cache that Hasura is able to use to make sure that [inaudible 00:28:22] cache [inaudible 00:28:23] and giving you lots cache hits, right? The internal mechanism that Hasura uses is a LRU kind of cache and it uses a add cache directive. So any query that you want to cache you can just either add cache directive to it, with a TTL, and then Hasura combines those two strategies for public data and private data automatically and we are able to kind of get caching. On Hasura cloud we do forward integration of the CDN as well, that we'll be launching soon. It's available on request right now, but then we'll be hopefully within the next quarter, so that you get forward integration to the CDN as well. So that all of the pops that you have, as long as they have a add cache directive, will be getting super low latency edge caching automatically built in as well for public and for private data.
Cool. But with that, I'll move on to JSON aggregations, which is a neat insight which I think contributes for one of the most massive speed ups that we see in Hasura. So when you think about fetching a query, right, like we had offer and articles, or we had invoices and customer invoices, right? We had artists and albums, like whatever kind of two models that you're fetching data from. If you make a simple one query to fetch this information and serve multiple database hits, the problem with just making that one query is that you'd get what is called a Cartesian product. Right? If you just naively wrote this query in something like SQL, in something that was not fundamentally a JSON store, right, you would basically have a result where if I kind of ran the select author from article and join an author and article, right, what I would end up fetching is a Cartesian product. So the result that I would get in that result set that I would get from the database, I'll get duplicate information for each author, for the articles as well. Right? So I'll get like the author one, article one, author one, article two, author one, article three. Right?
And this is a very simple use case, right, but imagine if I was fetching the author's description as well, right, or author address. So I would get author one, author one's address, article one, author one, author one's address, article two, article two content, right? I would get like a lot of duplicate information despite the fact that in a final API result, I don't need this. Right. I already have this author one and I have these articles. I don't actually need this piece. Right? So this result is a Cartesian product which is one reason why, when you're kind of running the joint query, the amount of data that the database is processing and returning over the Y or to the API server is actually quite large. Right. And so as the data sets start becoming larger, right, or as your queries become larger and more nested, this will actually start to slow down. Right?
So what Hasura does is a hierarchical JSON aggregation, right? This query is not a valid query, a negative query. What Hasura does is, Hasura tells the database to not return a Cartesian product, but Hasura tell the database to actually return a hierarchical JSON result, right? So I'm getting author one just once and I'm getting articles for that author one. I'm getting author two, that piece of data just once, and I'm getting articles for that second author. Right? So this assembly, Hasura gets the database to create the JSON itself. And then in fact, if it's a query to a single database, Hasura is just able to directly send that back to the client.
Hasura doesn't even have unpack this result and do a serialization, B serialization, to convert that into JSON. Otherwise, what you would do is you would talk to a database, you would get this result, you would pass the entire bite stream into a data structure in your programming language, right? In whatever API server environment, right? Convert that process, that object, convert that into JSON, and then convert that into JSON string, and then send that, right? Which is a odd [inaudible 00:32:30] to an operation, right? You have to traverse all of those bites, right? And again, as your queries start becoming larger, starts taking more and more time, right? So I was able to kind of push that JSON aggregation to the database itself and then extract that result, right? Which is why, when you kind of look at this API call here... when we're looking at this API call here, this is the snippets of like the rule to JSON and the JSON aggregate you see, these are the JSON aggregations that Hasura is pushing down into the database itself. So Hasura is saying that, "Hey database, can you give me the JSON shape that is wanted by the end client directly?", right? And we can avoid that processing in the app server, right?
And so that, again, is a massive speed up because the database already has this data and is giving us that result. Right? And so that ends up being very useful for API calls, especially if you're fetching over lists and large lists and stuff like that, right. Or a large number of fields, and even if you have one level of nesting, right? There's a lot of duplication and just data transfer between the API server and the database that is cut down massively, right? Which, again, improves an amount of concurrency that you can have, improves the amount of latency or reduces that amount of latency quite drastically. So that's kind of on the JSON aggregations piece.
Now it's interesting because not all databases support JSON aggregations, right? So Postgres supports JSON aggregations, or [inaudible 00:34:06] SQL, [inaudible 00:34:07] supports JSON aggregations, BigQuery supports JSON aggregations, MySQL does not support JSON aggregations. For MySQL, we don't make one gigantic MySQL query, what we do is we make multiple queries, right? So we fall back to a data loader [inaudible 00:34:18] pattern, but an optimized data loader pattern, right? But Hasura does a little bit of processing to make sure that you can get this kind of nested pagination and stuff. So there we use kind of data load. So depending on the kind of database that we're adding support for, the data source that we're adding support for, Hasura chooses between two strategies, right? Either JSON aggregations or multiple queries, multiple point queries, but with a data loader [inaudible 00:34:43] approach. So that's on the JSON aggregation piece.
Cool, next we move on to subscriptions. So Hasura subscriptions are essentially live queries. So what you're able to do is subscribe to any entity. And if that entity changes, you get the latest result of that entity, right? Now this is quite complicated to implement for several reasons in scale. And again, there's a very detailed deep dive that we've done, that we've talked about how Hasura implements, how we chose between different approaches, and kind of what we went for. But one of the most fundamental problems that ends up happening is that if you were trying to subscribe to something like an order, right, when you try to query your particular order, and then later on subscribe to it so that this data is live, when you're querying this order information, the problem is not just that you have to get the latest order information, the problem is that you have to also embed the authorization clause into it, right? Because depending on who's subscribing to what, you're not going to get access to like my orders, right, if you're accessing the order information.
So events, and the event data itself, also has authorization information that's required that needs to be processed, right? So there's a re-fetch, right? So if that data has to change, we have to re-fetch, recompute some of these things, to fetch the exact data that can be sent back to the user, right? Even if it's a subscription, not just a query. The problem is that now, if I have a thousand users, a thousand users are subscribing to that order, I'll be running a thousand queries. Right? Because for each of those queries, I'll have to compute this clause. Right? And that's crazy. Right? Like if you have a thousand subscribers, you'll be running a thousand queries, right, to try to fetch that information.
So what Hasura does is we use a technique of multiplexing, right, where Hasura does not have to run a thousand queries for each of those different subscribers that are connected. Hasura actually is able to collapse that into a single query that is fetching information for all of the folks that are connected whether it changes, right? And that's kind of how the subscription mechanism works for multiplexing, right. I'll drop the link in the slide, they have details around kind how this works, of what some of the underlying architecture choices and decisions therewith. But that's kind of the multiplexing element that makes it pretty easy to have a large number of subscribers connected to and fetching certain information
Now, one of the problems, if you've been using subscriptions with Hasura, right, is that you'll notice that you're using live queries, which is convenient if you're looking at a model and watching on that model, but it's not convenient when you are fetching large lists. Right? Because not technically, streaming events are kind of like is a subset of live queries. So you can transform any event-based thing that you want into a live query model. But it's inconvenient, right? So just to give you an example, let's say I'm subscribing to a chat room and messages, right? What Hasura will do is if there is a new message, you will get all of the messages again and again. So that means from subscribing to this, right, you'll be getting like hundred messages, 200 messages, 300 messages. So every payload will kind of keep increasing. Right? And that's going to slow that subscription result down quite a bit, right, just because of that network transfer that you're doing. Because there's so many redundant messages.
So something new that we're working on is being able to stream events easily. So what you'll be able to do on the client is add a kind of notation saying after X, and X is a cursor. So this could be based on ID, last updated, right? Like any kind of thing that determines a cursor to the events table that you have or the messages table that you have. Right? So that you're kind of subscribing to those events after X and X is the state that you maintain the client. So let's say you're building like Slack, right, or you're building WhatsApp or something like that, and you have some local state. So you have all of the messages, still X, X is some unique globally kind of identify globally sort of identify, right? And what you want to do is fetch the messages after X and stream that as that happens. Right? So what Hasura will then do is send you those messages that are happening after that time. Right. And once that gets exhausted, anything that comes in after that, during the lifetime of that connection.
So that is kind of a good balance between making sure that you don't miss events, but you're also able to stream events continuously without having to fetch the whole thing as a live query. So for those of you are interested in this, please to reach out to me. We'll have a GitHub issue for this as well in a bit that you can subscribe to so that you can start trying this piece out. So this should be super awesome for streaming, even for streaming information, but also for streaming events. So this is going to be very interesting.
All right. Next, let's take a look at... remote joins. So remote joins is when we're joining across two databases, right? Which is something that we previewed in Hasura corner a few months ago and is, I think, the final pool request that has been reviewed and will test this out. So we're looking at merging that in next week. Super exciting. Cross database joints have a massive impact on performance because all of this cool stuff that we were talking about when we had a single database breaks down when you have multiple databases, right? Because how can you compile a single SQL query that hits multiple databases? And that becomes challenging, right? So there's a bunch of very interesting things that we've done at Hasura to optimize that piece. Right. But just to kind of understand what performance cultures you can run into there is that, you A, have to do all of the optimizations that we talked about required for a single database, right, then as soon as you do it on multiple databases that problem just gets exacerbated? How are you going to do nested pagination, how are you going to do nest filtering, right? How are you going to do this across two systems? Right.
And just to give you an example, if I'm fetching artists from one database and albums from a second database, right, even though they're kind of related models, what you need to do is you need to fetch those artists and you need to fetch those albums per artists, right? And you can use a data loader to kind of optimize it and say that I want to fetch all the artists from here and I want to fetch all the albums in artist one, two, three, four, five, right? And so you can run the same kind of query. But this, like I was mentioning previously in the single database case, becomes worse when you do something like the first 10, right? Sorted by a particular thing because now you're inquiry is definitely not going to work on a new system. Right? How are you going to represent a top 10 for the second database called, right? You're not going to be able to push down that assembly that join into the second source effectively, right? So that kind of becomes challenging.
Another example is when you have, across these systems, you realize that you're fetching information, right? Let's say I'm fetching albums and artists and reviews of the album made by other artists. There's duplication that's happening here. Right? But it's happening in different depths. One of it i happening here, the other piece of it is happening here. Right? So it's happening in different depths, that again becomes really hard to fit into kind of an effective data loader pattern to optimize that because context needs to be shared across weird parts of the tree. Right? And that again becomes challenging to optimize, right? So you have to kind of dedupe the artists requests that you're making in Hasura effectively.
So let's just take a quick look at... I was just looking at Gregor's message. Yes. That's exactly the use case record that this is made for, so that you don't have to do the updated add piece. Right? You can just stream the message as you get them. Awesome. Cool. So here, I've set up a quick example that has artists here and albums in SQL server, which is the example that I was demoing at Hasura con as well. And I set up a relationship between the artist in Postgres and the album in SQL server. Right? So what I'm able to do is say I want artists and the first 10 and like we'll put the name, right? And then I want albums and I want the albums title, right? And this query again runs pretty snappily, even though it's fetching information across two systems, right? And what Hasura is doing here is it runs this query for artists and then create the same, similar technique that we used in subscriptions multiplexing, we use a similar technique to run a single query on SQL server across those artists in that kind of multiplex style with the predicate pushdown. So we'll do a writeup talking about what that internal predicate pushdown looks like. And it's interesting engineering there. But that's effectively kind of how it's able to work to make sure that this happens.
And then just to kind of make this example a little more fun. I had album, and album has tracks, and I put tracks back in the Postgres database, and I set up a relationship with that as well. So each track has an album ID, right? So now I can do albums, but then I can also do albums dot tracks dot name, right? So this from Postgres, from SQL server, back to Postgres, right? And again, pretty snappy, right? Even though I'm doing like a multi join across multiple related systems on two databases. So that is kind of how Hasura approaches the remote joins piece of it and how Hasura optimizes it. So some more details on the exact nature of the query generation is something that we'd start putting out together. We don't have this available on the explain plan yet, but that should also land soon, so that you can see the underlying plan also. You can see the trace today, but the plan and making that plan visually available is something that will land soon.
Cool. Next, just talking a little bit about scale up and scale out. Hasura is multi-threaded by default and uses some interesting concurrent or concurrency program permissives. So that makes it very easy to scale Hasura up vertically. And vertical scaling is very useful for things like subscriptions or heavy queries and heavy mutations, right, which might require a larger amount of memory to process and stuff like that. So you can kind of scale Hasura vertically quite easily. And then you can also, just like any other system, you can also scale Hasura horizontally. Each instance of Hasura is stateless, so that means even if you have subscriptions, especially with things like subscriptions, it becomes really hard to have Hasura subscription server scale horizontally. Because there's a lot of state at each web socket connection... has state for that user, right? So if it horizontally scales out, it moves from one system to another system, suddenly everything breaks, right? That state breaks to another system. But Hasura handles that well. So each Hasura instance is stateless. That makes it very easy to scale out horizontally.
So if you want to scale out based CPU or memory or on any kind of other advanced metrics that are getting emitted from Hasura, it's very straightforward to set that limit up and then scale that out. So that kind works easily. What we recommend and what we help our users and customers do is there's a nice benchmarking repository that you can use. It's fairly simple. That you can just set up, look at all, put in some kind of common GraphQL queries there, run a GraphQL benchmark, see what your base load is going to be like, and depending on that set up a default instance size, and then set it up to scale horizontally. That's usually a fairly simple initially what you need to do to set Hasura up.
So that is a quick overview of how Hasura does a bunch of these performance things internally. And I'm hoping that was helpful. Just before we close for some more questions, here are some resources and links that has more details on some of the stuff that I was talking about. I dive in a little bit deeper technically on the query compilation aspect, on the subscriptions aspect, on the remote joint, the caching aspect. So once we share these slides with you'll be able to check those links out as well. But in general, if you Google for stuff, you'll find some of these post and some of these talks.
Cool, with that, we have a little bit of time for questions. I'm just taking a look at Gregor's question... handle creations but also updates? Yes. Yep, yep, yep. That's a very good question Gregor. So what we want to do is just implement a few patterns for deletes and permission with respect to subscriptions, right? Because when we look at the event streaming model, what it fundamentally relies on is the fact that you have some identifier that is determining a change, right? Which for example could be a timestamp, could be UID or whatever. So that does mean that when you think about updates or deletes, the best way to handle that often is by just creating a new instance of that message, right? Especially when like a part of your model. So some parts of your application, you can probably have direct updates and deletes on the same model, right? Like you have a message, you just update that message or you delete that message in place.
And for some aspects of that, it works like, especially if you're looking at a model, like an order, right? Like you're tracking an order, you're subscribing to an order, right? The order has a status. Now this status will go through an update, delete whatever. Right. So that's much easier to model. And then you can use a last updated add to determine that, you can do a soft delete instead of an actual delete. Right? And you can model that well. But for things like messages where you're kind of streaming that information, you're going to have a large volume, right, and a large number of people fetching those messages, and you want to partition or shard those messages tables out also eventually and stuff like that. In those cases, and we'll do a write up here, talk about this pattern, but what we would recommend doing is you create a new message that references the message ID that has like a parent message ID. Right? And this message ID has the latest data, right? So you don't edit it in place, you create a new reference of that piece for an update, right? And for a delete you kind of mark that as a delete event, right, in that same model. So you make it a part of the model, and then it just becomes an event stream, right?
And this is kind of what you would see if you look at, like for example, GitHub comments or even on WhatsApp and stuff like that, right? Or even on Slack. You'll see that they actually have like those copies, right. They're not doing crowd updates on the raw model. That makes it much easier from a performance point of view, but also from a history and audit point of view, right? Let's say tomorrow you want to give admins on your app a feature to say look at past versions of this message, right? Like what you see on GitHub comments. So it unlocks some of that, but it also makes like just thinking about streaming much, much easier. Otherwise it becomes challenging. So very good point. And I think as we do the streaming stuff, we'll start documenting some of these patterns as well. Because the question that you asked is pretty much one of the questions that people have when they're kind of setting up these models.
All right, cool. So I think we're nearly at time. And if you have any questions about this stuff, about the internals, please feel free to hit me up. I am @tanmaigo on Twitter. And thank you for the kind words. I'm @tanmaigo on Twitter so please feel free to reach out to me, anything that I can help with. And I think this recording of the webinar and the slides will be sent across soon as well. So hopefully that was useful and talk to you folks again soon. Have a good day, evening, night, depending on where you folks are. All right, all right. Thanks everybody.