Tune Hasura Performance
This page serves as a reference for suggestions on fine tuning the performance of your Hasura GraphQL Engine instance. We cover database configurations, scaling, observability, and software architecture to help you get the most out of your set up.
Configuration and deployment
PostgreSQL's basic configuration is tuned for wide compatibility rather than performance. The default parameters are very undersized for your system. You can read more details about the configuration settings at the PostgreSQL Wiki. There are also config generator tools that can help us:
With respect to Hasura, several environment variables can be configured to tune performance.
- Minimum 2 connections
- Default value: 50
- Max connections: = Max connections of Postgres - 5 (keep free connections for another services, e.g PGAdmin, metrics tools)
However, how many connections is the best setting? Of course, it depends on the Postgres server's hardware specifications. Moreover, too many connections doesn't mean query performance will be the highest. There are many great articles that analyze this deeper:
There isn't a silver bullet for all server specs. Your developer has to test and benchmark carefully for the final result. However, as a starting point, you can estimate with this formula, and then test around this value:
connections = ((core_count * 2) + effective_spindle_count)
For example, if your server has a 4 Core i7 CPU and 1 hard disk, it should have a connection pool of:
9 = ((4 * 2) +1).
Call it 10, as a nice round number.
For high-transaction applications, a horizontal scale with multiple GraphQL Engine clusters is the recommended best practice. However, you should be aware of the total connections of all nodes. The number must be lower than the maximum connections of the Postgres instance.
With Read replicas, Hasura can load balance multiple databases. However, you will need to balance connections between database nodes too:
- Master connections (
HASURA_GRAPHQL_PG_CONNECTIONS) are now used for writing only. You can decrease max connections lower if Hasura doesn't write much, or share connections with other Hasura nodes.
- Currently, read-replica connections use one setting for all databases. It can't flexibly configure specific values for each node. Therefore, you need to be aware of the total number of connections when scaling Hasura to multiple nodes.
Default: 1000 (ms)
In brief, live query subscribers are grouped with the same query and variables. The GraphQL Engine needs to execute one query and return the same results to clients once every refetch interval.
The smaller the interval is, the faster the update clients receive. However, everything has a cost. Small intervals with a large number of subscriptions need larger CPU and memory resources. If you don't need such high frequency, the interval can be set a bit longer. In contrast, with a small to medium number of subscriptions, the default value will suffice.
Imagine there are 1,000,000 subscribers to the same query. Emitting to millions of websockets in sequence can cause delays and eat more memory in a long queue. However, a small batch size can increase the number of SQL transactions. This value needs to be kept in balance. If you don't have an idea to determine what value to use, use the default.
The Hasura GraphQL Engine binary is containerized by default, so it is easy to scale horizontally. Workloads will be spread across instances but may however not be spread equally. This is normal. In cases where you have many event triggers, consider tweaking http timeout / max event processing threads per instance to suit your needs.
To calculate your total nodes, you will need to estimate the expected concurrent requests per second and benchmark the number of requests one Hasura node can handle, then scale to multiple nodes with this simple calculation:
total_nodes = required_ccu / requests_per_node + backup_node
1, depending on your plan.
A good starting point would be with
n x (2 CPUs with 4GB RAM) Hasura instances to maintain high availability in
production and then scale accordingly.
It's recommended to monitor resource usage patterns and vertically scale up the instance size as this is dependent on your workload, DB size etc. 4 CPUs with 8GB RAM is usually a good sweet spot to handle a wide variety of production workloads.
Horizontal auto-scaling can be set up based on CPU & memory. It's advisable to start with this, monitor it for a few days and see if there's a need to change based on your workload.
However, you need to be aware of the total Postgres connections, too. The default
50, meanwhile the default Postgres
max_connections configuration is
100. The Postgres server will easily be out
of connections with
3 Hasura nodes, or
2 nodes with events/action services that connect directly to the database.
pg_max_connections >= hasura_nodes * hasura_pg_connections + event_nodes * event_pg_connections
Observability tools help us track issues, alert us to errors, and allow us to monitor performance and hardware usage. It is critical in production. There are many open-source and commercial services. However, you may have to combine many tools because of the architectural complexity. For more information, check out our observability section and our observability best practives.
Software architecture and best practices
Hasura as a data service
Database connection management isn't an easy task, especially when scaling to multiple application nodes. There are common issues such as connection leaking, and maximum connections exceeded. If you use serverless applications for Actions/Event Triggers that connect directly to a database, connection leaking is unavoidable because every invocation may result in a new connection to the database. This is especially true when the number of services has grown into the hundreds.
You can use many solutions such as PgBouncer, vertical scaling, and increasing
max_connections on Postgres configurations. However, increasing too many connections can have unintended consequences
for your server and degrade performance.
Therefore, you can reduce connection usage by querying data from the GraphQL Engine instead. Connection pooling will be centralized in Hasura nodes.
You can read more in the Hasura Blog.
Understand your data
Hasura's query performance relies on database performance. When there is any performance issue, you need profiling to identify the bottleneck point and optimize your database queries. Utilizing the power of Postgres can help boost application speed.
Fundamental knowledge you should know and practice:
- Index your table.
- Optimize queries with view, materialized view, and functions.
- Use trigger to update data instead of using 2 or more request calls.
- Normalize data structure.
- EXPLAIN, ANALYZE.
- Avoid too many
_orconditions. This doesn't utilize the table index.
ilikeis expensive, especially on long text. Use Full-text search instead.
- Pagination is necessary.
- Fetching batched multiple queries in one request is useful. However, you shouldn't overuse it. One common case is using aggregate count with data in the same pagination query. It is okay for small-to-medium tables. However, when the table size is large, the query will be slow because of the scanning the entire table.
- Avoid joining too many tables.
Hardware has its limits. It's expensive to scale servers as well as optimize data. Moreover, Postgres doesn't support master-master replica, so it will be a bottleneck if you store all the data in one database. Therefore, you can divide your business logic into multiple smaller services, or microservices.
Hasura encourages microservices with Remote Schemas. This can act as an API Gateway that routes to multiple, smaller GraphQL servers.
For example, in an e-commerce application, you can design three Hasura services:
- User + Authentication
- Product management
- Order + Transactions
Depending on a project's scope, the design philosophy is flexible. On a small project, one database will suffice.
Thanks to the open-source community, Postgres has many extensions for various types of applications:
- Time-series data, metrics, IoT: TimescaleDB, CitusDB.
- Spatial and geographic objects: PostGIS.
- Image processing: PostPic.
With Remote Schemas, you can use Hasura with multiple databases for various use cases. For example, Postgres for user data, TimescaleDB for transaction logs, and Postgis for Geographic services.