GraphQL and UUID type on Postgres
Hasura GraphQL Engine has implicit support for Postgres UUID type, which means that UUID values can be provided as strings.
Every table in a database needs a primary key.
A primary key’s purpose is to provide a stable, indexable reference to an entity in a table. Common practice is to use serial ids as primary keys. Serial ids are auto-incrementing integers, which are usually auto-generated by the database.
A robust alternative to a serial id is the UUID (Universally Unique Identifier) or the GUID (Globally Unique Identifier), which is a standard 128-bit number that is very hard to duplicate, thereby providing unique identifiers which are perfect for use as primary keys.
What is the Postgres UUID Type?
The Postgres UUID data type stores UUIDs as defined by RFC 4122, which are 128-bit quantities generated by algorithms that minimize the probability of having duplicate identifiers.
A UUID comprises of 32 hexadecimal digits, represented in groups of 8,4,4,4 and 12.
- Exposing information: Very frequently, the primary id is exposed to the user (REST urls, anyone?). If the primary key is easily visible to the user, this means that anyone can find out your number of users by just looking at the primary key on a freshly created user. This leaks information about the size of your userbase.
- Enumeration of entries: Another problem is that it’s very easy to enumerate all the entries of your table with a primary key. A malicious user or a bot could easily scrape identities of all your users, or spam your API with organised actions on all your users. A UUID is universally random, and is practically impossible for external parties to predict. To be very clear though, UUIDs are not a security mechanism, the actual solution to this problem is to put up stronger access control. UUIDs just provide more protection in this case.
- Independent Generation: The biggest problem with sequential ids is that the client doesn’t know the id of an object being inserted without talking to the database. This is even more of a problem when inserting multiple related objects with foreign keys, and creates inelegant multi step inserts. A UUID on the other hand, can be generated directly by the client making inserts very simple, and solves the foreign key problem quite nicely too.
- Random distribution: The random distribution of the UUIDs can prevent disk hotspots, where a large number of objects are stored in a particular disk block, causing high usage on specific areas of the disk. This works very well for databases distributed over several nodes.
Why not UUID?
- Size: The downside is that UUIDs are four times as big as a serial, and if you have size constraints, they can be a problem.
- Hard to debug: Harder to remember, hard to debug, a small but important QoL issue. It’s easier to test/debug when you have simpler ids that you can remember compared to long complicated UUIDs.
- Lack of easy ordering: The primary key cannot be used to determine the order of insertion, another column with timestamps or a sequence becomes necessary. This also means that clustered indexes on UUID columns are very costly in terms of performance.
- Performance: And also, random UUIDs mean scattered indexes, and this can cause performance issues/high disk usage with high insert rates. (Databases have sequential UUID generation functions to deal with this, but this eliminates the advantage of being able to generate the UUID on the client)
UUIDs in the Hasura GraphQL Engine:
Hasura GraphQL Engine (HGE) is an open source tool that gives instant GraphQL APIs over Postgres.
HGE supports UUID types as implicits, which means that UUID values can be provided as strings.
Let’s set up a simple schema with a UUID column to run some queries on:
Here’s a sample GraphQL query that fetches a specific object from a table with a UUID primary key:
Inserting an object with a UUID generated using a client side library is quite simple too:
You can try out all the above examples on a live HGE instance here!