Primary Key: A primary key in a relational database uniquely identifies the record among many records stored in a table.
Primary keys can be made up of one or more columns, of existing data, or made up data that serves only the purpose to identify the record itself. The following are the kinds of primary keys a table can have.
- Super Key - A super key is a single or multiple keys that make up a group of keys which identifies a row within a table.
- Candidate Key - A candidate key are attributes that uniquely identify rows in a table. A candidate key is a super key with none of the attributes repeated, thus a natural key within the existing stored data.
- Alternate Key - An alternate key is a column or group of columns that uniquely identify every row in the table. It is a key that isn't particularly selected as a primary key, but can be used as one if the primary key weren't available for some reason.
- Compound Key - A compound key has two or more columns included as the primary key of a table to uniquely identify each row of data.
- Composite Key - An artificial key (think of a generated autonumber, short ID, CUID, or UUID) that uniquely idenfies each record. Also referred to as a surrogate key, these keys are used when a natural key doesn't exist in the schema of the existing table.
- Surrogate Key - See Composite Key.
I put together two videos related to specific primary keys: UUIDs & creating Short URL Safe UUIDs. Check those out on Youtube.
The short ID function I add in this short, is embedded below and provided in this gist for easy and quick access!
CREATE OR REPLACE FUNCTION gen_unique_short_id() returns text language plpgsql as $$ DECLARE id text; BEGIN id := encode(gen_random_bytes(6), 'base64'); id := replace(id, '/', '_'); id := replace(id, '+', '_'); RETURN id; END; $$; alter function gen_unique_short_id() owner to postgres;
The above definitions and details I discuss in "A Hasura Bit - What is the UUID column type good for and what exactly is a UUID?" leave some additional details that should be added. UUIDs are great as surrogate keys for many situations, but there are some situations where they can cause significant problems.
For example. Some of the characteristics of UUIDs can become significant problems in higher scale scenarios or in certain hardware situations. Being the UUID is a largely random value and numbered value, the database engine attempts to order these during writes, which if there are too many writes that then require reordering immediately come in, it can cause disk thrashing. Something that in the end can become a significant performance issue! At the same time, 128-bit can be excessively large with large volumes of data just for the key that may not be needed.
These problems, as with any trade off, just emphasize the fact that for each and every scenario the pluses and minuses need to be measured carefully for the intended purpose and outcome for the mission at hand.
Other considerations with both the UUID and short ID above include the issue with writes incurring a read/write and possible disk thrashing if the insert volume gets too high too fast. Another issue for the short IDs is that there is a slightly higher possibility of duplicates being generated. However, in the sense of a high possibility it's still absurdly low and statistically almost nonexistent. However, it doesn't hurt to have a check and write a new key if a duplicate does occur to prevent that error. But the investment in that level vs. just dealing with the error might not be worth it, every project might have a different tolerance for a possible insert error of that sort. Maybe a retry is good enough?
Character Data & Numeric Data Types
In the next two shorts I tackle a few details about character data and numeric data in GraphQL, how it maps back against Postgres types and reference the key grid that maps Postgres types to their respective API types one would get in Hasura served GraphQL JSON result objects here, and below the videos for quick reference in this article.
|bigint||int8||signed eight-byte integer||String|
|bigserial||serial8||autoincrementing eight-byte integer||String|
|bit [ (n) ]||fixed-length bit string||Implicit|
|bit varying [ (n) ]||varbit [ (n) ]||variable-length bit string||Implicit|
|boolean||bool||logical Boolean (true/false)||Bool|
|box||rectangular box on a plane||Implicit|
|bytea||binary data (“byte array”)||Implicit|
|character [ (n) ]||char [ (n) ]||fixed-length character string||Char|
|character varying [ (n) ]||varchar [ (n) ]||variable-length character string||String|
|cidr||IPv4 or IPv6 network address||Implicit|
|circle||circle on a plane||Implicit|
|date||calendar date (year | month | day)||Date|
|double precision||float8||double precision floating-point number (8 bytes)||Float|
|inet||IPv4 or IPv6 host address||Implicit|
|integer||int | int4||signed four-byte integer||Int|
|interval [ fields ] [ (p) ]||time span||Implicit|
|json||textual JSON data||JSON|
|jsonb||binary JSON data | decomposed||JSONB|
|line||infinite line on a plane||Implicit|
|lseg||line segment on a plane||Implicit|
|ltree||labels of data stored in a hierarchical tree-like structure||Implicit|
|geometry||PostGIS Geometry type||Geometry|
|geography||PostGIS Geography type||Geography|
|macaddr||MAC (Media Access Control) address||Implicit|
|macaddr8||MAC (Media Access Control) address (EUI-64 format)||Implicit|
|numeric [ (p | s) ]||decimal [ (p | s) ]||exact numeric of selectable precision||Numeric|
|path||geometric path on a plane||Implicit|
|pg_lsn||PostgreSQL Log Sequence Number||Implicit|
|point||geometric point on a plane||Implicit|
|polygon||closed geometric path on a plane||Implicit|
|real||float4||single precision floating-point number (4 bytes)||Float|
|smallint||int2||signed two-byte integer||Int|
|smallserial||serial2||autoincrementing two-byte integer||Int|
|serial||serial4||autoincrementing four-byte integer||Int|
|text||variable-length character string||String|
|time [ (p) ] (without time zone)||time of day (no time zone)||Implicit|
|time [ (p) ] (with time zone)||timetz||time of day | including time zone||Timetz|
|timestamp [ (p) ] (without time zone)||date and time (no time zone)||Implicit|
|timestamp [ (p) ] (with time zone)||timestamptz||date and time | including time zone||Timestamptz|
|tsquery||text search query||Implicit|
|tsvector||text search document||Implicit|
|txid_snapshot||user-level transaction ID snapshot||Implicit|
|uuid||universally unique identifier||Implicit|