Postgres Containment Operators : Part 2 - Performance comparison with MongoDB

In Part 1 of the series, we looked at what containment operators are and provided a (in)formal definition. In Part 2, we see how containment operators perform compared to MongoDB.

What are we measuring ?

We measure the time taken to execute the following four types of queries,

  1. Simple - Query for a value in a top level key
  2. Embedded Document - Query for a value inside a nested object
  3. Array - Query for the existence of a value inside an array
  4. Multi-key - Query for values across multiple keys

under the following settings,

  1. Postgres without using containment operator
  2. Postgres using containment operator
  3. MongoDB

in both indexed and non-indexed cases.

Measurement Setup

The tests are run on a MacBook Pro (Retina, 13-inch, Late 2013). The versions of Postgres and MongoDB are 9.4.4 and 3.0.7 respectively.

Postgres timing information is obtained by prefixing the queries with EXPLAIN (ANALYZE TRUE, TIMING FALSE). MongoDB timing information is obtained from db.collection.find().explain("executionStats").

Data

Both databases are loaded with 1 million random JSON objects of the form,

{
  "name": "Ochoa",
  "age": 47,
  "spouse": {
    "name": "Santos",
    "age": 35
  },
  "children": [
    "Jannie",
    "Olsen"
  ]
}

The Postgres table the data was loaded into has the schema,

 Column |  Type   |                      Modifiers
--------+---------+-----------------------------------------------------
 id     | integer | not null default nextval('people_id_seq'::regclass)
 person | jsonb   |

Queries

Query Type Query Hits
Simple Find all people with name "John" 478
Embedded Document Find all people with spouse's name "John" 492
Array Find all people having a child named "John" 1532
Multi-key Find all people aged 50 whose spouse is aged 32 2288

The following queries were used.

1. Postgres (without containment)

SELECT person FROM people WHERE person ->> 'name' = 'John'

SELECT person FROM people WHERE person - > 'spouse' - >> 'name' = 'John'

SELECT person  
FROM people  
WHERE EXISTS (  
        SELECT 1
        FROM jsonb_array_elements_text(person - > 'children') children
        WHERE children = 'John'
        )

SELECT person  
FROM people  
WHERE person - >> 'age' = '50'  
    AND person - > 'spouse' - >> 'age' = '32';

2. Postgres (with containment)

SELECT person  
FROM people  
WHERE person @ > '{"name" : "John"}';

SELECT person  
FROM people  
WHERE person @ > '{"spouse" : {"name" : "John"}}'

SELECT person  
FROM people  
WHERE person @ > '{"children" : ["John"]}'

SELECT person  
FROM people  
WHERE person @ > '{"age" : 50, "spouse": {"age" : 32}}'  

3. MongoDB

db.people.find({"name" : "John"})

db.people.find({"spouse.name" : "John"})

db.people.find({"children" : "John"})

db.people.find({"age": 50, "spouse.age" : 32})  

Indexes

In case of Postgres (without containment) and MongoDB we created indices for - name, spouse.name, children, age & spouse.age and in the case of Postgres (with containment) we created one GIN (jsonb_path_ops) index.

Results

We present three results - timing measurements in non-indexed and indexed cases and the index sizes.

Non-Indexed

Query Type Simple (ms) Embedded Document (ms) Array (ms) Multi-key (ms)
Postgres (without containment) 325 425 1990 470
Postgres (with containment) 330 430 465 350
MongoDB 390 455 748 380

Indexed

Query Type Simple (ms) Embedded Document (ms) Array (ms) Multi-key (ms)
Postgres (without containment) 0.75 0.79 1990 20
Postgres (with containment) 0.83 1.12 2.82 10
MongoDB ~ 1 ~ 1 ~ 2 70

Index Sizes

Query Type Simple (MB) Embedded Document (MB) Array (MB) Multi-key (MB)
Postgres (without containment) 23 23 55 42 (21 + 21)
Postgres (with containment)
27
MongoDB 23 23 70 48 (24 + 24)

Conclusion

The numbers suggest that Postgres jsonb performance is at par with MongoDB in most cases and betters it in multi-key containment queries. Moreover it can do it with just one index, thanks to GIN. Well done Postgres!

Balaji Rao

Balaji's Haskell interest brought him to Hasura, and he's joined the HasuraDB team. In his past life, he's been a Linux kernel hacker, a platform orchestration engineer, and an almost math-academic.

comments powered by Disqus