Skip to main content
Version: v3.x

Data Modeling

The MongoDB data connector makes available any database resource that is listed in its configuration. In order to populate the configuration, the connector supports introspecting the database via the CLI update command.

Queryable collections

The MongoDB data connector supports several types of queryable collections:

  • Collections
  • Views
  • Native Queries

Collections and views are typically discovered automatically during the introspection process.

Native queries on the other hand are custom aggregation pipelines that you specify in the data connector configuration. These are queries run using the MongoDB aggregate command which makes them similar to views, except that you may configure parameters that are interpolated into the pipeline at query time (similar to a function), and they do not require any DDL permissions on the database. See the configuration reference on Native Queries.

Scalar types

The MongoDB data connector supports nearly all of scalar types that the MongoDB database does. In addition the MongoDB data connector provides an additional scalar type, extendedJSON which acts like an "any" type.

Data type names

MongoDB's scalar types are listed in MongoDB's BSON Types documentation. The MongoDB data connector configuration uses "alias" names for each type which are camel-cased with a lower-case initial letter. The same names are used in the connector's NDC schema.

In order to use standard GraphQL scalar types rather than custom scalar types in the resulting GraphQL schema, the DDN configuration enables expressing a mapping between data connector types and GraphQL types.

Limited support

The data connector does not support the dbPointer BSON type which is deprecated in MongoDB. Other deprecated types, symbol and undefined, are supported for the time being.

Extended JSON

extendedJSON may represent any MongoDB value including scalars, objects, arrays, or null. The automatically-generated schema in your MongoDB data connector configuration may include fields with the extendedJSON type in cases where the introspection system is unable to infer a specific type. The name is based on the serialization format used: unlike other values, values with the extendedJSON type are serialized in the MongoDB Extended JSON format.

Structural types

The data connector supports MongoDB array and object types, and they will appear as such in the schema it exposes. Since MongoDB is a document store that is intended to store arbitrarily complex data structures, the MongoDB data connector is designed to support the same.

As mentioned previously, although extendedJSON is presented as a scalar type from the perspective of GraphQL it may represent arbitrary structural data encoded using the MongoDB Extended JSON format.

Schema Introspection

The CLI is capable of automatically creating a configuration for the MongoDB data connector which includes a schema for your database. Because MongoDB databases do not have an internal schema, the connector schema is produced by randomly sampling database documents. By default it samples 100 documents - you may get more accurate results by increasing this number which you can do by editing the sampleSize option in configuration.json.

During sampling types of values found are compared between documents. This includes top-level fields, but also nested object fields, and array elements. For a field path where the same type of value is found in every document the schema sets that type. If a value at that path is missing in some documents a nullable type is assumed.

In general automatic schema introspection when encountering arrays or objects will accurately reflect the complex structure seen in the database. These types will in turn map to complex GraphQL types. The introspection system falls back to the extendedJSON when encountering fields with the same name and path in multiple objects that contain incompatible types. For example given a collection with these documents:

{ _id: { $oid: "66cf91a0ec1dfb55954378bd" }, value: 1, created: "2024-02-20" }
{ _id: { $oid: "66cf9230ec1dfb55954378be" }, value: 2, created: { $date: "2024-07-03" }, updated: { $date: "2024-10-25" }}
{ _id: { $oid: "66cf9274ec1dfb55954378bf" }, value: 3, created: { $date: "2024-07-03" } }

(Note that when inputting data to MongoDB { $oid: ... } is interpreted as an objectId value, and { $date: ... } is interpreted as a date value.)

The CLI infers these schema types:

  • _id : objectId
  • value : int
  • created : extendedJSON
  • updated : nullable date

Because updated appears in one document, but not in all of them, it gets a nullable type.

Because created has type string in one document, but has type date in the others, it gets the extendendJSON type which is the only type that can represent both string and date.

Subtyping

The connector assumes subtyping to a limited extent: it treats int as a subtype of double. In cases where different documents use both numeric types with the same field name an automatically-generated connector schema will supply the double type.

Queries

The connector supports all query operations; see the query documentation for details.

Mutations

We currently support mutations in the form of Native Mutations. These are arbitrary MongoDB database commands that you write into the data connector configuration.