Data Modeling
The MongoDB data connector makes available any database resource that is listed in its configuration. In order to
populate the configuration, the connector supports introspecting the database via the CLI update
command.
Queryable collections
The MongoDB data connector supports several types of queryable collections:
- Collections
- Views
- Native Queries
Collections and views are typically discovered automatically during the introspection process.
Native queries on the other hand are custom aggregation pipelines that you specify in the data connector configuration.
These are queries run using the MongoDB aggregate
command which makes them similar to views, except that you may
configure parameters that are interpolated into the pipeline at query time (similar to a function), and they do not
require any DDL permissions on the database. See the configuration reference on
Native Queries.
Scalar types
The MongoDB data connector supports nearly all of scalar types that the MongoDB database does. In addition the MongoDB
data connector provides an additional scalar type, extendedJSON
which acts like an "any" type.
MongoDB's scalar types are listed in MongoDB's BSON Types documentation. The MongoDB data connector configuration uses "alias" names for each type which are camel-cased with a lower-case initial letter. The same names are used in the connector's NDC schema.
In order to use standard GraphQL scalar types rather than custom scalar types in the resulting GraphQL schema, the DDN configuration enables expressing a mapping between data connector types and GraphQL types.
The data connector does not support the dbPointer
BSON type which is deprecated in MongoDB. Other deprecated types,
symbol
and undefined
, are supported for the time being.
extendedJSON
may represent any MongoDB value including scalars, objects, arrays, or null. The automatically-generated
schema in your MongoDB data connector configuration may include fields with the extendedJSON
type in cases where the
introspection system is unable to infer a specific type. The name is based on the serialization format used: unlike
other values, values with the extendedJSON
type are serialized in the
MongoDB Extended JSON format.
Structural types
The data connector supports MongoDB array and object types, and they will appear as such in the schema it exposes. Since MongoDB is a document store that is intended to store arbitrarily complex data structures, the MongoDB data connector is designed to support the same.
As mentioned previously, although extendedJSON
is presented as a scalar type from the perspective of GraphQL it may
represent arbitrary structural data encoded using the MongoDB Extended JSON format.
Schema Introspection
The CLI is capable of automatically creating a configuration for the MongoDB data connector which includes a schema for
your database. Because MongoDB databases do not have an internal schema, the connector schema is produced by randomly
sampling database documents. By default it samples 100 documents - you may get more accurate results by increasing this
number which you can do by editing the sampleSize
option in configuration.json
.
During sampling types of values found are compared between documents. This includes top-level fields, but also nested object fields, and array elements. For a field path where the same type of value is found in every document the schema sets that type. If a value at that path is missing in some documents a nullable type is assumed.
In general automatic schema introspection when encountering arrays or objects will accurately reflect the complex
structure seen in the database. These types will in turn map to complex GraphQL types. The introspection system falls
back to the extendedJSON
when encountering fields with the same name and path in multiple objects that contain
incompatible types. For example given a collection with these documents:
{ _id: { $oid: "66cf91a0ec1dfb55954378bd" }, value: 1, created: "2024-02-20" }
{ _id: { $oid: "66cf9230ec1dfb55954378be" }, value: 2, created: { $date: "2024-07-03" }, updated: { $date: "2024-10-25" }}
{ _id: { $oid: "66cf9274ec1dfb55954378bf" }, value: 3, created: { $date: "2024-07-03" } }
(Note that when inputting data to MongoDB { $oid: ... }
is interpreted as an objectId
value, and { $date: ... }
is
interpreted as a date
value.)
The CLI infers these schema types:
_id
:objectId
value
:int
created
:extendedJSON
updated
: nullabledate
Because updated
appears in one document, but not in all of them, it gets a nullable type.
Because created
has type string
in one document, but has type date
in the others, it gets the extendendJSON
type
which is the only type that can represent both string
and date
.
The connector assumes subtyping to a limited extent: it treats int
as a subtype of double
. In cases where different
documents use both numeric types with the same field name an automatically-generated connector schema will supply the
double
type.
Queries
The connector supports all query operations; see the query documentation for details.
Mutations
We currently support mutations in the form of Native Mutations. These are arbitrary MongoDB database commands that you write into the data connector configuration.