MongoDB Driver Specifications

The modern MongoDB driver consists of a number of components, each of which are thoroughly documented in this repository. Though this information is readily available and extremely helpful, what it lacks is a high level overview to tie the specs together into a cohesive picture of what a MongoDB driver is.

Architecturally an implicit hierarchy exists within the drivers, so expressing drivers in terms of an onion model feels appropriate.

Layers of the Onion

The "drivers onion" is meant to represent how various concepts, components and APIs can be layered atop each other to build a MongoDB driver from the ground up, or to help understand how existing drivers have been structured. Hopefully this representation of MongoDB’s drivers helps provide some clarity, as the complexity of these libraries - like the onion above - could otherwise bring you to tears.

Serialization

At their lowest level all MongoDB drivers will need to know how to work with BSON. BSON (short for "Binary JSON") is a bin­ary-en­coded serialization of JSON-like documents, and like JSON, it sup­ports the nesting of arrays and documents. BSON also contains extensions that al­low representation of data types that are not part of the JSON spec.

Specifications: BSON, ObjectId, Decimal128, UUID, DBRef, Extended JSON

Communication

Once BSON documents can be created and manipulated, the foundation for interacting with a MongoDB host process has been laid. Drivers communicate by sending database commands as serialized BSON documents using MongoDB’s wire protocol.

From the provided connection string and options a socket connection is established to a host, which an initial handshake verifies is in fact a valid MongoDB connection by sending a simple hello. Based on the response to this first command a driver can continue to establish and authenticate connections.

Specifications: OP_MSG, Command Execution, Connection String, URI Options, OCSP, Initial Handshake, Wire Compression, SOCKS5, Initial DNS Seedlist Discovery

Connectivity

Now that a valid host has been found, the cluster’s topology can be discovered and monitoring connections can be established. Connection pools can then be created and populated with connections. The monitoring connections will subsequently be used for ensuring operations are routed to available hosts, or hosts that meet certain criteria (such as a configured read preference or acceptable latency window).

Specifications: SDAM, CMAP, Load Balancer Support

Authentication

Establishing and monitoring connections to MongoDB ensures they’re available, but MongoDB server processes typically will require the connection to be authenticated before commands will be accepted. MongoDB offers many authentication mechanisms such as SCRAM, x.509, Kerberos, LDAP, OpenID Connect and AWS IAM, which MongoDB drivers support using the Simple Authentication and Security Layer (SASL) framework.

Specifications: Authentication

Availability

All client operations will be serialized as BSON and sent to MongoDB over a connection that will first be checked out of a connection pool. Various monitoring processes exist to ensure a driver’s internal state machine contains an accurate view of the cluster’s topology so that read and write requests can always be appropriately routed according to MongoDB’s server selection algorithm.

Specifications: Server Monitoring, SRV Polling for mongos Discovery, Server Selection, Max Staleness

Resilience

At their core, database drivers are client libraries meant to facilitate interactions between an application and the database. MongoDB’s drivers are no different in that regard, as they abstract away the underlying serialization, communication, connectivity, and availability functions required to programmatically interact with your data.

To further enhance the developer experience while working with MongoDB, various resilience features can be added based on logical sessions such as retryable writes, causal consistency, and transactions.

Specifications: Retryability (Reads, Writes), CSOT, Consistency (Sessions, Causal Consistency, Snapshot Reads, Transactions, Convenient Transactions API)

Programmability

Now that we can serialize commands and send them over the wire through an authenticated connection we can begin actually manipulating data. Since all database interactions are in the form of commands, if we wanted to remove a single document we might issue a delete command such as the following:

db.runCommand(
  {
     delete: "orders",
     deletes: [ { q: { status: "D" }, limit: 0 } ]
  }
)

Though not exceedingly complex, a better developer experience can be achieved through more single-purpose APIs. This would allow the above example to be expressed as:

db.orders.deleteMany({ status: "D" })

To provide a cleaner and clearer developer experience, many specifications exist to describe how these APIs should be consistently presented across driver implementations, while still providing the flexibility to make APIs more idiomatic for each language.

Advanced security features such as client-side field level encryption are also defined at this layer.

Specifications: Resource Management (Databases, Collections, Indexes), Data Management (CRUD, Collation, Write Commands, Bulk API, Bulk Write, R/W Concern), Cursors (Change Streams, find/getMore/killCursors), GridFS, Stable API, Security (Client Side Encryption, BSON Binary Subtype 6)

Observability

With database commands being serialized and sent to MongoDB servers and responses being received and deserialized, our driver can be considered fully functional for most read and write operations. As MongoDB drivers abstract away most of the complexity involved with creating and maintaining the connections these commands will be sent over, providing mechanisms for introspection into a driver’s functionality can provide developers with added confidence that things are working as expected.

The inner workings of connection pools, connection lifecycle, server monitoring, topology changes, command execution and other driver components are exposed by means of events developers can register listeners to capture. This can be an invaluable troubleshooting tool and can help facilitate monitoring the health of an application.

const { MongoClient, BSON: { EJSON } } = require('mongodb');

function debugPrint(label, event) {
 console.log(`${label}: ${EJSON.stringify(event)}`);
}

async function main() {
 const client = new MongoClient("mongodb://localhost:27017", { monitorCommands: true });
 client.on('commandStarted', (event) => debugPrint('commandStarted', event));
 client.on('connectionCheckedOut', (event) => debugPrint('connectionCheckedOut', event));
 await client.connect();
 const coll = client.db("test").collection("foo");
 const result = await coll.findOne();
 client.close();
}
main();

Given the example above (using the Node.js driver) the specified connection events and command events would be logged as they’re emitted by the driver:

connectionCheckedOut: {"time":{"$date":"2024-05-17T15:18:18.589Z"},"address":"localhost:27018","name":"connectionCheckedOut","connectionId":1}
commandStarted: {"name":"commandStarted","address":"127.0.0.1:27018","connectionId":1,"serviceId":null,"requestId":5,"databaseName":"test","commandName":"find","command":{"find":"foo","filter":{},"limit":1,"singleBatch":true,"batchSize":1,"lsid":{"id":{"$binary":{"base64":"4B1kOPCGRUe/641MKhGT4Q==","subType":"04"}}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1715959097,"i":1}},"signature":{"hash":{"$binary":"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"00"}},"keyId":0}},"$db":"test"},"serverConnectionId":140}

The preferred method of observing internal behavior would be through standardized logging once it is available in all drivers (DRIVERS-1204), however until that time only event logging is consistently available. In the future additional observability tooling such as Open Telemetry support may also be introduced.

Specifications: Command Logging and Monitoring, SDAM Logging and Monitoring, Standardized Logging, Connection Pool Logging

Testability

Ensuring existing as well as net-new drivers can be effectively tested for correctness and performance, most specifications define a standard set of tests using YAML tests to improve driver conformance. This allows specification authors and maintainers to describe functionality once with the confidence that the tests can be executed alike by language-specific test runners across all drivers.

Though the unified test format greatly simplifies language-specific implementations, not all tests can be represented in this fashion. In those cases the specifications may describe tests to be manually implemented as prose. By limiting the number of prose tests that each driver must implement, engineers can deliver functionality with greater confidence while also minimizing the burden of upstream verification.

Specifications: Unified Test Format, Atlas Data Federation Testing, Performance Benchmarking, BSON Corpus, Replication Event Resilience, FAAS Automated Testing, Atlas Serverless Testing

Conclusion

Most (if not all) the information required to build a new driver or maintain existing drivers technically exists within the specifications, however without a mental mode of their composition and architecture it can be extremely challenging to know where to look.

Peeling the "drivers onion" should hopefully make reasoning about them a little easier, especially with the understanding that everything can be tested to validate individual implementations are "up to spec".