I have mentioned MongoDB in some of my previous blog posts. And, from those posts, you may know that one of the differences between MongoDB (or NoSQL databases, in general) and traditional relational (or SQL) databases, is that MongoDB is non-relational. In other words, relations between entities in the database are not enforced in MongoDB. However, relational databases (like MySQL or PostgreSQL) can enforce relations between tables. But, as you will see in this blog post, it is possible to implement relations in MongoDB. So, how to create relations in MongoDB?
Different approaches
MongoDB provides a lot of flexibility when it comes to the structure of the data that’s stored in it. As a result of this flexibility, there is more than one approach you can take to implement relations in MongoDB. I will list two approaches in this post. With each of these approaches, I will talk about the advantages and disadvantages. Also, I’ll touch base on when it makes more sense to use one over the other.
Before I get into the different approaches, let us quickly review how data is stored in MongoDB. A table, or entity, in MongoDB is called a collection. Each collection has a set of documents stored in it. Each document is stored in a special JSON format. The below is an example of how an employee document may look like.
{ "_id": ObjectId("5bf41179997910230e41153e"), "firstName": "John", "lastName": "Doe", "active": true, "dateCreated": "2020-11-26" }
As you can see from the above, documents look like JavaScript objects, where each field is a set of key-value pair.
Embedding
The first approach to creating a relation in MongoDB is to embed a document inside another. Keeping the same employee document as an example, if we wanted to add department information to it, we could embed a department document inside the employee document. The result would look like this:
{ "_id": ObjectId("5bf41179997910230e41153e"), "firstName": "John", "lastName": "Doe", "active": true, "dateCreated": "2020-11-26", "department" : { "departmentName": "IT", "departmentManager": "Jane Smith", "departmentLocation": "6th Floor" } }
Adding a relation between the employee entity and the department entity may look like this. The department document is embedded inside the employee entity.
Advantages
The biggest advantage that you get when implementing this approach, is that it is very easy to implement. If you wanted to retrieve an employee record, and get information about their department, you could do this with a very simple query. Simply retrieving the record itself gives you all you need about the department as well. There’s no need to join or lookup data from any other entity.
Disadvantages
A quick glance at the example above, and it’s evident what the biggest disadvantage is. If the manager of the department is changed (or any other field in the departments entity, for that matter), then you will have to update that information in all of the employees documents as well. This will quickly become unmanageable as the number of documents gets larger.
When to use
I would recommend using the embedding approach only if the number of documents is small, and the data does not change often, or at all. Otherwise, updating data in the embedded documents becomes unmanageable.
Referencing
An alternative approach that you can take if you want to implement relations in MongoDB, is to reference a document from another document. This approach is similar to having a foreign key in a relational database. To take the same employee-department example, if we wanted to use references, the employee document would look like this:
{ "_id": ObjectId("5bf41179997910230e41153e"), "firstName": "John", "lastName": "Doe", "active": true, "department": ObjectId("5fd930b2ca2d7a28b214892d"), "dateCreated": "2020-11-26" }
Using this approach, the department field is a reference to a document in another collection (the departments collection).
Advantages
The biggest advantage to using this approach is that updating anything in the departments collection does not require an update in the employees collection. The changes would be reflected in the employee record, because it is pointing to the department record.
Disadvantage
The only disadvantage here, is that if you wanted to retrieve an employee record, along with information about the department, you would have to perform some kind of a lookup. This lookup operation might cause an overhead that can be expensive, in terms of performance.
When to use
I recommend using this approach for most scenarios. Even though looking up the referenced records is more expensive (performance-wise) than having them embedded, however, in most cases, this cost in performance is negligible compared to the cost of having to update all the embedded records.
Conclusion
Even though MongoDB does not force relations, you can create relations in MongoDB by using one of two approaches. Embedding a document inside another document is one way. It is easy to implement, and does not have the overhead of looking up the related documents when retrieving. I recommend using this approach in simple scenarios where the data does not change often. The other approach is to reference the related documents. With referencing, you can modify one collection without having to modify the related documents. You can justify the overhead in the lookup in cases where you update the data frequently.
If you have any questions, please leave a comment here, or use the contact page.