18 Jan 2015

MongoDB; the why

I have heard lots of arguments against MongoDB (NoSQL generally). One wrong assumption that generates this is that MongoDB is a SQL database replacement. It is not. However, there are certain data types and structures that are perfect for a NoSQL database. But even before you jump into making it your choice database for a project, you have to look at your data and be sure it is what you really need.

In MySQL, you can have multiple tables with one-to-many relationship.

articles

id	author_user_id	title	body	date
1	1	MongoDB; the why	I have heard lots of arguments…	2015-01-01 18:01:01

users

id	username	password	avatar
1	kehers	$2a$08$Lg5XF1Tt.X5TGyfb…	u/213649?v=3

comments

id	article_id	user_id	body	date
1	1	3	I disagree. Here is what I think…	2015-01-02 18:01:01

A query to get comments will require a join of the comments and users table:

select u.username, u.avatar, c.body from comments c, users u
where c.article_id={id} and c.user_id=u.id order by c.date desc

This is a luxury you can’t afford in MongoDB. If you will be doing lots of table relationships, MongoDB is not for the project. For simple joins with few tables, say two, you may. But anything more than that will lead to complications.

What MongoDB is great for is a data structure where you have entities that can contain every property they need on their own. Combine that with its great write speed and you get a perfect store for things like logs, events, stats and similar data.

/*
* crawler_logs
*/
{
    title: "HTTP error",
    content: "Failed to access example.com. Curl returns a 400 error.",
    stack: "pinger.js, line 45"
    date: "Sun Jan 11 2015 00:10:16 GMT+0100 (WAT)"
}

/*
* user_events
*/
{
    event: "Login",
    ip: "192.168.2.2",
    user: {
        id: 525e3fb3b218980d35000001,
        username: "opeyemi"
    },
    date: "Sun Jan 11 2015 00:12:42 GMT+0100 (WAT)"
}

As you can see, each of the entries above contain all I need about the entry. I don’t need to pull an additional data from a different collection (what tables are called in MongoDB) to make sense of the entry or make it complete. (I will be representing MongoDB entries as JSON objects because it closely represents how they are stored).

But MongoDB is not only good for logs. Consider a simple contact entity with the properties:

Name
Numbers
Emails

A contact may contain multiple numbers and emails. We can represent this in MySQL with a one-to-many relationship model like this:

contacts

id	name	..
1	Opeyemi O.	..

numbers

id	contact_id	number
1	1	08181019
2	1	08069018

emails

id	contact_id	email
1	1	kehers@gmail.com
2	1	ope@fonebaselabs.com

Well, we can argue it can as well be represented in a single table like this:

contacts

id	name	numebrs	emails
1	Opeyemi O.	08181019,08069018	kehers@gmail.com,ope@fonebaselabs.com

(CRUD operations will not be as easy as with the one-to-many model but find_in_set can really be a helper). However, if we throw in one more contact property - organisations, with title and position columns, a single table with find_in_set won’t save us now. We will have to go back to our one-to-many relationship.

contacts

id	name	..
1	Opeyemi O.	..

numbers

id	contact_id	number
1	1	08181019
2	1	08069018

emails

id	contact_id	email
1	1	kehers@gmail.com
2	1	ope@fonebaselabs.com

organisations

id	contact_id	title	position
1	1	Fonebase Labs	Co-founder
2	1	Life	Learner

There, we have another data structure we can use MongoDB for. With MongoDB, we can represent the model in one collection:

/*
* contacts
*/
{
  name: "Opeyemi O",
  email: ["kehers@gmail.com", "ope@fonebaselabs.com"],
  numbers: ["08181019", "08069018"],
  organisations: [
     { title: "Fonebase Labs", position: "Co-founder"},
     { title: "Life", position: "Learner"}
  ]
}

This simplifies the data structure and makes things a lot easier.

Let’s look at one more instance where using MongoDB makes sense. Imagine you are creating a platform as a service (PaaS) product and users can store data sets or metadata that can be anything - strings, arrays, e.t.c.

User A may have the following data:

mailgun_key: pubkey-02iismi5n5xozcmeyu3-ymqe3f9-0da9
ses_id: AKIAIJAZC3A2OYRSUGVA
ses_key: aXTd5VlLfremkG5UyoB76tnTTo2jB9FrZVywFz

User B may have this:

app_key: pGX1NjfDEUV5i60vvKRjeA
oauth
- token: 16110519-JOpMUsEWcAVSr2ft4jYKrbD2o6K
- secret: B1yH5DPHZ3qHe9y29Ugoa0Dz7iDpWyvuNoMNYJ

And User C this:

title: Callbase
description: Set up a call center in 5 minutes
author: Fonebase labs
keywords: call center, telephony, customer care

Because the expected data from the user is not predefined and the same, using a SQL database for storage will be complex. You can’t just go ahead creating columns for a data structure you have no idea of. With MongoDB however, this fits just perfectly as data down table (collection) rows do not need to have the same columns (name, data type or length).

/*
* metadata
*/
[
  {
    mailgun_key: "pubkey-02iismi5n5xozcmeyu3-ymqe3f9-0da9",
    ses_id: "AKIAIJAZC3A2OYRSUGVA",
    ses_key: "aXTd5VlLfremkG5UyoB76tnTTo2jB9FrZVywFz",
    user: ObjectId(167e3d8d95699af944000001)
  },
  {
    app_key: "pGX1NjfDEUV5i60vvKRjeA",
    oauth: {
      token: "16110519-JOpMUsEWcAVSr2ft4jYKrbD2o6K",
      secret: "B1yH5DPHZ3qHe9y29Ugoa0Dz7iDpWyvuNoMNYJ"
    },
    user: ObjectId(525e3fb3b218980d35000001)
  },
  {
    title: "Callbase",
    description: "Set up a call center in 5 minutes",
    author: "Fonebase labs",
    keywords: ["call center", "telephony", "customer care"],
    user: ObjectId(526fb1fa13d4432c6f000004)
  }
]

As you can see, we can have a single collection (table) with totally different fields (columns).

In conclusion, the key is to understand your data structure and the most efficient way to store it. Where it gets interesting is that you can even use both types of database for a single project. You can store your core data in MySQL and have your logs and events in MongoDB. And sometimes, the project is a small one and all you need is a simple way to store few data like app configuration settings.

My name is Opeyemi Obembe. I build things for web and mobile and write about my experiments. Follow me on Twitter—@kehers.