I spent the weekend reading over all the conversations on Hacker News (effectively summarized in this post) about how MongoDB had failed this one company, and all the follow-on conversation about why or why not to use Mongo as a database solution. Fortunately there are often some very rational people in the comments on Hacker News, but I feel like there is one key takeaway: developers and engineers will always be overcoming challenges no matter what system they select. Each system comes with its own set of problems, some developers just like bitching about those problems more than others :)
Unfortunately for most developers, your application will never get to the size where picking one database over the other will even matter. At Holler.com I went with MongoDB because a couple friends recommended it and it was relatively easy to code for. I also like their approach to structuring data as opposed to the standard MySQL way that I’ve been using for 10 years. Will it have problems at some point? I hope so! I also hope that we have a team in place who can react to and appropriately resolve the problems (also hopefully prevent problems before they occur).
If you decide to select a data store that is not yet proven, there’s a good chance you are going to face some serious hurdles as your system expands. The biggest problem that you’ll face is the lack of documentation surrounding the problem as not as many people have overcome the hurdles of scaling with some of the newer NoSQL solutions. Five years ago developers started going over how to effectively scale MySQL and now it has become common sense: how to shard, use master/slave setups, etc. There are even books on the topic.
The problem for many developers and pretty much all companies, is that there is typically a lack of resources to tackle a problem. When the developer ignores the warning signs that things will eventually fail because the business wants to keep adding features (as it sounds was the case with the anonymous MongoDB hater this weekend), it’s pretty much guaranteed that problems will ensue. The data gets corrupted, servers go down, all hell breaks loose :) It’s really the job of intelligent engineers to figure out how to solve these problems and more importantly how to avoid as many of them as possible.
The most intelligently designed systems fail at some point, the key however is figuring out how to avoid having failures bring entire systems down. Fortunately most people programming MongoDB are not developing commercial aircraft, and having a DB get corrupted is not really the end of the world (or somebody’s life). There’s a blog post by Netflix that has been re-shared numerous times over the years because of their strategy of killing systems at random and it illustrates a great engineering solution to less reliable systems:
We’ve sometimes referred to the Netflix software architecture in AWS as our Rambo Architecture. Each system has to be able to succeed, no matter what, even all on its own. We’re designing each distributed system to expect and tolerate failure from other systems on which it depends.
If our recommendations system is down, we degrade the quality of our responses to our customers, but we still respond. We’ll show popular titles instead of personalized picks. If our search system is intolerably slow, streaming should still work perfectly fine.
One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.
Yes, systems break, but in theory, the whole system won’t go down. However their engineers have tested the systems and have figured out how to generate reliability on an unreliable platform. At the end of the day it’s not about the database you chose alone, but instead effectively building a reliable system based on the technologies you happened to choose to accomplish the job.
In my personal experience, MongoDB has been great and using a service like MongoHQ has meant that I can focus less on operations and spent my time building our app. While our system isn’t 100% fault tolerant since I’m the only person programming right now, offloading much of the operations to outside services has made my life a whole lot easier. When the time comes to solve the problems of scaling, I will be ready to tackle the problem along with any other engineers who are on our team at that time!
As for which database you select for your own startup, my own theory is that picking one with a sizable community is always the way to go. That way you have people available who have probably faced the same challenges you have whenever you face them. MongoDB happens to have a very active community which is why adoption continues to grow. MySQL and other SQL solutions also continue to have large communities as well.
No matter what you build though, you’re definitely going to run into problems just like those who have been complaining about MongoDB. With newer technologies you’ll probably face more problems but it doesn’t mean they’re bad solutions. It’s your job to work around them and help make your entire system more reliable!!