(Note: this is being posted with Foursquare’s permission.)
As many of you are aware, Foursquare had a significant outage this
week. The outage was caused by capacity problems on one of the
machines hosting the MongoDB database used for check-ins. This is an account of what happened, why it happened, how it can be prevented, and how 10gen is working to improve MongoDB in light of this outage.