Outage Post Mortem – FourSquare

Link to Original Report

(Note: this is being posted with Foursquare’s permission.)

As many of you are aware, Foursquare had a significant outage this
week.  The outage was caused by capacity problems on one of the
machines hosting the MongoDB database used for check-ins.  This is an account of what happened, why it happened, how it can be prevented, and how 10gen is working to improve MongoDB in light of this outage.

