An article from David Mytton, CEO of Server Density, on how they write their postmortems.
When sufficiently elaborate systems begin to scale it’s only a matter of time for some sort of failure to happen.
An article from David Mytton, CEO of Server Density, on how they write their postmortems.
When sufficiently elaborate systems begin to scale it’s only a matter of time for some sort of failure to happen.
A post by John Allspaw on how they create blameless postmortems at Etsy.
So: failure happens. This is a foregone conclusion when working with complex systems. But what about those failures that have resulted due to the actions (or lack of action, in some cases) of individuals? What do you do with those careless humans who caused everyone to have a bad day?