Tagged in: Configuration

503 Service Unavailable – Spreedly

Incident Report for Spreedly

 A mistake was made last night configuring firewalls in our new rack. While these devices are not yet a part of the service, they are connected to the switches, so the changes prevented packets from reaching places they were intended to go. This was an oversight, and we apologize for causing a disruption.

Update on Christmas Issues – Steam

We’d like to follow up with more information regarding Steam’s troubled Christmas.

What happened

On December 25th, a configuration error resulted in some users seeing Steam Store pages generated for other users. Between 11:50 PST and 13:20 PST store page requests for about 34k users, which contained sensitive personal information, may have been returned and seen by other users.

Continue reading…

Steam Outage: How to Monitor Data Center Connectivity–THOUSAND EYES

This is not an official Valve post mortem, rather an analysis by the company Thousand Eyes who make network troubleshooting tools.


Data centers are the factories of the Internet, churning out computation and services that customers demand. Apple is currently converting a 1.3M sq ft factory into a data center. And like factories, data centers rely on critical services to stay running. In the case of a data center, the most important inputs are electricity, cooling and network connectivity. Typically, redundant supplies of each input are available should there be a problem. Let’s examine an outage that occurred yesterday to see the importance of monitoring data center ingress, or network connectivity into the data center.

Continue reading…

Today’s Outage Post Mortem–Cloudflare

Today's Outage Post  
Mortem

This morning at 09:47 UTC CloudFlare effectively dropped off the Internet. The outage affected all of CloudFlare’s services including DNS and any services that rely on our web proxy. During the outage, anyone accessing CloudFlare.com or any site on CloudFlare’s network would have received a DNS error. Pings and Traceroutes to CloudFlare’s network resulted in a “No Route to Host” error.

Continue reading…