For those of you out there who check this tiny lil' blog obsessively, looking for the latest newly-published post that I probably started in the mid-2010s and just finished, you might have noticed some ugly 504 Gateway Timeout errors where there ought to have been embedded Flickr slideshows. The deal is that there was a huge Amazon Web Services outage earlier today (though it's apparently back to normal now), which caused a total Flickr outage since they rely heavily on AWS, and that in turn caused a partial outage here, as in no photos for a few hours. No permanent harm done, from what I can tell, and no lost revenue because I didn't have any in the first place.
I only mention all of this because a core best practice in this exciting modern DevOps universe is to maintain a status blog and write a post on it whenever you have an outage, explaining what happened. Everybody says that explaining is very, very important, and explaining things is basically all I do here, so I figured somebody out there might be expecting an official status post or something. Which would go here, because this blog is its own status blog. The reason this is important is not because you necessarily expect customers to understand, but they're apparently flattered that you even tried to, and then they can repeat the explanation to other people and sound smart. For bonus points, you can make it an apology that doubles as a job posting, as this outage was minimized on your end thanks to some advanced tools you wrote in-house using the latest and trendiest language of the year, and you're thinking about open-sourcing these tools if only you could hire someone as a maintainer. As it so happens my, uh, monitoring tool was me trying to find a photo to use as a new MS Teams background during an overlong meeting today. Which, on one hand, detected the outage without any annoying pagers going off, though on the other hand it doesn't scale up very well. As for the detailed explanation, Amazon will probably post one eventually here. When that happens, just imagine that statement plus me nodding along sagely to phrases like "Elastic Kubernetes" and "Flux Capacitor", and that's your official status update from here.
This particular outage annoyed me because I like to insist this humble little blog is a tiny one-person operation, and it's just me here puttering around pursuing various weird and eccentric hobbies and whatnot. And I like the idea that the site at least appears to exist outside of capitalism: No ads, nothing for sale, no sponsored guest posts, no affiliate links, nothing. And then an outage comes along and reminds me and everybody else that this is a reverse Wizard of Oz situation, with the twin corporate monoliths of Google and Amazon hiding behind a curtain & operating all of the actual machinery here. In theory I could probably host everything from home except for the embedded maps here, that would almost certainly be slower and less secure while also costing more, and doing a bunch of system administration at home as a hobby has never been my idea of a good time.
No comments :
Post a Comment