Black Friday is back, and, as you've probably already noticed, with a considerable vengeance. According to Adobe Analytics data, online spending is predicted to hit $3.7 billion over this holiday season in the U.S, up from $2.9 billion in 2017. But while consumers clamour for deals and businesses reap the rewards, it's important to remember there's a largely hidden plane of software engineering labour.
Without this army of developers, consumers will most likely be hitting their devices in frustration, while business leaders will be missing tough revenue targets - so, as we enter into Black Friday let's pour one out for all those engineers on call and trying their best to keep eCommerce sites on their feet.
Of course, the pain that hits on days like Black Friday and Cyber Monday can be minimised with smart planning and effective decision making long before those sales begin. However, for engineering teams under-resourced and lacking the right tools, that is simply impossible.
This means that software engineers are left in a position where they're treading water, knowing that they're going to be sinking once those big days come around.
It doesn't have to be like this. With smarter leadership and, indeed, more respect for the intensive work engineers put in to make websites and apps actually work, revenue driving platforms can become more secure, resilient and stable.
This is the central argument of chaos engineering platform Gremlin, who we've covered a number of times this year. To coincide with Black Friday the team has put together what they believe is the 'true cost of downtime'. On the one hand this is a good marketing hook for their chaos engineering platform, but, cynicism aside, it's also a good explanation of why the principles of chaos engineering can be so valuable from both a business and developer perspective.
Estimating the annual revenue of some of the biggest companies in the world, Gremlin has been then created an interactive table to demonstrate what the cost of downtime for each of those businesses would be, for the length of time you are on the page.
For 20 minutes downtime, Amazon.com would have lost a staggering $4.4 million. For Walgreens it's more than $80,000.
Gremlin provide some context to all this, saying:
"Enterprise commerce businesses typically rely on a complex microservices architecture, from fulfillment, to website security, ability to scale with holiday traffic, and payment processing - there is a lot that can go wrong and impact revenue, damage customer trust, and consume engineering time. If an ecommerce site isn’t 100% online and performant, it’s losing revenue."
"The holiday season is especially demanding for SREs working in ecommerce. Even the most skilled engineering teams can struggle to keep up with the demands of peak holiday traffic (i.e. Black Friday and Cyber Monday). Just going down for a few seconds can mean thousands in lost revenue, but for some sites, downtime can be exponentially more expensive."
For Gremlin, chaos engineering is clearly the answer to many of the problems days like Black Friday poses. While it might not work for every single organization, it's nevertheless true that failing to pay attention to the value of your applications and websites at an hour by hour level could be incredibly damaging.
With outages on Facebook, WhatsApp, and Instagram happening earlier this week, these problems aren't hidden away - they're in full view of the public. What does remain hidden, however, is the work and stress that goes in to tackling these issues and ensuring things are working as they should be.
Perhaps it's time to start learning the lessons of Black Friday - business revenues will be that little bit healthier, but engineers will also be that little bit happier.