Black Friday/Cyber Monday 2017 – Lessons Learned from Years Past

History has shown that no retail ecommerce site, no matter how prominent, is immune to performance (speed, availability) problems under the weight of holiday web traffic. In 2016, it was Macy’s that had problems – starting on Black Friday morning and extending throughout most of the day. Shoppers attempting to access both their desktop and mobile sites were met with a “temporary shopping jam” page.  In 2015, it was Neiman Marcus, which experienced major outages very early on Black Friday morning followed by intermittent outages on Saturday; the site did not stabilize until Sunday morning.

When these outages happen we are often asked, “What happened?”  In many instances it’s impossible to know the root cause – only the IT teams know that.  But we do know that weak links under heavy load are often the culprits, and these can be anything from faulty infrastructure to oversight of very basic web optimization techniques.  Here, we’ll explore five common “weak links” from past holiday seasons (as well as other recent peak periods) and how ecommerce organizations can avoid them.

External Third Parties:

Third-party elements – upon which most ecommerce sites depend – represent a huge point of vulnerability. These can be everything from external third-party infrastructure (like the cloud and CDNs), to must-have services (like marketing analytics tags) to “nice to have” features like social media plug-ins. During times of peak traffic, the load these services experience (as they support hundreds of ecommerce sites) grows exponentially, and if one of these services crashes, it can take down all the sites depending on it. One example from last year is Williams-Sonoma, which experienced ongoing problems starting around 9 a.m. EST Black Friday as a result of problems directly attributable to a third-party photo display service. During this time both the desktop and mobile versions of the site experienced load time spikes of upwards of 25 seconds, considered slow by any standard.

The golden rule for third-party use during the holidays is to go light and use only those that are truly needed. Once deployed, these services need to be measured and monitored constantly for potential problems, and organizations must always have contingency plans in place (for example, removing a poorly performing component if problems are detected.

Regional outages are common:

Ecommerce sites often use synthetic monitoring – or, “dummy” pings generated from the cloud – to ensure websites and mobile sites are available and downloading at an acceptable speed. The problem is, performance can vary significantly from geography to geography (as a result of varying elements like local and regional ISPs), and many ecommerce sites do not monitor from a wide enough set of vantage points, at sufficiently frequent intervals. This means they may miss localized performance problems.

An example from last year was Walmart, which experienced problems on its desktop site due to an ad tech provider, ultimately leading to ongoing blackouts in Phoenix, starting early in the long holiday weekend. Localized performance issues can be a huge problem if key geographies are impacted, and most ecommerce sites won’t want to take that chance.

Think beyond your homepage:

Measuring and monitoring homepage speed is absolutely necessary, though it’s not enough. What about the key landing pages and conversion paths visitors migrate to once they enter a site? During Amazon Prime Day last year, Amazon did a fantastic job at delivering exceptional homepage performance, even under loads estimated to be 20 times the norm. Amazon actually delivered faster mobile site performance than major competitors which were not having sales that day, and should be commended for this. However, certain areas of Amazon.com did experience problems, namely search, with search processes slowing to 14 seconds at certain points throughout the day.

Well before the holidays, ecommerce sites should identify what are the most important landing pages and conversion paths and optimize these for performance under load.  Real-user monitoring, which complements synthetic monitoring and identifies what users actually do once they enter the site, can help narrow down areas for prioritization as well as demonstrate how performance changes impact user behavior, positively or negatively.

Monitor APIs:

APIs are a fundamental part of the internet fabric, enabling services that may otherwise not be possible. Since API-dependent processes often support customer-facing, revenue-generating (and therefore mission-critical) applications, this makes API monitoring an absolute must. Like external third-party services, popular APIs can come under major strain during peak traffic periods, slowing down critical processes or worse yet, breaking altogether.

During Black Friday/Cyber Monday 2015, PayPal, the popular payment processing service, experienced availability issues, dipping down to around the 30% range on Sunday morning, followed by a dip to around 80% for a short time on Monday morning. Translation: nearly 70% of people trying to use the PayPal function Sunday could not, and 20% could not during Monday’s incident.

Sometimes APIs are so critical that ecommerce sites can’t simply remove them. But at least sites depending on APIs can measure their performance, validate SLAs and get ahead of any problems through proactive customer communication and collaboration with the API vendor.

Get Slim:

As they say in the dieting world, nothing tastes as good as slim feels. In web performance parlance, this translates to no piece of content, no matter how rich or compelling, is worth a slowdown during a peak period.  Webpage load time, or the time it takes for a shopper to perceive the page has downloaded and is ready for interaction, tends to increase in direct proportion to page weight. Compared to other types of content, large images tend to contribute a disproportionately large amount to a page’s overall weight.

Images must therefore be optimized, including miniaturizing JavaScript and CSS files and serving only those files relevant to the requesting platform. Serving up CSS, images, and JavaScript designed for a desktop site to mobile users will increase their webpage load time and lead to customer frustration. In addition, organizations should use compression on all text-based content (HTML, XML, JavaScript, CSS, etc.), but avoid using compression on images, as most use a compression format already and extra “unzipping” work for browsers creates further delays.  These are basic optimization techniques, but it is surprising how often organizations overlook them. In the last Back to School season, we saw one major retailer with an especially large image on their landing page, which caused a significant site slowdown.

Besides image optimization, organizations should consider techniques such as asynchronous loading, meaning the sequential loading process for various page elements will automatically “skip” over any slowly loading elements, versus having a slow loading element delay the rest of the page. This can help ensure that one slow-loading image does not impede the overall webpage load time. Additionally, if an image is a “nice to have,” though not necessarily a requirement for webpage interaction, it should load below the fold so as to not negatively impact users’ perception of page download speed.

Conclusion

Will 2017’s Black Friday weekend break from tradition and be free of major ecommerce website crashes or slowdowns? No one knows for sure, and there are never any guarantees – but what we do know is that years past offer many useful lessons. The five guidelines above can go a long way in helping organizations maximize their chances for a smooth, seamless, successful holiday ecommerce season.

Mehdi Daoudi is CEO of Catchpoint

Leave a Reply