Did a Lack of Resiliency Cost the Patriots a Super Bowl Trip?
This year’s AFC Championship game was one of the most memorable in years. Two of the league’s top teams with two quarterbacks destined for the Hall of Fame slugging it out for the privilege to compete in the Super Bowl 50. Denver fans were elated with the results of the game, while Patriot fans play a game of what-if.
One of the what-if’s the Patriot’s fans are talking about is a 20-minute outage of their team’s sideline tablets. During the first-half of the game, a problem with a network cable took the Patriot’s access to a league provided Sideline Viewing app offline. The app is used by coaches and players during the game to review opponent’s offensive and defensive tendencies or to dissect why a problem occurred. While the Patriots lost access, the Broncos were able to maintain their access to the system. So the argument goes, did the Broncos have an unfair advantage during the outage, [yes] and did the advantage cost the Patriots the victory? Certainly that’s a debate that may not have a clear answer, but what we can do is look at problem and learn from it.
The NFL takes great pride in their preparations for every game. They spend a lot of time validating the conditions of the field along with all of the logistics that it takes to put on a game. Radio communications are checked to ensure that there is no interference and that no one else is using the same frequencies. Sideline trunks are deployed with radios, backup power supplies, and a wired alternative if the coaches’ wireless communications fail.
Unfortunately for the Patriots, a single point of failure, one network cable failed when it mattered most – in a win or go-home situation. While backups for power and voice communications were in place, the new Sideline Viewing System did not have complete redundancy.
Sideline Viewing was rolled out in 2014 as a replacement for an ancient black-and-white photo album system. In the past, snapshots of formations and plays were relayed over fibre-optic cables to a printer behind each team’s bench. Runners would assemble the photos in binders and get them to the sideline coaches and players as quickly as possible.
Now, snapshots are relayed wirelessly to tablets on each team’s sideline. This gets information into the coaches’ hands 30-seconds faster than the old photo album system – roughly the time between plays. It’s a technology triumph that allows coaches to make quicker adjustments during the game.
We too in IT sometimes get wrapped up in technology triumphs. The cloud is reducing the cost of IT and big data is helping our organizations make nimble decisions that affect efficiency and profitability. What hasn’t changed over time is risk. Everything we do has a risk profile and a cost associated with potential loss.
Many of our IT systems are designed with resiliency to reduce our risk profile. However, the cost of resiliency is sometimes higher than the risk warrants, or a level of resiliency may seem downright silly. I mean, do I really need to wear both a belt and suspenders? Certainly not. Should the NFL have designed network resiliency into their sideline viewing system? They probably thought they did, yet the Patriots and their fans could argue they should fix the single-point-of-failure.
We too in IT face similar dilemmas. Should a particular system have redundancy or not? One way of looking at application systems is through the lens of a Business Impact Analysis or BIA. In a BIA, business units apply a monetary value to downtime. Business management in-turn can use the valuations to make investment decisions.
While the BIA has fallen out of wide-spread use, it is still used in the financial and healthcare industries and in some other regulated environments. Two of the problems with the BIA are that it overstated financial loss and it fails to consider other factors. On the financial side, the BIA lost credibility because the aggregate loss potential of all the applications often exceeded the revenue of the organization. At the same time, other factors are not considered, such as the impact of a failure on the organization’s brand. It’s hard to make decisions if you can’t trust your numbers.
Replacing the BIA are measurements of maturity aligned to International Organization for Standardization (ISO) standards along with risk assessments. The notion of a maturity analysis helps an organization to understand process and people strengths and weaknesses. As well, the risk analysis helps organizations understand the potential impact of that risk and helps develop strategies for risk mitigation.
Do you know your risk profile? Is there a single point of failure lurking somewhere you didn’t know about? It’s often difficult to internally evaluate the maturity of your processes and people, understanding infrastructure single-points-of-failure, and fully understanding your risk and potential mitigation strategies. Engaging someone outside the organization with significant experience and a proven approach is often the better way to go.
While debate will never definitively answer the question, “Did a Lack of Resiliency Cost the Patriots a Trip to the Super Bowl?” A detailed risk analysis should have identified the loss potential, and analysis of the risk may have been either a recommendation to increase the resiliency or to mitigate the risk by simply cutting off the Sideline Viewing System to both teams. Not every weakness in resiliency requires spending money. But having risk mitigation options lets business management decide whether to invest, or to mitigate, or accept the risk. And in the case of the Patriots loss of the Sideline Viewing System communications, money may or may not need to be spent. But having a risk mitigation option of cutting communications to both teams, while inconvenient, would have provided fairness to both teams and to their fans.