Flying High, Crashing Fast: The Airline Industry IT Problem

Share This Post

Airports are, at best, an inconvenience: A multitude of people occupying the same small space; too-high prices for food and drink; delays upon delays upon delays. Airports are not places in which you want to be stuck; they are places you want to leave. In fact, being stuck in an airport can be a nightmare. Why? Not just because it is a nuisance, but one you have little to no control over. One of the most frustrating air travel-related incidents is a flight cancellation. And as reality teaches us – no traveler is immune.

As recent as last month, United suffered an IT problem – causing the company to halt all domestic flights. This was following a similar incident with United in October. British Airways faced cancellations stemming from technical issues in September, as did Southwest Airlines with 2,300 flights cancelled over several days, And of course there was the Delta debacle last August that cost them some 150 million US dollars. Those incidents usually share a similar background story: they happen to huge companies, who are still operating on an outdated, almost obsolete IT system.

High Altitude, Low Tech

An August article from The Economist sheds some light on why IT continues to be the instigator of such significant problems for the airline industry, claiming that instead of replacing antiquated systems with modern ones that crash less, “airlines are constantly working on upgrading their systems, little by little.” But that is only part of the answer; to understand the bigger picture requires an examination of the different reasons for each outage. While all of the crashes are marked under the generic name “computer failure” or “system malfunction,” each of these episodes stemmed from various causes.

One incident was related to the communications system, another was due to a power outage in the data center. Other causes that can be listed are the weight report system and a failed network router. These are all problems rooted in technology, but looking closely, one can highlight two main factors: the first is the outdated systems that are prone to malfunctions, and the second, the general maintenance and upkeep of a company’s IT infrastructure and personnel, ranging from poor quality hardware to employees who do not follow standard protocols, like in the Delta incident.

What Should They Do?

Owing to their nature, airlines must present High availability in their IT systems. This may sound like the most basic or benign concept, but it is crucial to remember the importance of not only having a backup, but actually testing it constantly to see that it works properly. Take Southwest’s case for example: The backup system was prepared and set, but according to the airline, when the router malfunctioned, it should have kicked into action, but in the end didn’t. Something similar happened to Delta: After the incident, a spokesperson publicly questioned “why some of its own critical operations had not switched over to backup systems.” So, the problem is not just investing in modern systems, since these companies are obviously doing just that in order to meet strict industry regulations, as well as to avoid publicity nightmares and financial losses. And it’s not the fact that they don’t have backup systems, because most obviously do. But, those systems are not working properly. So, what to do?

Systems must be constantly evaluated and updated. Veteran companies (and not only airlines) seem to suffer from an IT infrastructure that is jumbled together, assembled over a long period of time. What is needed – whether built or bought anew – is to keep the infrastructure whole and as current as possible. Another important thing is for airlines to update their IT mentality and approach to “agile.” Airlines cannot plan processes that will take years to implement; they have to shorten every phase of development. Also worth noting is the airlines’ deep focus on all things front end. Yes, the customers are important, and slick UI combined with a pleasant experience gets the customer “on board” time and again, but there should be equal consideration for the backend.

The Cloud’s Where It’s At

The last important thing that airlines should consider is the cloud. Updating obsolete systems to an agile platform with a reasonable investment in the backend is achievable and made easier in a cloud environment. This is especially important to the mission critical systems: If the backup in Delta’s case was on AWS public cloud (Amazon cloud backup system), for example, the problem would have likely been resolved much faster. It should be noted that the cloud can surely help with other challenges, such as customer experience or airplane maintenance. Airline companies outside of the US (Lufthansa and Quantas are good examples) are already using and reaping the benefits of the cloud. The US juggernauts are the ones lagging behind. ( Save for American Airlines which does indeed seem to be preparing for a move).


The thing to remember with airline failures like the ones noted above is that there is no “magic solution”, and no “one fix fits all.” Airlines are perfect examples of old, massive companies that have to keep in mind both tiny details relevant to short term success and major concerns that, if not treated, will continue to threaten the smooth running of day-to-day operations. And whether it’s new IT systems, public or private cloud, or improved QA on, well, everything, these companies must remember that stagnating technology is a detriment to their success. Just as they invest in new airplanes and the updating of airport facilities, the same logic should apply to behind-the-scenes operations. Customers of Delta, United, Southwest and others learned the hard way how back-end failures can lead to front-end inconveniences; ones that directly impact the customer’s experience and future purchasing decisions.

Next step

The easier way to recover cloud workloads

Allowed us to save over $1 million in the management of AWS EBS snapshots...

Try N2WS for Free