Earlier today Salesforce.com experienced a downtime incident. Specifically na19 and na20, which serve the greater Boston area were down most of the morning. As with most public IT problems there was considerable reaction on Twitter and other social media. This is completely understandable as downtime incidents are a considerable inconvenience to subscribers. Further, given the breadth of Salesforce.com, the number of companies affected the natural outlet is to react on social media. At the time of this writing, the cause of the problem was unknown so it may premature to blame Salesforce.com. There are a lot of things that can cause this type of downtime. The bigger issue is, what does this mean for the availability of the cloud and should companies worry as more and more of their IT infrastructure is going to be deployed in an “off-premises”?
There are two main problems in evaluating the relatively reliability of different IT solutions. The first is fear of things people feel they do not control. People, in generally, tend to have less fear for things that they feel they have control over. This explains why in general more people are afraid to fly than to drive, when statistics indicate that flying is much safer than driving. The problem is, unless you are the pilot, then you are not in control of the plane. This also explains why people are skeptical of self-driving cars. The second problem is the magnitude of publicity when there is a problem. When Salesforce.com crashes, there is significant discussion on Twitter and other social media sites. It is a very public incident. However, most internal IT problems, with the exception of security breaches, do not get covered by the or on Twitter. These two factors can lead to the perception that the cloud or off-premises computing is less reliable than traditional on-premises computing.
But perception is not always reality. Companies that are considering moving to the cloud need to rely not on the perception of the reliability or lack thereof for a cloud solution. They need to investigate the actual data regarding on-premises and off-premises downtime. Additionally, companies need to determine the actual costs of downtime. IT tends to have too many systems that are classified as “mission critical”. So these systems are designed and built for “five 9’s” of reliability, often at considerable cost to the company. Further most organizations view these systems as so important that they could not even consider moving them to a cloud deployment. In reality, a more prudent financial approach would be to tolerate some downtime and lower costs for many systems. Additionally, mission critical systems can be moved to the cloud as long as a robust backup plan is in place to address downtime.
So in evaluating which systems to move to the cloud and which to keep on-premises, do not be ruled by fear. Keep in mind that cloud is not materially different from other things that are already not controlled directly by the IT department. Most companies have a third party that hosts their datacenter. Email has moved to the cloud, and with O365 even the basic office productivity tools are moving outside the direct control of IT. The decision to move a workload outside of the data center should be made based on the finances, the needs of the workload, and the service level required. By going through this analysis, companies will be in a better position to optimize their IT operations.