There was a huge outage of Salesforce services today following an internal permission change. As a result, customers were not able to access services for several hours.
“This is about our public cloud access service,” Shanmugam Chinnasamy, SVP of Salesforce Sales Cloud Engineering, said during a video call with customers. “From Salesforce we call into AWS and we introduced a change in that permission that caused this error.”
Wide effects on Salesforce cloud services. A graphic displayed during the call showed most if not all of Salesforce’s cloud services were affected.
Chinnasamy said the company’s technology team became aware of the service disruption at 11:03 AM (EST). Because this was an issue with Salesforce communicating with AWS no customers’ data was ever at risk of being exposed or compromised, he added.
“We faced issues in our morning around the attachments section (files, notes, etc.) where the error message somehow pointed towards Amazon AWS bad gateway,” one customer wrote in the call’s chat. “Fortunately, it [was] resolved in a couple of hours.”
Customers on the call wanted to know why Salesforce’s trust site wasn’t updated until 11:53 AM. This was because the company first wanted to have a clear picture of what had happened, according to Cuthbert Langley, a member of Salesforce’s tech crisis communications team.
Business hours? The company wasn’t able to answer a question asked by many customers: Why was the update done during business hours?
“We will do root cause analysis and then understand why it was made, why it [happened] during business hours and then we will get back to you,” Chinnasamy said.
Langley said people with more specific questions and issues should contact their sales rep for further guidance.
Why we care. Most core applications used by most marketing organizations are now cloud-based. In the early days of cloud, there was great concern about security of data and about possible outages. Largely those concerns have been put to rest; cloud applications have generally come to be seen as more reliable than their on-prem predecessors. This incident shows that human error can still introduce problems.
Get MarTech! Daily. Free. In your inbox.