6 Things To Know About The Latest Salesforce Outage
Salesforce customers and partners still have questions about the worst service disruption in Salesforce’s history late last week. Here’s what we know and what we would still like to find out.
Salesforce Status
Salesforce grappled with the worst service disruption in its history on Friday and into the weekend. By late Monday, the outage was mostly resolved, although not all users were fully back up and running by that point.
The outage caused sales agents and marketers around the world to lose access to Salesforce Marketing Cloud for several hours, leading some companies to send employees home early for the weekend.
Customers and partners still have a lot of questions about how a faulty script for Pardot marketing automation software wreaked such widespread havoc.
Here's what we know, and what we would still like to find out.
6. Pardot The Interruption
Salesforce knew it had a problem Friday when customers using Pardot marketing automation software started reporting all their users could see and edit the entirety of their company's data on the system.
To stop the permissions failure from exposing more sensitive information, Salesforce cut access to Pardot tools—and the larger Salesforce Marketing Cloud they're part of.
Salesforce soon discovered a database script error was at root.
The cloud giant was able to isolate organizations directly affected by the Pardot problem and restore access to all others by Saturday. The company has yet to explain from a technical perspective why the Pardot failure resulted in the need to pull down the larger marketing platform.
5. Scope Of Failure
The number of reported outages peaked at 3,262 shortly after 1 p.m. ET on Friday.
Marketing Cloud was down for up to 15 hours, but it took longer for many customers to properly restore permissions so users could get back to work.
One Salesforce partner told CRN the failure "appears to be massive."
"Many of my Salesforce connections have reached out and have been impacted in one way or another," that partner said.
Salesforce has not yet characterized the full extent of the disruption in a post-mortem.
4. Where In The World?
A heat map of the Salesforce outage locates the epicenters of disruption in North America and Northern Europe.
California and the East Coast of the U.S. are particularly lit up. In Europe, the impact radius looks centralized in Southern England.
Problems that persisted into Monday involved three cloud instances in North America that were reset, but again had overly broad permissions slip-ups that required another cycling.
It's unclear if, and why, certain geographies were affected more than others.
3. How To Restore?
For customers directly affected by the faulty Pardot database script, Salesforce at first only restored access to users with a System Administrator profile. Those businesses then had to go restore profiles or user permissions.
Salesforce advised organizations with valid backups of their profiles and user permission data to deploy that information directly from a Sandbox copy to the production environment. Customers lacking Sandbox production profiles needed to manually modify the configurations to grant appropriate access to users.
By Monday, automated provisioning to restore permissions had been executed on all production instances, Salesforce said.
A small subset of customers, however, were still experiencing permission problems.
2. Disruption History
The latest outage was almost certainly the worst in the cloud pioneer's history—at least when measured by the number of users impacted.
The last notable Salesforce outage happened in 2016. That service disruption had the added inconvenience of wiping out four hours of data customers entered into their CRMs on May 10 of that year.
Two months earlier, in March 2016, Salesforce customers in Europe saw a CRM disruption lasting up to 10 hours that was caused by a storage problem.
1. The Latest
On Monday, at least three of the more-than-hundred Salesforce cloud instances in North America were still having problems restoring permissions for some users.
"We're all hands on deck to remediate the issue for all customers," tweeted Salesforce CTO Parker Harris sometime before noon on the West Coast.
Salesforce thought it had fully remediated the outage over the weekend by executing automated provisioning to restore permissions on all production systems.
Following that fix, however, some customers on the small number of still-affected instances had permission levels again improperly set to give users broader access than intended, according to the company's status page.
To stop that, Salesforce initiated another service disruption, and admins again were asked to manually restore permissions.