ArcGIS Online service partial outage - Resolved

On 4th September there was a partial outage with ArcGIS Online which affected some customers. Publishing services, the use of hosted feature services and some other specific functionality were impacted. The issue was restricted to certain parts of the infrastructure and this was reported on the  ArcGIS Online Health Dashboard.  

Unfortunately, some issues still remain which are affecting some customers. We are working with Esri Inc to understand the causes of the problem and to resolve the ongoing issues as quickly as possible.

UPDATE (06/09/2018)

The problem has been resolved. The following statement from Esri Inc provides further information:

What happened :

Between 3:00AM and 6:45 PM PST Tuesday September 4, 2018  ArcGIS Online experienced continual intermittent problems with a subset of our infrastructure that provides hosted feature services and spatial analysis. 

Why did it happen :

The issue was due to a weather/lightning event which caused electrical problems and a shutdown of part of the infrastructure with the Microsoft Azure South Central US region. This infrastructure shut down resulted in a subset of ArcGIS Online Services being unavailable for publishing or viewing.

Our Response:

The ArcGIS Online operations team worked with  Microsoft Azure during the event to diagnose the problem and then began work on swift recovery.  During the investigation and recovery, the status of the issue was updated at status.arcgis.com until restoration was verified and tested.  In parallel to the above activity we also triggered our Disaster Recovery process for restoring affected services in a different region from database backups.  The recovery of the original affected data center completed earlier.

Additional notes and plans:

  • Our services already run in High Availability mode covered by the SLA of Azure and take advantage of redundancy and fail over across multiple instances. We will investigate and consider additional improvements once the root cause analysis from Microsoft has been released to strengthen the availability of our current deployments. 
  • We will also look at improvements to the communication process to insure you have the most reliable and updated information possible.