RIM global outage caused by core switch failure; fix under way

11.10.2011
BlackBerry service on Tuesday were caused by a core switch failure within the infrastructure of Research in Motion (RIM), the company said late Tuesday.

A RIM spokesman said service was beginning to be restored to normal around 2 p.m. ET, although there would be further delays as backlogs in data are cleared. It was the second outage or "delay," as RIM put it, in two days affecting users in numerous countries.

RIM's system is designed to failover to a back-up switch, but the failover system "did not function as previously tested," according to a statement issued by RIM at 5 p.m. ET.

When the failover did not function, a backlog of data was generated. The company is working to clear that backlog.

"RIM has failed again at what plagued them in past outages, which is to provide a comprehensive disaster recovery solution," Ken Dulaney, an analyst at Gartner, said after the cause of the outage had been made public.

Dulaney said that while switches can fail, "there should be automatic ways in which the system recovers from this type of event. Any vendor who runs this type of mission critical service must constantly be reviewing disaster recovery solutions."