SNW - Users share disaster recovery nightmares, fixes

03.11.2006
Storage Networking World attendees ' some veterans of which remember hurricane winds hitting the hotel where past instances of the Orlando conference were being held ' this week traded descriptions of steps they've taken to prepare for disasters.

In an impromptu electronic survey, about 10 percent of the audience said they had considered moving their disaster recovery facility due to concerns about natural disasters, and 11 percent blamed natural disasters as a main cause of their downtime.

Few have had to deal with the sort of disaster Glenn Exline, manager of advanced technology for Computer Sciences Raytheon, had to deal with in managing the computer systems for the 45th Space Wing of the U.S. Air Force, the IT department of which Raytheon manages on a consulting basis: In 1993, a Delta rocket blew up 13 seconds off the pad, raining flaming rocket chunks on a server farm and destroying it.

In fact, it was disaster recovery that led the wing, which is responsible for Patrick Air Force Base and Cape Canaveral, both on the Florida Atlantic coast, to implement a SAN four years ago, Exline said. The two bases, about 22 miles apart, replicate data between each other's EMC Corp. Clariion storage arrays using EMC's MirrorView software. In addition, backup tapes are vaulted in an Iron Mountain facility in Orlando, he said.

While both bases are in Florida, and thus vulnerable to hurricanes, the group assumes that one will survive. "In a true worst case, where neither survives, the last thing anyone is going to be worrying about is where Exchange is," Exline said.

In addition to hurricanes ' which, as numerous speakers pointed out, at least give you warning ' the part of Florida where the bases are located gets more lightning strikes than almost anywhere else in the country: 12 per square kilometer per year, which adds up to 768 strikes per year, Exline said.

The big year for hurricanes for the 45th Space Wing was 2004, when it got hammered by three storms, Exline said. The group not only shut everything down but pulled the servers and storage out of the racks and moved them upstairs to the third floor, in a room with no windows, and moved them back when the storm was over. Everything was back up four hours later, he said.

Hurricanes aren't a big problem in Canada but hurricane Katrina acted as a wake-up call for the Peterborough, Ontario, school district, especially after the area suffered two floods, said Anthony Brice, manager of technical systems for Kawartha Pine Ridge School. The district set up a 3TB SAN with a Thunder 9570V midrange modular storage array from Hitachi Data Systems Inc. and is working on setting up a wireless replication system with the board of the Peterborough Victoria Catholic school district, which he hopes to have planned by Christmas, he said.

Matt Pittman, director of enterprise systems for Penson Financial Services, implemented a disaster recovery plan for the Dallas company to be able to offer rapid recovery to its financial clients in minutes rather than hours, he said. The company ended up choosing Xiotech Corp. for its SAN and CommVault Systems Inc. for its backup and replication software, he said. There is a second site outside Dallas and the two sites replicate synchronously, he said. In addition, he said, he is considering setting up a third site in a more distant location, such as New York or Montreal, and adding asynchronous replication to that one. The new system has also decreased his backup times by 50 percent, he said.

Ruden McCloskey, a law firm with locations in 10 Florida cities, has also been dealing with disaster recovery, said IT director Ben Weinberger. After four hurricanes in 2004 knocked out power for days, he needed a better solution. Plus, tape backup was taking more than 30 hours, he said.

The organization ' which had been set up in a hub-and-spoke configuration with Fort Lauderdale and Tampa acting as hubs ' instead set up a network using multiprotocol label switching, including a disaster recovery site in Chicago, using WANSyncHA software from XOSoft Inc., now owned by CA Inc., Weinberger said. Fort Lauderdale can fail-over to Chicago and then fail back when the emergency is over, he said.

The system cost less than US$100,000 and had the added benefit of reducing the backup window to less than six hours, Weinberger said. But the real test came in 2005, when hurricanes Katrina and, in particular, Wilma, hit Fort Lauderdale. While all five of the offices in the hurricanes' path were off-line for up to five days due to a loss of power, the other offices remained up. Attorneys could either work from their homes, if they had power, or travel to one of the other offices that had power, he said.

In fact, Exline said he found Weinberger's experience so compelling he is going to look at the XOSoft product himself.

And as a reminder to attendees that disasters could strike at any time, Pittman mentioned during his presentation that one of his company's office buildings, in Montreal, had to initiate its business continuity plan just that day due to a fire.