Still Standing

04.11.2008
It's really quite simple: IT companies have Disaster Management and Disaster Recovery plans in place because they are IT savvy. They know what the critical importance of Business Continuity is. They realize how imperative a Data Center is and how essential it is to have a secondary facility in a remote location. But they're IT savvy and run drills so everyone working in the company knows what to do in the event of an unforeseen incident.

The bomb blast which took place outside the Marriott hotel on the 20th of September earlier this year shook the nation. It also shook the physical infrastructure of the Evacuee Trust Complex that housed 23 IT companies. It however, had no impact on the data that was running through the operations of the companies. Companies that lost their physical assets already had plans in place that allowed for businesses to continue.

Nayatel's (FTTU) Fiber-To-The-User infrastructure running across the capital city of Islamabad and the volunteer efforts of a lot of individuals and non-technology sector companies, most of the companies that had mission critical applications running were up within 48 hours. All of the companies that used to be in the ETC, were up and running within 5 days. No data or business was lost. Companies or websites operating on services managed by some of these companies experienced zero downtime.

And they achieved all this in Pakistan.

A facility in the US who went down after Hurricane Katrina took 10 days to come back up. LMKR, a Pakistani operation, came back up online in 2 days. That's even earlier than the prescribed international best practice for disaster recovery.

In the aftermath of the disaster, buildings in the areas close to the Marriott called upon the expertise of IT companies to better understand what they need to do in order to provide more robust facilities or even procedures as basic as evacuation plans, to their tenants. Formal agencies are once again running around to find people who can develop a comprehensive disaster management plan or policy without realizing one small, critical fact: you cannot plan for any management of anything overnight.

If you didn't have a policy document in place before, get started on it now. With more industries and corporations that make significant contributions to Pakistan's economy, if companies that are installing their automation solutions into existing infrastructure without a back up plan in place, we are going to have a lot more to lose than just a computer system.

"It Won't Happen To Me" Psyche

Hollywood action and the world of special effects has perhaps made decision makers numb to the possibility that something as horrifying as an earthquake can actually bring a building down or whether rains during the monsoon season might actually collapse bridges and block roads. At least the local district governments and Federal Relief Commission is more prone to thinking about developing plans that might help to manage the potential damage caused to the physical infrastructure. But even that thought process only seems to happen in a reactive mode as opposed to being as proactive as possible. While you can't expect to bring 100% physical safety based on envisioning all the possible calamity that can happen, you can at least attempt to be proactive and work and facilitate the people or the agencies that know what to do.

Management of virtual property is perhaps still not quite understood by the agencies responsible for disaster management. According to Salman Ansari, CEO of SATC and a Telecom Consultant, outside the realm of the IT and Telecom sector, nobody is thinking about disaster recovery or management. "The Federal Relief Commission head complains that disaster recovery is such a long and complicated process that takes a lot of time, however even he is only talking about the 'search and rescue disaster recovery'. They haven't even started talking about DR for the intellectual property or data management of a business."

Most professionals are of the opinion that if there isn't a public entity that comes forward to set up a formalized process, then the IT companies who already practice Business Continuity as part of their everyday practice, should.

"There is no plan in place," continues Salman. "In this case, all the work that enabled companies involved people's goodness of heart - everything was volunteered on an adhoc basis." There is no question that the companies that work within the IT industry regardless of their size, practice disaster management as part of their regular business practice so there is no threat of data loss interruption of business. Considering all the companies realize the importance of the fact that they will not be able to survive the losses should they not be prepared with their own infrastructure, they invest in the proactive measures that ensure the integrity and longevity of the intellectual property that runs through their networks.

But as long as businesses and industries don't realize the critical importance of their own networks and data, the demand will never reach an inertia high enough to pressure DR into a policy document. Because of the globalization phenomena, industries expand to different physical sites; business operations setup branch offices and companies who represent international brands, have the need to be able to use effective communication technologies that enables constant coordination. Khurram Rahat, Country Manager of Teradata Pakistan had recently made a wonderful comment about the role technology plays in any organization saying that, "The technology sector for any country or market is comparable to the heart that pumps the blood through a living organism. You simply cannot function without it."

Consider this hypothetical example: a food and beverage company has its processing plant in one part of the city, outsources its transportation to another company and carries its inventory in yet another part of the city. The data from all these locations is consolidated into one central database residing on a physical server which can be accessed by everyone involved in the business operations. Should any unforeseen circumstance arise in any one of the physical sites outlined in the example above, it will slow down the production until the issue can be resolved. However in the event that anything happen to the location that hosts the server, any coordination that is ongoing for the company's hundreds of processes, which also impacts everyday business decisions, will stop. And the stoppage will be more than just a temporary one. In case there is no disaster recovery site proactively planned into the business continuity plan of the organization, you can say goodbye to any competitive edge the company may have garnered with the system in place.

"CBR's website was down soon after the blast for quite some time. There should have been no reason for it to be down. These sites should have disaster recovery planned. Take the State Bank of Pakistan as an example -- it has a very strong DR plan in place, but then that's just one entity. This procedure has to be part of an industry policy or a corporate strategy. And I cannot emphasize enough as to how critically important DR strategies are for any business of any size which has any kind of technology running through it."

But the question to ask is why the industry has to wait for a government body to initiate the policy or why even involve them. The stakes for any formal public agency, along with the culture and environment they function in, are very different than the ongoings in the professional industry. It would be comparable to say to allow private companies to plan core business functions of any government body. The mismatch in that case, would be very evident. Private enterprise is very commerce drive. On the other hand, a monopolistic environment, such that governments are categorized into, has no competition. Despite globalization, the core function of the government or any ministry is to facilitate activity and protect the interests of its citizens. It is because of the competition that drives the private enterprise that makes the integration with technology so critical.

A strong and robust IT infrastructure provides companies with their competitive edge. If business processes are more efficient, companies can compete on price and forward the benefit to customers. Granted that the government has to be involved at the national level, but why delay the process further?

There is no doubt about the importance that Salman is emphasizing: "You have marketing and finance and sales that are usually in the forefront of any business. The IT departments are the ones that drive the knowledge economy." And if we don't take care of it, it's going to hurt.

Speaking with Nadeem Malik of InfoTech, he says, "Pakistan isn't the only place where disaster occurs. We've all read about bombings in Madrid or London, for example. But you see, these bombings had virtually no impact on their economy because they had plans in place. People knew what to do in the case of an unpredicted event. But instances like these influence the morale of the people of any country. Being able to minimize any lasting physical damage helps them to overcome any sense of loss they may have. Pakistan, with all the right technology in place, is unable to have a plan in place for disaster recovery."

The Blueprint to be Up and Running

Call centers, offshore companies and all IT enterprise cannot afford to be down. If perhaps this tragic incident had occurred in any city apart from Islamabad, it would have made the ability for companies to simply plug into a robust FTTU network that Nayatel already has ready, a more cumbersome job. As per industry analysts, the recovery would have been impossible.

All the management efforts in the post blast time period took place on personal efforts. Wahaj Us Siraj, the CEO of Nayatel mobilized his engineers very quickly to assist the companies with anything they required. NUST opened its campus to house some of the call center operations and other parts of Islamabad did the same. The PSEB and PASHA got the IT companies some of the support needed, but everyone else who should have been creating that policy document, were already too late.

But there are a lot of critical lessons to be taken from the speed of the mobilization that companies were able to react with in this instance. Pakistan has all the isolated ingredients required to develop disaster management plans according to international best practices.

A passive infrastructure is already in place -- it just needs to be tweaked to be redundant and have processes in place which will enable it to be flexible to meet the urgent requirements of any business in any part of the country. You have various companies working on the fiber solution. With quick facilitation by the PTA and other agencies, this process can be hastened to perhaps empower private infrastructure providers to work closer with companies in the event of any unforeseen instance.

Companies need to have their backup facilities identified and ready to be able to manage the switch from their primary location to their secondary facility at no advance notice. Not a branch office. A DR site. The intensely populated large cities still have room to plan this deployment. Where the government can play a role is to help negotiate contract terms on behalf of the companies. Building requirements is an infrastructure issue that has to be resolved even for the existing IT parks. Take a look at the quality of the IT park in Karachi and you'll realize what this age-old argument is all about. Commercial buildings don't care to subsidize rates for commercial companies. After all, they too are running a business. If the government wishes to play its role, it needs to help with mobilization. Assist with the backup electricity. The electricity situation, which will hopefully get better over the next few months, will hamper what progress companies are able to make from any site, secondary or primary.

Have a plan in place. A consortium has to come together. A consortium of two kinds of stakeholders: those whose businesses are directly impacted by the lack of a national DR initiative, and those who have the ability to provide support facilities. So members of various private enterprise would comprise of the first set of people and representatives of universities with large campuses, network providers, companies that plan, develop and deploy infrastructure need to comprise the other. Because most of the IT-enabled projects running in the government are planned or managed by private consultants anyway, you need to have those representatives sitting on the consortium. But on the whole, for this initiative to see any kind of daylight, has to be industry driven.

To assess your own readiness, you need to have the courage to consider the following hypothetical situation: if there were 5 people at the site of a devastating incident, do you know what tasks the other 4 people will be responsible for? If your mind begins to ponder over possible responses, compare your answers with the answers someone else might come up with. If your responses for the 5 critical measures and responsibilities differ, then we've got a big problem because that just highlights that we don't have the blueprint telling us what is going to happen in the event of a disaster.

To borrow the famous line from the Ghostbusters, 'Who you gonna call?" in the event of an emergency and more importantly, will they know what to do?

Many people wrote in saying that in the aftermath of the crisis at the Marriott, there were more volunteers than formal agencies. This could be simply an eyewitness account, but the reality of the situation is this: most of the work done in the recent past with respect to utilizing technology, has been done through volunteers. Remember the number of private enterprises who played critical roles in coordinating relief efforts for the 2005 earthquake? How many young kids set up websites to coordinate and pool the information for better, more efficient management of resources? If volunteers have the ingenuity and access to creativity, then they need to be part of the plan.

You Don't Have to Look Too Far: LMKR and Business Continuity

Most companies operating in this building conduct business across the globe, providing offshore services. Uninterrupted service is imperative for the survival of these companies. Service companies were left with communication breakdowns, valuable loss of business-critical data and in a few instances, the lack of alternate operational facilities. LMKR was one of the companies that was victim to the fateful episode and was faced with the challenge of preserving its business.

LMKR is Petroleum Technology Company specializing in Data Management, Processing, Interpretation, Consulting Services and Software Solutions was founded in 1994. Since its inception in early 90's, the company has been providing high quality services and turnkey solutions to a large number of clients. The customer base includes; Multinationals, National Oil Companies, Public Sector Organizations, Banking and Financial Institutions, Pharmaceuticals Industry, Telecommunication Sector, and Donor Agencies. Currently, LMKR offers its services through three functional divisions namely Geotechnical Services, Information Management and Information Technology.

Waqar Janjua, the CTO of LMKR, "We got our critical infrastructure back up in 20 hours, the medium and low in 40 hours. The standard best practice for DR is 48 hours in total." Waqar and his team ran regular drills to role-play what the exact process would be in the event of a disaster. "Every three months, we ran drills, some announced, some unannounced. I think the fact that we have this mode hard coded within our own practice, it helped for us to react faster."

Sidra Irshad, team lead for Network Applications explains, "There is a great emotional reaction for something like this. It has a lasting impact on all of us. However since we had gone through the drills, it was second nature for us to continue our work." And Sidra and Waqar both make an interesting point. There is no amount of planning that can help you manage the grief, but you can prepare for how you're going to manage the business. Since the team was focused on getting the critical applications back up, they knew what to do. "The system was already in place," reiterates Waqar.

Waqar explains that they had replicated the mission critical infrastructure at a secondary, disaster recovery site. "We practiced everything from how long it would take for us to get here, and begin work from here. All our team had to do was to reach these premises and continue their work. As a result, we had zero data loss."

Without even sharing too many details, it is evident that LMKR not only had a plan in place, but went through the motions every so often to ensure that everyone was on the same page. If you don't practice the blueprint you have developed, there will always be confusion in roles, processes, replication of tasks and inefficient management. The company's data was backed up in real time and with the FTTU infrastructure that Nayatel had them plugged into, there was no loss.

LMKR's quick turnaround time in terms of training and developing fresh graduates into world class geoscientists and geophysicists equipped with domain-specific software and IT skills is widely acknowledged in the industry. With more than 500 employees operating in Pakistan, LMKR is one of the largest recruiters of petroleum and IT professionals in the country.

In the last three years LMKR has increased its operations with a 35-50% growth rate. Currently LMKR is operating in more than 17 countries through its operational headquarters in Islamabad and branch offices in Mauritius, Dubai, Houston, Trinidad and Kuala Lumpur.

Atif Siddiqui, Program Manager of Information Management Division that handles Oil and Gas. Atif was actually at the Marriott hotel when the blast happened and got a first hand account of the shock. "When the blast happened, the initial reaction was complete shock and fear. We couldn't see anything and it was a tough situation to be in. I saw a tremendous amount of discipline in people during the evacuation process from the hotel."

Atif talks about the change that an incident like this brings into one's life. "You don't know what is going to happen in the next moment. You can plan for the future but you really don't know what is going to happen." After evacuating from the hotel, Atif was back at work almost on an immediate basis. "The incident happened on Saturday and the entire company was back at work almost immediately. We planned how to remove our hardware from our offices at the ETC."

LMKR is known for its inclusive culture. Every team member played a role in the post trauma situation. Their input mattered. One of the more dangerous backlashes of any incident like this is that the belief that people have towards a country, begins to fail. Asif Ahmed, Project Manager of LMKR's Business Continuity facility, explains, "This is a natural reaction regardless of where an incident like this occurs but we have tried to play our role in continuing our business operations in Pakistan. The company was already using the DR site to run other projects and, Asif explains that, "We've had this site for a while but the focus on this site has increased after our ISO 27001 certification. We shifted our business critical operations to this site way back then."

The company's operations are managed through a state-of-the-art data warehousing facility, which is the third of its kind in the world. "A secure backup of this facility is maintained in an alternate location allowing us to keep this system online 24/7," continues Atif.

Keeping in view the sensitive data managed by the company, adequate backup facilities and business continuity measures have always been the key to LMKR's uninterrupted service delivery model. The catastrophe hit LMKR's main offshore center in Islamabad which has a workforce of over 400.

In the backdrop of such a scenario, critical business operations and core ICT infrastructure need to be restored without delay. Being an ISO- and CMMi-certified organization, LMKR periodically carries out disaster recovery drills to deal with unexpected events and calamities. The company's comprehensive business continuity plan ensures pre-emptive measures and practices that ensure minimum downtime and little or no interruption in the continuity of services.

Atif and Shabana, the husband-wife team that runs LMKR, were concerned about how their team members would cope with the situation. Atif Khan, CEO of LMKR says, "Our facility is such that there is always someone working in the office at the ETC. That was perhaps our biggest worry and thankfully, our team was not present at the time." The 60,000 square foot office that the company had was completely shattered in the blast.

As part of its business continuity plan, LMKR has a number of business continuity centers and backup facilities in various geographical regions out of which 2 are based in Islamabad. All vital services and data were migrated to alternate locations and critical operations were recommenced from these secondary facilities. The core disaster recovery team was quick to respond to the situation and had core business activities and supporting services functional in a record 41 hours.

This unfortunate event has indeed brought the best out of companies like LMKR in terms of leadership, commitment, competence and unity exuded by each and every member of the organization. A combination of best practices, preemptive practices and human devotion has allowed them to sail through this period without sacrificing business continuity and quality.

We keep referring to international best practices. Here we have a company that is running its entire operations on the best practices and doing a better job than most international organizations have managed to. You don't to look far to find a better lesson to learn from.