Killing risk, unifying data protection

27.02.2007
The most successful companies are those that have been able to reinvent themselves and adapt to changing user demands and competitive landscapes. In IT, it is equally important to question and reevaluate why we do the things we do. Often, practices have been adopted at a tactical level to address a particular need with little consideration for other related requirements. Over time, this approach can lead to a hodge-podge of solutions that may partially overlap but lack unified direction or management.

In data protection, this can be evidenced by the variety of data protection techniques applied within an organization -- nightly backup, snapshot, mirroring, database dumps, database log-shipping, host-based replication, storage array-based replication -- to name a few. I'm not suggesting that these approaches are not serving a valid purpose or even that they cannot all be applied effectively within a given environment. However, employing multiple tactical solutions without an overall unified data protection strategy can be both wasteful and risky.

One way to begin to form a unified strategy is to consider the range of data protection risks that need to be addressed. Some are obvious, while others may be less so. Among the more common are the following:

-- Physical device failure -- Loss of a storage element or connectivity to that element. This is typically addressed by redundancy, e.g., RAID, multipath I/O, etc.

-- Detectable logical data loss -- Accidental deletion or corruption. Commonly addressed by point-in-time data copies, e.g., backup, snapshot, database dump, split mirror.

-- Site loss -- Large scale data recovery because of a site outage. Remediated through disaster recovery planning, including recovery from off-site media or various kinds of replication.

Some less obvious risk sources are the following:

-- Undetected data loss -- What is the likelihood of data loss (accidental deletions, improper data modifications, latent corruption) going undetected for weeks or months? Does the organization have a means to protect against this type of loss?

-- Interdependency risk -- Inability to recover because of lack of synchronization of among or within applications is an area that is not given sufficient attention, and can result in hours or even days of added time and effort. Has this risk been considered?

Developing a comprehensive profile of risks and aligning to a common framework of policies and practices can eliminate redundant efforts and eliminate exposures that may exist today.

Jim Damoulakis is chief technology officer of GlassHouse Technologies Inc., a leading provider of independent storage services. He can be reached at jimd@glasshouse.com