Maintaining backup health amid constant change

27.06.2006
Backup is degrading. No, I'm not trying to imply that there is something ignoble or shameful about the activity -- some of my best friends are backup administrators! I mean that in an active environment, backup is in a continual state of change, and a system that may have been performing optimally three to six months ago is likely to be doing substantially less so today. As servers and applications are added to an environment and data on primary storage grows, backup is affected exponentially, and without focused attention, the backup environment will inevitably degrade.

The decay can be gradual, and therefore may not be noticed when administrators are focused on the day-to-day operational aspects of backup, particularly when lacking comprehensive backup reporting to measure trending. As a result, small problems can go unnoticed until they become large. For this reason, it is important to develop a regular routine -- a backup health calendar -- to check various aspects of the backup environment's health at appropriate intervals -- daily, weekly, monthly, quarterly and annually.

Here are a few examples of each:

Daily

-- Review successful and unsuccessful backups.

-- Resolve failures before the next backup window.

-- Check tape drive status.

-- Check media status.

-- Confirm catalog backups.

-- Review and act on requests: restores, change management, etc.

-- Send vault tapes off-site.

Weekly

-- Check scratch-tape inventory.

-- Check library slots available.

-- Check the catalog tape inventory.

-- Review the off-site catalog pool.

-- Review backup server log files.

-- Publish weekly metrics reports.

Monthly

-- Check backup catalog disk space usage.

-- Review and apply any new software patches as required.

-- Perform random restore testing.

-- Publish trending metrics reports.

Quarterly

-- Review off-site tape storage authorization.

-- Review and apply any new software version releases as appropriate.

-- Perform high-level capacity planning review of backup storage resources.

-- Review performance statistics for each backup server.

Annually

-- Inventory and audit off-site tape storage.

-- Audit backup software licenses for software license compliance.

-- Review and update backup/recovery policies for application environments as appropriate.

-- Perform detailed capacity-planning and resource-utilization analysis.

-- Apply resource utilization metrics to budget request and forecasting.

Specifics will vary within an given environment, but the key is to reserve some time to focus to on the less urgent, but still very important, aspects of the backup operation.

Jim Damoulakis is chief technology officer of GlassHouse Technologies Inc., a leading provider of independent storage services. He can be reached at jimd@glasshouse.com