Tips, checklist to deploy tape virtualization

12.07.2006
Virtual tape libraries have become a popular storage topic for IBM mainframes as well as Linux, Unix and Windows environments. VTLs are a building block for deploying tiered storage as well as to support disk-to-disk data protection (3DP) including backup, archiving, compliance retention, business continuance and disaster recovery. Among other things, VTLs enable and provide the following:

- Improved performance of backup, restore, archiving and other tape-based processing

- Reduced or eliminated manual handling of tape or optical media

- Coexistence with existing backup and archiving software, procedures and policies

- A phased transition from tape to disk-based processing with minimum disruption

So let's spend a few minutes and look at some issues, problems and hurdles to be aware of when deploying a VTL solution to avoid surprises. VTLs are available from many different vendors in a variety of products and feature sets. Some are positioned as software-based, while others are more encompassing solutions combing hardware, software, storage and management tools and services. Keep in mind the following things to avoid problems and surprises with VTL deployments:

1. If the VTL system is being sold or marketed as a software product, you should conside where the software will run and who supplies and integrates the server, software and storage as well as provides on-going support.

2. What happens when the VTL becomes full or runs out of capacity? While a VTL provides virtual tape emulation, I'm not aware of any VTLs yet (besides in archtecture slides) that provide unlimited real-world virtual capacity, although I'm aware of several products that have exceptional scaling capabilities to dynamically add storage. Likewise, there are vendors that can proactively manage and reduce the amount of storage including leveraging differencing, compaction, factoring, deduplication and singe-instance image capabilities.

3. Some VTLs provide extensive interoperability and emulation capabilities to coexist with existing software configurations and settings. Look into what emulation is provided by a VTL as well as how extensive and robust the emulation is compared to your needs and requirements. Keep in mind that some of your applications or procedures may have hard-coded references that may prove sticky and thus need to be accounted for and addressed.

4. How will you be using your VTL -- will you be using it primarily as a large disk buffer and staging pool with as much data as possible kept on disk, or will you look to actively leverage tape and move data off of disk as soon as possible?

5. What format and how accessible is the data once it is stored on a VTL? Is the data stored as a file representing a tar ball or proprietary backup save set format, or is it stored in a format that lends itself to rapid access? For example, some vendors create relative smaller container files, while others more closely map and mirror to a 1-to-1 correlation between physical and virtual tapes.

6. What's involved in making a clone copy of a virtual tape along with exporting or moving a virtual tape to a physical tape for storage or transport, and what audit trails and logs exist?

7. Will any special host software or agents have to be installed, and what if any changes need to be made to your monitoring and notification systems? If you have not already done so, look into backup and data protection monitoring and analysis tools such as those from Aptare, Bocada Inc. and WysDM Software Inc. to help gauge how your backups are performing.

8. Who has access to and can allocate virtual tapes, virtual tape drives and libraries?

9. How are firmware and software updates handled, as well as what are software maintenance fees?

10. How fast can a job access a particular file, and can multiple streams access the same virltual tape?

A premise of VTLs is to emulate and coexist with existing software and procedures including backup scripts to mask complexity of underlying hardware and software. Hence, the move to a VTL system should seamlessly integrate into your environment, with the fewer disruptions and changes the better. However, having said that, identify what and where you may have "sticky" points in your environment for future consideration, or address those points now. Sticky points include custom scripts or application-specific linkages to hardware and software that inhibit your ability to leverage new technologies resulting in additional complexity. Keep the following 10 questions in mind when considering VTLs for deployment into your environment.

1. How does a VTL integrate into your business continuity and disaster recovery plans locally as well as at a remote hot or cold site?

2. What storage devices do you want or need to be attached by the VTL?

3. How much storage capacity will you need in the future, and how will the VTL support that growth?

4. What backup, archiving, compliance and other data protection software is supported by the VTL?

5. Can the VTL support some form of differencing or single-instance repository capabilities?

6. Will you perform the integration of hardware and software, or are you looking for a turnkey solution?

7. What level of redundancy and resiliency do you need in a VTL solution, and are upgrades disruptive?

8. Do you need advanced security capabilities, including encryption of virtual tapes?

9. What tape device and tape library emulation do you need supported for your environment?

10. What changes will be needed to your environment's software, procedures and polices to use a VTL?

VTLs combine virtualization (emulation) and disk-based backup to address various needs. When the most applicable solution for your requirements is properly implemented, it can provide many benefits to your organization. Proper planning and working closely with vendors and/or business partners can reduce the number of surprises and disruptions associated with deploying a VTL into your environment. As with any technology, virtualization should work for you, not the other way around, so it should not increase your workload or add complexity.

Greg Schulz is founder and senior analyst of the StorageIO group and author of Resilient Storage Networks (Digital Press, 2004).