|
Virtual tape
Since the dawn of the digital computer age, long-term data storage and backup have been the province of a single primary medium: magnetic tape. Tape has compelling advantages. It's inexpensive to operate and buy, and even cheaper to store, whether it exists on reels, inside cartridges or as part of an automated tape library system. Tape also has the benefit of separating the portable and inexpensive storage medium from the larger, more costly recording machinery. The introduction of tape made it possible to back up everything, keep copies off-site and restore older or deleted files as needed.
In comparison, hard drive storage combined the machine and the medium into a single piece of hardware, gaining speed and simplifying access. But it also drove up storage costs and was for years simply not economical for backup use. Tape persisted as the storage medium of choice, even though it suffered from poor performance and the need for sequential, not random, access to stored data. It wasn't very fast, and for the most part, operations had to be run in batch modes, often overnight.
In the early 1970s, IBM predicted the death of tape as a backup medium, and since then, others have continued to echo that sentiment. That hasn't happened yet, and it's not at all obvious that it will. But the amount of data being stored and processed continues to grow exponentially, and while ever-larger tape formats continue to emerge, the time needed to perform regular backups is also growing.
Finally, the economics of backup changed radically as hard drive storage became far cheaper. Not only are new hard drives cheap, capacious, physically smaller and increasingly reliable, but they operate much faster and offer online storage at off-line prices and with no waiting. A 250GB hard drive today costs less per gigabyte than the digital linear tape cartridges for a relatively recent tape library. Although tapes are still much more portable than RAID arrays, it's now practical to replace tape with disk for primary backups to boost speed, improve reliability and eliminate delays in loading and searching for needed data.
One logical response to this technological change was for enterprise IT to shift to hard-drive-based backup systems. But this approach required a surprising amount of work to convert existing systems, policies and procedures. Enterprise backup teams are used to fine-tuning backup environments and applications by adding custom scripts and workflows to manage thousands of individual tapes both on- and off-site. Even positive change will be disruptive in this setting, so IT managers are rightly concerned about the effects of disk-based backups on their systems and scheduling. The better answer, at least for now, turns out to be a game of "Let's Pretend."
With virtual tape, even though we're backing up direct to disk, we pretend we're dealing with tape. Data is backed up to the disk subsystem by accessing it through what's called a virtual tape library software that emulates the properties of tape. By making the disks look like tape, the virtual system lets IT use its existing tape-based scheduling procedures and practices, scripts and workflows; the only difference is that backup data is stored on a different set of devices. This is such a simplification as to be nearly simple-minded, but it allows IT to expand its capabilities with little or no effort, and it gets away from the need to handle, rotate and store near-line tapes. The net effect is that virtual tape makes both backups and restores faster, more reliable and cheaper.
The virtual tape libraries emulate industry-standard-based physical tape drives and libraries, presenting themselves as tape to all of the common backup software applications. A backup media server sends backup streams to a virtual tape library, which writes the data sequentially that is, in native tape format to disk storage. Through this bit of hocus-pocus, the virtual tape library appears to the system as another automated tape library, but the fact that data is being written to disk means backup jobs are completed significantly faster, often by a factor of 10 or more.
Virtual systems emulate tape operations even to the point of assigning bar codes to virtual tape "reels"or "cartridges" used by the backup software.
Virtual tape isn't necessarily the entire answer to backup. It still doesn't address the requirements of off-site storage and disaster recovery, but it can be used with a hierarchical storage management system in which data is moved to slower and increasingly less expensive storage media as it is used less. Virtual tape may also be included as part of a storage-area network, with a single virtual tape server managing less-used or archived data for many networked computers.
虚拟磁带
自数字计算机时代曙光初现,数据的长期储存和备份就一直是由磁带这个单一的重要介质担纲。磁带具有引人注目的优势。不管磁带是盘装的、盒式的、还是自动磁带库的一部分,它运营和购买成本低廉,储存更便宜。磁带还有另一个优点,即将可便携的廉价介质与体积更大、成本更高的记录机器分离。引入磁带使备份一切东西、在机房以外的地方存放拷贝以及按需要恢复陈旧的或者删除的文件成为可能。
比较而言,硬盘机存储将机器与介质合成为单一的硬件,从而获得速度和存取简化的优势。但是它也使存储的成本上升,多年来对备份用途而言它不是一种经济的方式。即便磁带的性能不佳、只能顺序而不是随机存取储存的数据,但磁带作为存储介质一直未变。它的速度不快,多数情况下,以批处理的方式运行,常常要通宵达旦地工作。
早在上世纪七十年代初,IBM就预测磁带作为备份介质将不复存在,自那以后,其他的人也不断附和这个论调。但这一直没有发生,而且根本看不出明显有这个趋势。随着要储存和处理的数据继续以指数形式急剧增加,同时更大的磁带格式继续出现,完成定期备份所需的时间也在增加。
最终,当硬盘存储变得更便宜时,备份的经济学发生了根本的变化。不仅新的硬盘机便宜、容量大、物理体积小、更可靠,而且它们运行的速度快得多,以离线的价格和无需等待时间提供在线存储。今天,一台250GB硬盘机的价格以每千兆字节计算,比新的磁带库使用的数字线性盒式磁带更便宜。虽然磁带仍比磁盘阵列更容易携带,但现在实际上已经可以用磁盘替代磁带,作为主要的备份手段,来提高速度、改进可靠性和消除装载和搜索所需数据时的延误。
企业IT部门转移到基于硬盘机的备份系统就是对此技术进步的合理响应。但是这种转移需要做数量惊人的工作,来转换现有的系统、政策和过程。企业备份队伍习惯于通过增加定制的脚本和工作流来管理成千上万现场和现场外的磁带,精细调整备份环境和应用程序。在此转变中,即使正面的变化也将是破坏性的,因此IT经理们要恰当地关注基于磁盘备份对系统和调度的影响。至少目前而言,较好的答案是玩"让我们假装"的游戏。
利用虚拟磁带,虽然我们是直接备份到磁盘,但我们假装是在与磁带打交道。通过一个叫虚拟磁带库的软件(它模拟磁带的特性),将数据备份到磁盘分系统上。通过将磁盘看成像磁带那样,虚拟系统让IT部门使用其已有的调度程序和做法、脚本和工作流。惟一的差别是备份数据储存在不同的设备上。这种简化看来好像很愚蠢,但它让IT部门不花什么力气就能扩展其能力,避免处理、转动和储存大量的磁带。虚拟磁带的实际效果是使备份和恢复更快、更可靠和更便宜。
虚拟磁带库模拟基于工业标准的物理磁带机和磁带库,让它们自己以磁带形式面对所有常见的备份软件。备份介质服务器将备份流送到虚拟磁带库,它顺序地写数据,即以原来的磁带格式写到磁盘存储器。通过这种有点欺骗的方式,对系统而言虚拟磁带库就是另一个自动的磁带库,但数据写到磁盘上这个事实意味着备份工作显著加快了,常常是提高了10倍甚至更高。
虚拟系统模拟磁带的工作甚至到了这样的地步,将条形码分配给虚拟磁带"盘卷"或"盒式磁带",供备份软件使用。
虚拟磁带并不一定能解决备份的全部问题。它仍然没有满足场外存储和灾难恢复的要求,但它能与层次式存储管理系统一起工作,在这种管理系统中,用得不太勤的数据转移到速度更慢、越来越便宜的存储介质上。虚拟磁带还可以作为存储区域网(SAN)的一部分,用单一的虚拟磁带服务器为多台联网的计算机管理用得不太频繁的或者已归档的数据。 |
|