News | October 7, 2009

Storage Intervention: Deduplication Helps VARs Tackle Hardware Over-Consumption

By Randy Cochran, VP Channel Sales, Symantec

Customers today have more data to manage than ever before and IT infrastructures have become more complex as storage and bandwidth consumption increases. In this environment, customers are actively looking for ways to accommodate expanding data volumes without adding new hardware, optimize their existing infrastructure in order to control costs and improve return on IT investments, and reduce complexity to ease management. With data deduplication, solutions providers can help customers eliminate duplicate backup data, which reduces the amount of data they must store and archive and, in turn, enables them to make better use of the storage they already own.

Better yet, solution providers have a widening range of deduplication approaches to offer their customers. These solutions move deduplication closer to information sources so that duplicate data is reduced everywhere. With flexible engines that can be used in a client and within a backup application media server, a single interface from which to manage the entire infrastructure, and support not only for a variety of hardware appliances but also virtualized environments, these deduplication offerings provide measurable benefits for solution providers to deliver to customers.

Reducing Data
The most sophisticated deduplication technologies work by identifying every unique file and unique segment in a customer's environment, centralizing that information, and then storing only a single copy of it. With deduplication, customers can store up to 10x or more backup data per terabyte of disk. Indeed, data deduplication transforms enterprise storage into deduplicated storage, giving organizations a way to manage more data with less hardware. Deduplication also gives customers the option of eliminating tape backup in the data center — and the associated cost, risk, and hassle — while also automating and scaling the speed of disk-to-disk backup performance.

Remote offices that are required to centralize all their data through a single data center and leverage existing network connectivity to perform backups will find deduplication technologies to be invaluable. Transferring such high volumes of backup data over slow WAN connections can be troublesome and inefficient. By deduplicating data locally at the source level using a deduplication agent that is installed on a client at the remote site, traffic is significantly reduced because only unique segments of data are transferred over the network.

In the data center, deduplication occurs at the backup media server level. As a result, all data that is transferred to that media server is deduplicated and only unique data segments will be stored on it. Customers can also include deduplication as part of their disaster recovery strategy by using solutions that integrate with clustering technologies that will replicate deduplicated data to a secondary storage pool as well as provide automatic node failover and redundant storage paths.

One of the most compelling uses of deduplication technology is with email and content archiving tools. When deduplication is integrated into such tools, the solution moves messages, files, and other content directly out of applications to a deduplicated archive. This helps reduce the archive footprint, which some archiving tools further minimize with optimized single instance storage. Optimized single instance storage allows the archive to share commonly repeated data between archived items such as email and SharePoint documents, thereby shrinking the footprint down to only one copy in the archive, regardless of source.

Reducing Infrastructure
With customers rapidly embracing virtualization as a way to reduce infrastructure, solutions providers can also use deduplication to help them improve the return on their server virtualization investments. The most advanced deduplication tools typically offer two approaches for backing up and deduplicating virtual environments such as VMware. One option is to run the deduplication client at the source level in a VMware guest and back it up directly from the guest to the deduplication environment. Another option is to do off-host deduplication wherein a snapshot is moved from the ESX Server to a backup media server where deduplication is performed.

In fact, deduplication can help overcome a variety of challenges associated with backing up their virtual environments. For example, when a virtual environment is typically backed up, memory and CPU utilization are virtualized but I/O is not. This creates a bottleneck when transferring large amounts of data. Customers can use deduplication to address this challenge by significantly reducing the storage set and, in turn, offering less data to transfer over the network.

Because data sets do not become smaller when they are virtualized, customers must still back up the same amount of data in a virtual environment as they would if that same data were in a physical environment. By using deduplication to reduce their virtualized data set, customers will also reduce the backups of their VMware image files, thereby making their virtualized environment more efficient.

Reducing Complexity
Of course, unless customers can manage their environment, they will likely be unwilling to invest in even the most promising data deduplication technologies. Consequently, an effective deduplication solution must enable organizations to manage their entire deduplication environment, including remote offices and the data center running physical and virtual systems as well as popular deduplication appliances, from a single interface.

As customers continue to struggle to accommodate more data and meet more stringent legal and performance requirements without adding unnecessary complexity or cost, data deduplication is emerging as a highly effective solution. Data deduplication enables customers to use their existing storage infrastructure more efficiently, delay purchases of additional hardware, and avoid out-of-control storage costs while improving their return on server virtualization investments. By helping customers identify and put in place a deduplication infrastructure that can be managed across their entire environment, solution providers have a powerful solution for delivering customer value now and in the future.

SOURCE: Business Solutions Magazine