6 deduplication, How it works, Getting deduplication running on the vls – HP 9000 Virtual Library System User Manual

Page 84: Considerations

Advertising
background image

6 Deduplication

Deduplication is the functionality in which only a single copy of a data block is stored on a device.
Duplicate information is removed, allowing you to store more data in a given amount of space
and restore data using lower bandwidth links. The HP virtual library system uses Accelerated
deduplication
.

This section describes deduplication including getting deduplication running on your system,
configuring deduplication, and viewing reports.

NOTE:

See the HP VLS Solutions Guide for more detailed information.

How It Works

HP Accelerated deduplication compares the most recent version of a backup to the previous version
using object-level differencing code. It places pointers in the earlier version that identify duplicated
content in the new version. Deduplication then eliminates the redundant data in the earlier version
while retaining the complete, new version. You can improve deduplication performance simply by
adding additional nodes.

NOTE:

Deduplication takes place after the data has been processed to the backup tapes.

Therefore, any data backed up to compression-enabled virtual tape drives (both software and
hardware compression) is compressed before it is deduplicated.

The following is an overview of the deduplication process. See the HP VLS Solutions Guide for
more detailed information.

1.

When a backup runs, a data grooming exercise is performed on the fly. Using meta-data
attached by the backup application, data grooming maps the content or “objects” of the
backup, and assembles a content database. This process has minimal performance impact.

2.

After the scheduled backups have completed, the content database is used to “delta-difference”
(compare) objects in current and previous backups from the same hosts. There are different
levels of comparison. For example, files may be compared using a strong hashing function,
while other objects may be compared at a byte level.

3.

When duplicate data is found in an older backup, it is replaced by a pointer to the most recent
copy of the same data. Because the most recent backup is a full version, you achieve the
fastest possible restores.

4.

Space reclamation occurs when duplicate data from previous backups is removed from the
disk. This can take some time, but results in previously consumed capacity being returned to
a free pool on the device.

Getting Deduplication Running on the VLS

This section explains how to get deduplication running on your VLS system including some
considerations for setting up the system, installing the firmware, and installing the deduplication
licenses.

Considerations

To make the most of the deduplication benefits, review these considerations before setting it up on
your VLS system:

Virtual cartridge sizing — The system cannot deduplicate versions of a backup that are on
the same cartridge; the versions are not deduplicated until a new version is written to a different
virtual cartridge. Therefore, you want the cartridges to be sized big enough to contain an

84

Deduplication

Advertising
This manual is related to the following products: