Data removal process – Symantec NETBACKUP 7 User Manual

Page 118

Advertising
background image

The PureDisk plug-in reads the backup image and separates the image into
files.

The PureDisk plug-in separates files into segments and calculates the
fingerprint for each file and segment.

The plug-in compares each fingerprint against the local fingerprint cache. If
the fingerprint is not known in the cache, the plug-in requests that the engine
verify if the fingerprint already exists.

If the fingerprint does not exist, the segment is sent to the engine. If the
fingerprint exists, the segment is not sent.

The fingerprint calculations are based on the MD5 algorithm. However, any
segments that have different content but the same MD5 hash key get different
fingerprints. So NetBackup prevents MD5 collisions.

Data removal process

The following list describes the data removal process for expired backup images:

NetBackup removes the image record from the NetBackup catalog.

NetBackup directs the NetBackup Deduplication Manager to remove the image.

The deduplication manager immediately removes the image entry and adds a
removal request for the image to the database transaction queue.

From this point on, the image is no longer accessible.

When the queue is next processed, the NetBackup Deduplication Engine
executes the removal request. The engine also generates removal requests for
underlying data segments

At the successive queue processing, the NetBackup Deduplication Engine
executes the removal requests for the segments.

Storage is reclaimed after two queue processing runs; that is, in one day. However,
data segments of the removed image may still be in use by other images.

If you manually delete an image that has expired within the previous 24 hours,
the data becomes garbage. It remains on disk until removed by the next garbage
collection process.

See

“About maintenance processing”

on page 90.

Deduplication architecture
Data removal process

118

Advertising