Complete guide to backup deduplication
A comprehensive collection of articles, videos and more, hand-picked by our editors
Does deduplication change the way we back up and restore data?
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Data deduplication reduces the sheer amount of storage space required for backups, and this can benefit data protection by allowing faster and more frequent backups, faster restorations, and potentially longer retention times within the limits of regulatory compliance requirements and corporate policy.
Deduplication can affect the actual backup application, depending on its approach. For example, backup tools that handle block storage should work as is because deduplicated data is preserved on the target storage device. By comparison, backup tools based on file storage may generally "undo" the deduplication -- resulting in significantly more storage being needed on the target storage device -- unless the backup tool specifically supports Windows Server 2012 R2 data deduplication. For example, a tool like Windows Server Backup fully supports deduplication, and IT administrators can restore a complete volume or individual folders from the backup.
Remember that deduplication doesn't operate on system and boot volumes, remote drives, encrypted files (because data is already uniquely scrambled), or files smaller than 32 KB. This content is backed up and restored just as any conventional file.
Deduplication periodically runs so-called garbage collection processes to recover storage chunks no longer in use. It's best to run a backup after a garbage collection process to ensure that any changes to freed storage are captured in the backup and not allowed to age out.
Data deduplication provides an important means of improving storage efficiency, lowering storage costs and speeding data protection processes. But deduplication's effectiveness and performance can vary depending on the workload and deduplication setup. IT administrators should invest the time it takes to benchmark each storage volume before and after deduplication is applied, in order to gauge any performance penalty, then should adjust scheduling and other options to optimize server and workload performance. Backup and restoration processes should also be tested in advance to understand the storage needs of deduplicated data and to allow for updates or patches to the data protection tool to enhance storage use for deduplicated backups.
Related Q&A from Stephen J. Bigelow
Migrating to Service Center 2016 requires a good deal of preparation and work. Are the features in Orchestration and Service Management Automation ...continue reading
We need better control and visibility for workloads, but still want dedicated compute capacity. How do Dedicated Hosts work and what are the ...continue reading
We're exploring AWS instances and need the best option for mission-critical workloads. What are the benefits and drawbacks of EC2 Dedicated Instances?continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.