Hyper-V – Optimise Storage by Implementing Data Deduplication
Hi guys, different kind of post today, I want to show you how I’ve implemented data duplication within my Hyper-V lab to optimise my storage drives, in case you’re unfamiliar with data deduplication, it’s a technology that reduces storage space on an NTFS or ReFS volume by locating multiple copies of the same data and storing only one master copy at a binary level, giving us more storage space for labs!
Understanding the requirements and limits of Data Deduplication
- System or boot volumes are not supported
- Volumes must be using NTFS or ReFS
- Volumes must be attached to the server and cannot appear as non-removable drives
- Volume can be shared storage
Typical Savings
Scenario | Content | Typical Savings (%) |
---|---|---|
User Documents | Office documents, photos, music, videos, etc. | 30-50 |
Deployment Shares | Software, cab, symbols, fonts, etc. | 70-80 |
Virtualisation | ISO’s, VMDK’s, VHD’s, etc. | 80-95 |
General File Share | As above | 50-60 |
Usage types
Usage Types | Idea Workloads |
---|---|
Default | General-purpose file shares: – Departmental Shares – Work Folders – Folder Redirection |
Hyper-V | Virtualised Desktop Infrastructure (VDI) Servers |
Backup | Virtual Backup Applications: – Data Protection Manager (DPM) as an example |
Implementation
Installing the Features
On my Hyper-V host, within Server Manager, Manage, Add Roles and Features

Navigate through until you get to Server Roles, expand File and Storage Services and then File and iScsi Services, select Data Deduplication:

Navigate through again and finally, select Install:

No restart is required
Configuring Data Deduplication
Refresh the Server Manager console and navigate to File and Storage Services, Volumes, find the volume that you wish to enable deduplication on, in my case, it’s drive D:\, right-click it and select Configure Data Deduplication:

As we’re looking to use dedup on Hyper-V, we need to select Virtual Desktop Infrastructure (VDI) Server, for me, I have selected 0 for the day threshold, now open Set Duplication Schedule:

This is my schedule, however, pick what suits you:

That’s it for the automatic scheduling, let’s have a look at how to trigger this manually, monitor and see what the results are.
Manual Triggering and Monitoring
To manually trigger a deduplication job, once the above options are in place, open PowerShell.exe (As admin) and type in the following (Note: I am stating that the job can use up to 20 percent of my RAM, so be careful here):
Start-DedupJob -volume "D:" -Type Optimization -Memory 20

How to monitor a job:
Get-DedupJob

How to cancel a job:
Stop-DedupJob -Volume D:
Results
Before
Some of my VHDX’s before dedup was enabled:

After
Same folder:

Also, within the Server Manager console, you can see what space I have saved just over 1.0TB accross two drives, both contain Hyper-V files:

Until next time!