Deduplication and Compression Considerations

Refer to the following topics to learn about compression and deduplication:

Compression and deduplication are some of the NAKIVO Backup & Replication features which allow you to optimize the use of storage space in which you back up your data.

What is Compression?

Modern businesses dealing with virtual environments create a lot of backup data, which must be reliably stored and use as little space as possible. One of the most efficient ways to reduce your storage space is by utilizing backup compression.

NAKIVO Backup & Replication has a built-in data compression feature, allowing you to reduce file size by re-encoding the file data using fewer bits of storage in comparison with the original file. This process is based on mathematical equations - the software scans data and looks for repeated patterns. Having found repeated patterns, the compression feature replaces instances of these patterns with smaller codes, indicating the place where the patterns were found.

Compression Levels

When creating a Backup Repository, you can select one of three compression levels:

  • Fast. It's the lowest compression level which consumes minimum CPU, but for most cases, the space savings should be sufficient.

  • Medium. This compression level requires more CPU than the Fast level. However, it allows you to save more space than in the Fast level.

  • Best. This is the highest compression level with an advanced compression algorithm. Best compression is usually slower, and uses more CPU than the Medium, but it allows you to save much more space. It i set as the default option.

Important

Compression option can't be edited once the Backup Repository is created.

Enabled vs Disabled Compression

In a virtual environment, it's recommended that you use data compression when performing backups. It allows you to store more data and reduce storage expenses. NAKIVO Backup & Replication applies compression along with other data reduction techniques which help to achieve up to 10X storage savings.

Example

Assume that you need to back up 48.4 GB of raw data. See how much space you can save with Fast and Best compression levels but note that compression highly depends on the source data and space saving rates may vary significantly.

Compression level

Time Transferred RAW data, GB Backup size, GB Approximate space savings, GB Approximate space savings, %
Fast 17 min 53 sec 48.4 20.2 29.0 59
Best 33 min 58 sec 48.4 15.5 33.0 68
Disabled 20 min 27 sec 48.4 48.4

0.0

0.0

Important

To achieve maximum effectiveness, exclude swap fies and partitions during processing.

To find out the information about storage savings after a backup job is completed, view Backup Repository details.

What is Deduplication?

When running a backup job with NAKIVO Backup & Replication, the data deduplication feature allows you to compare new data blocks with those that are already available in a Backup Repository. If there are duplicates of data blocks, they don't get copied, and a reference to the existing data blocks is created.

Deduplication helps you to save data storage space and reduce network load since duplicates of previously backed-up data aren’t transferred over the network.

Note

To prevent conflicts with deduplication appliances, deduplication is available only for forever-incremental Backup Repositories.

Enabled vs Disabled Deduplication

You can enable or disable deduplication when creating a Backup Repository. However, keep in mind that deduplication can’t be edited once the Backup Repository is created.

Owing to the target post-processing deduplication strategy of NAKIVO Backup & Replication, you can reduce backup size up to 10 times.

Example

Assume that you have 10 VMs running on Windows Server 2016. The minimum disk space requirements for this OS is 32 GB of free disk space to install the system. This means that the total size of VM backups will be at least 320 GB (without applications and databases). To deploy more than one VM with the same system, you must use a template to get 10 sets of duplicate data blocks. Therefore, with deduplication enabled for the backup job, you should get a 1:10 storage space saving ratio. In general, storage savings ranging from 1:5 to 1:10 are considered to be good. Such efficient storage use allows for storing more recovery points per VM backup. It can also reduce your storage and other related expenses.