Advanced Bandwidth Throttling

NAKIVO Backup & Replication was designed to transfer data at the maximum available speeds for the purposes of completing VM backup, replication, and recovery jobs as quickly as possible. However, if you run data protection jobs during business hours, your LAN or WAN networks risk being overloaded. This can affect the performance of applications and degrade user experience (think of email messages taking too long to be sent, excessive load times for websites, etc.). NAKIVO Backup & Replication addresses this issue with the flexible Advanced Bandwidth Throttling feature. With Advanced Bandwidth Throttling, you can set limits for your data protection jobs and make sure they don't take more bandwidth than you can afford to allocate.

Advanced Bandwidth Throttling allows you to set global rules that limit the data transfer speeds of your backup processes. Such rules can apply to different jobs and on different schedules. For instance, you can create a global rule preventing your backup jobs from consuming more than 50 MByte/s during business hours, but leave the bandwidth unrestricted for Sunday backups. You can also create bandwidth throttling rules on a per-job basis, if you want to have more granular control over the whole process. Individual limits override global rules, sparing you the need to adjust the global rule for every job.

The Advanced Bandwidth Throttling feature of NAKIVO Backup & Replication is an effective means of optimizing backup operations and controlling your network traffic. With global and individual limits on data transfer speeds, the feature can help you ensure the performance of your business applications is never affected by backup workloads – even if you have little bandwidth to spare. With bandwidth rules, usage of LAN/WAN bandwidth by NAKIVO Backup & Replication jobs may be restricted to a specific amount. For more information, refer to the following sections:

About Bandwidth Rules

A bandwidth rule specifies the bandwidth amount that can be used by one job, by multiple jobs, or by all applicable job. When a job containing multiple VMs starts running with a bandwidth rule active, the rule divides bandwidth between tasks.

Bandwidth rules are applicable to the following types of NAKIVO Backup & Replication jobs:

  • Backup Job

  • Backup Copy Job

  • Replication Job (except for Amazon EC2)

  • Recovery Job

  • Replica Failback (except for Amazon EC2)

Bandwidth rules may be always active, active on schedule, or disabled. Refer to Bandwidth Throttling for more details.

A bandwidth rule can be:

  • Global Rule – a bandwidth rule applied to all applicable Jobs.

  • Per Job Rule – a bandwidth rule only applied to specific Jobs.

Per Job rules have higher priority than Global Rules. A per job rule will be applied to the job when both the per job rule and a global rule are active for the same job.

Bandwidth rules (up to 100 rules can be created and enforced at the level of a Single-tenant product / Tenant of Multi-tenant product) are applied at the Transporter level, stored at the Director, and enforced while starting processing of a specific job object that falls into the limits of the current rules. Bandwidth rules can be enabled/disabled individually.

When enabled, the rule can limit bandwidth of JODTs that are covered by this rule.

Note

Job Object Data Transfer (JODT) is a step of a single job object processing which transfers data of the job object from the source endpoint to the target endpoint.

In case a JODT is running and the rule is created that applies to this JODT, the JODT will get the bandwidth allowed by the rule (for example 10 Mbit/s).

In case a JODT is running and the rule is enabled that applies to this JODT, the JODT will get the bandwidth allowed by the rule (for example 10 Mbit/s).

In case a JODT is running with limited bandwidth, the rule is disabled that applies to this JODT, and there are no other rules applying to this JODT, the JODT will get unlimited bandwidth.

In case a JODT is running with the limited bandwidth and another JODT is started covered by the same rule, the bandwidth allowed by the rule will be equally split between JODTs (for example 5 Mbit/s for the first JODT, and 5 Mbit/s for the second JODT).

In case a JODT is running with the limited bandwidth and another JODT is started covered by the same rule, the second JODT will start only after it gets the sub-capacity of the bandwidth allowed by the rule.

In case two JODTs are running with limited bandwidth and one of them has completed/failed/stopped, the bandwidth assigned to the completed/failed/stopped JODT will be freed, and the remaining JODT will get the entire bandwidth allowed by the rule.


In case a JODT is running and more than one rule is created that applies to this JODT, the bandwidth rule with the lowest bandwidth allocation will be applied.

In case there are multiple global rules – and no per job bandwidth rules,–  the global rule with the lowest bandwidth allocation will be applied.

Distributing Bandwidth Between Tasks

To illustrate distribution of bandwidth between tasks, one can take an example case with a single bandwidth rule of 30 Mbit/s that is used by Job A, Job B, and Job C.

Job A that has 7 VMs with one disk each (7 tasks in total) starts running with the 30 Mbit/s bandwidth rule activated as follows:

1. The bandwidth amount is split into 30 chunks 1 Mbit/s each.

2. The Transporter used by Job A can run 4 concurrent tasks at the maximum so Tasks A1, A2, A3, A4 are selected for processing by the Transporter.

Note
The Transporter can process a limited number of concurrent tasks.

3. The product starts distributing bandwidth chunks to tasks one by one. Each task receives 7 chunks that are equally distributed.

4. The remaining bandwidth is distributed from the start of the queue, so that the Tasks A1 and A2 receive an extra chunk each.

5. Tasks A1, A2, A3, A4 start running.

6. When Task A1 finishes execution, it frees 8 x 1Mbit/s chunks.

7. Task A5 starts execution, using the 8 available chunks.

8. When Tasks A2 and A3 finish execution, it frees 15 x 1 Mbit/s chunks.

9. Tasks A6 and A7 start running, using the 15 available chunks and 8 chunks are allocated to Task A6 and 7 chunks to Task A7.

 

At this point, the rule changes the bandwidth rule limit from 30Mbit/s to 80 Mbit/s and the Transporter starts distributing bandwidth as follows:

Job B consisting of two VMs with one disk each (2 tasks in total) starts running with the 80 Mbit/s bandwidth rule below:

1. The bandwidth amount is split into 80 chunks 1 Mbit/s each.

2. The Transporter used by Job B can run 10 concurrent tasks at the maximum.

3. However, 30Mbit/s (30 chunks) are already being used by Job A tasks, so 30 of 80 chunks cannot be used at the moment (as you cannot assign part of a chunk) so, only 50 chunks are available.

4. The product starts distributing bandwidth chunks to tasks one by one and Tasks B1 and B2 are allocated with 25 chunks each.

5. Tasks A4 and A5 finish execution, it frees 15 x 1Mbit/s chunks, but there are no queued tasks, so the bandwidth is left idle.

At this point, the bandwidth rule limit changes back to 30 Mbit/s.

 

The bandwidth rule is now activated for another job, Job C, that consists of one VM with one disk, so the Transporter starts distributing the bandwidth as follows:

1. The bandwidth amount of 30Mbit/s is split into 30 chunks 1 Mbit/s each.

2. The Transporter used by Job C can run 10 concurrent tasks at the maximum.

3. The currently running tasks occupy 65Mbit/s of the bandwidth, which is the 65 x 1Mbit chunks (with 35Mbit/s over the limit). Therefore, there is no free bandwidth for Task C1.

4. Tasks A6 and A7 finish execution, it frees 15 Mbit/s of the bandwidth. But the bandwidth rule limit is still exceeded by 20 Mbit/s, therefore, there is still no free bandwidth to start Task C1.

5. Task B1 finishes execution, it frees 25 Mbit/s of the bandwidth. The rule is now using 25 Mbit/s by Task B2 and 5 Mbit/s are available.

6. Task C1 is assigned with 5 x 1Mbit/s chunks and starts execution.

Note
Jobs and tasks may wait for a long time until bandwidth is available for them to start.