Configuring and optimizing proxy servers
Proxy servers are the workhorses of the Veeam Backup & Replication v10 application, and they do all the heavy lifting or processing of tasks for backup and restore jobs. When you set up Veeam, you need to ensure that the proxy servers get configured as per best practices:
- https://bp.veeam.com/vbr/VBP/2_Design_Structures/D_Veeam_Components/D_backup_proxies/vmware_proxies.html
- https://helpcenter.veeam.com/docs/backup/vsphere/backup_proxy.html?ver=100
When you decide to deploy a proxy server, Veeam Backup & Replication will install two components on the server:
- Veeam Installer Service: This is used to check the server and upgrade software as required.
- Veeam Data Mover: This is the processing engine for the proxy server and performs all the required tasks.
Veeam Backup & Replication proxy servers use what we call a Transport Mode to retrieve data during backup. Three standard modes are available, and they are listed in order, starting with the most efficient method:
- Direct Storage access: The proxy is placed in the same network as your storage arrays and can retrieve data directly from there. This method allows for two transport modes – Direct SAN access and Direct NFS access. The backup load is offloaded from the hypervisor to process the workloads.
- Virtual Appliance: This mode mounts the VMDK files to the proxy server for what we typically call Hot-Add Mode to back up the server data.
- Network: This mode is the least efficient, and is used when the Failover to network mode if primary mode fails, or is unavailable option is selected. It moves the data through your network stack, and it is recommended not to use 1 GB, but rather 10 GB or higher.
In addition to these standard transport modes provided natively for VMware environments, Veeam provides two other transport modes: Backup from Storage Snapshots, and Direct NFS. These provide storage-specific transport options for NFS systems and storage systems that integrate with Veeam.
Refer to the integration with storage systems guide: https://helpcenter.veeam.com/docs/backup/vsphere/storage_integration.html?ver=100.
Along with the transport modes, there are specific tasks that the proxy server performs:
- Retrieving the VM data from storage
- Compressing
- Deduplicating
- Encrypting
- Sending the data to the backup repository server (backup job) or another backup proxy server (replication job)
Veeam proxy servers leverage what is known as VADP (VMware vStorage APIs for Data Protection) when using all transport modes other than Backup from Storage Snapshots and Direct NFS.
The following are things you should consider in relation to your proxy servers:
- Operating System: Most software vendors will always recommend the latest and greatest, so if you are going to choose Windows, then 2019, or if you are going to choose Linux, then the newest flavor you have picked (Example – Ubuntu 20.04.1 LTS). Note that as regards Linux VMware backup proxies, only HotAdd mode is supported in Veeam Backup & Replication v10.
- Proxy Placement: Depending on the transport mode for the server, you will need to place it as close to the servers you want to back up, such as on a specific host in VMware, a physical server, or a blade enclosure. The closer to the source data, the better!
- Proxy Sizing: This can be tricky to determine and will be dependent on the server being physical or virtual. Veeam proxy servers complete what are called Tasks, which is the processing of one virtual disk for a VM or one physical disk for a server. Therefore, Veeam recommends one physical core or one vCPU as well as 2 GB of RAM per task.
Veeam has a formula used to calculate the required resources for a proxy server:
- D = source data in MB
- W = backup window in seconds
- T = throughput in MB/s, = D/W
- CR = change rate
- CF = cores required for full backup, = T/100
- CI = cores required for incremental backup, = (T * CR)/25
Based on these requirements, we can use a sample of data to perform the calculations:
- 500 virtual machines
- 200 TB of data
- 8-hour backup window
- 10% change rate
Using these numbers, we perform the following calculations:
This formula determines the throughput required for the data that will be ingested by the backups.
We now use the numbers we calculated to determine the required number of cores needed to run both full backup and incremental backup to meet your defined SLA:
This formula takes the throughput from the previous formula and then calculates the number of CPU cores required.
Based on our calculations and considering you require 2 GB of RAM for each task, you would need a virtual server with 73 vCPUs and 146 GB of RAM. This size may seem like a considerable server, but keep in mind that it uses the sample data. Your calculations will likely be much smaller or possibly more extensive, depending on your dataset.
Should you decide to use a physical server as a proxy, you should have a server with 2 – 10 core CPUs. In the case of our sample data, two physical servers are what you require. If you are using virtual servers for proxies, then the best practice is to configure them with a maximum of 8 vCPUs and add as many as required for your environment – in this case, we would need nine servers.
Should you want to size things based on incremental backups only, your requirements are less than half of those for a full backup – 29 vCPUs and 58 GB of RAM.
There are limitations for proxy servers that you need to be aware of when it comes to job processing and performance. As noted above, a proxy server performs tasks, which are assigned CPU resources. Concurrent task processing is dependent on the resources you have available in your infrastructure and the number of proxy servers you have deployed. As seen here, when adding a proxy server to Veeam Backup & Replication, there is the Max concurrent tasks option, which correlates to the number of CPUs that are assigned:
The task limits can be viewed at the following link: https://helpcenter.veeam.com/docs/backup/vsphere/limiting_tasks.html?ver=100.
Important note
Job performance gets impacted based on the tasks of a proxy server. As an example, if you had a proxy server with 8 CPUs and you added 2 virtual machines for backup, one with 4 disks and the other with 6 disks, only 8 of 10 disks would get processed in parallel. The remaining 2 disks would have to wait on resources before backing up because tasks get assigned per virtual disk of a VM during the backup process.
You should now be able to size your proxy servers correctly regarding things such as CPUs and RAM and understand proxy placement and how it processes tasks. Proxy servers send data to repository servers, which is the focus of the next section.