Will French
on 29 June 2024
Introduction
Enterprise applications and services often rely on load balancers to ensure reliability, availability, and performance by distributing traffic across multiple servers. Many load balancers provide capabilities to manage the asymmetric cryptographic handshakes that are required to establish connections for TLS, which is a protocol used to secure HTTPS traffic. TLS operations rely on compute-intensive ciphers that can consume significant CPU resources, especially when the load balancer is managing a large volume of requests. In this post, we demonstrate that it’s possible to significantly reduce CPU utilisation of HAProxy, a popular open source load balancer, by offloading these operations to IntelⓇ QuickAssist Technology (IntelⓇ QAT) in Ubuntu 24.04 LTS. The added benefits include:
- Handling larger volumes of traffic
- Energy savings
- Freeing up the CPU to handle other work
IntelⓇ QAT support on Ubuntu 24.04 LTS
The IntelⓇ QAT accelerator is designed for offloading of a specific set of operations from the CPU, namely cryptographic and data compression workloads. Given IntelⓇ QAT’s role as an accelerator, it should not be surprising that one benefit it provides is improvement in raw performance. However, another benefit is that CPU utilisation may be reduced by offloading encryption and compression operations to an IntelⓇ QAT device, freeing up the CPU for other tasks. This blog post focuses on reducing CPU utilisation, leading to lower power consumption and overall cost as fewer cores may be required to run a scaled-out workload.
Ubuntu 24.04 LTS marks the first Ubuntu release with official IntelⓇ QAT support available through packages in the Ubuntu archive. The IntelⓇ QAT software stack consists of both kernel space and user space components. The kernel space component comes in the form of a kernel driver which is distributed through the 6.8 stable kernel. The user space component, meanwhile, is included through a set of packages in the Ubuntu archive which can be installed using apt. The table below summarises the relevant packages for the use case described in this blog post:
Conveniently, the first four packages are all dependencies of the final package: IntelⓇ QuickAssist Technology OpenSSL* Engine. Thus, all of these components can be installed by running:
sudo apt -y install qatengine
Reducing CPU usage of the HAProxy load balancer
To demonstrate the value of IntelⓇ QAT on Ubuntu 24.04 LTS, in this section, we show the benefit of integrating it with HAProxy. HAProxy is designed to distribute network traffic across a fleet of backend servers in an optimal manner. Among HAProxy’s many features is its ability to handle TLS handshakes on behalf of a group of web servers accepting requests over HTTPS. TLS handshakes require a set of cryptographic operations (e.g., key exchange) that can be offloaded to IntelⓇ QAT, freeing up the CPU to perform other tasks.
To highlight this CPU usage savings, we performed benchmarks of HAProxy running on 4th Gen IntelⓇ XeonⓇ Platinum 8488C processors equipped with built-in IntelⓇ QAT devices on Ubuntu 24.04 LTS. All tools and libraries (including those listed in the table above) were installed using apt from the Ubuntu archive with the exception of HAProxy itself, which was built with SSL Engine support enabled from the recent 3.0 release. To enable IntelⓇ QAT support at runtime, HAProxy was configured with ssl-engine qatengine algo ALL and ssl-mode-async enabled.
We ran benchmarks on a single system using the ApacheBench tool as the HTTPS clients and Apache web servers as the backend servers. The web servers were configured to return an empty response and for cryptographic operations we used TLSv1.2 with the ECDHE-RSA-AES256-GCM-SHA384 cipher and a RSA 2048 certificate. All benchmarks used sixteen ApacheBench threads, eight HAProxy worker threads, and sixteen Apache web server threads. CPU pinning was enabled and two threads were assigned per physical core for each component (ApacheBench, HAProxy, and Apache web server) of the benchmark. To approximate a real-world scenario, the benchmarks were run using network namespaces with a simulated latency of 50 ms, jitter of 4 ms, and packet loss of 0.1%.
Figure 1. The HAProxy CPU usage and TLS Handshake performance (measured in connections per second) resulting from different HTTPS client loads. Larger values in the x direction indicate a greater stress on the CPU, while larger values in the y direction indicate better performance.
To showcase the impact of IntelⓇ QAT offloading, we measured the CPU usage and connections per second while the HAProxy load balancer was under different loads from the ApacheBench clients. The results are presented in Figure 1 and show clear benefits for IntelⓇ QAT offloading compared to native OpenSSL (version 3.0.13). The IntelⓇ QAT technology consists not only of a hardware component (QATHW) but also a software component (QATSW) that leverages cryptographic libraries highly optimized for running on modern Intel CPUs using multi-buffer processing and the IntelⓇ AVX-512 instruction set. From Figure 1 there is a clear benefit from offloading to QATSW as it significantly outperforms native OpenSSL. Stated another way, QATSW achieves the same performance as native OpenSSL but with lower CPU usage, and therefore lower energy consumption and operational costs.
Improved CPU efficiency (i.e., lower CPU utilization) and higher performance are achieved when a single IntelⓇ QAT device is configured for cryptographic operations. The IntelⓇ QAT OpenSSL* Engine package supports QATHW and QATSW coexistence (referred to as QATHW+SW in Figure 1) by optimally distributing work among the two paths. This QATHW+SW behavior is enabled in the qatengine apt package in Ubuntu 24.04 LTS. For more details on how to configure and tune IntelⓇ QAT devices based on your use case, refer to this IntelⓇ QAT Configuration and Tuning Guide. While in this blog post we focus on relative performance, it’s possible that additional configuration and tuning may yield even higher raw performance.
Conclusion
In this post, we showed that IntelⓇ QAT can be used in Ubuntu 24.04 LTS to offload compute intensive workloads, maximizing CPU efficiency and driving cost savings. Also evident from the data shown is that IntelⓇ QAT can be used to accelerate an application’s performance. Our benchmarks focused on HAProxy but many other workloads can expect similar benefits, provided they rely on a high proportion of cryptographic and/or compression operations. Finally, the availability and ease of installation of IntelⓇ QAT packages and libraries in Ubuntu 24.04 LTS make it an attractive platform for integrating, testing, and deploying applications with IntelⓇ QAT support. If you have an IntelⓇ QAT device of your own that you would like to test, you can download Ubuntu 24.04 LTS now and use it for free at the Ubuntu Server Download page.