A common question asked by our customers is if graphics processing units (GPUs) really improve remote end-user experience in virtual desktop environments. The answer is yes, they do in many use cases and for both on-premises and cloud setups. One reason is that the Central Processing Unit (CPU) load on a remoting host can be reduced significantly by offloading graphics calculations used in many modern Windows applications to a GPU. Another reason is that GPUs can do video encoding and decoding much more efficient than any CPU. Simply said, offloading graphics calculations and video codecs to a GPU improves remote end-user experience. This article is an introduction into how GPUs work and why this is beneficial for remoting environments.
First, it’s important to understand what a CPU does. A CPU is “in charge” of a computer system. It manages the system’s activities (like component communication, disk queue, and memory retrieval) and it executes instructions received from a computer program. Simply put, it:
- receives input from a program (or from memory)
- performs arithmetic and logical calculations on that data per the instructions given
- produces output, stores it back in memory or displays it on the screen
There are a lot of parts to a CPU and we won’t go into a deep dive here, but it’s important for the sake of understanding GPUs to be familiar with some CPU architecture. A CPU has a certain number of cores (units that process instructions at the same time). Each CPU core is made up of a control unit, some arithmetic logic units (ALUs), and some memory (cache for short term storage, DRAM for long term storage).
The control unit retrieves data from memory. The ALU performs arithmetic and logical operations on the data, and the control unit moves the results back in to memory. The ALU is versatile in that it is capable of performing a large set of arithmetic and logical operations. On a multi-core processor, each core has a certain number of ALUs and each ALU can execute one calculation at a time. So, if a processor has 4 cores, with 8 ALUs per core, it can execute 32 instructions at one time.
A GPU, on the other hand, is much more specialized - its sole job is to calculate the color of individual pixels in complex 2D or 3D scenes that are described by simple graphics objects, such as triangles, rectangles, circles, lines and fonts. The graphics chips you find on modern GPUs include hundreds or even thousands of very simple processing units called shaders (GPU manufacturers refer to shaders as “cores”, “CUDA cores”, or “Shader Processors (SPs)”). But unlike a multi-purpose CPU core, a GPU shader cannot execute code independently of other supporting components. Shaders are highly optimized for a fairly small set of graphics calculations they can do really fast and with very low energy consumption. Due to the number of shaders on board, the GPU is capable of processing a large amount of lightweight threads.
The pixilation process of calculating and coloring individual pixels in complex 2D or 3D scenes is called rasterizing or rendering.
The pixel color data is temporarily stored in a special graphics memory called the frame buffer. The frame buffer represents the physical pixels on one or more computer screens. As soon as the entire scene in the frame buffer is complete, a pixel stream of the rendered scene is sent to the screen.
To create the illusion of movement for the human eye, a frame is displayed on the screen and repeatedly replaced by a slightly advanced frame. Stitching collected frames together is the digital equivalent to the art of stop motion animation (think Wallace and Gromit “Claymation” movies).
The same is true when dragging an application window (with graphic elements) across the screen.
Rendering in the past took a lot of time. At the time when the first Star Wars movie was produced, a computer-generated imagery (CGI) effect took hours, even days per frame. Now, rendering a CGI effect on a typical gaming PC can render these kinds of effects in real-time. The update frequency (in the frame buffer or) on the screen is measured in frames per second (FPS). The scene refresh cycle can range from once every couple of seconds to hundreds of times per second, depending on complexity and dynamic of the scene and depending on screen resolution. In general:
- 24 or 30 FPS is used in the film industry
- 30 to 60 FPS is regarded as good enough for most virtual desktop environments
- 120 FPS is the sweet spot for gaming and virtual reality
GPU and CPU Work Together
So, the CPU chugs away, and when it gets some instructions pertaining to graphics, it acts as a broker, constantly offloading graphics calculations to hundreds or thousands of shaders on a dedicated GPU, which calculates pixel colors from graphics instructions in parallel. The faster that frames get rendered and output to the screen, the better the end-user experience.
GPU Benefits to Multi-user Environments
In a virtual desktop environment, the pixel stream doesn't go to a physical monitor. Instead the pixel stream is encoded into a video stream and gets redirected to the remoting client. Most modern GPUs include helper units for video encode and decode (codec). This allows for effective hardware-accelerated encoding or decoding of MP4 videos using the H.264/H.265 family of algorithms, also known as MPEG-4 AVC (Advanced Video Coding).
This is not only beneficial when playing back videos on a Windows desktop, but also for encoding the remoting protocol data stream. All modern versions of remoting protocols are based on H.264 streams for the screen content redirection. A CPU can be saturated easily by high definition video streams. Multiuser environments benefit from the fact that the CPU can offload video encoding and graphics rendering to the GPU, reducing the CPU load massively and thereby giving user sessions additional CPU resources for other jobs. This can really enhance the remote end-user experience.
The Growing Necessity of GPUs
Applications built on top of graphics application programming interfaces (APIs) tell the shaders what to do. In today's Windows world, DirectX and OpenGL are the most popular 3D graphics APIs. They were both designed to offload their graphics instructions to a graphics processor if an adequate graphics card and corresponding driver is present.
Computer games and CAD/CAM applications are common applications that use graphics APIs, but they are no longer the only ones – graphics API usage has become mainstream for thousands of Windows applications like the Window Manager (yes, the graphical shell), Microsoft Office, Internet Explorer and Google Chrome. This is why Windows performance is massively improved when a GPU is present. In addition, some DirectX and most OpenGL commands will simply not work if no physical GPU is present.
We have told you the way a CPU works, and how it can use a GPU as a helper to perform graphics specific tasks in tandem. You also know now that using a GPU can decrease the overall load on the CPU because it can render graphics efficiently and in tandem with the CPU. Finally, more and more mainstream business applications are using graphics APIs and taking advantage of a GPU if its present. All this shows that adding a GPU to a remoting host in many cases will improve end-user experience. And we contend that adding a GPU to virtual desktop environments will not be optional in the near future - for many use cases it will become mandatory.
Stay tuned for our next post where we will use the REX Analytics framework to demonstrate the benefits of GPUs in a remoting environment.