Difference between double buffer and triple buffer in Direct3D12

2024-10-16 (Last Modified: 2025-02-16)

Previously, we were able to freely set the refresh rate, This time, we will be able to freely set the drawing fps.

To do so, we need to change the double buffer to a triple buffer, eliminate the VSync wait, and execute Present.

Roughly speaking, the differences are as follows.

Double buffer
- A method of switching between two buffers, a front buffer and a back buffer. VSync wait is required to avoid tearing.
Triple buffer
- Triple buffer: Switching between three buffers (front buffer and back buffer x2) without VSync wait to avoid tearing.

In the case of Direct3D12, the internal processing is difficult to understand because it is handled well in the Present, and if you make a mistake in the settings, you may get an exception or tearing. Tearing is a phenomenon in which different frames are displayed at the top and bottom of the screen, making the image appear to flicker.

First of all, the behavior of Present seems to be different for double buffer and triple buffer. (This is what Dr. ChatGPT says, so he may be wrong.)

Present for double buffer when VSync is enabled
- Wait for drawing to complete, then Swap buffer, then wait for VSync
Present for triple buffer when VSync is disabled
- Don’t wait for drawing completion, just Swap buffer

It is difficult to understand, but the important thing to remember is the following

Even if you Swap a buffer, the display will start showing that buffer at the next VSync timing

It is sometimes written that tearing occurs even with triple buffers, but tearing is never caused by the timing of the Swap switch. Tearing is caused by incorrectly drawing to a buffer set at the front.

In other words, if tearing occurs, the buffer to be drawn is not being properly managed.

In Direct3D12, the code to work with either a double buffer or a triple buffer is as follows.

First, make sure to wait for the previous frame to be drawn before the Present.

    void SwapBuffer()
    {
        if (mDrawing)
        {
            // Wait for completion of drawing of previous frame (no fluence is played)
            WaitForFence(false);

            // Present
            int vsync = FRAME_BUFFER_COUNT == 3 ? 0 : 1;
            swapChain->Present(vsync, 0);
        }
        mDrawing = false;
    }

    void IncrementFence()
    {
        fenceValue++;
        commandQueue->Signal(fence.Get(), fenceValue);
    }

    void WaitForFence(bool increment = true)
    {
        if (increment)
        {
            IncrementFence();
        }

        if (fence->GetCompletedValue() < fenceValue) {
            fence->SetEventOnCompletion(fenceValue, fenceEvent);
            WaitForSingleObject(fenceEvent, INFINITE);
        }
    }

This SwapBuffer is called at the start of the drawing. Also, at the end of drawing, a fence is played to indicate that drawing is complete.

    void Draw()
    {
        // At the timing of the next VSync, the drawing result of the previous frame is shown on the display.
        SwapBuffer();

        // Start of drawing
        mDrawing = true;
        int buffer_index = swapChain->GetCurrentBackBufferIndex();

        // This is where the command list is generated.

        // Command List Execution
        commandList->Close();
        ID3D12CommandList* commandLists[] = { commandList.Get() };
        commandQueue->ExecuteCommandLists(_countof(commandLists), commandLists);

        // Fence drawing completion
        IncrementFence();
    }

It is important to note the following

Executing Present updates the index ofswapChain->GetCurrentBackBufferIndex()
From the perspective of CPU and GPU parallelization, it is desirable to call Present just before GetCurrentBackBufferIndex is called

Let me explain a little about fence. fence is a function to know when the GPU is processing a command, and it is put on the command queue with a simple ID, The GPU reads the command and notifies the CPU that the process before the fence has completed, along with a simple ID. In other words, when the GPU reads the fence, a callback function registered by the CPU is simply invoked.

The purpose of the fence is to wait for the drawing process up to the fence to complete, so be careful not to confuse waiting for VSync with waiting for drawing.

Now that we can freely set fps, we should talk about whether it is safe to ignore the refresh rate and freely set fps.

For example, what happens if you display 100 fps on a display with a refresh rate of 60 Hz? The GPU draws 100 images per second, but the display only shows 60. In other words, 40 images will not be displayed on the screen and will be wasted. However, this does not mean that there is no point: the CPU processes the UI and the simulation 100 times per second, so in games where you have to hit buttons repeatedly, it will make a difference in reaction time. This can make a difference in response time in games where you have to hit buttons repeatedly. If you want to eliminate unnecessary rendering, you can use only CPU processing at 100 fps and render only 60 images. To run CPU and GPU fps asynchronously requires more complex processing and more memory, so the difficulty level is a little higher.