DXGI 1.3 introduced the concept of a waitable swap chain, which allows a program to block waiting for a vblank before a frame starts getting generated rather than having to block waiting for a vblank after a frame is finished being generated. The basic concept is easy to understand but I had a lot of questions about the details of how it worked that were not at all obvious to me from any documentation that I could find.
Disclaimer: This post is not written from a place of confident knowledge. Instead, I am documenting my current understanding based on some frustrating trial and error but it is quite possible that I have some details wrong. I am writing this in case it helps others to save time (I am surprised that I haven’t found others asking similar questions, although maybe I just haven’t been using the correct search terms), but caveat emptor.
Some official documentation on waitable swap chains:
- https://learn.microsoft.com/en-us/windows/uwp/gaming/reduce-latency-with-dxgi-1-3-swap-chains
- https://learn.microsoft.com/en-us/windows/win32/api/dxgi1_3/nf-dxgi1_3-idxgiswapchain2-getframelatencywaitableobject
Summary
The following is a list of things that I think I have discovered but that were not obvious to me:
- The waitable object must be (successfully) waited on for every call to Present(), but it is initialized with some additional inherent waits equal to the value provided to
IDXGISwapChain2::SetMaximumFrameLatency()
- I believe that any outstanding required waits are canceled by
DXGI_PRESENT_RESTART
toPresent()
(meaning that it is ok to not wait on the waitable object before callingPresent()
with that flag, and after a successful call toPresent()
with that flag there is only a single time that the waitable object must be waited for regardless of the state of the waitable object before that call)
- I believe that any outstanding required waits are canceled by
- The waitable object behaves like a FIFO (first-in-first-out) queue, and so waiting for it to signal is waiting for the oldest queued
Present()
call to actually be executed - The waitable object behaves like a Windows event, meaning that there is not a chance of missing the vblank signal if application code doesn’t wait soon enough. If an attempt to wait is made before the vblank then the wait will block, but an attempt can be made any time after the vblank (which should hopefully return immediately), and successfully waiting on the waitable object is what clears/resets a particular queued
Present()
call. - When a swap chain is created with a waitable object,
Present()
not only doesn’t block (which the documentation makes clear) but also doesn’t fail if the present queue is exceeded (and, specifically, doesn’t returnDXGI_ERROR_WAS_STILL_DRAWING
even ifDXGI_PRESENT_DO_NOT_WAIT
is specified as a flag).
That last point made things especially difficult for me: The application is solely responsible for tracking the state of the present queue and this is generally fine since the waitable swap chain gives you the tools to do that, but the problem is that the documentation doesn’t mention the details that I list above. It is hard to know how to track the state of the present queue without knowing the behavior that one is tracking 🐔🥚.
Open questions that I don’t yet know the answers to:
- What happens if too many
Present()
calls are made and the present queue is exceeded?- (I do know that you still have to successfully wait for the waitable object to signal for every
Present()
call that returns success. I don’t know, however, what actually gets presented and when.)
- (I do know that you still have to successfully wait for the waitable object to signal for every
- What happens when 0 is provided as the sync interval to
Present()
?- This, at least, is clearly discoverable. I just haven’t put in the time to do tests.
- What happens when a value greater than 1 is provided as the sync interval to
Present()
?- I know how I would expect this to work, but I haven’t put in the time to verify this. With some very cursory tests things seemed to mostly behave as I would expect, but it wasn’t entirely clear and more testing on my part would be required to verify.
Below I will go into a few more details.
Paired Present() and Waits
This behavior is actually described on an Intel webpage: https://www.intel.com/content/www/us/en/developer/articles/code-sample/sample-application-for-direct3d-12-flip-model-swap-chains.html
Conceptually, the waitable object can be thought of as a semaphore which is initialized to the Maximum Frame Latency, and signaled whenever a present is removed from the Present Queue.
Embarrassingly, I had read that sentence several times as the entire page has valuable information but it unfortunately didn’t sink in. It was only in retrospect after I had figured it out for myself that I understood what it was saying.
The way that I ended up stumbling on the answer myself is to test how many times the waitable object can be waited on before it blocks and doesn’t return. Although I haven’t discovered any way of doing the opposite (i.e. detecting when the present queue is full) waiting for a wait to block at least gives a pretty clear way of testing how many Present()
calls are “queued” from the perspective of the waitable object.
I don’t understand why it gets initialized to the SetMaximumFrameLatency()
value, however. It’s not really clear to me what specific vblanks are actually being waited on in this case (are they specific vblanks that get queued somehow, or is there special code that detects that they are dummy pre-queued frames and just waits for the next one?). Also, in my code at least I end up wanting to do my own initial Present()
and wait to initialize some timing (which I need to do with my own calls because that’s how the DXGI frame statistics work), and so these pre-queued things just get in the way. It seems like the decision to do this was probably made because otherwise the obvious pattern of waiting before the first frame would have blocked forever, but I guess I would have preferred this to be more explicitly documented (rather than just urging programmers to remember to wait before the first frame without explaining why).
I also ran into problems trying to figure out how to discard a queued present using DXGI_PRESENT_RESTART
. I had a desire to do this when I detected that the call to Present()
didn’t return until after a vblank, and so I knew that it missed the deadline and was going to take a future frame’s spot. It was surprisingly tricky for me to figure out how to actually make this work, though, since the waitable object has some implicit behavior. It kind of seems to me that DXGI_PRESENT_RESTART
does not reset the waitable object’s internal count of waits (meaning that you can still wait for the number of successful Present()
calls before it blocks indefinitely), which was frustrating because I didn’t know how to clear this out (the only way to clear it is to successfully wait, but successfully waiting meant I was delayed until the next vblank which is exactly what I was trying to avoid). Eventually, though, I just pretended that it was cleared and then everything seemed to work (meaning that the successful returns from waits happen when I would want them to). It’s not clear to me what is actually happening, and whether me pretending the problem doesn’t exist will eventually come back to bite me.