So, Early Z is the name of my blog, but what excatly is early z?

Early z rejection is an optimization technique that enables the GPU to skip execution of the pixel shader if it can accurately determine that the given pixel will be discarded after depth testing. To do so, the GPU does depth testing and updates the dept buffer before the pixel shader. The opposite scenario, where depth testing and buffer updates are done after the pixel shader is called late z.

Early z is very interesting because it can allow the GPU to discard a very large number of primitives/pixels.

To use early z, the GPU needs to be sure that pixels that would fail the depth test after the vertex shader, when only the geometry has been processed, will not be visible at the end of the pipeline. For example, if the pixel shader modifies the z value of the pixel it is shading, it could make a previously occluded pixel visible.

Let’s list the ways in which occluded pixels (geometry-wise) could end up visible:

  • Depth testing is disabled (whatever is drawn last is visible)
  • Depth testing is enabled but the depth comparison functions is set to ALWAYS (depth test always passes)
  • Alpha blending is enabled (a pixel that fails the depth test may be visible through another pixel)
  • The pixel shader modifies the z value of the pixel (writes to SV_DEPTH)
  • The pixel shader modifies the stencil buffer (stencil test passes or writes to SV_StencilRef)
  • The pixel shader modifies the MSAA coverage mask (SV_Coverage)
  • The pixel shader kills the pixel using the discard statement
  • The pixel shader performs unordered access operations (a pixel shader that fails early depth testing but writes to a UAV still needs to be executed)

Please note that a pixel shader can force early z to be turned on using the earlydepthstencil HLSL attribute but results may be undefined if some of the features mentionned earlier are used.

With early z on, it is useful to render a scene in way that will cause as many pixels as possible to be rejected. This article from Intel shows two such techniques.

Additional Resources

A very nice article from Fabian Giesen going more in depth (no pun intended).
DirectX System-Value Semantics.
More info on SV_Coverage, a seldom documented feature.