While working on dual paraboloid point light shadows, at first I used two R32 textures for the shadow maps, one map per hemisphere of the point light.
Here’s a quick glimpse at what point light shadows are currently looking like with a deferred rendering setup, using a small 7 tap PCF filter and a 512×512 shadow map. The model used is Crytek’s Sponza model.
I wanted to save some texture bandwidth, so I tried to put one shadow map in the R channel of a R16G16 texture, and then the other shadow map in the G channel. However, you cannot use ColorWriteChannels on a non-blendable texture format in XNA (and HalfSingle textures are not blendable).
So I tried using two channels of a A2R10G10B10 texture instead (i.e. HDRBlendable on the Xbox). This worked, except that it didn’t give me the bandwidth savings I was hoping for. It turns out that on the xbox, this format is stored as 32bpp in the EDRAM, but actually resolves to HalfVector4 (R16G16B16A16), 64bpp, when copying to system memory. So I was using the same amount of texture bandwidth in the end anyway, because the HDRBlendable format expands to twice the size on the xbox on resolve.
Instead, I’m experimenting with packing two float values into 32bpp 8bit/channel texture. The cons are:
- Packing/unpacking costs time in the pixel shader.
- Must switch blend state to mask the color channel when switching to drawing the next shadow map.
- Lower precision (although this is not a problem at all if the point lights are small enough, and the fixed decimal point place is chosen well).
The pros are:
- Half the texture bandwidth used to resolve the shadow map from the EDRAM to system memory.
- Avoid having to switch render targets when drawing the next shadow map.
- Save a little bit of texture bandwidth when reading from the shadow map (when drawing the light), because some lookups will use both channels at the same time. Mostly the R channel will have totally separate lookup locations from the G channel since the channels represent opposite hemispheres, but in some places (along the seams?) caching will save a little bit of bandwidth.
Whether using two textures or one texture with packing is better depends mostly on how much of a bottleneck texture bandwidth is vs fill rate.
There is yet another option.
For small shadow casting point lights, 8 bits of depth precision might be enough. Then up to four shadow maps can be stored in a single texture, without any extra encoding ALU costs. This might allow for many small shadow casting point lights.