HDRBlendable Xbox 360 performance and dual paraboloid point light shadow optimization.

While working on dual paraboloid point light shadows, at first I used two R32 textures for the shadow maps, one map per hemisphere of the point light.

Here’s a quick glimpse at what point light shadows are currently looking like with a deferred rendering setup, using a small 7 tap PCF filter and a 512×512 shadow map. The model used is Crytek’s Sponza model.

Point light shadows using dual paraboloid shadow mapping.

I wanted to save some texture bandwidth, so I tried to put one shadow map in the R channel of a R16G16 texture, and then the other shadow map in the G channel. However, you cannot use ColorWriteChannels on a non-blendable texture format in XNA (and HalfSingle textures are not blendable).

So I tried using two channels of a A2R10G10B10 texture instead (i.e. HDRBlendable on the Xbox). This worked, except that it didn’t give me the bandwidth savings I was hoping for.  It turns out that on the xbox, this format is stored as 32bpp in the EDRAM, but actually resolves to HalfVector4 (R16G16B16A16), 64bpp, when copying to system memory. So I was using the same amount of texture bandwidth in the end anyway, because the HDRBlendable format expands to twice the size on the xbox on resolve.

Instead, I’m experimenting with packing two float values into 32bpp 8bit/channel texture. The cons are:

  • Packing/unpacking costs time in the pixel shader.
  • Must switch blend state to mask the color channel when switching to drawing the next shadow map.
  • Lower precision (although this is not a problem at all if the point lights are small enough, and the fixed decimal point place is chosen well).

The pros are:

  • Half the texture bandwidth used to resolve the shadow map from the EDRAM to system memory.
  • Avoid having to switch render targets when drawing the next shadow map.
  • Save a little bit of texture bandwidth when reading from the shadow map (when drawing the light), because some lookups will use both channels at the same time. Mostly the R channel will have totally separate lookup locations from the G channel since the channels represent opposite hemispheres, but in some places (along the seams?) caching will save a little bit of bandwidth.

Whether using two textures or one texture with packing is better depends mostly on how much of a bottleneck texture bandwidth is vs fill rate.

There is yet another option.

For small shadow casting point lights, 8 bits of depth precision might be enough. Then up to four shadow maps can be stored in a single texture, without any extra encoding ALU costs. This might allow for many small shadow casting point lights.

This entry was posted in Coding, XNA and tagged , , . Bookmark the permalink.

3 Responses to HDRBlendable Xbox 360 performance and dual paraboloid point light shadow optimization.

  1. Hildegard says:

    I used to be recommended this website via my cousin. I am not positive whether or not this put up is written through him as
    no one else realize such designated about my trouble.
    You’re wonderful! Thank you!

  2. Asking questions are genuinely fastidious thing if you are not understanding something completely, except
    this piece of writing gives pleasant understanding even.

    My web blog … Debs/Prom Dresses

  3. I’m very pleased to find this great site. I wanted to thank you for ones time due to this fantastic read!! I definitely enjoyed every part of it and I have you book marked to look at new information in your website.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>