Thursday, December 11, 2008

Pinching Cascading Shadow Maps

The battle with cascading shadow maps (CSM) is always one for resolution.  We could use 25 layers of CSM, but that would defeat the whole purpose of CSM.  Ad-Hoc shadow maps deliver really good, precise shadows, but with two draw-backs:
  • They are horribly dependent on the size of the objects in your world.  For small objects they produce crisp shadows - for big ones they produce muck.
  • Odds are the number of objects floating around (cars, buildings, etc.) is several orders of magnitude larger than the number of CSM layers you might use.  I get good results with 8 CSM layers, and can probably reduce that with careful optimization.  I usually have more than 8 buildings on screen.  (That means a lot of thrash as we create, then use each shadow map, with GL setup each time.)
Yesterday I found (by misunderstanding the NVidia White Paper) a way to squeeze a little bit more detail out of my CSM layers.

For each "layer" (that is, a distance-wise partition of the user's frustum that gets a separate shadow map) we normally calculate the shadow map's bounding cube around the corners of the user's view sub-frustum (for that layer).

But that's really a much bigger bounding box than we need.  For the price of an iteration over the scene graph* we can calculate a smaller bounding box that is "pinched in" to the edge of the content we need to draw.

Is this a win?  Well, it depends...on...
  • The size of your scenery content.  You won't get a pinch in that's much smaller than the smallest indivisible entities in your world.
  • The overall shape of your scenery content - if it really fills the frustum, no win.
Evaluating this on scenery, I am seeing:
  • Very little benefit for the nearest layers...they are usually so small (a few dozen meters) that they include much larger scenery entities.  (Our ground patches can be several kilometers, so no win.)  But for the far layers, we might reduce the bounding box by 25-50% in some dimensions...that's almost like doubling your texture!
  • The shape of our content is a win.  Since the world is sort of flat and 2-dimensional, usually at least one axis of the bounding box (depending on the sun's angle) is horribly wasted.  That's where we get a win.
Example: the user's camera is on the ground, facing north.  The sun is directly overhead.  Now the user's frustum indicates that, in the farther layers, content that is significantly below or above the ground would be visible.  (We don't have occlusion culling here.)

But in practice, there is no scenery below the ground.  So we can "pinch in" the far clip plane of the sun's camera (which is really far from the sun, just in case anything below the surface of the earth is visible), bringing that far clip plane all the way up to the lowest ground point. Similarly, if we're not shadowing clouds (they are handled separately) the near clip plane can be pushed down to the tallest building.

This makes the far layers much, much more useful.  Normally if the layer is 10 km away at 60 degrees FOV, the bounding box for the shadow map is going to have 10000 meters from its near to far plane.  If we "pinch in", we can reduce this "depth of field" to the difference between the lowest and highest scenery items, which might only be 100 or 200 meters.

(This is of course a best-case scenario...put a mountain in there and angle the sun a bit and the win is much more modest.)

As a side effect, the scene graph traversal lets us completely eliminate layers that contain no content - I am finding I drop at least one layer that way.

EDIT: all of the above data is totally misleading - for shadowing 3-d content, the above is true. But if the terrain mesh is included (it is much larger, and its granularity is larger), the savings all vanish.

* Lazy evaluation can make this a lot faster - simply skip whole sub-trees of the scene graph that are already entirely inside the "pinched" cube)

No comments:

Post a Comment