Index Cube Shadow Mapping in OpenGL
using nVidia Register Combiners
by Ronald Frazier ()

In my last 2 articles, I discussed various real time lighting techniques. In the last article I also introduced two techniques that can be used to create shadows: shadow volumes and depth cube shadow mapping. In the article, I also discussed the advantages and disadvantages of each technique. In this article I will introduce a third type of shadow mapping called index cube shadow mapping.

It should be noted that in the images contained here and in the sample application, no form of back face rejection shadowing was used. This means that surfaces facing away from the light are still lit. While this is unrealistic, it was done for two reasons. The first reason is to make the application simpler for demonstration purposes. The second reason is to prevent the viewer from confusing shadows generated by the shadow mapping with those generated by back face rejection.

Basics of Index Cube Shadow Mapping
Index cube shadow mapping is actually very similar to depth cube shadow mapping, in that, for each pixel, both techniques compare the polygons visible to the camera and to the light, and if they aren't the same polygon, then the pixel should be shadowed. The difference is that in depth shadow mapping, our basis of comparison is the distance or depth from the camera to the polygon pixel, while in index shadow mapping our basis of comparison will be an index value that is unique to each polygon. 

Determining the Polygon Index
The first thing we need to decide is how to determine the index value for a given polygon. For this we have three options. The first is to use an RGB color value to represent the index. This will theoretically give us approximately 16.7 million indices to use in a scene. While this should be adequate for any scene rendered on current hardware, there is a disadvantage to this technique, and that is that it requires 3 bytes for each pixel in our index cube map. The second option is to use a luminance value (or  an alpha value) as our index value. The advantage here is that each pixel in the index cube map only requires 1 byte. However, this technique also limits us to 256 unique indices (which is not a lot of polygons for a scene). The third technique is to use luminance and alpha values, requiring only 2 bytes, but giving us approximately 65000 indices. While the third technique does seem like a valid middle ground, and perhaps the best option for many applications, for this article we will go strait to the top and use the RGB value option. 

Creating an Index Cube Map
Now that we know how to determine the index value for each polygon, the next thing we need to do is to build the index cube map. To do so, we use the same technique as we did with depth cube maps (render 6 90° views from the lights viewpoint). However, this time, instead of calculating per-pixel depth, we just flat shade the polygon in the color that represents its index value. It is important to remember to reset the index value each time we render the scene, and also to increment the index value for each polygon as we render the scene. The following image shows the index cube map for a sample scene. 


Index Cube Map

There are 2 things to note about this image. The first is that, because the scene contains a small number of polygons, everything appears in a shade of red (since red shades occur at the low end of our index range). In a complex enough scene, we would begin see a variety of colors (representing the entire range of indices). The second thing to note is that the brightness and contrast of the image have been modified to make it easier to see each polygon.

Rendering Shadows to the Stencil Buffer
As with depth cube shadow mapping, the next step is to re-render the scene from the camera's point of view. While doing so, we "project" the index cube map out onto the scene. For each pixel, we then have 2 indices to compare, one for the camera and one for the light. If the 2 indices are not the same, then we conclude that the current pixel should be shadowed. Otherwise, the pixel should be lit.

In order to compare the 2 index values, we need to configure the register combiners to output an alpha value of 0 if the index values match, or a value greater than 0 if they don't match.  To do this, we must break the process into 2 steps. The first step is to take the difference of the two colors on a component by component basis. Next, if either the Red, Green, or Blue difference is greater than 0, we need to output an alpha value greater than 0. If all three difference components are 0, the resulting alpha value should be 0. Since we only care whether the alpha value is zero or not zero, and we don't actually care what non-zero value it is, this makes things easy on us. The first thought would be to take the absolute value of each component, add them together, and using the resulting value as the final alpha output. The closest thing we have to this is a dot product. since 0*0 = 0, and since any non-zero number squared is both positive and non-zero, the dot product will work perfectly for this. So we can take the dot product of the index difference with itself, and use this as our output alpha value. The code below shows how to configure the register combiners to calculate this difference assuming that the current polygon index is stored in the primary color, and the index cube map is set to texture unit 0.

//setup the register combiners
glCombinerParameteriNV(GL_NUM_GENERAL_COMBINERS_NV, 2);

//calculate the difference between the reference index value (primary color RGB)
//and the index value from the index cube map (tex0 RGB)
//scale the result by four (this helps prevent rounding error when LSB are truncated)
//since we only care whether there is a difference (and not the actual difference) scaling wont hurt
//output this to spare0 RGB

glCombinerInputNV(GL_COMBINER0_NV, GL_RGB, GL_VARIABLE_A_NV, GL_PRIMARY_COLOR_NV, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER0_NV, GL_RGB, GL_VARIABLE_B_NV, GL_ZERO, GL_UNSIGNED_INVERT_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER0_NV, GL_RGB, GL_VARIABLE_C_NV, GL_TEXTURE0_ARB, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER0_NV, GL_RGB, GL_VARIABLE_D_NV, GL_ZERO, GL_EXPAND_NORMAL_NV, GL_RGB);
glCombinerOutputNV(GL_COMBINER0_NV, GL_RGB, GL_DISCARD_NV, GL_DISCARD_NV, GL_SPARE0_NV, GL_SCALE_BY_FOUR_NV, GL_NONE, GL_FALSE, GL_FALSE, GL_FALSE);

//ignore the combiner 0 Alpha
glCombinerOutputNV(GL_COMBINER0_NV, GL_ALPHA, GL_DISCARD_NV, GL_DISCARD_NV, GL_DISCARD_NV, GL_NONE, GL_NONE, GL_FALSE, GL_FALSE, GL_FALSE);

//take the dot product of the difference (spare0 RGB) with itself,
//this gives us the total difference
//again, scale by four to help prevent LSB truncation errors
//output to spare1 RGB

glCombinerInputNV(GL_COMBINER1_NV, GL_RGB, GL_VARIABLE_A_NV, GL_SPARE0_NV, GL_SIGNED_IDENTITY_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER1_NV, GL_RGB, GL_VARIABLE_B_NV, GL_SPARE0_NV, GL_SIGNED_IDENTITY_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER1_NV, GL_RGB, GL_VARIABLE_C_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glCombinerInputNV(GL_COMBINER1_NV, GL_RGB, GL_VARIABLE_D_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glCombinerOutputNV(GL_COMBINER1_NV, GL_RGB, GL_SPARE1_NV, GL_DISCARD_NV, GL_DISCARD_NV, GL_SCALE_BY_FOUR_NV, GL_NONE, GL_TRUE, GL_FALSE, GL_FALSE);

//ignore the combiner 1 Alpha
glCombinerOutputNV(GL_COMBINER1_NV, GL_ALPHA, GL_DISCARD_NV, GL_DISCARD_NV, GL_DISCARD_NV, GL_NONE, GL_NONE, GL_FALSE, GL_FALSE, GL_FALSE);

//output zero as final RGB
//output differnece as Alpha (if Alpha > 0, pixel is shadowed, else its unshadowed)

glFinalCombinerInputNV(GL_VARIABLE_A_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glFinalCombinerInputNV(GL_VARIABLE_B_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glFinalCombinerInputNV(GL_VARIABLE_C_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glFinalCombinerInputNV(GL_VARIABLE_D_NV, GL_ZERO, GL_UNSIGNED_IDENTITY_NV, GL_RGB);
glFinalCombinerInputNV(GL_VARIABLE_G_NV, GL_SPARE1_NV, GL_UNSIGNED_IDENTITY_NV, GL_BLUE);

glEnable(GL_REGISTER_COMBINERS_NV);

There is one important think to consider about the register combiner calculation. Due to limited precision in current versions of the register combiners, when the difference between 2 index values is very small, the squaring of the values in the second step will cause the result to get truncated to zero. As a result, a pixel that should be shadowed will end up lit. In order to combat this problem we will do two things. The first thing is that when we increment the index value, we will always increment by 2 steps. The only negative effect of this is that we reduce our index range from over 16 million indices down to approximately 2 million (because each of the 8 it components must be scaled by 2, we end up with 3 7-bit values, or only 21 bits of precision). However, this should still be more than adequate for quite some time. The second thing that we will do to reduce these types of errors is to scale all intermediate values by four. This will have the result of moving all important values out of the bits that will get truncated in future calculations.

I should also note here that I am currently testing this application on a GeForce 2. While it should be safe to assume that future cards will have more precision and will thus work fine with this algorithm, I cannot guarantee that the original GeForce card has the same precision in the register combiners. If it does have less precision, it may be necessary to increment index values by 3 or 4 steps instead of 2.

Limitations, Optimizations, and Alternatives
As with every technique, there are some problems with index cube shadow mapping. The first problem is that, just like depth shadow mapping, shadow resolution is limited to the size of the index cube map and that shadow resolution also degrades as the distance from the light increases. The only way to combat both of these problems is to increase the size of the index cube map. However, this increases video memory consumption and also decreases performance. The only solution to these problems is the inevitable increase of memory, bandwidth, and performance in future hardware.

The next problem with index cube mapping is also related to index cube map size limitations. This problem has to do with the fact that, as a polygon gets represented with fewer and fewer pixels in the index cube map, errors start to crop up around the edges of polygons due to sampling error. While this problem also exists in depth cube shadow mapping, it is extremely less noticeable. One reason is because in depth cube shadow mapping, the error only occurs at edges of object, where the is an extreme difference in depth from one pixel to the next. Across the surface of an object, there is only a smooth and gradual change in depth, even where polygons meet. However, in index shadow mapping, the algorithm actually makes a distinction between individual polygons, so we see these types of error at every polygon joint. The following image demonstrates the problem.


Errors at polygon joints

One way to solve this last problem is to use only a single index value for all polygons of a single object, thus incrementing the index on a per object basis rather than per polygon. The advantage of this is that it also makes the luminance or luminance + alpha indexing more viable alternatives to RGB indexing (since we would need fewer index values for the scene). The downside to this technique is that we then lose the ability to have a non-convex object cast a shadow upon itself. Notice how in the image below (which uses per object indexing instead of per polygon indexing), the torus doesn't shadow itself as it does in the image above.


No errors at polygon joints,
but no self shadowing either

I should mention that even though there are still some shading errors in the above image, this should mostly be covered up in typical applications once you add in diffuse lighting calculations (which will shadow surfaces that face away from the light). So per object indexing seems to work if you have convex objects or can tolerate non-convex objects not casting shadows on themselves.

Another alternative would be to break an object up into sections, and use a different index for each section. However, this would only reduce the errors slightly, not eliminate them.

Finally, another problem, which is a variation of the above problem, would show up when using very small polygons. If the polygons got small enough, they might disappear from the index cube map completely.

Conclusion
Now that we have examined index cube shadow mapping, and now that we have seen its problems, one question begs to be asked: should we even bother with index cube shadow mapping? That depends. In general, it seems to me that depth cube shadow mapping and shadow volumes are more viable techniques than index cube shadow mapping. While I am pretty confident that with some additional effort we could reduce the errors at polygon boundaries, my gut instinct is that we wouldn't be able to reduce them enough to make the results any better than depth cube shadow mapping. Also, depth cube shadow mapping has the potential that the depth values could be used in some type of special effects application.

Does this mean index cube shadow mapping is useless? Not necessarily. First of all, it should be noted that with this technique, we only require 1 texture unit to perform the shadowing, whereas depth cube shadow mapping requires at least 2 (more if we want increased precision). In some situations this could be an advantage. Another thing to note about index cube shadow mapping is that the shadowing errors around polygon edges look almost cartoonish. Perhaps there could be some application for this effect in cartoon rendering engines. Finally, the per polygon indexing might also have some application as a type of selection buffer.

In all, I would prefer to think of index cube shadow mapping as just another specialized tool in a bag of programming tricks. It might not be the best tool for everyday use, but someday it just might come in handy.

Source Code and Executable
The source code, compiled executable, and necessary image files for the demo application can be downloaded here

References

  1. Photorealistic Lighting Effects Using Per-pixel Multitexturing Operations, Jason L. Mitchell and Faisal Qaisi, 2000
  2. Improving Shadows and Reflections via the Stencil Buffer, Mark Kilgard, 1999
  3. Shadow Mapping with Today’s OpenGL Hardware, Mark Kilgard, 2000
  4. Texture Compositing With Register Combiners,  John Spitzer,  2000
  5. Cube Maps,  Sim Dietrich,  2000