ATI Tootle

My list of subscribed blogs failed to notify me of the release of an interesting tool by ATI called Tootle, and the associated paper on which it is based. (Well, your ”blogroll” is better than mine, because now you’ve learned  about it from me.) What it does is to rearrange mesh indices to optimize both for vertex cache hit rate AND early-Z hit rate of the resulting pixels; they do it by splitting the mesh in several “roughly planar” patches, doing traditional vertex cache order optimization within each patch, and order the patches to minimize overdraw.

They do it, however, with what seems to be an atrocious preprocessing cost. For example, for a mesh split in 32 patches (which is the low end of the range they cite), they do on the order of 32*32*162 renders of portions of the mesh to determine patch vs. patch occlusion via hardware occlusion culling; it’s not surprising they don’t mention preprocessing times in their results - to render 160K frames should take on the order of minutes, even if you’re being clever.

    Its not quite as bad as minutes. We do bounding-box overlap tests for each cluster pair to reject the ones that dont overlap before rendering them. This gets rid of about 90% of the pairs, and rendering the rest is pretty quick (only a few seconds for small cluster counts).

    I’ll admit that things start to get hairy as the number of clusters goes up. When it gets to a few hundred, then we actually use a software renderer to measure overdraw, which does take minutes. Believe it or not, this is much faster than using HW occlusion query (by a factor of 10 for 1000 clusters)

    […] bit more than a year ago I found ATI Tootle, an interesting mesh preprocessing tool for simultaneous vertex cache AND overdraw optimization. […]

