"I hear there's rumors on the internets"
- George W. Bush

Custom alpha to coverage
Sunday, January 25, 2009 | Permalink

In DX10.1 you can write a custom sample mask to an SV_Coverage output. This nice little feature hasn't exactly received a lot of media coverage (haha!). Basically it's an uint where every bit tells to which samples in the multisample render target the output will be written to. For instance if you set it to 0x3 the output will be written to samples 0 and 1, and leave the rest of the samples unmodified.

What can you use it for? The most obvious thing is to create a custom alpha-to-coverage. Alpha-to-coverage simply converts the output alpha into a sample mask. If you can provide a better sample mask than the hardware, you'll get better quality. And quite frankly, the hardware implementations of alpha-to-coverage hasn't exactly impressed us with their quality. You can often see very obvious and repetitive dither patterns.

So I made a simple test with a pseudo-random value based on screen-space position. The left image is the standard alpha-to-coverage on an HD 3870x2, and on the right my custom alpha-to-coverage.



Enter the code below

Vil�m Otte
Thursday, January 29, 2009

Nice idea, I haven't tried it yet, but imagine that you have some objects pretty far from your position - wouldn't it look more noisy?
Also can you give us how was it with performance - with/without custom alpha to coverage?

Thursday, January 29, 2009

Alpha-to-coverage already looks kinda noisy at a distance, so it doesn't change much in that respect. It is slightly noisier right now, but that could probably be fixed by adjusting the pseudo-random function. Alternatively one could tighten it like I did in my alpha-to-coverage demo:
At least for the minification case that would probably be a good idea.

Performance-wise it's of course a bit worse. In this test case I'm only outputting a textured quad, and alpha-to-coverage is 1.09ms vs 1.99ms for this technique. The alpha-to-coverage case is a single instruction shader whereas this technique requires 10. It should be mentioned that it's kind of a worst case shader as well with all of those 10 instruction slots only utilizing a single scalar, so the ALUs are used to 1/5 the theorethical max on my 3870x2. In a full-blown shader it would probably not add much to the total instruction count.

Friday, January 30, 2009

I like that AMD added SV_Coverage output to GLSL as well (gl_Coverage).

Dr Black Adder
Friday, October 14, 2011

A very nice trick, it's just been used in BF3 fro this exact purpose. :-)