"You should never have your best trousers on when you turn out to fight for freedom and truth."
- Henrik Ibsen
More pages: 1 2
EDRAM
Saturday, May 29, 2010 | Permalink

It's funny how sometimes as technology develops what was originally a good can become a bad idea. EDRAM for video cards is such a thing. AFAIK no video card for the PC ever had it, but it's been used occasionally in some game consoles, most recently in the Xbox 360. However, I recall back when bitboys were trying to enter the PC video card industry using EDRAM based designs. Unfortunately they never succeeded and none of their products ever saw the light of the day. Had they been able to produce one at the time though, chances are it would have worked very well. However, how people render the frames today and back in the DX7 era is very different, which makes EDRAM far less ideal today. Back in the DX7 days you'd probably render to the backbuffer almost all the time, and given no shaders and as many ROP as there were pipelines the biggest bottleneck was generally bandwidth to the backbuffer. That's hardly the case anymore. Today shaders are long, bottlenecks are usually ALU or texture fetches, and even if you end up being limited at the backend you're normally not bandwidth bound but ROP bound.

Having worked a fair amount with the Xbox 360 the last 2.5 years I find that EDRAM mostly is standing in the way, and rarely providing any benefit. Unfortunately, you can't just render "normally" to a buffer in memory if you so prefer, nope, you always have to render to the EDRAM. Once you're done with your rendering the results have to be resolved to the video memory. So even if we assume we are never ROP bound and EDRAM gets to shine with its awesome bandwidth, it really would not buy us much. Each render target operation is immediately followed by a resolve operation copying the data to video memory. During this copying phase the GPU is busy just copying rather than rendering. If the rendering was targetting a video memory buffer to begin those writes to memory would be nicely interleaved with the rendering work that the GPU does and no resolve would be necessary, so once the rendering is done all that's needed is to flush whatever data is residing in the backend cache to memory and you're done.

Sadly it's not just that it doesn't really provide so much of a benefit as it might look on paper, but it also alters the rendering model that we are all familiar with and adds a bunch of new restrictions. Because EDRAM is still quite expensive in hardware it's not something we get an awful lot of. The Xbox 360 has 10MB. But if you render to the typical 1280x720 resolution with 2xAA, that's 14MB needed for the color and depth buffer. So this is generally solved by "tiling", which means you render for instance to the top of the screen first, then resolve, and then the bottom, and resolve. The DirectX9 for Xbox helps out a bit here to let you do this stuff quite automatically by entering a tiling section of the rendering, which is then submitted twice to the hardware, or how many times necessary depending on how many tiles are require for your render target configuration. Sounds fine huh? Well, until you want to squeeze in another render target operation somewhere in that stream. Say you want to apply some SSAO. You need the complete depth buffer for opaque stuff and then apply before rendering transparent stuff. SSAO can be quite expensive, so instead of using oodles of samples you probably want to take a few, then blur the result to half-res, and then apply that. Well, to blur you need to switch render target, which breaks the model. In order for everything to work you need to first resolve everything, do your SSAO passes, then copy that back to EDRAM again and enter a new tiling section. This is of course way too costly so nobody bothers doing that kind of stuff, but instead just try to live with the limitations imposed. So one may attempt to just resolving the depth-buffer without switching render target and then apply SSAO in one pass. Unfortunately, not even this is ideal. The problem is that when the top tile enters this code only the top tiles has been rendered, so the depth buffer texture it will use for the effect is incomplete. So when it samples the neighborhood around pixels close to the edge it will sample over the edge and get data from the previous frame. This often results in visible seams when in motion. It's common to copy the backbuffer for refraction effects. In many games on the Xbox 360 you'll see visible seems when traveling on water for this reason.

For the next generation consoles chances are we want 1080p, full HDR, at least 4xMSAA and probably many want additional buffers for deferred rendering or other techniques. I don't think it will be possible to embed enough EDRAM to fit all for many games, so if you're designing a future console now and are thinking of using EDRAM, please don't. Or at least let us render directly to memory. Or only let the EDRAM work as a large cache or something if you really want it.

Name

Comment

Enter the code below



mark
Saturday, July 10, 2010

here an intel report:
http://visual-computing.intel-research.net/publications/mlaa.pdf

simon
Tuesday, July 13, 2010

u can also run anti-aliasing on the CPU with the xbox360, u can use the VMX128 to accelerate filtering and move anti-aliasing fully onto the CPU.

the CPU PPC970/VMX128 is better at doing DSP tasks than the xbox360/PS3 GPU's

Slagh
Monday, July 26, 2010

One of the advantages of EDRAM is when reading and writing the 'same' render target/texture - on 360 it's different ram - you retain the previous state (the last resolve) of your texture in RAM and you can do what you like in EDRAM. On systems without EDRAM you must swap buffers, and possibly begin with a 'restore' of the previous pixels if it's a partial update. Funnily enough this 'restore' is another one of the downsides to EDRAM when updating render target content (when you can't limit it to a sub-rect resolve).

mark
Tuesday, July 27, 2010

EDRAM simply isn't future proof, the xbox720 won't have it.
they actually put it in to resolve USB3 patent issues with NEC.

More pages: 1 2