"In C we had to code our own bugs. In C++ we can inherit them."
More pages: 1 2
GPU vs. CPU
Wednesday, June 23, 2010 | Permalink

Via Rage3D I found this Nvidia blog post, which I found somewhat amusing. Although after a brief look at the actual paper I give Intel a bit more credit than the Nvidia spin of it. Still, even then, the Intel paper concludes that the previous generation GPU is 2.5x faster on average.

Anyway, I find the GPU vs. CPU war not so interesting, because my prediction is that we still need to have both paradigms around. No model is going to "win", so I don't think Intel needs to be so defensive, nor do I believe in Nvidia's prediction that "the piece of hardware that runs sequential code will shrink to a tiny dot swimming in an ocean of ALUs" (I forgot the exact wording, but something like that). I don't believe in Nvidia's prediction because of Amdahl's law. At least when speaking of games, there will always be some sort of critical path through the game update code where each step needs input from previous steps. So just slapping on more cores will not make things much faster and switching to the Larrabee model for CPUs is likely to make things slower even if you get an order of magnitude more raw throughput power. I believe the model for future CPUs is something like what the PS3 has, with one main CPU and 6 smaller throughput oriented SPUs. Even in the future we will need at least one, but preferably two or three cores optimized for quickly crunching through sequential code. Then a larger number of tiny throughput oriented cores next to it for parallel but fairly independent tasks. Then the GPU for graphics and a number of other embarrasingly parallel tasks. I don't think the GPU and CPU will meet anytime soon, although with more and more programmable GPUs and then stuff like Fusion I could imagine that the GPU and the SPUs might merge at some point, but I'm not convinced of that yet.

Name

Comment

Enter the code below



Bryan McNett
Wednesday, June 23, 2010

My suspicion is that, as flops/watt continues to improve for many tiny slow cores, there will eventually come a mobile computer generation in which the sequential CPU no longer earns its keep and is left out. Amdahl's law always applies, but the common belief that popular apps require serialism is a myth, IMO.

The driving force behind this is flops/watt. So it should happen to mobile computers first, then videogame consoles or servers, and finally to desktops. But the process may take decades, and a majority of developers will resist it.

Kayru
Thursday, June 24, 2010

The console way definitely seems like the way forward. GPU is just another processor in the system. Ideally, with shared memory and very low level API, like libgcm. But thats probably too much to ask...

Overlord
Friday, June 25, 2010

Personally i think the faster x86 CPUs get a few vector processors for each core the better, but then again the reason we still don't have that setup is because of x86.
I would much rather see them invent a new type of processor, one with multiple types of processor cores, like a few general purpose cores, a bunch of vector cores and a lot of SPPUs(Single purpose processing units, really tiny cores that can only do a few things).

And when it comes to flops/watt, if done right such a processor if you only used whats needed would fit the mobile market pretty good as you could often just use a smaller core for the same job

sqrt[-1]
Wednesday, June 30, 2010

Just bring on a huge grid of memristors and each application can configure the chip hardware how it wants!

(I actually have no idea - just was amazed at some of the predictions in http://www.youtube.com/watch?v=bKGhvKyjgLY "Finding the Missing Memristor"

kyle
Wednesday, June 30, 2010

re Overlord:
Something like Itanium?
Imo, we are stuck with x86 forever ...

kyle
Wednesday, June 30, 2010

re Overlord:
Something like Itanium?
Imo, we are stuck with x86 forever ...

Overlord
Wednesday, June 30, 2010

re kyle:
not even close, the itanium doesn't have a multicore architecture (besides those with 2-4 cores) what it does have is multiple sub cores that can do like 3 or so instructions at once, but it can't use them in paralell.
In essence they tried to make the itanium faster by making it to execute large groups of instructions at once, something the newest intel processors solved in a simpler way by combining multiple instructions into a single instruction that is not part of the x86 set.
Either way this makes the itanium large and complicated, perhaps its good for some things, but for most it's better to keep it simple, i bet you can fit like 50 vector cores on the same space as a single itanium core and that has to count for something.

mark
Saturday, July 10, 2010

the advantage of smaller processing units is that u can disable them, which is ideal for a desktop/server device.

the multi-full-core model will cause a slow-down due to a saturated bus, and thus the future is in a cell-like platform where everything is managed, everything to get the best overall throughput at the lowest possible cost (also lowest energy cost)

this task-managing will be completely transparent like what IBM showed on the cell (which isn't ideal for gaming, and servers are more important than desktops)

More pages: 1 2