"There is a theory which states that if ever anybody discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another theory which states that this has already happened."
- Douglas Adams
More pages: 1 2 3
MetaBalls demo updated
Wednesday, October 19, 2005 | Permalink

I just recently got myself a new dual-core Athlon64 3800+, so naturally I had to try taking advantage of this extra processing power. Most of my demos are more GPU limited than CPU though, but MetaBalls is an exception to that rule. So I implemented threading into it to improve performance. Everything couldn't be parallelized though, so the gain is fairly moderate, about 15-20%. Another reason for the moderate increase is that the bottleneck apparently shifted from computations to cache/memory. The gain is larger when running the slower FPU path than the 3DNow path, about 25%.
This CPU also supports SSE3, so naturally I threw in an SSE3 path as well. 3DNow is still a tiny bit faster though.



Enter the code below

Thursday, October 20, 2005

Hey nice! Congratulations on your new CPU! I want one of those myself *drool*

Anyhow, has this anything to say on my "now-outdated-underpar-performing" A64 3400+?


Friday, October 21, 2005

Friday, October 21, 2005

The coolest would be to compute them entirely on the GPU (via raytracing).

Looks like this: http://jmb.mine.nu/~kma/x/metaballs_xvid.avi

You know who!
Friday, October 21, 2005

Friday, October 21, 2005

Very interesting piece of code. I get about 13% speed increase on my 2.8 GHz P4 HT. But what's weird is that the FPU path is faster that SSE (117 fps to 104 with 1 thread and 132/117 with 2 threads). Is the new VS2005 optimising so much? I recompiled the demo on VS2003 and the results are rather slower. 1 tread FPU/SSE: 54/97, 2 threads even slower: 43/93. Does anybody know why is it so?
Btw the old demo ran 80 fps on FPU and 115 on SSE.

Friday, October 21, 2005

Saturday, October 22, 2005

Thanks for explanation. I had off all optimizations when I recompiled it, so now it runs 75 fps on FPU and 113 on SSE (1 thread) and 122 on SSE (2 threads). FPU with 2 threads still slower - 53 fps. Maybe it's just because I have only HT. I need a new CPU
Btw Humus, I wonder how fast does the demo run on your new Athlon64.

Saturday, October 22, 2005

Nice to see that, Humus! It seems that the future are dual core processors... but they are too expensive now for me

Will be there anytime more indoor demos? Perhaps radiosity could be a nice challenge for you or realtime ambient occlusion

More pages: 1 2 3