"There's a saying that states: 'The road to hell is paved with good intentions'. There's hardly anything that better describes communism than that. All good hopes and dreams of justice and equality have ended in oppression and dictatorship."
- Peter Eriksson, Green Party of Sweden

Shader programming tips #3
Monday, February 9, 2009 | Permalink

Multiplying a vector by a matrix is one of the most common tasks you do in graphics programming. A full matrix multiplication is generally four float4 vector instructions. Depending on whether you have a row major or column major matrix, and whether you multiply the vector from the left or right, the result is either a DP4-DP4-DP4-DP4 or MUL-MAD-MAD-MAD sequence.

In the vast majority of the cases the w component of the vector is 1.0, and in this case you can optimize it down to three instructions. For this to work, declare your matrix as row_major. If you previously was passing a column major matrix you'll need to transpose it before passing it to the shader. You need a matrix that works with mul(vertex, matrix) when declared as row_major. Then you can do something like this to accomplish the transformation in three MAD instructions:
float4 pos;

pos = view_proj[2] * vertex.z + view_proj[3];
pos += view_proj[0] * vertex.x;
pos += view_proj[1] * vertex.y;


It should be mentioned that vs_3_0 has a read port limitation, and since the first line is using two different constant registers, HLSL will put a MOV instruction in there as well. But the hardware can be more flexible (for instance ATI cards are). In vs_4_0 there's no such limitation and HLSL will generate a three instruction sequence.

[ 0 comments ]