Matrix multiplication in graphics APIs is ridiculously confusing. People are often confused about the right order of multiplying their matrices, and about rowmajor, columnmajor, premultiplication, postmultiplication, row vectors and column vectors, and transposing.
I plan for this to be the last resource you’ll never need to check.
Why do we use matrices?
If matrices are so confusing, why do we even use them in the first place? Our goal with using matrices in graphics is using them to transform objects in space, or to transform space itself.
If we imagine a cube with some X, Y, and Z coordinates, we can translate that cube along the X axis by 5 units by adding a constant 5:
x' = x + 5
If we wanted to scale the cube by 2, we can multiply by 2:
x' = 2x
These are both very simple forms of transformations. So why go more complicated and use matrices? Well, let’s introduce rotation. Scaling and translating happen along a single axis, while rotation happens within a plane. Rotating a point around the Y axis changes the point’s values on the X and Z axis.
That means that to transform a point in a way that lets us scale, rotate, and translate it in a generic way, we need a formula that looks something like this:
x' = 1x + 7y + 9z + 1
y' = 4x + 1y + 6z + 0
z' = 3x + 8y + 2z + 3
If we assume that all of our equations will always be of the form Ax + By + Cz + D
, then we can eliminate all of the “fluff” there and end up with a bag of 12 numbers:
As you might imagine, this “bag of numbers” is called a matrix.
Combining Translations
One interesting fact that isn’t at all obvious is that transforms compose differently depending on the order. Let’s go back to our simple example of scaling and translation using simple equations:
x' = 2x // Scale x by 2
x' = x + 5 // Translate x by 5
If we want to combine the two, the answer changes depending on whether we scale before translating or scale after translating. Scaling before translating means that we substitute the scaling equation into the translation one:
x' = 2x + 5 // Scaling by 2, then translating by 5
While scaling after translating means that we substitute the translation equation into the scaling one:
x' = 2(x + 5) // Translating by 5, then scaling by 2
And we can further simplify that “translating, then translating” equation by expanding terms to help see that this result is indeed very different.
x' = 2x + 10 // Translating by 5, then scaling by 2
As long as we reduce our formula to a standard polynomial expression like Ax + B
, we can take those coefficients and put them directly in a matrix. In fact, doing this “substitution and simplify” will give us the exact same coefficients as if we multiplied two matrices together.
Matrix Fact #1: The standard algorithm for matrix multiplication is nothing more than a “compressed” form of writing two systems of linear equations, substituting one into the other, and simplifying all the way down.
Matrix multiplication being noncommutative is just a representation of this fact, that the order that you substitute equations into each other matters.
If it helps clear things up, you can think of matrix multiplication as something more along the lines of function application, rather than the multiplication of two scalars. If we imagine “translate by 5” as being function F
and our “scale by 2” as some function G
, then as we’ve just shown, F(G(x))
is not the same thing as G(F(x))
.
With a full set of x, y, and z equations, you can still write it out, substitute, and simplify, however it quickly becomes tedious for even just a handful of equations:
x' = 1x + 7y + 9z + 1
y' = 4x + 1y + 6z + 0
z' = 3x + 8y + 2z + 3
x'' = 2x' + 1y' + 3z' + 10
y'' = 1x' + 9y' + 4z' + 12
z'' = 3x' + 9y' + 6z' + 3
x'' = 2(1x + 7y + 9z + 1) + 1(4x + 1y + 6z + 0) + 3(3x + 8y + 2z + 3) + 10
y'' = 1(1x + 7y + 9z + 1) + 9(4x + 1y + 6z + 0) + 4(3x + 8y + 2z + 3) + 12
z'' = 3(1x + 7y + 9z + 1) + 9(4x + 1y + 6z + 0) + 6(3x + 8y + 2z + 3) + 3
If you took this and simplified it all the way down, we’d end up with the exact same set of coefficients that we would have had we did matrix multiplication instead. But as you can imagine, as equations get more and more complex, and as we stack more and more transforms on top of each other, matrix multiplication is a lot easier than writing equations out and substituting.
If you were working with a set of transformations for an extended period of time, you would have invented matrices too!
Matrix/Matrix Multiplication
Let’s cover the easy case first, from a mathematical perspective, before we start talking about code and memory layouts. It’s wellestablished that Matrix/Matrix multiplication is not commutative, that is, matrix A times matrix B does not result in the same thing as matrix B times matrix A. However, Matrix/Matrix multiplication has exactly one formula, and it’s fairly easy to remember: it’s always Across times down. What do I mean by that?
Let’s multiply two 4×4 matrices together. On the left is A, and on the right is B. To get the result at any individual slot in the result, we take elements going across on the left, multiply them together with elements going down on the right, and then add those all up. In other words, each element of the result is a dot product of a row from matrix A with a column from matrix B.
Whether you’re using Vulkan or DirectX, GLSL or HLSL, rowmajor or columnmajor matrix math always respects this formula. Across times down. If there’s one thing to remember from this post, it’s that. Everything else is a corollary of that simple fact.
Matrix Fact #2: All matrices are multiplied as across times down no matter what shading language or graphics API you use.
What if the dimensions are different?
Let’s try to multiply a 2×4 matrix against a 4×3 matrix. Note that in standard mathematical lingo, a 2×4 matrix is a matrix with 2 rows and 4 columns.
Matrix Fact #3: the convention in mathematics is that an MxN matrix is made up of M rows and N columns.
Nonetheless, we can multiply a 2×4 matrix against a 4×3 matrix, resulting in a 2×3 matrix. Across times down.
However, if we try and multiply a 4×2 matrix against a 3×4 matrix, we run into an issue. We cannot do across times down without running off the end of one of the matrices.
This is an error. It is only possible to multiply matrices together if their “inner” dimensions match: when multiplying matrices A and B, the number of columns in A must match the number of rows in B.
Matrix/Vector Multiplication
Let’s stretch the limit here. What happens when we try to multiply a 4×4 matrix against a 4×1 matrix?
Across times down works just fine! Shading languages like HLSL and GLSL both use this formula when multiplying a matrix against a vector — they turn the vector on its side, becoming a “column vector”, or a matrix of 4×1, so that they can multiply together.
What happens when we multiply a vector against a matrix? You might expect to be an error, as it would be illegal to multiply a 4×1 matrix against a 4×4 matrix. However, shading languages like HLSL and GLSL do something somewhat unexpected here. When multiplying a vector times a matrix, they will arrange the same exact matrix as a 1×4 “row vector” matrix!
Note that vectors by themselves don’t have an inherent direction, it’s only by “arranging” them into 1xN or Nx1 matrices that they grow into “column vectors” or “row vectors”.
Matrix Fact #4:
 Multiplying a matrix M against a vector v is equivalent to multiplying M against a new Nx1 matrix made up of v.
 Multiplying a vector v against a matrix M is equivalent to multiplying a new 1xN matrix made up of v, against our matrix M.
 An Nx1 matrix is called a “column vector” since it’s tall and skinny, while a 1xN matrix is called a “row vector” since it’s short and wide.
Multiplying a matrix by a vector or a vector by a matrix is sometimes called premultiplication or postmultiplication, but these are unclear phrases; sometimes premultiplication refers to the vector being on the lefthand side, and sometimes it refers to the matrix being on the lefthand side.
For clarity, whenever the vector is on the lefthand side of a matrix/vector multiplication, I prefer to call it “rowvector multiplication”, and whenever it’s on the righthand side, I prefer to call it “columnvector multiplication”.
Useful Identities
We can take any matrix and transpose it. This effectively swaps rows and columns; what was a row is now a column, and vice versa.
Swapping a matrix like this is helpful for all number of reasons, but an important fact you can remember is that multiplying a matrix A against a matrix B is equivalent to transposing the two matrices, and swapping the multiplication order, and transposing the result. Note that transposing them swaps their dimensions, so the transpose of an e.g. 3×4 matrix becomes a 4×3 matrix. Combined with the across times down rule, it should be easy to verify yourself that the calculations will be the same.
Matrix Fact #5: Given matrices A and B, A * B
is the same as transpose(transpose(B) * transpose(A))
.
Since vectors can become either rowvectors or columnvectors based on their usage, this means that they “automatically transpose” themselves in shading languages. So effectively, A * v
simplifies down to be the same as v * transpose(A)
, and v * A
can be written as transpose(A) * v
.
RowMajor and ColumnMajor
So far we’ve only talked about things from the mathematical perspective. Let’s talk about the computers side of things. A 3×4 matrix consists of 12 numbers, which we’ve stored in an array. How can we unpack this flat array of numbers into a matrix? We have two options: rowmajor matrix packing, and columnmajor matrix packing.
Rowmajor matrix packing is probably what seems the most obvious to a programmer: we unpack each number first from left to right, then from top to bottom. One way to imagine this is that we’ve divided up the 12 numbers from our array into 3 “row vectors” stacked on top of each other. This is why it’s called “rowmajor”.
Conversely, columnmajor matrix packing runs first top to bottom, then from left to right. We can visualize this packing order as being equivalent to four consecutive “column vectors” packed left to right, giving us the name “columnmajor”. Note that both rowmajor and columnmajor matrix packing still gives us matrices with the same shape and dimensions, it only changes how the array of data is interpreted.
Matrix Fact #6: Rowmajor and columnmajor are not properties of matrices by themselves and do not affect matrix multiplication; that is always across times down. Rowmajor and columnmajor are simply about the memory storage order of a matrix when loading from buffers and storing back to buffers.
Shading languages can control the memory packing of matrices through different means.
In HLSL, there are two different ways to affect packing order of matrices:
 You can use the pack_matrix pragma to affect packing.
 You can use the
/Zpc
or/Zpr
command line arguments to set the default matrix packing for the file.
In GLSL, you can use the row_major
and column_major
layout options on structures or individual field.
In both HLSL and GLSL, columnmajor packing is the default, if no other overrides here are specified.
Also note that since transposing a matrix swaps the rows and columns, it’s another way to handle the packing differences; this is what the transpose
parameter to the glUniformMatrixNxMv
functions do; calling glUniformMatrix3x4fv
with the transpose parameter set to false
will treat your 12 numbers as 4 columnvectors, while with the transpose parameter set to true
it will treat your 12 numbers as 3 rowvectors.
Matrix Fact #7: Transposing a matrix is effectively equivalent to changing its packing order (I’m being a bit wishywashy here).
Shading Language Differences
While GLSL and HLSL are both similar in how they work with matrices, there’s still some things we need to cover.
The first is matrix type naming. To declare a 3×4 matrix, one with 3 rows and 4 columns:
 In HLSL, the type is float3x4. HLSL chooses to name it as floatRxC, as is traditional.
 In GLSL, the type is mat4x3. GLSL unfortunately chooses to name it as matCxR which clashes with existing mathematical practice.
Next up is the syntax for matrix multiplication and elementwise multiplication, sometimes called the Hadamard product. The Hadamard product only exists for two matrices of the same dimension, and is just the individual elements in each “slot” multiplied together.
 HLSL uses
mul(A, B)
for standard matrix multiplication, andA * B
for the Hadamard product.  GLSL uses
A * B
for standard matrix multiplication, andmatrixCompMult
for the Hadamard product.
When constructing matrices inside a shader, HLSL and GLSL act differently.
 HLSL’s matrix constructor works by being supplied full row vectors at a time. For a 3×4 matrix, if you pass 12 numbers, it will first construct 3 row vectors out of each consecutive set of 4 numbers, and then stack them up on top of each other. You can also pass 3
float4
row vectors directly. This happens regardless of whateverpack_matrix
pragma is set, or any compile arguments.  GLSL’s matrix constructor works by being supplied full column vectors at a time. For a 3×4 matrix, if you pass 12 numbers, it will first construct 4 column vectors out of each consecutive set of 3 numbers, and then stack them up on top of each other. You can also pass 4
vec3
column vectors directly. This happens regardless of whether the matrix is tagged asrow_major
orcolumn_major
layout.
When indexing into matrices inside a shader, HLSL and GLSL act differently.
 In HLSL, indexing into a matrix with
matrix[0][2]
will return the value in the 0th row and 2nd column.matrix[3]
will return the 3rd row as a vector. As above, this happens independently of any pack settings and command line arguments.  In GLSL, indexing into a matrix with
matrix[0][2]
will return the value in the 0th row and 2nd column.matrix[3]
will return the 3rd column as a vector. As above, this happens independently of any layout settings on the matrix.
These three differences above, plus other more historical artifacts have led many to believe that HLSL (and DirectX) are rowmajor, while GLSL (and OpenGL) are columnmajor. While they have a preference for advanced cases of indexing and constructors, matrix multiplication works the exact same between HLSL and GLSL, and they support either mode for packing and unpacking.
Matrix Fact #8: There are some subtle differences that affect the behavior of matrix types between HLSL and GLSL, but they don’t change how multiplication works.
Space Transformations and Associativity
Our goal with matrices in computer graphics is to transform objects between different spaces. If we have a point in world space, and we would like to have a version of that point in view space, we transform the point by what’s commonly called a view matrix. I’m going to use columnvector multiplication here for example’s sake:
vec3 view_space_P = view_matrix * world_space_P;
The matrix is in charge of taking our point and transforming it into a new space. In computer graphics, we often wrangle many spaces, and want to build large transformation chains. For instance, a common viewing transformation looks something like this:
vec3 clip_space_P = projection_matrix * view_matrix * model_matrix * model_space_P;
Here, we are chaining together a set of transformations to our point P: the modelmatrix transforms from modelspace to worldspace, the viewmatrix transforms from worldspace to viewspace, and the projection matrix transforms from viewspace to clipspace.
(And for anyone following along, yes, putting a vec3
into a projection_matrix
is not what a standard transformation looks like. I simply didn’t wish to distract this discussion with homogeneous coordinates and vec4
for the purposes of this article.)
One interesting fact is that while matrix multiplication is not commutative, it is associative: that is, (A * B) * C
is the same as A * (B * C)
. That means that we can think of the transformation in two different, identical ways:

We transform our vector
model_space_P
by themodel_matrix
to get a new vector in worldspace, then transform that vector by theview_matrix
to get a new vector in viewspace, then transform that vector by theprojection_matrix
to get a new vector in clipspace. That is,projection_matrix * (view_matrix * (model_matrix * model_space_P))

We multiply the matrices together:
projection_matrix * view_matrix
results in a new matrix that transforms from worldspace directly to clipspace, andprojection_matrix * view_matrix * model_matrix
results in a new matrix that transforms from modelspace directly to clipspace. That is,((projection_matrix * view_matrix) * model_matrix) * P
One interesting fact is that when multiplying matrices together, we never end up growing the size of the matrix: when multiplying two 4×4 matrices together, the end result is always a new 4×4 matrix. This means that no matter how many space transforms we end up wanting to do, whether we want to transform between 3 spaces or between 500 spaces, we can always collapse that space transformation into a single 4×4 matrix.
But note that this new matrix now depends on which order you’re intending to do the resulting multiplication! That is, a new matrix M = projection_matrix * view_matrix * model_matrix
needs to be used as M * v
. Your composition order needs to match your usage order.
Matrix Fact #9: Multiplying two matrices together builds a new matrix that combines the relevant space transformations. As long as you keep the order consistent, you can combine as many matrices as you want together.
Matrix Multiplication Ordering in Practice
Enough theory. Let’s talk some practice. Like above, we have a standard transformation sequence, and we would like to transform a given position vector P. We now have two choices:
 Columnvector multiplication, where P is on the righthand side
vec3 clip_space_P = projection_matrix * view_matrix * model_matrix * model_space_P;
 Rowvector multiplication, where P is on the lefthand side
vec3 clip_space_P = model_space_P * model_matrix * view_matrix * projection_matrix;
Now, remember that A * v
is the same as v * transpose(A)
, because v changes whether it is a rowvector or columnvector based on usage. This means that for these two to represent the same calculation, the matrices must be transposed between the top and bottom lines. And since swapping the packing order is roughly equivalent to transposing, that means that we can make the top and bottom lines work by changing the packing order.
This is where the other half of the confusion comes from; historical convention is that DirectX and HLSL codebases tend to use rowvector multiplication, while OpenGL and GLSL codebases tend to use columnvector multiplication. This is sometimes expressed online as “DirectX is rowmajor” and “OpenGL is columnmajor”, however, there is nothing inherent about this in the graphics APIs or shading languages, just set through years of inertia of sample code and existing codebases.
In practice, the combination of columnvector/rowvector multiplication, columnmajor/rowmajor packing, and the existence of transpose
means that you can often make two mistakes that cancel each other out, or just fiddle around with flipping multiplication order and inserting transposes until things work.
However, there are some ways to make your life easier. When dealing with matrices, try to consider the space your data starts in, and the space transformations you wish to make. The general rule is that when using columnvector multiplication with the vector on the righthand side, we wish to have a series of spacefromspace matrices where the starting space is on the righthand side, and the resulting space is on the lefthand side. Some even prefer to name their matrices this way.
With P on the righthand side (columnvector multiplication), you want to phrase it as a chain of “AfromB” transformations:
vec3 clip_space_P = clip_from_view_space * view_from_world_space * world_from_model_space * model_space_P;
And with P on the lefthand side (rowvector multiplication), you want to phrase it as a chain of “BtoA” transformations:
vec3 clip_space_P = model_space_P * model_to_world_space * world_to_view_space * view_to_clip_space;
As long as you know your spaces, and pick a convention and stick to it, you should be good to go.
Inverses
OK, we’ve figured out how to transform space one way, but what if we wanted to go backwards? What if we have a point in worldspace, and we want it in modelspace? That’s what an inverse matrix is for. If we have a matrix that takes us from modelspace to worldspace, its inverse will take us from modelspace back to worldspace.
When using an inverse matrix, you should still multiply in the same order, but the space transformations will be backwards.
vec3 world_space_P = world_from_model_space * model_space_P;
vec3 model_space_P = inverse(world_from_model_space) * world_space_P;
Matrix Fact #10: The inverse matrix applies the opposite space transform, but doesn’t change the multiplication order at all.
Which one should I use?
Between columnvector and rowvector multiplication, and columnmajor and rowmajor packing, we find ourselves with four choices with no obvious preferences. At some level, this is a choice similar to spaces or tabs, or picking your favorite up vector. That said, my preference is columnvector multiplication, and rowmajor packing, and my rationale is the follows:

Columnvector multiplication, the convention with P being a columnvector on the righthand side, is the more traditional convention that you will see spelled in papers from mathematics. I prefer it for this reason; if you only have space in your brain for one convention, I think it simplifies the load to only consider this. That said, given my experience working in the games industry, it is certainly the less standard convention here. Additionally, it maps more naturally to the functionapplication mental model: it’s easier to imagine
A * B * C * v
asA(B(C(v)))
. 
Rowmajor packing, I prefer because it allows you to pack affine transform matrices more efficiently; for our model and view matrices, the last row of our matrix will be (0, 0, 0, 1). By removing this, we can fit the remaining 3 rows into three float4’s, which are a natural thing for a GPU to pack. For instance, with GLSL’s (admittedly outdated) std140 packing mode, we save 16 bytes of storage packing a 3×4 matrix as rowmajor over packing a 3×4 matrix as columnmajor.
However, more than anything, please try to be consistent, and please document your conventions somewhere. Nothing makes a graphics programmer happier than having your matrix conventions consistently obeyed and clearly documented.
Quick Reference Table
HLSL  GLSL  Notes  

Multiplication Order  Across times down  Across times down  
Matrix Type Name  floatRxC 
matCxR 

Matrix/Matrix Multiplication  mul(A, B) 
A * B 

Matrix/Vector Multiplication  mul(A, v) 
A * v 
Treats v as a column vector. 
Vector/Matrix Multiplication  mul(v, A) 
v * A 
Treats v as a row vector. 
Hadamard Multiplication  A * B 
matrixCompMult(A, B) 
Only works for two matrices of the exact same size. 
Default Matrix Packing  ColumnMajor Packing  ColumnMajor Packing  
Changing Matrix Packing  #pragma pack_matrix or the /Zpc command line argument 
layout(row_major) or layout(column_major) 

Constructors  float2x2(row0, row1) 
mat2x2(col0, col1) 
The elementwise constructor acts like you took consecutive numbers and “bunched” them up into vectors, then called the vector constructor. 
Indexing  m[row_index][col_index] 
m[col_index][row_index] 
This is sometimes called “columnmajor” or “rowmajor” but that is a misconception. Majority is just about the packing order, it does not change the indices. 
> Matrix Fact #4: Given matrices A and B, A * B is the same as transpose(B) * transpose(A).
It’s the transpose of transpose(B) * transpose(A), not?
Thank you for the catch. I was a little bit less than rigorous here, and have updated the post to correct my mistake.
This article is fantastic, thank you so much!
I was actually floating the idea of writing something similar, but with an additional twist: since changing either vector orientation or matrix storage order is effectively equivalent to first performing a transpose on the matrix, the four possible combinations of these properties actually collapse down into two. Either you match them (e.g. row vectors and row major storage) or you mismatch (e.g. column vectors and row major storage), or in other words either your axis and translation vectors inside the matrix are contiguous in memory or they are not.
I have not been able to fully explore this idea, so I don’t know whether it falls apart at some point, but so far it has helped me understand some cases (especially when in existing code different matrix definitions are just memcpy’d into each other and somehow work, following the match/mismatch model helps me analyze without having to consider 4 different cases). Just some food for thought if you’re interested.
Great post! I wrote one in a similar vein a couple of years ago to get all this out of my head too ðŸ˜… It doesn’t cover as much of the mathematical side of things but might be interesting for reference – https://tomhultonharrop.com/mathematics/matrix/2022/12/26/columnrowmajor.html