What gives?
I remember being very frustrated when trying to learn about matrix math in 3D graphics software.
I wondered why matrix multiplication seemed "backwards" in certain programs.
You read an OpenGL or Blender tutorial that says "It's easy! Think of matrix multiplications as applying transformations from righttoleft!"
Then you see a DirectX or Maya tutorial that says something contradictory. "It's easy! Think of matrix multiplications as applying transformations from lefttoright!"
Here's why
There are different conventions for representing transformation matrices. The authors of various 3D packages decide to use one convention over another. It really just boils down to that.
I've come up with two core ideas that define why matrix math appears different and confusing:
 Element order: Rowmajor vs Columnmajor order
 Transform style: Premultiply rowvector ($y=xA$) vs postmultiply columnvector ($y=Ax$)
Element order: Rowmajor vs Columnmajor order
From the Wikipedia article: https://en.wikipedia.org/wiki/Row_and_columnmajor_order
Rowmajor and columnmajor order answers the question: "How do we describe a matrix as a flat list of numbers?"
Given the 4x4 matrix:
We can write it as rowmajor:
float A[16] = {1,0,0,5, 0,1,0,10, 0,0,1,15, 0,0,0,1};
Or as columnmajor:
float A[16] = {1,0,0,0, 0,1,0,0, 0,0,1,0, 5,10,15,1};
When we talk about rowmajor and columnmajor order, it can be in the context of the memory layout of the particular matrix data structure or how it's indexed. Alternatively, it can be how it's initialized from code. I find the latter more useful in most contexts.
Note that element ordering virtually never describes a visual representation of a matrix. If you're prettyprinting a matrix or you see one in a user interface and it's arranged in rows and columns, then they're rows and columns!
With rowmajor of columnmajor order, the math doesn't change. It only affects the way it's represented as a flat list.
Transform style: Premultiply rowvector ($y=xA$) vs postmultiply columnvector ($y=Ax$)
We use the notation $y=xA$ and $y=Ax$. $y$ describes an output vector, $x$ describes an input vector, and $A$ describes a transformation matrix. Please do not confuse $x$ and $y$ with the x and y axes.
When we say $y=xA$, the operation looks like this when working with a 4x4 matrix and $\R^3$ vectors:
When we say $y=Ax$, the operation looks like this:
Both of these operations yield equivalent vectors. The only difference is the first is a row vector, and the second is a column vector.
Pros of $y=Ax$:
 The most common style in mathematics and math textbooks.
 Order is the same as functional composition. If we use functions $T$ and $U$ to describe transformations, then $T \circ U$ is the same as $(T_{matrix})(U_{matrix})$
Pros of $y=xA$:
 Might be more computationally performant if also using rowmajor ordering.
 Order is lefttoright. You read transformations as they are applied from local to world.
How do you test how a program does matrix math?
We need to figure out both element order, and transform style.
Miniquiz: We perform a translation of (10,20,30). What's the element order of the following matrix?
float A[16] = {1,0,0,10, 0,1,0,20, 0,0,1,30, 0,0,0,1};
a) Rowmajor, b) Columnmajor, or c) Not enough information
Answer below:
...
...
If you guessed c, then you're correct. There is not enough information. This matrix could be either rowmajor or columnmajor depending on the transform style:
Rowmajor and postmultiply columnvector ($y=Ax$):
Columnmajor and premultiply rowvector ($y=xA$):
It's often difficult to find documentation that formally describes this for most 3D graphics software (if it exists at all). So one might resort to finding it out on thier own.
I've used a few tricks:
 Testing matrix multiplication and seeing what it does, e.g. by multiplying a translation with a rotation
 Typing out a matrix by hand and seeing what it does, e.g. if I make the 4th element "10", does it translate X by 10?
$y=xA$ and $y=Ax$ don't affect basic matrix operations
Those two styles only prescribe how transform matrices are created. Given known matrices, it doesn't actually affect the basic matrix operations and the order of their arguments.
For example, for matrix multiplication, the order only swaps because the matrices themselves are not the same in the different programs.
No matter which style a program uses, the following operations should remain identical across them.
 Transpose: $A^T$
 Matrix multiplication: $AB$
 Inverse: $A^{1}$
 Determinant: $\det A$
 Minor: $M_{i,j}$
 Cofactor: $C_{i,j}$
 ... and pretty much all of them
Appendix: a table of matrix math conventions for popular programs
The results of this table are all my own independent research. I've tried and tested the code in the table. Feel free to factcheck!
Application  Matrix initialization order*  Transform style  Language  Example (translate by x=10,y=20,z=30) 

Blender  Rowmajor  y=Ax  Python 

Maya  Rowmajor  y=xA  MEL, Python 

Houdini  Rowmajor  y=xA  VEX 

Cinema 4D  Columnmajor*  y=Ax  Python 

Unreal Engine  Rowmajor  y=xA  C++ 

Unity  Columnmajor  y=Ax  C#  
CSS  Columnmajor  y=Ax  CSS 

GLSL  Columnmajor  y=Ax (by convention)  GLSL 

HLSL  Rowmajor  y=xA (by convention)  HLSL 

glm  Columnmajor  y=Ax  C++ 

DirectXMath  Rowmajor  y=xA  C++ 

Eigen  Rowmajor  y=Ax  C++ 

* Cinema 4D: The translation vector is the first column, and we effectively multiply with the column vector $\begin{bmatrix}1,x,y,z\end{bmatrix}^T$. See: https://developers.maxon.net/docs/Cinema4DPythonSDK/html/manuals/data_algorithms/classic_api/matrix.html#matrixfundamental
* Note: Initialization order describes how a matrix is created from scratch. This describes the element order for numbers passed to a class constructor, function, array, or inline syntax. It's usually the same as storage order, but not always (Eigen is an example of this).
The work is not shown, but "Transform style" is based on other indications such as builtin transform functions and the order that transforms are applied (e.g. "translate", "rotate", "scale"). GLSL and HLSL are "by convention", because they don't include transform functions and they allow for both $y=Ax$ and $y=xA$.