Understanding matrix multiplication

In my last blog on orthogonal projection, I shared how visualizing mathematical concepts help us understand them better. So, I thought of doing the same with matrix multiplication. Matrix multiplication is one of the easiest topics, however, it is very crucial to understand what it means. Why? Well, you will find out at the end of this blog. So, today, we are going to talk about matrix multiplication and visualize the product to understand what it does.

A brief introduction to matrix multiplication

Matrix multiplication is a mathematical operation that takes two matrices as input and produces a third matrix as output. It involves multiplying each element in the rows of the first matrix by the corresponding element in the columns of the second matrix, and then summing these products.

The resulting matrix has dimensions that depend on the dimensions of the two input matrices. Specifically, if the first matrix has dimensions m x n (m rows, n columns) and the second matrix has dimensions n x p (n rows, p columns), then the resulting matrix will have dimensions m x p (m rows, p columns).

Understanding its significance through an example

Matrix multiplication (dot product) is one of the most commonly used operation on matrices. If you are unfamiliar with the steps of matrix multiplication, take a look at this explanation (It is easy and would take about 15 min to understand). Say we have a matrix A with vectors u and v which is given as:

\(
A = \left[\begin{array}{cc}
1 & -1 \\ 1 & 1
\end{array}\right]\)

This matrix is illustrated in a 2-D space as follows:

matrix multiplication

We have another matrix R which is given as:

\(
R = \left[\begin{array}{cc}
0 & -1 \\ 1 & 0
\end{array}\right]
\)

Now let us go ahead an multiply these two matrices, that is,

\(
R \cdot A = \left[\begin{array}{cc}
0 & -1 \\ 1 & 0
\end{array}\right] \cdot \left[\begin{array}{cc}
1 & -1 \\ 1 & 1
\end{array}\right] = \left[\begin{array}{cc}
-1 & -1 \\ 1 & -1
\end{array}\right]\)

If we plot this new matrix in a 2-D space, we get the following result:

matrix multiplication

Notice any difference between the first plot and the second plot? The set of vectors are rotated by 90°. This shows that the matrix A is transformed by the matrix R, which by the way, is a rotation matrix. The standard form of a rotation matrix is given by:

\(\left[\begin{array}{cc}
cos\theta & -sin\theta \\ sin\theta & cos\theta
\end{array}\right]\)

Because we want to rotate the vectors by 90°, setting \(\theta=90°\) will give us:

\(\left[\begin{array}{cc}
0 & -1 \\ 1 & 0
\end{array}\right]\)

Hence, when we multiply a matrix with another, we are mapping one set of basis from one vector space into another. One thing to notice is we calculated the dot product \(R \cdot A\) and not \(A \cdot R\). This is because we interpret the chronology of transformation from right to left. This means the matrix \(A\) goes through the rotation transformation \(R\), hence \(R \cdot A\).

In the example above, we transformed the vectors in matrix A by using the rotation matrix R. I encourage you to go ahead and try some examples yourself (you can use this site to plot the resulting vectors):

| Cool. So why is it important to understand what it means?

Matrix multiplication means to transform a given matrix with the help of some transformation matrix. Due to this transformation, most of the vectors get knocked off from the original span and end up in a new one. However, there are certain vectors that maintain their original span; they only get scaled up/down during the transformation (Have a look at this article on span if you are unfamiliar with the term). These vectors are called eigenvectors and the values by which they get scaled are eigenvalues. These eigenvectors and eigenvalues play a crucial role in many areas, like Principal Component Analysis (PCA), image compression, noise reduction etc.

Conclusion

In this blog, we understood that matrix multiplication basically means transformation of a matrix using another one. I also provided a brief information on eigenvalues and eigenvectors. In the next blog, we will understand these terms in detail.


This blog is a part of my journey to understand Mathematics on Machine Learning. Other blogs that fall under this journey are:


Every support helps me to create articles that inspires and informs

Leave a Reply