An intuitive understanding of Eigenvalues and Eigenvectors

“The essence of mathematics is not to make simple things complicated, but to make complicated things simple.”
– S. Gudder

In this previous post on matrix multiplication, I discussed the geometrical significance of the dot product of matrices and how it represents the linear mapping of vectors. We also realized that the original vectors are thrown off their span as a result of the linear mapping. Some vectors, however, remain in the original span and only get scaled by some factor. This entire idea about the span and linear mapping of vectors serves as fundamental information about eigenvalues and eigenvectors.

eigenvalues and eigenvectors
Photo by Sergey Meshkov on Pexels.com

In this blog, we will understand what eigenvalues and eigenvectors are. We will see how to calculate them and more significantly, why are eigenvalues and eigenvectors such popular topics in linear algebra. So to provide you with a brief overview, below are the topics we will discuss today:

Let’s get started!

What are Eigenvalues and Eigenvectors?

As I mentioned in the blog “Understanding Matrix Multiplication”, when we calculate a dot product between matrices, in geometrical terms, it means that we are transforming a particular matrix with the help of the other (transformation matrix), and while some of the vectors fall under the new span, the other maintain their original span; they just get scaled up/down. This idea of vectors changing span depends on the matrix that is responsible for linear transformation. Hence, it would be interesting to know the characteristics that the particular matrix holds.

Eigenvalues and eigenvectors represent the characteristics of a matrix. The vectors that maintain their original span as a result of a linear transformation are called eigenvectors and the amount by which they get scaled are eigenvalues. Every eigenvector of a matrix has an eigenvalue. By calculating these two entities, we can determine what vectors will remain in their span. But first, I would like you to see what these eigenvectors and eigenvalues are actually doing. and later we will proceed to understand how to calculate these entities. This will help you understand why you do, what you do.

Understanding the significance of eigenvalues and eigenvectors

Consider a simple matrix \(A \in \mathbb{R^2}\):

\(A =
\left[\begin{matrix}
1 & 2 \\ 2 & 1
\end{matrix}\right]
\)

The eigenvectors and their corresponding eigenvalues for the above matrix are:

\(
v_1 = \left(\begin{matrix}-1 \\ 1 \end{matrix}\right), \lambda_1 = -1 \\
v_2 = \left(\begin{matrix}1 \\ 1 \end{matrix}\right), \lambda_2 = 3
\)

This says that during any linear transformation performed by this matrix, all the vectors lying within the span of \(v_1\) will get scaled by -1 (the negative sign suggests that the direction of the vector will change). Similarly, all the vectors that are spanned by \(v_2\) will be scaled by the value of 3.

To understand this, consider that this matrix \(A\) linearly maps a matrix \(B\) which is given as:

\(
B = \left[\begin{array}{cc}
-2 & 5 \\ 2 & 5
\end{array}\right]\)

Notice that both the vectors of matrix \(B\) are spanned by the eigenvectors of \(A\); the first vector of \(B\) is spanned by the eigenvector \(v_1\) of \(A\) and the second by \(v_2\). Recalling the fact that these vectors, because they are spanned by eigenvectors of the transformation matrix, will not change their original span after transformation but would just get scaled by their respective eigenvalues. So, our resultant matrix should be:

\(
\left[\begin{array}{cc}
2 & 15 \\ -2 & 15
\end{array}\right]\)

This can be confirmed by describing the transformation process mathematically as:

\(A \cdot B = \left[\begin{array}{cc}
1 & 2 \\ 2 & 1
\end{array}\right] \cdot \left[\begin{array}{cc}
-2 & 5 \\ 2 & 5
\end{array}\right] =
\left[\begin{array}{cc}
2 & 15 \\ -2 & 15
\end{array}\right]\)

The transformation is illustrated in the figures below:

Had those vectors in matrix \(B\) not been spanned by any of the eigenvectors of \(A\), they would get knocked off their original span during the transformation and the resulting vector would land somewhere outside the span.

To summarize, eigenvectors specify what vectors will lie within their span even after linear transformation and eigenvalues signify their scaling factor. Next, we will see step-by-step procedure to calculate them.

How to calculate eigenvectors and eigenvalues

The method we are going to focus on here is called Eigendecomposition. This method works only on square matrices, like the matrix \(A\) we saw above. There is another method called Single Value Decomposition but it is outside the scope of this blog. If you are still interested, click this link to know about SVD.

As an example, we will consider the same matrix \(A\) as above and implement the eigendecomposition method on this. Eigendecomposition consists of the following steps:

  • Calculating the characteristic polynomial
  • Solving the polynomial to achieve eigenvalues
  • Calculating eigenvectors using the eigenvalues

Step 1: Calculating the characteristic polynomial

The characteristic polynomial is defined as \(det(A – λI)\), where \(A\) is the matrix, \(\lambda\) is the eigenvalue, and \(I\) is the identity matrix. For the given matrix, \(det(A – λI)\) will yield us:

\(\left|\begin{array}{cc}
1-\lambda & 2 \\ 2 & 1-\lambda
\end{array}\right|\)

Calculating the determinating of the above, the characteristic polynomial we achieve is:

\(\begin{equation}
\lambda^2 – 2\lambda – 3
\end{equation}\)

Step 2: Solving the characteristic polynomial to achieve eigenvalues

The characteristic equation above is nothing but a quadratic equation. Set it to zero and solve for \(\lambda\).

\(\begin{equation}
\lambda^2 – 2\lambda – 3 = 0
\end{equation}\)

Solving the equation, we get \(\lambda_1 = 3\) and \(\lambda_2 = -1\). These are our eigenvalues. Now, we will find their corresponding eigenvectors.

Step 3: Calculating eigenvectors using the eigenvalues

To find the eigenvectors, we will substitute the eigenvalues in the equation \((A – λI)\cdot x = 0\), and solve for \(x\). Let us start with \(\lambda_1\).

For \(\lambda_1 = 3\):

\(\left(\begin{array}{cc}
1-\lambda & 2 \\ 2 & 1-\lambda
\end{array}\right) \cdot \left(\begin{array}{c}
x_1 \\ x_2
\end{array}\right) = 0 \)

\(\Rightarrow\left(\begin{array}{cc}
-2 & 2 \\ 2 & -2
\end{array}\right) \cdot \left(\begin{array}{c}
x_1 \\ x_2
\end{array}\right) = 0\)

Solving the above, we get:

\(\begin{equation}
-2x_1 + 2x_2 = 0 \\ 2x_1 – 2x_2 = 0
\end{equation}\)

Solving this system of equations, we get x_1 = x_2 = 1. Hence, our first eigenvector is \((1, 1)^T\).

Doing the same for \(\lambda_2 = -1\):

\(\left(\begin{array}{cc}
1-\lambda & 2 \\ 2 & 1-\lambda
\end{array}\right) \cdot \left(\begin{array}{c}
x_1 \\ x_2
\end{array}\right) = 0 \)

\(\Rightarrow\left(\begin{array}{cc}
2 & 2 \\ 2 & 2
\end{array}\right) \cdot \left(\begin{array}{c}
x_1 \\ x_2
\end{array}\right) = 0\)

The resulting equations are:

\(\begin{equation}
2x_1 + 2x_2 = 0 \\ 2x_1 + 2x_2 = 0
\end{equation}\)

Ultimately, the eigenvector we yield by solving this system of linear equations is \((-1, 1)^T\).

Thus we can say that the matrix \(A\) has eigenvectors \((1, 1)^T\) and \((-1, 1)^T\) with eigenvalues \(3\) and \(-1\) respectively.

Calculating eigenvectors and -values for a 3×3 matrix is the same, the only difference will be in the characteristic equation. The dimension of the matrix will influence the degree of the characteristic equation.

Application of Eigenvalues and -vectors

Eigenvalues and eigenvectors have many applications in various fields such as physics, engineering, computer science, and economics. Some examples are Principal Component Analysis (PCA), structural engineering, quantum mechanics, and image processing. However, I would like to share my own experience of the practical application of the concept.

During my studies in automobile engineering, we had to perform brief research on fiberglass. We were given a piece of it and had to examine it. We learned about its strengths and limitations. During this, we were given a shear matrix that contained the information of forces to be applied to that test sample. With the shear matrix, we were able to determine the axis with maximum deformation by calculating the eigenvectors and the eigenvalues.

Conclusion

In this blog, we understood how to calculate eigenvalues and -vectors but more importantly, we understood what do they mean. Many of the entities in the real world can be expressed in terms of matrices and these “eigen-stuffs” can help us understand their very nature, just like my experience with the shear matrix. I would like you to go ahead and try your small Python exercises with matrices. As an example, you can consider any grayscale image as it is two-dimensional and would be a nice practice. Feel free to get in contact and share your results with me on Instagram: @machinelearningsite.

If you would like to stay updated with small posts on programming and machine learning, do follow me on social media:

If you enjoyed my blog, do consider subscribing to my free, monthly newsletter so you do not miss the interesting articles in the future.

Leave a Reply