Lately, I have been working on a project that involves sending images, captured by an industrial camera, from one station to another. The task itself is trivial, however, if the images are sent in the raw form, it can demand a significant amount of computing power from the processor. To make this efficient, it is a good practice to compress the images before transferring them to the client stations. And by compression, I do not mean reducing the image resolution. Image compression is a process to reduce the file size by encoding the image in as few bits as possible. The aim is to lower the size of the file, simultaneously retaining the image quality.
In this blog, we will go through the process of compressing images in Python. There will be some math involved (it’s unavoidable), and some programming as well, but all in all, it will be fun to learn. So open up your favorite IDE (notepad?!), grab a cup of coffee, and let’s start.
Table of Contents
Mathematical Background
One of the most common techniques for image compression is SVD or Singular Value Decomposition. Singular Value Decomposition is a powerful linear algebra technique that decomposes a matrix into three simpler matrices, allowing us to represent and analyze the matrix in a more compact and informative way.
For a given matrix ‘A’, the SVD is represented as: \(A = U \Sigma V^T\) where:
- \(U\) is an m×m orthogonal matrix (left singular vectors).
- \(\Sigma\) is an m×n diagonal matrix containing the singular values of \(A\).
- \(V^T\) is an n×n orthogonal matrix (right singular vectors).
In the context of image compression, the SVD is applied to the image matrix \(A\) (representing pixel intensities) to identify dominant patterns and compress the information while retaining essential features.
For more information on SVD, I recommend you to check out this article and this video. We will now implement this concept using Python programming and gain better understanding of it.
Bonus Tip:
Programming is fun when you get to implement the topics you learn. Without any hands-on implementation, motivation to learn programming can fade away quickly. I’ve got something that will help you maintain this motivation, make programming fun and it is exactly what you need!
“Python Crash Course” is a project-based book for beginners who are willing to not only learn Python but build amazing, cool projects with it. “Python Crash Course” is one of the bestselling programming books that has sold over 1,500,000 copies worldwide. In this book, you’ll learn how to:
– Use Python libraries and tools like Pytest, Pygame, Matplotlib, Plotly, and Django
– Create more complex 2D games that respond to user actions
– Generate interactive data visualizations with different datasets
– Build and deploy apps that let users create accounts, manage data
– Solve coding errors and common programming issues.
Why waste $100+ on some online programming course where motivation is lost easily and provides 1-2 only hands-on projects, while you can learn the same concepts and do more fun projects under $30?! Go ahead, buy your copy of “Python Crash Course” and have fun building creative apps and Python projects!
Algorithm for Image Compression
Now that we have some idea of the math behind image compression, let us have a look at the code:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
def compress_image(image_path, k):
# Load image
img = Image.open(image_path)
# Convert image to numpy array (normalized to [0, 1])
img_array = np.array(img)
img_array = img_array/255
row, col, _ = img_array.shape
image_red = img_array[:, :, 0]
image_green = img_array[:, :, 1]
image_blue = img_array[:, :, 2]
original_bytes = img_array.nbytes
print ("The space (in bytes) needed to store this image is", original_bytes)
U_r, d_r, V_r = np.linalg.svd(image_red, full_matrices=True)
U_g, d_g, V_g = np.linalg.svd(image_green, full_matrices=True)
U_b, d_b, V_b = np.linalg.svd(image_blue, full_matrices=True)
U_r_k = U_r[:, 0:round(k*(U_r.shape[1]))]
V_r_k = V_r[0:round(k*(U_r.shape[0])), :]
U_g_k = U_g[:, 0:round(k*(U_r.shape[1]))]
V_g_k = V_g[0:round(k*(U_r.shape[0])), :]
U_b_k = U_b[:, 0:round(k*(U_r.shape[1]))]
V_b_k = V_b[0:round(k*(U_r.shape[0])), :]
d_r_k = d_r[0:round(k*(d_r.shape[0]))]
d_g_k = d_g[0:round(k*(d_g.shape[0]))]
d_b_k = d_b[0:round(k*(d_b.shape[0]))]
compressed_bytes = sum([matrix.nbytes for matrix in
[U_r_k, d_r_k, V_r_k, U_g_k, d_g_k, V_g_k, U_b_k, d_b_k, V_b_k]])
print ("The compressed matrices that we store now have total size (in bytes):", compressed_bytes)
image_red_approx = np.dot(U_r_k, np.dot(np.diag(d_r_k), V_r_k))
image_green_approx = np.dot(U_g_k, np.dot(np.diag(d_g_k), V_g_k))
image_blue_approx = np.dot(U_b_k, np.dot(np.diag(d_b_k), V_b_k))
image_reconstructed = np.zeros((row, col, 3))
image_reconstructed[:, :, 0] = image_red_approx
image_reconstructed[:, :, 1] = image_green_approx
image_reconstructed[:, :, 2] = image_blue_approx
image_reconstructed = np.zeros((row, col, 3))
image_reconstructed[:, :, 0] = image_red_approx
image_reconstructed[:, :, 1] = image_green_approx
image_reconstructed[:, :, 2] = image_blue_approx
image_reconstructed[image_reconstructed < 0] = 0
image_reconstructed[image_reconstructed > 1] = 1
print(image_reconstructed.nbytes)
return image_reconstructed
# Example usage
input_image = "data/1.png"
output_image = "compressed_output.png"
compression_ratio_1 = 0.1 # Adjust based on desired compression ratio (e.g., retain 10% of total energy)
compression_ratio_2 = 1 # Adjust based on desired compression ratio (e.g., retain 10% of total energy)
compressed_image_1 = compress_image(input_image, compression_ratio_1)
compressed_image_2 = compress_image(input_image, compression_ratio_2)
plt.imsave("compressed_output.png", compressed_image_1)
plt.subplot(2,1,1)
plt.imshow(compressed_image_1)
plt.subplot(2,1,2)
plt.imshow(compressed_image_2)
plt.show()
#compressed_image.save(output_image)
The code seems overwhelming at first glance, but it is simple to understand. The main logic lies in the compress_image()
function so let us go through the function step by step.
Breaking down the Code
Let us understand the initial lines of the code.
def compress_image(image_path, k):
img = Image.open(image_path)
# Convert image to numpy array (normalized to [0, 1])
img_array = np.array(img)
img_array = img_array/255
row, col, _ = img_array.shape
image_red = img_array[:, :, 0]
image_green = img_array[:, :, 1]
image_blue = img_array[:, :, 2]
original_bytes = img_array.nbytes
print ("The space (in bytes) needed to store this image is", original_bytes)
Initially, the input image is imported using Image.open(image_path)
. Next, we convert the Image object into numpy array and split the RGB channels into individual matrices, namely image_red
, image_green
and image_blue
. To keep a record of the difference between the memory consumed to store the image data, we check the bytes used to store this original image using img_array.nbytes
. We will do the same later for compressed image. In my case, the memory used to save the original image is 6,291,456 bytes. After converting the image array to individual RGB array, we now come to the significant part of the code.
U_r, d_r, V_r = np.linalg.svd(image_red, full_matrices=True)
U_g, d_g, V_g = np.linalg.svd(image_green, full_matrices=True)
U_b, d_b, V_b = np.linalg.svd(image_blue, full_matrices=True)
We use the Singular Value Decomposition method for image compression. For this, numpy provides the function of svd()
that allows us the calculate the SVD of matrices. Thus, we calculate the Singular Value Decomposition (SVD) of the red channel matrix, resulting in matrices U_r
, d_r
, and V_r
. We do the same for the other channels, namely the green and the blue channel matrices.
U_r_k = U_r[:, 0:round(k*(U_r.shape[1]))]
V_r_k = V_r[0:round(k*(U_r.shape[0])), :]
U_g_k = U_g[:, 0:round(k*(U_r.shape[1]))]
V_g_k = V_g[0:round(k*(U_r.shape[0])), :]
U_b_k = U_b[:, 0:round(k*(U_r.shape[1]))]
V_b_k = V_b[0:round(k*(U_r.shape[0])), :]
d_r_k = d_r[0:round(k*(d_r.shape[0]))]
d_g_k = d_g[0:round(k*(d_g.shape[0]))]
d_b_k = d_b[0:round(k*(d_b.shape[0]))]
compressed_bytes = sum([matrix.nbytes for matrix in
[U_r_k, d_r_k, V_r_k, U_g_k, d_g_k, V_g_k, U_b_k, d_b_k, V_b_k]])
print ("The compressed matrices that we store now have total size (in bytes):", compressed_bytes)
Now it is time to trim the matrices. If you scroll back up, you’d see the second argument that we pass into the function as k
. This is the compression ratio. This means that only the top ‘k’% singular vectors/values are retained. So for instance, if we pass the argument of 0.5 (50%) compression, the code will trim off 50% of the values from the matrices. Ultimately, we check the total memory that would be used to store this compressed image. We store this value in compressed_bytes
. In my case, the memory used to store this compressed image is 1,254,600 bytes.
Thus, memory for original image was 6,291,456 bytes, whereas for compressed image is 1,254,600 bytes. That’s a drastic memory reduction by factor of 5. However, we cannot return this disassembled image as the function output. Hence we need to put these decomposed matrices back together and build the compressed image as a whole.
image_red_approx = np.dot(U_r_k, np.dot(np.diag(d_r_k), V_r_k))
image_green_approx = np.dot(U_g_k, np.dot(np.diag(d_g_k), V_g_k))
image_blue_approx = np.dot(U_b_k, np.dot(np.diag(d_b_k), V_b_k))
image_reconstructed = np.zeros((row, col, 3))
image_reconstructed[:, :, 0] = image_red_approx
image_reconstructed[:, :, 1] = image_green_approx
image_reconstructed[:, :, 2] = image_blue_approx
image_reconstructed = np.zeros((row, col, 3))
image_reconstructed[:, :, 0] = image_red_approx
image_reconstructed[:, :, 1] = image_green_approx
image_reconstructed[:, :, 2] = image_blue_approx
image_reconstructed[image_reconstructed < 0] = 0
image_reconstructed[image_reconstructed > 1] = 1
print(image_reconstructed.nbytes)
return image_reconstructed
Here’s a comparison of original image vs compressed image:

Summary
In this blog, we understood how images can be compressed using the Singular Value Decomposition technique in mathematics, and the Numpy library of Python helps us to accomplish this. The input image is first deconstructed into individual channels and every channel is decomposed using SVD. After this decomposition, the layers are again combined together which ultimately results in the compressed image as we saw above in the image example.
And again, to learn more interesting Python concepts by implementing them in creative, hands-on projects, you need not spend hundreds of bucks on some boring Python course. Get the book “Python Crash Course” where you not only will learn Python programming, but also build extraordinary projects and develop apps.
If you enjoyed this blog, show your support by following me on my social media where I post tips and tricks on programming and machine learning:
To avoid missing such interesting projects and articles, subscribe to my *FREE* newsletter where you’ll receive monthly updates on the posts you might have missed: