20 Aug 2024

What the heck is NumPy?

When I was learning NumPy, I found it difficult to understand its practical use because most resources focused on how to use it rather than real-life examples.

In this post, I'll explain NumPy in a simple, practical way and show you why it's important in machine learning.

Everything is a number

Computers don't see images, videos, audio, or text like we do. While we can easily recognize a dog, cat, or person in a picture, computers need everything to be converted into numbers. Tools like NumPy help us turn images into numbers so computers can understand them.

NumPy isn’t just for images, it’s incredibly versatile! You can use it with text, audio, and video, and it’s also perfect for handling arrays. But to keep things simple, let's focus on images for this post.

Let me demonstrate how this works with a simple example.

Take this image of the Iraqi flag as an example. It’s 683 x 1024 pixels in size and, being in PNG format, it includes four channels: Red (R), Green (G), Blue (B), and Alpha (A) for transparency.

Iraqi Flag

To a computer, an image is simply a bunch of numbers. So, let’s convert this image into a format that the computer can understand.

First things first, you'll need to install both NumPy and Pillow to get started:

Pillow, is a powerful and widely-used image processing library for Python,

pip3 install numpy pillow matplotlib
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

# Load the image using Pillow
image = Image.open('iraqi_flag.png')

# Convert the image to a NumPy array
image_array = np.array(image)

image_array

Here is the output:

array(
    [
        [
            [205,  15,  29, 255],
            [205,  15,  29, 255],
            [205,  15,  29, 255],
            ...,
            [  0,   0,   0,  52],
            [  0,   0,   0,  44],
            [  0,   0,   0,  11]
        ]
    ], 
dtype=uint8)

Too many numbers! 🥴

Let's print out the last item in the array:

image_array[-1][-1]
[ 0  0  0 11]

Now, if you want to flip the flag upside-down, you can easily do that using NumPy’s flip method:

image_flipped = np.flip(image_array, axis=0)
image_flipped

What the flip method does is invert the array upside down, so the last row [ 0 0 0 11] becomes the first row.

Let's do one more thing to know what I mean:

print(image_array[-1][-1]) # last item
print(image_flipped[0][-1]) # last item becomes the first

As you can see, the numbers are now flipped upside down. If we use matplotlib to plot the image, here’s what we get:

plt.imshow(image_flipped)

There's a lot more to the flip method than I can cover here. For instance, if you remove axis=0, you'll get a different result because NumPy will flip along all axes by default. To learn more about the flip method and its various options, you can check out the official NumPy documentation

Iraqi Flag Flipped

Next, let’s print out the shape of the array:

image_array.shape

Output:

(683, 1024, 4)
  • 683: The number of rows (height).
  • 1024: The number of columns (width).
  • 4: The number of channels or depth.

Finally, let’s print out the dimensions of the array:

image_array.ndim

Output:

3

In NumPy, ndim is an attribute of a NumPy array that represents the number of dimensions (or axes) the array has. Essentially, it tells you how many levels of nested arrays or how many indices are needed to access an element in the array.

Now you get it - at its heart, an image is just a bunch of numbers, and NumPy is perfect for handling that kind of data.

Thanks for reading.

All rights reserved to Ahmad Mayahi