2 Feb 2025

Matrix Multiplication

Matrix multiplication is just a way of combining two tables of numbers (matrices) to create a new table.

Each row of the first matrix talks to each column of the second matrix, like a group interview where every candidate (row) meets every interviewer (column).


The Smallest Example

Let’s take the simplest case: a 2×2 matrix multiplied by another 2×2 matrix.

Imagine this as matching salespeople with products. Each row is a salesperson, and each column is a product.

import numpy as np

A = np.array([[1, 2], 
              [3, 4]])  # Salespeople data

B = np.array([[5, 6], 
              [7, 8]])  # Product demand

C = np.matmul(A, B)
print(C)

What’s Happening?

Each row from A mixes with each column from B:

Row from A Column from B Result
[1, 2] [5, 7] 1*5 + 2*7 = 19
[1, 2] [6, 8] 1*6 + 2*8 = 22
[3, 4] [5, 7] 3*5 + 4*7 = 43
[3, 4] [6, 8] 3*6 + 4*8 = 50

So the final matrix is:

[[19, 22]
 [43, 50]]

💡 Think of it as:

  • The first row in A is one salesperson's performance.
  • The first column in B is one product's demand.
  • The multiplication tells us how well they fit together.

Different Shapes - Why Does Shape Matter?

You CANNOT multiply just any two matrices—you have to match the columns of the first with the rows of the second.

Let’s try a 2×3 matrix multiplied by a 3×2 matrix:

A = np.array([[1, 2, 3], 
              [4, 5, 6]])  # Shape (2, 3)

B = np.array([[7, 8], 
              [9, 10], 
              [11, 12]])  # Shape (3, 2)

C = np.matmul(A, B)
print(C)

What’s Happening?

  • A has 2 rows, 3 columns.
  • B has 3 rows, 2 columns.
  • Since A has 3 columns, and B has 3 rows, we can multiply them!
Row from A Column from B Result
[1, 2, 3] [7, 9, 11] 1*7 + 2*9 + 3*11 = 58
[1, 2, 3] [8, 10, 12] 1*8 + 2*10 + 3*12 = 64
[4, 5, 6] [7, 9, 11] 4*7 + 5*9 + 6*11 = 139
[4, 5, 6] [8, 10, 12] 4*8 + 5*10 + 6*12 = 154

Final result:

[[ 58,  64]
 [139, 154]]

💡 Think of it as:

  • A classroom of students (rows) taking three subjects (columns).
  • A grading system (matrix B) that converts subject scores into final grades.
  • The multiplication converts raw scores into final results.

Vector-Matrix Multiplication (A Special Case)

A vector is just a single row or column, like a student's grades.

If we have a 2×2 matrix and a vector (1D array of 2 numbers):

A = np.array([[1, 2], 
              [3, 4]])  # Shape (2,2)

v = np.array([5, 6])  # Shape (2,)

C = np.matmul(A, v)
print(C)

What’s Happening?

  • The vector meets every row in A.
  • It multiplies each element and adds the results.
Row from A Vector v Result
[1, 2] [5, 6] 1*5 + 2*6 = 17
[3, 4] [5, 6] 3*5 + 4*6 = 39

Final result:

[17, 39]

💡 Think of it as:

  • A restaurant rating system where each row is a restaurant, and the vector represents your personal weight for food, service, and ambiance.
  • The multiplication computes your overall score for each restaurant.

Batch Multiplication (For AI & Deep Learning)

Imagine you’re running multiple experiments at once. Instead of multiplying one matrix at a time, we multiply a batch of matrices together.

import torch

A = torch.rand(2, 3, 4)  # Two (3×4) matrices
B = torch.rand(2, 4, 5)  # Two (4×5) matrices

C = torch.matmul(A, B)  # Output shape (2, 3, 5)
print(C.shape)

🔍 What’s happening?

  • We have 2 sets of (3×4) matrices (A).
  • We have 2 sets of (4×5) matrices (B).
  • Instead of multiplying them one by one, PyTorch does them all at once!

💡 Think of it as:

  • Processing multiple images at the same time in deep learning.
  • Applying multiple transformations to different datasets at once.

Final Summary

Matrix multiplication is just taking rows from one table and mixing them with columns from another.
Different shapes just mean you need to match columns of the first to rows of the second.
Vectors (1D arrays) can multiply matrices too, like getting a final score from weighted criteria.
Batch multiplication is used in AI to process multiple calculations at once.

All rights reserved to Ahmad Mayahi