— 2 Feb 2025

Matrix Multiplication

Matrix multiplication is just a way of combining two tables of numbers (matrices) to create a new table.

Each row of the first matrix talks to each column of the second matrix, like a group interview where every candidate (row) meets every interviewer (column).

The Smallest Example

Let’s take the simplest case: a 2×2 matrix multiplied by another 2×2 matrix.

Imagine this as matching salespeople with products. Each row is a salesperson, and each column is a product.

import numpy as np

A = np.array([[1, 2], 
              [3, 4]])  # Salespeople data

B = np.array([[5, 6], 
              [7, 8]])  # Product demand

C = np.matmul(A, B)
print(C)

What’s Happening?

Each row from A mixes with each column from B:

Row from A	Column from B	Result
`[1, 2]`	`[5, 7]`	`15 + 27 = 19`
`[1, 2]`	`[6, 8]`	`16 + 28 = 22`
`[3, 4]`	`[5, 7]`	`35 + 47 = 43`
`[3, 4]`	`[6, 8]`	`36 + 48 = 50`

So the final matrix is:

[[19, 22]
 [43, 50]]

💡 Think of it as:

The first row in A is one salesperson's performance.
The first column in B is one product's demand.
The multiplication tells us how well they fit together.

Different Shapes - Why Does Shape Matter?

You CANNOT multiply just any two matrices—you have to match the columns of the first with the rows of the second.

Let’s try a 2×3 matrix multiplied by a 3×2 matrix:

A = np.array([[1, 2, 3], 
              [4, 5, 6]])  # Shape (2, 3)

B = np.array([[7, 8], 
              [9, 10], 
              [11, 12]])  # Shape (3, 2)

C = np.matmul(A, B)
print(C)

What’s Happening?

A has 2 rows, 3 columns.
B has 3 rows, 2 columns.
Since A has 3 columns, and B has 3 rows, we can multiply them!

Row from A	Column from B	Result
`[1, 2, 3]`	`[7, 9, 11]`	`17 + 29 + 3*11 = 58`
`[1, 2, 3]`	`[8, 10, 12]`	`18 + 210 + 3*12 = 64`
`[4, 5, 6]`	`[7, 9, 11]`	`47 + 59 + 6*11 = 139`
`[4, 5, 6]`	`[8, 10, 12]`	`48 + 510 + 6*12 = 154`

Final result:

[[ 58,  64]
 [139, 154]]

💡 Think of it as:

A classroom of students (rows) taking three subjects (columns).
A grading system (matrix B) that converts subject scores into final grades.
The multiplication converts raw scores into final results.

Vector-Matrix Multiplication (A Special Case)

A vector is just a single row or column, like a student's grades.

If we have a 2×2 matrix and a vector (1D array of 2 numbers):

A = np.array([[1, 2], 
              [3, 4]])  # Shape (2,2)

v = np.array([5, 6])  # Shape (2,)

C = np.matmul(A, v)
print(C)

What’s Happening?

The vector meets every row in A.
It multiplies each element and adds the results.

Row from A	Vector v	Result
`[1, 2]`	`[5, 6]`	`15 + 26 = 17`
`[3, 4]`	`[5, 6]`	`35 + 46 = 39`

Final result:

[17, 39]

💡 Think of it as:

A restaurant rating system where each row is a restaurant, and the vector represents your personal weight for food, service, and ambiance.
The multiplication computes your overall score for each restaurant.

Batch Multiplication (For AI & Deep Learning)

Imagine you’re running multiple experiments at once. Instead of multiplying one matrix at a time, we multiply a batch of matrices together.

import torch

A = torch.rand(2, 3, 4)  # Two (3×4) matrices
B = torch.rand(2, 4, 5)  # Two (4×5) matrices

C = torch.matmul(A, B)  # Output shape (2, 3, 5)
print(C.shape)

🔍 What’s happening?

We have 2 sets of (3×4) matrices (A).
We have 2 sets of (4×5) matrices (B).
Instead of multiplying them one by one, PyTorch does them all at once!

💡 Think of it as:

Processing multiple images at the same time in deep learning.
Applying multiple transformations to different datasets at once.

Final Summary

✅ Matrix multiplication is just taking rows from one table and mixing them with columns from another.
✅ Different shapes just mean you need to match columns of the first to rows of the second.
✅ Vectors (1D arrays) can multiply matrices too, like getting a final score from weighted criteria.
✅ Batch multiplication is used in AI to process multiple calculations at once.