Matrix Multiplication
Matrix multiplication is just a way of combining two tables of numbers (matrices) to create a new table.
Each row of the first matrix talks to each column of the second matrix, like a group interview where every candidate (row) meets every interviewer (column).
The Smallest Example
Let’s take the simplest case: a 2×2 matrix multiplied by another 2×2 matrix.
Imagine this as matching salespeople with products. Each row is a salesperson, and each column is a product.
import numpy as np
A = np.array([[1, 2],
[3, 4]]) # Salespeople data
B = np.array([[5, 6],
[7, 8]]) # Product demand
C = np.matmul(A, B)
print(C)
What’s Happening?
Each row from A
mixes with each column from B
:
Row from A | Column from B | Result |
---|---|---|
[1, 2] |
[5, 7] |
1*5 + 2*7 = 19 |
[1, 2] |
[6, 8] |
1*6 + 2*8 = 22 |
[3, 4] |
[5, 7] |
3*5 + 4*7 = 43 |
[3, 4] |
[6, 8] |
3*6 + 4*8 = 50 |
So the final matrix is:
[[19, 22]
[43, 50]]
💡 Think of it as:
- The first row in
A
is one salesperson's performance. - The first column in
B
is one product's demand. - The multiplication tells us how well they fit together.
Different Shapes - Why Does Shape Matter?
You CANNOT multiply just any two matrices—you have to match the columns of the first with the rows of the second.
Let’s try a 2×3 matrix multiplied by a 3×2 matrix:
A = np.array([[1, 2, 3],
[4, 5, 6]]) # Shape (2, 3)
B = np.array([[7, 8],
[9, 10],
[11, 12]]) # Shape (3, 2)
C = np.matmul(A, B)
print(C)
What’s Happening?
-
A
has 2 rows, 3 columns. -
B
has 3 rows, 2 columns. - Since
A
has 3 columns, andB
has 3 rows, we can multiply them!
Row from A | Column from B | Result |
---|---|---|
[1, 2, 3] |
[7, 9, 11] |
1*7 + 2*9 + 3*11 = 58 |
[1, 2, 3] |
[8, 10, 12] |
1*8 + 2*10 + 3*12 = 64 |
[4, 5, 6] |
[7, 9, 11] |
4*7 + 5*9 + 6*11 = 139 |
[4, 5, 6] |
[8, 10, 12] |
4*8 + 5*10 + 6*12 = 154 |
Final result:
[[ 58, 64]
[139, 154]]
💡 Think of it as:
- A classroom of students (rows) taking three subjects (columns).
- A grading system (matrix
B
) that converts subject scores into final grades. - The multiplication converts raw scores into final results.
Vector-Matrix Multiplication (A Special Case)
A vector is just a single row or column, like a student's grades.
If we have a 2×2 matrix and a vector (1D array of 2 numbers):
A = np.array([[1, 2],
[3, 4]]) # Shape (2,2)
v = np.array([5, 6]) # Shape (2,)
C = np.matmul(A, v)
print(C)
What’s Happening?
- The vector meets every row in
A
. - It multiplies each element and adds the results.
Row from A | Vector v | Result |
---|---|---|
[1, 2] |
[5, 6] |
1*5 + 2*6 = 17 |
[3, 4] |
[5, 6] |
3*5 + 4*6 = 39 |
Final result:
[17, 39]
💡 Think of it as:
- A restaurant rating system where each row is a restaurant, and the vector represents your personal weight for food, service, and ambiance.
- The multiplication computes your overall score for each restaurant.
Batch Multiplication (For AI & Deep Learning)
Imagine you’re running multiple experiments at once. Instead of multiplying one matrix at a time, we multiply a batch of matrices together.
import torch
A = torch.rand(2, 3, 4) # Two (3×4) matrices
B = torch.rand(2, 4, 5) # Two (4×5) matrices
C = torch.matmul(A, B) # Output shape (2, 3, 5)
print(C.shape)
🔍 What’s happening?
- We have 2 sets of (3×4) matrices (
A
). - We have 2 sets of (4×5) matrices (
B
). - Instead of multiplying them one by one, PyTorch does them all at once!
💡 Think of it as:
- Processing multiple images at the same time in deep learning.
- Applying multiple transformations to different datasets at once.
Final Summary
✅ Matrix multiplication is just taking rows from one table and mixing them with columns from another.
✅ Different shapes just mean you need to match columns of the first to rows of the second.
✅ Vectors (1D arrays) can multiply matrices too, like getting a final score from weighted criteria.
✅ Batch multiplication is used in AI to process multiple calculations at once.