Linear Algebra

The mathematics of vectors, matrices, and linear transformations. Linear algebra is the backbone of modern computation, powering everything from computer graphics and machine learning to quantum mechanics and economic modeling.

Introduction to Linear Algebra

Linear algebra is one of the most widely applicable branches of mathematics. While algebra deals with single unknown quantities, linear algebra deals with collections of unknowns organized into vectors and matrices, and it studies the linear relationships between them.

The word "linear" refers to straight lines and flat planes — the simplest geometric objects. A linear equation in variables x₁, x₂, …, xₙ has the form:

a₁x₁ + a₂x₂ + … + aₙxₙ = b

where a₁, a₂, …, aₙ and b are constants. There are no squares, cubes, products of variables, or other nonlinear terms — just variables multiplied by constants and added together.

Linear algebra asks — and answers — fundamental questions such as:

  • Does a system of linear equations have a solution? If so, is it unique?
  • How can we efficiently represent and solve systems with hundreds or millions of unknowns?
  • What happens when we apply a linear transformation to a geometric object?
  • What are the "natural directions" associated with a matrix (eigenvectors)?
Linear algebra is arguably the single most useful branch of mathematics for the 21st century. It underpins machine learning, data science, computer graphics, signal processing, quantum computing, optimization, and much more. Mastering it will open doors across science and engineering.

The subject was developed over centuries, with key contributions from mathematicians including Carl Friedrich Gauss (elimination methods), Arthur Cayley (matrix theory), Hermann Grassmann (vector spaces), and David Hilbert (abstract spaces). Today, linear algebra is typically the first mathematics course that moves from calculation to abstraction and proof.

Vectors

A vector is a mathematical object that has both magnitude (length) and direction. Geometrically, a vector is an arrow pointing from one location to another. Algebraically, a vector is an ordered list of numbers called components.

Notation and Representation

A vector in ℝⁿ (n-dimensional real space) is written as:

v = (v₁, v₂, …, vₙ) or v = [v₁, v₂, …, vₙ]ᵀ

For example, the vector v = (3, 4) in ℝ² represents a point or an arrow from the origin to the point (3, 4) in the plane.

Common notations for vectors include boldface (v), an arrow above the letter (v⃗), or an underline. We'll use boldface throughout this lesson.

Vector Addition

Vectors of the same dimension are added component by component:

u + v = (u₁ + v₁, u₂ + v₂, …, uₙ + vₙ)

Example: Add (2, 5, -1) and (3, -2, 4)

(2, 5, -1) + (3, -2, 4) = (2+3, 5+(-2), -1+4) = (5, 3, 3)

Geometrically, adding two vectors corresponds to placing them tip-to-tail and drawing the arrow from the start of the first to the end of the second (the parallelogram rule).

Scalar Multiplication

Multiplying a vector by a scalar (a real number) scales every component:

cv = (cv₁, cv₂, …, cvₙ)

Example: Compute 3 · (2, -1, 4)

3 · (2, -1, 4) = (3·2, 3·(-1), 3·4) = (6, -3, 12)

If c > 1, the vector stretches. If 0 < c < 1, it shrinks. If c < 0, it reverses direction and scales.

Magnitude (Length) of a Vector

The magnitude (or norm) of a vector v = (v₁, v₂, …, vₙ) is:

v‖ = √(v₁² + v₂² + … + vₙ²)

Example: Find the magnitude of (3, 4)

‖(3, 4)‖ = √(3² + 4²) = √(9 + 16) = √25 = 5

Unit Vectors

A unit vector has magnitude 1. To find the unit vector in the direction of v, divide by its magnitude:

û = v / ‖v

Example: Find the unit vector in the direction of (3, 4)

û = (3, 4) / 5 = (3/5, 4/5) = (0.6, 0.8)

Check: ‖(0.6, 0.8)‖ = √(0.36 + 0.64) = √1 = 1 ✓

The standard unit vectors in ℝ³ are i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1).

Dot Product (Scalar Product)

The dot product of two vectors is a scalar quantity defined as:

u · v = u₁v₁ + u₂v₂ + … + uₙvₙ

The dot product can also be expressed geometrically:

u · v = ‖u‖ ‖v‖ cos θ

where θ is the angle between the two vectors.

Example: Compute the dot product of (1, 2, 3) and (4, -5, 6)

(1)(4) + (2)(-5) + (3)(6) = 4 - 10 + 18 = 12

Two vectors are orthogonal (perpendicular) if and only if their dot product equals zero: u · v = 0. This is one of the most important tests in linear algebra.

Finding the Angle Between Vectors

Rearranging the geometric dot product formula gives:

cos θ = (u · v) / (‖u‖ ‖v‖)

Example: Find the angle between (1, 0) and (1, 1)

u · v = (1)(1) + (0)(1) = 1

u‖ = 1, ‖v‖ = √2

cos θ = 1 / (1 · √2) = 1/√2

θ = arccos(1/√2) = 45° (π/4 radians)

Cross Product (Vector Product)

The cross product is defined only in ℝ³ and produces a new vector that is perpendicular to both input vectors:

u × v = (u₂v₃ - u₃v₂, u₃v₁ - u₁v₃, u₁v₂ - u₂v₁)

Example: Compute (1, 2, 3) × (4, 5, 6)

First component: (2)(6) - (3)(5) = 12 - 15 = -3

Second component: (3)(4) - (1)(6) = 12 - 6 = 6

Third component: (1)(5) - (2)(4) = 5 - 8 = -3

u × v = (-3, 6, -3)

Verify orthogonality:

(-3, 6, -3) · (1, 2, 3) = -3 + 12 - 9 = 0 ✓

(-3, 6, -3) · (4, 5, 6) = -12 + 30 - 18 = 0 ✓

The magnitude of the cross product equals the area of the parallelogram formed by the two vectors:

PrXwTiLcdS3oszAjXP0mwiILGD7jN51DDU/MvRhdlvcLrdsH8Dka4n+5Iu9U3rfIEERdaqZLadsw1J8swNjy5hmTv+L9zqZVyqfyIJ7x2Y59WhWLQh+DFnafmOWSfXTJI2DKg8PFwBSlHPmbgSAGDULab0qmxEy20Afk/VhGsnc2VntcDbISDrbbbQwXyRLOXVlWHfomhsxtZwTa6Bo14heMxrMByLOZJ51/k5atgyfY1R7J6SawguQ0slBnTfSII05PpWkuzyq942j7B2fE0sXck3L7MTHlTM3We/fGNfQ6/+QNiBVqXWCsUQSXJdE8nNaY9mUfuQD52amDMLcwuIGts5z8MwtizSrg0J+x+Y0tY7MwLIqm3vUha1rzQPRT4eIDxMJzreXa6ynaDGNByILOc2G8ytcEuSku1iI+bIvxkJJwhIHTG27241C/oqaNTjL/DnsZ6vvvLKGJLE/ClfwCb2E9wqnwKvggJKhj46AW66IoqeMj+AqcVfDdULzSjS3znPwPtJrpCkIa3j4YMZ1ltWeG8Mt4Codk5mhxkCeD/tj69PNSDqbe3RElxxss4E1+vEpfhxe3OF8RQ5L0LvsUBoGHtlga/hf5uUc1VnZfU0JdD9gC8FamE+06AAITTGCPFOBz/TDeUUploWKfaZw1jfmd/drwH3QopKRFXNHXiE5R0rKSLrWdS4UcuFbX+ukRvt492r4YBm9d2J2WwzGZcDRrz31ZZ4r64SQjhZnBJ7Amp
u × v‖ = ‖u‖ ‖v‖ sin θ
The cross product obeys the right-hand rule: if you curl the fingers of your right hand from u toward v, your thumb points in the direction of u × v. Note that the cross product is anti-commutative: u × v = −(v × u).

Matrices

A matrix is a rectangular array of numbers arranged in rows and columns. An m × n matrix has m rows and n columns. Matrices are the central data structure of linear algebra and provide a compact way to represent systems of equations, transformations, and data.

Matrix Notation

A general m × n matrix A is written as:

A = [a₁₁ a₁₂ … a₁ₙ]
    [a₂₁ a₂₂ … a₂ₙ]
    [ ⋮ ⋮ ⋱ ⋮ ]
    [aₘ₁ aₘ₂ … aₘₙ]

The entry in row i and column j is denoted aᵢⱼ.

Types of Matrices

  • Row vector: A 1 × n matrix (a single row)
  • Column vector: An m × 1 matrix (a single column)
  • Square matrix: An n × n matrix (equal rows and columns)
  • Zero matrix: All entries are 0
  • Identity matrix (I): A square matrix with 1s on the main diagonal and 0s elsewhere
  • Diagonal matrix: A square matrix where all non-diagonal entries are 0
  • Symmetric matrix: A square matrix where A = Aᵀ (i.e., aᵢⱼ = aⱼᵢ for all i, j)
  • Upper triangular: All entries below the main diagonal are 0
  • Lower triangular: All entries above the main diagonal are 0

Matrix Addition and Scalar Multiplication

Matrices of the same dimensions are added entry by entry, and scalar multiplication scales every entry:

(A + B)ᵢⱼ = aᵢⱼ + bᵢⱼ
(cA)ᵢⱼ = c · aᵢⱼ

Matrix Multiplication

The product AB is defined when the number of columns in A equals the number of rows in B. If A is m × n and B is n × p, then AB is m × p:

(AB)ᵢⱼ = Σₖ aᵢₖ bₖⱼ = aᵢ₁b₁ⱼ + aᵢ₂b₂ⱼ + … + aᵢₙbₙⱼ

Each entry of AB is the dot product of the corresponding row of A with the corresponding column of B.

Example: Matrix Multiplication

Let A = [1 2] and B = [5 6]

        [3 4]        [7 8]

AB:

(AB)₁₁ = (1)(5) + (2)(7) = 5 + 14 = 19

(AB)₁₂ = (1)(6) + (2)(8) = 6 + 16 = 22

(AB)₂₁ = (3)(5) + (4)(7) = 15 + 28 = 43

(AB)₂₂ = (3)(6) + (4)(8) = 18 + 32 = 50

AB = [19 22] = [19 22]

     [43 50]   [43 50]

Matrix multiplication is NOT commutative: in general, AB ≠ BA. Even when both products are defined, they usually give different results. However, matrix multiplication is associative: (AB)C = A(BC).

Transpose

The transpose of an m × n matrix A, written Aᵀ, is the n × m matrix obtained by swapping rows and columns:

(Aᵀ)ᵢⱼ = aⱼᵢ

Example: Find the Transpose

A = [1 2 3]

    [4 5 6]

Aᵀ = [1 4]

     [2 5]

     [3 6]

Key transpose properties:

  • (Aᵀ)ᵀ = A
  • (A + B)ᵀ = Aᵀ + Bᵀ
  • (cA)ᵀ = cAᵀ
  • (AB)ᵀ = BᵀAᵀ (note the reversal of order!)

Systems of Linear Equations

A system of linear equations is a collection of one or more linear equations involving the same set of variables. Linear algebra provides systematic methods for solving such systems efficiently, even when they involve thousands of variables.

Matrix Form of a System

Any system of linear equations can be written in matrix form as:

ufkP89x14y9X88w+De7pydTL+3cKSoK58ZGuqdse0GDQN47G87k7AZ1jFkCWNFcpqX2z1R8wEoD3Da3VKyvSbbCWBLCVPG6nw5KrU3Rf4+7GVXBiZXwKSmirPRKamDCMqttEq061zDfP0k5z85hVgbScTMPQVo5iOoInQusCNyiYR12ghbMD+fpN69ATc3vsIke8/EugsiUUvbeF/dIjz+5QyRAl8R59oVfOVvgBUW2cX9kTjCEIr2hzDReN12XczVUdKIJZ3lcSbYKlS9En7wRU6n0KQrI8HBXZwZyS9fMMMXeJd3eZWmXXSCa2tovQ6CgFdUjguxUov8VEM43zV8FYUZmKNwbTYd4r0gmALpy6BKtwbC4ZovT9PJ0dr1/PNadjomRu4r5Vrjb1FkAQvkEnZ0QWuhRbOLaJ4V9c216szEnApJYrmh4fFVHcuBZpWW4svywfIo4rYGdxe4bTHyihqC4bScNnR729+t7S4iocoX3Pe+edwWkyjTEtPo/KlFwfR9W5OFf16r+n4I0vz7K6YEeZ3oQ6peSNn0UwcE0oU3iPp+fanRHT7AinFsoRm4p7JaZIwj9JgOYHzQDJR5FmAk48VG7rXFueZhFTjxWsXrL7uDyXh+WGzatbvecIHbKUrdfQiHJcca6J2Bl9+fNKBQB1k6czEXPAUrA03kEkyh3KFuRtMYOhzfo8hb6udvBvjqtyvFQ4ByldCWFYB96Iykf0A61khaWopizny+rxyqKP+
Ax = b

where A is the coefficient matrix, x is the vector of unknowns, and b is the vector of constants.

Example: Writing a System in Matrix Form

The system:

  2x + 3y = 7

  x − y = 1

In matrix form:

[2 3] [x] [7]

[1 -1] [y] = [1]

Augmented Matrix

To solve a system, we form the augmented matrix [A | b] by appending the constants column to the coefficient matrix:

[A | b] = [2 3 | 7]
            [1 -1 | 1]

Gaussian Elimination

Gaussian elimination is the fundamental algorithm for solving systems of linear equations. It uses three elementary row operations to systematically reduce the augmented matrix:

  1. Row swap: Interchange two rows (Rᵢ ↔ Rⱼ)
  2. Row scaling: Multiply a row by a nonzero constant (Rᵢ → cRᵢ)
  3. Row replacement: Add a multiple of one row to another (Rᵢ → Rᵢ + cRⱼ)

Example: Solve using Gaussian Elimination

System:

  x + y + z = 6

  2x + 3y + z = 14

  x + 2y + 3z = 16

Step 1: Form augmented matrix:

[1 1 1 | 6]

[2 3 1 | 14]

[1 2 3 | 16]

Step 2: R₂ → R₂ - 2R₁:

[1 1 1 | 6]

[0 1 -1 | 2]

[1 2 3 | 16]

Step 3: R₃ → R₃ - R₁:

[1 1 1 | 6]

[0 1 -1 | 2]

[0 1 2 | 10]

Step 4: R₃ → R₃ - R₂:

[1 1 1 | 6]

[0 1 -1 | 2]

[0 0 3 | 8]

Step 5: Back-substitute:

From R₃: 3z = 8 → z = 8/3

From R₂: y - z = 2 → y = 2 + 8/3 = 14/3

From R₁: x + y + z = 6 → x = 6 - 14/3 - 8/3 = 6 - 22/3 = -4/3

Solution: x = -4/3, y = 14/3, z = 8/3

Row Echelon Form (REF)

A matrix is in row echelon form if:

  1. All rows consisting entirely of zeros are at the bottom
  2. The first nonzero entry (called the pivot) in each nonzero row is to the right of the pivot in the row above
  3. All entries below a pivot are zero

Reduced Row Echelon Form (RREF)

A matrix is in reduced row echelon form if it satisfies all REF conditions plus:

  1. Each pivot is 1
  2. Each pivot is the only nonzero entry in its column

Example: RREF

The matrix:

[1 0 0 | 2]

[0 1 0 | 3]

[0 0 1 | 5]

is in RREF. The solution is immediately readable: x = 2, y = 3, z = 5.

Every matrix has a unique reduced row echelon form. This is extremely useful because it means the RREF gives a canonical description of the solution set. The process of converting REF to RREF is called Gauss-Jordan elimination.

Types of Solutions

A system of linear equations has exactly one of three possibilities:

  • One unique solution: Every variable is a pivot variable (the system is consistent and determined)
  • Infinitely many solutions: The system is consistent but has free variables (underdetermined)
  • No solution: The system is inconsistent — a row of the form [0 0 … 0 | b] with b ≠ 0 appears

Determinants

The determinant is a scalar value associated with every square matrix. It encodes important information about the matrix: whether it's invertible, how it scales area/volume, and more. The determinant of matrix A is written det(A) or |A|.

2 × 2 Determinant

For a 2 × 2 matrix:

det [a b] = ad - bc
    [c d]

Example: 2 × 2 Determinant

det [3 7] = (3)(2) - (7)(1) = 6 - 7 = -1

    [1 2]

3 × 3 Determinant (Cofactor Expansion)

For a 3 × 3 matrix, we expand along the first row using cofactor expansion:

det [a b c]
    [d e f] = a(ei - fh) - b(di - fg) + c(dh - eg)
    [g h i]

Example: 3 × 3 Determinant

Find det(A) where:

A = [2 1 -1]

    [3 0 2]

    [1 4 -3]

Expanding along the first row:

xBZqdZD2quD3xeN27USjnHANOMK5AtsV+hUkyP+SbFAOyLcQZyx53RgV0ON27Erk9P8Ku/u2MCAO2OX8a8r5j1BRcZkfe3KnbNjWXN7WUY5DP7ZZNZ3Bbh+51pmD5F9j52Hc9pPm1hluIPHz25kHWTXkxdSrFa3CtHUyxVmLgPeK5ZWDZmUNOmpdeaRTIH8guFanW8y1LKogDpD8P5MmdxI29dBENCDTVBFFdm9ZDS7kEO8M2uXdtd9yf7b/bhoO8DNJWLuLwz8QRsAlFe+xx29REEFLWAqU4Fdzu7zgmlaZF97PJWKGzhiJx6kQXxXBrE0EhJmCXa6R6RNIc/XGSkjB/o4x2bcusNs0ih2uahFeKtpWZRRUzBwgXDhPFlLGRkv3pNUjx7tes5pckWAVTb/IaDmHiqrufUrNy5GXi+04uYKhfXylvLuznPfcNCOZzY8IO8nAExkMtiy8qQvdgsuLZbf1SImcdCy4vquSECX8zvDrqN3unt2cf6/uQgBVjeTBGLtjkC4La/xR46wOzfEeTryi8ncL9C1HFZLWnPJY8jgm2NlniqBzKxRxThj6DGGzgxx4apqkRyumHivSmA5WkStQ2bp0RxsDjymVtuZ0rUKs4stRKqIIWHiCAA6ANrAq+sWincJLimdDFZhbTvfNkJuQSj2b6x7XLsUkOkeVeY1YGFMmm1QkKpX2ijF9GGlPQgXTjHV5cUjls9+7Ttof8/dKCvTCZBOC0GO2IBEt1z3Z5

= 2[(0)(-3) - (2)(4)] - 1[(3)(-3) - (2)(1)] + (-1)[(3)(4) - (0)(1)]

= 2[0 - 8] - 1[-9 - 2] + (-1)[12 - 0]

= 2(-8) - 1(-11) + (-1)(12)

= -16 + 11 - 12

= -17

Properties of Determinants

  • det(I) = 1 — the identity matrix has determinant 1
  • det(Aᵀ) = det(A) — transposing doesn't change the determinant
  • det(AB) = det(A) · det(B) — determinants multiply
  • det(cA) = cⁿ det(A) — for an n × n matrix A
  • Swapping two rows changes the sign of the determinant
  • Multiplying a row by c multiplies the determinant by c
  • Adding a multiple of one row to another does not change the determinant
  • If any row (or column) is all zeros, the determinant is 0
  • If two rows (or columns) are identical, the determinant is 0
Geometrically, the absolute value of the determinant of a 2 × 2 matrix gives the area of the parallelogram formed by its column vectors. For a 3 × 3 matrix, it gives the volume of the parallelepiped. A negative determinant indicates that the transformation reverses orientation.

Determinant and Invertibility

A square matrix A is invertible (nonsingular) if and only if det(A) ≠ 0. If det(A) = 0, the matrix is singular — it squashes space into a lower dimension, losing information irreversibly.

Matrix Inverse

An n × n matrix A is invertible if there exists a matrix A⁻¹ such that:

AA⁻¹ = A⁻¹A = I

where I is the n × n identity matrix. The inverse "undoes" the action of A.

Inverse of a 2 × 2 Matrix

If A = [a b; c d] and det(A) = ad - bc ≠ 0, then:

A⁻¹ = (1/det(A)) [ d -b]
                 [-c a]

Example: Find the Inverse of a 2 × 2 Matrix

A = [4 7]

    [2 6]

det(A) = (4)(6) - (7)(2) = 24 - 14 = 10

A⁻¹ = (1/10) [ 6 -7] = [ 0.6 -0.7]

             [-2 4]   [-0.2 0.4]

Verify:

AA⁻¹ = [4 7] [ 0.6 -0.7] = [4(0.6)+7(-0.2) 4(-0.7)+7(0.4)]

       [2 6] [-0.2 0.4]   [2(0.6)+6(-0.2) 2(-0.7)+6(0.4)]

= [2.4-1.4 -2.8+2.8] = [1 0] ✓

  [1.2-1.2 -1.4+2.4]   [0 1]

Finding the Inverse Using Row Reduction

For larger matrices, augment A with the identity matrix and row reduce to RREF:

[A | I] → row operations → [I | A⁻¹]

If the left side reduces to I, the right side is A⁻¹. If the left side cannot be reduced to I (you get a row of zeros), then A is not invertible.

Example: Finding a 3 × 3 Inverse by Row Reduction

Find the inverse of:

A = [1 0 1]

    [0 1 1]

    [1 1 0]

Form [A | I]:

[1 0 1 | 1 0 0]

[0 1 1 | 0 1 0]

[1 1 0 | 0 0 1]

R₃ → R₃ - R₁:

[1 0 1 | 1 0 0]

[0 1 1 | 0 1 0]

[0 1 -1 | -1 0 1]

R₃ → R₃ - R₂:

[1 0 1 | 1 0 0]

[0 1 1 | 0 1 0]

[0 0 -2 | -1 -1 1]

R₃ → R₃ / (-2):

[1 0 1 | 1 0 0]

[0 1 1 | 0 1 0]

[0 0 1 | 1/2 1/2 -1/2]

R₁ → R₁ - R₃ and R₂ → R₂ - R₃:

[1 0 0 | 1/2 -1/2 1/2]

[0 1 0 | -1/2 1/2 1/2]

[0 0 1 | 1/2 1/2 -1/2]

A⁻¹ = (1/2) [ 1 -1 1]

            [-1 1 1]

            [ 1 1 -1]

Properties of the Inverse

  • (A⁻¹)⁻¹ = A
  • (AB)⁻¹ = B⁻¹A⁻¹ (note the reversal!)
  • (Aᵀ)⁻¹ = (A⁻¹)ᵀ
  • KvkL0rjzSfsOPRI93hW13kWJcmEX7GDNRfa+301i3tWBSQbuJ+Ik3TY92iPxuaLFUmw7kXNAB1QFjwYJV3ajm1RxNiLrRKePhj3aLHZzEWbnG47YJOFDmdtmDqfl1xl+yn+MRYTOTcmRHfpVZ4mU/P/Y5udInU3zmKbZUYwULfRUEVuf9NtMzXlTfrZLVg2v/MLD4AEAAEa9OTXOz02BZB44uatqzcmzzCeC040CosYLmwD/lw99O1hqhrRC0MsM26UE/RzbYAIXFaRp8iLa3cbYmOiQKljOzkepFOHuW+hPyUspOJuoHtnCRZVOxmW+bwsL8GjbhAIiaIKFaz18aNhQfTOIM56Vg6lTayCHrwVDD7nPXEx93+60PDHdgt+NvzcZvZLib95RslhEqKPIdFAqGrX2PpPG6IfHaabrSfmeJTm21N75oh2Okr9BRNU/pwBJkfgigHiAP3KPnRk1+BaIzGQkp5hGzcYh1o1hwMa12M2EuqDbhDdXLNtFP4Ne3JRokv0MBzQ8wiSVD8LbUu5mlgaWrqRKV+6tNKuiiUEuH7UFH7fsMXSjCEF4F8pfyHEeGSGq5f8qur1xTbKUWl2ECIMuZBgC2cYlYJQ/AOR68ACc63PQCfOuiJP9Iv8VoID3ino4i+LjUloNblOJFvkr+QwiaMsnF3kKZw0BmaUd2TbK26boEoL0RLIxdiU2qByNEr1hX8pDfPd2mNt4vHKaaE+PYtckR52IVCLaoV7n2P7HT
  • det(A⁻¹) = 1/det(A)
  • (cA)⁻¹ = (1/c)A⁻¹

Cramer's Rule

Cramer's Rule provides an explicit formula for the solution of a system Ax = b when A is invertible. For each variable xᵢ:

xᵢ = det(Aᵢ) / det(A)

where Aᵢ is the matrix formed by replacing the i-th column of A with b.

Example: Solve Using Cramer's Rule

System:

  3x + 2y = 16

  x + 5y = 18

A = [3 2], b = [16]

    [1 5]        [18]

det(A) = (3)(5) - (2)(1) = 15 - 2 = 13

A₁ (replace column 1 with b):

Glecc6zJdmcvSU5ny9VtWcRdOYKr9cTJM6X0UyllD3JlqB2Rctp77h2LdKZLo1iOew3BEwb7sg/6ghAzSmUTpkDjXpoRz5o87KZkB3gsreKpEWB/imtvhAYd5u2laVt0b51r2YWaWQ41D63t1JG6rwy2K0DUtfII8tlNfyvG6GZLrA8TDEOdn5utQ3lXu18N2/lsfC8k2LgVpm/9o7tTdOa7cSb83szrfL3PQz899JzXwTa85ebq71b9Fp2It+E53eugiTGjtR29vbYxzP0SHWUY9KvDqhH7UPlEKzbBK1k0s4AI/fkzfe7j4rej9cnjr1UZdxeiSxDLG8yKAumusmYg+hIWHnmcTKKoqWVtgjueJCD7np91HokWOgOsoL/kzLpYZijdf0YM7oWv5U0NVxCMQAE/6B4xwcD6xU05/oaWKzyak7HYwM9YOjMuw8I4PsNw9/Dmrc0+Z4sNKwUmCdL7oqlaDWKjWMzx39macHPDjvUfrV3OMSbbw+jj3S/pos2FAv+iHz8RFt2Iib6r5tMfTz2yvEhUv/CGDMxGjTOTbYh4FjR2Hos+l14hqNSveD1UCCyzwmDQlS4J1AWZJ7IR9rPPpWLwdk9gx87T5GZW+WlGuyF4Ij9Y9Qe5dGsCoC0XiP6TQi0sGDLDx8Pl6EMSbmmCLf9xihdziEB2l42qVsiLWTHfd7Rp+VL3X2I4nYHpgft+VsrSwtil5nO7tQOKDE1CSXAGtTz6SvlfCVVjpURL5

det [16 2] = (16)(5) - (2)(18) = 80 - 36 = 44

    [18 5]

A₂ (replace column 2 with b):

det [3 16] = (3)(18) - (16)(1) = 54 - 16 = 38

    [1 18]

x = 44/13, y = 38/13

Solution: x = 44/13 ≈ 3.385, y = 38/13 ≈ 2.923

Cramer's Rule is elegant and useful for small systems or theoretical work, but it's computationally expensive for large systems. In practice, Gaussian elimination is far more efficient for systems with more than a few variables.

Vector Spaces

A vector space (over ℝ) is a set V together with two operations — addition and scalar multiplication — that satisfy ten axioms. Vector spaces abstract the familiar properties of arrows in the plane to much more general settings.

Axioms of a Vector Space

A set V with operations + (addition) and · (scalar multiplication) is a vector space if for all u, v, w ∈ V and scalars c, d ∈ ℝ:

  1. u + v ∈ V (closure under addition)
  2. u + v = v + u (commutativity)
  3. (u + v) + w = u + (v + w) (associativity of addition)
  4. There exists a zero vector 0 such that v + 0 = v
  5. For each v, there exists -v such that v + (-v) = 0
  6. cv ∈ V (closure under scalar multiplication)
  7. c(u + v) = cu + cv (distributivity over vector addition)
  8. (c + d)v = cv + dv (distributivity over scalar addition)
  9. c(dv) = (cd)v (associativity of scalar multiplication)
  10. 1v = v (multiplicative identity)
  11. MuIkvoI5MXntMub8/VWcl7pGWn5O+Pkhy8++vIN9AO4AGj8HcS7fZevjGX1sPBKz5ZI3jEFjCGaD9rAGTi48dyYY3S7rlhetaewLHtZNMybjJK3xEbqfi8FLSf2QuNNf+wK76U5PiVCg96ZvTaCULeu2ob0WBG94MsGJYHyZ2LEqy0++Q3pXEr01+PPQlSVCpJytNgJFEumpq2tNsSxvQeMEFN3rP2PYqWBujTnlRSsi+qvVlH7ccXj83rIMtq999vA6ej85Z0GJPr8kwaNyObBRIp5RBOWAS/9+Tqxm+lqxiOYWQwiXFuXXEcHoB0dDM78+cUNMv6xDHNU2CXxdDxw4NkLja5WBPD7ZB033E9ZKBZQWDmM4ROvtEMiqzX/lEPFWmLvADAqAvwC6uEOAffOB8Ib62XDgS9e7CdKodzplCvPsC6Til2CInz2wKvxnC35DLOJH7OtSJam0jshik8vkRRYBM2Tw6Gy+Vie/B8/mTkupaDxravdgCfB4P0oOgNhjVPgazYLr9AUp6BlBAELXAvG3oiExlkrxaO//gPWDK1ripKlyAit0EzXmZFnWQlkokE+mDqBYAeih65xZaXolDUnmEsSP5bNzCIBK3E1/JOtpk5XaxAlfORnD5eDlgoV2aYRVuxsPjEHNj6DSMaxlMOlNfvypX87g5ASkJDVhEwPi/5c6JuZ2fH+YTdo4WONxiQmESmBwa2fx4Vs6DFsOkV47vfjKNBeNqHS2a804/lARm

Common examples of vector spaces include:

  • ℝⁿ — the set of all n-tuples of real numbers
  • Pₙ — the set of all polynomials of degree at most n
  • M₂ₓ₂ — the set of all 2 × 2 matrices with real entries
  • C[a,b] — the set of all continuous functions on [a,b]

Subspaces

A subspace of a vector space V is a nonempty subset W ⊆ V that is itself a vector space under the same operations. To verify W is a subspace, check three conditions:

  1. The zero vector is in W
  2. W is closed under addition: if u, v ∈ W, then u + v ∈ W
  3. W is closed under scalar multiplication: if v ∈ W and c ∈ ℝ, then cv ∈ W

Example: Subspace Verification

Is W = {(x, y, z) ∈ ℝ³ : x + y + z = 0} a subspace of ℝ³?

Zero vector: (0, 0, 0) → 0 + 0 + 0 = 0 ✓

Addition: If (x₁, y₁, z₁) and (x₂, y₂, z₂) are in W, then:

  (x₁+x₂) + (y₁+y₂) + (z₁+z₂) = (x₁+y₁+z₁) + (x₂+y₂+z₂) = 0 + 0 = 0 ✓

Scalar multiplication: If (x, y, z) ∈ W and c ∈ ℝ, then:

  cx + cy + cz = c(x + y + z) = c(0) = 0 ✓

W is a subspace of ℝ³.

Linear Combinations and Span

SbdawnbtxRD+4+PSBUl3FOr2TJDQ8b91LDWk4NFpE40jlFPmSlMMPW7e0FeCupRZGV6A3s/V1SlfAbzwQjR3DYdxM6r/Yhswlg31WYpJUf/Ax8QXATNM0Kz708cEbRonhKFhryC1avmFaqxMU73AMWprWk2yh1pl+4AP10cWPslktg4fPiK5sMrLB+5+BvnWFx5BsLHc272L0+se5yolSnJkegy/4enwuNOjhdPF/YYgyW8l2qLl1xv7sFticobydsLJpkhQCg96xvPd/voaTu6hHrcPWKgBXczn3USu6F3RlDrxGywn7ipzOhkZURnYRviNRblt1ulngTpH3LWm/kyxDrgtWZFBKwFIrLDCty+VMhMS2mikrhmn6nPPjjlxT/TSIxnaVoS3ueqVfIuwHehsLE8zTtpIkfSz1uRYnzFFNq5IWSDfVx1NjaVDHqeKlnl07ZB2z0kQBAiZD4PUfXQ7tDL7ehQB6oDG+NpLi6s6iy2xJU0r8pMbm8hN/HZR+sTLXGRPQhGpgEgUjE4zDq09vddokqOrnOSH4dciy9M6g1+pTlK6tos5nZ5+UMBrmf8l8qrjNOyZpItsrY6z4mvFS/5r8gft08L7leoCEodc9BEu5a5KYlHSTdj7teHdALhJuOw8lkfnvIUCFQ7go+lVAJuaWB4mwGtYUZmbpjF72ZvEH2msYHKsISsromLSFn0esOXpezlbD3h+zQ9FB2DLE5c8wi7XW3kheLeoVY1I/KW+r

A linear combination of vectors v₁, v₂, …, vₖ is any sum of the form:

c₁v₁ + c₂v₂ + … + cₖv

where c₁, c₂, …, cₖ are scalars.

The span of a set of vectors is the set of all possible linear combinations of those vectors. It is the "smallest" subspace containing all the given vectors:

span{v₁, v₂, …, vₖ} = {c₁v₁ + c₂v₂ + … + cₖvₖ : c₁, …, cₖ ∈ ℝ}

Example: Span in ℝ²

Let v₁ = (1, 0) and v₂ = (0, 1).

span{v₁, v₂} = {c₁(1,0) + c₂(0,1)} = {(c₁, c₂) : c₁, c₂ ∈ ℝ} = ℝ²

These two vectors span all of ℝ² because any point (a, b) can be written as a(1,0) + b(0,1).

Linear Independence

A set of vectors {v₁, v₂, …, vₖ} is linearly independent if the only solution to:

c₁v₁ + c₂v₂ + … + cₖvₖ = 0

is c₁ = c₂ = … = cₖ = 0. Otherwise, the set is linearly dependent, meaning at least one vector can be expressed as a linear combination of the others.

Example: Testing Linear Independence

Are v₁ = (1, 2, 0), v₂ = (0, 1, 1), v₃ = (1, 0, -2) linearly independent?

We need to determine if c₁(1,2,0) + c₂(0,1,1) + c₃(1,0,-2) = (0,0,0) has only the trivial solution.

This gives the system:

  c₁ + c₃ = 0

  2c₁ + c₂ = 0

  c₂ - 2c₃ = 0

Form the augmented matrix and row reduce:

[1 0 1 | 0]

[2 1 0 | 0]

[0 1 -2 | 0]

R₂ → R₂ - 2R₁:

[1 0 1 | 0]

[0 1 -2 | 0]

[0 1 -2 | 0]

R₃ → R₃ - R₂:

[1 0 1 | 0]

[0 1 -2 | 0]

[0 0 0 | 0]

There's a free variable (c₃), so the system has nontrivial solutions. The vectors are linearly dependent.

Setting c₃ = 1: c₁ = -1, c₂ = 2, so v₃ = v₁ - 2v₂.

Basis and Dimension

A basis for a vector space V is a set of vectors that is:

  1. Linearly independent
  2. Spans V

A basis is a minimal spanning set — every vector in V can be written uniquely as a linear combination of the basis vectors.

The dimension of a vector space is the number of vectors in any basis (all bases have the same size).

Example: Standard Basis for ℝ³

STZv7goGqXOoeFFigcZnFa1cANC/YPgCU5h2mT/Eg9le82bcClRxtnC8LRk1jis24Lf0h4fr21Z0ukn+wuN4auw3feOFz0T/TGNEufmn2cSYK9sYbTXAQVI3fTJR8l+XO0p0JvEqsanDhxALAinpYRd/vkaTub7swkA/77GNP08IitC4Gk805TSPt6cqzwE1GCp27Y8kLeqGz6I52Jb73yqdNFOlR/jwwf1gpDqxgKTkEnKaAdlUSmVmotGNY8TuJnRePAP5bz6JoIJf9n1UcBb6ekykd7ggujKz/Bs/pno4lJ6Yje3BEX7CRco9HJ2UqP2FKQvwwyrSjAbqRStUW8VLv5q0GKiFKRb2piTLESsnTYH8FfbG3hcgybUgTtunCS0hunVt2v+uF7MKV4d27ViSphvt+SPzT/Zn+u9YRak+62vIBzw7UaJFGEssam56aF0bG10RPSCrNdqYpmUnZ6pcRvPSKCxcRkKk0o6r37etKEKvqKf2bvlug7GY/JCZvv9lXsT8b1WnlnQRMdOshg/RAmUhlUipVW73K1Qjf/ZoGK8VamJZ/LWEej4u2lpBn54NwxF75ILy7+eBNJtVs1LdN1U+kPZhvSwZIu6mybWZXV3qXY+pemt132SeqnH5ihk4L/7LNpEMSHjv6tTxqM0oxPiVtOnK6Kp0hKvfHrohui8C/sA9/EVVh9Kx1GscKeMulkNlSCrFU38zde5ncdDWwQ/yxHGQMliuOjUSbtSB41abD

The standard basis for ℝ³ is:

e₁ = (1, 0, 0), e₂ = (0, 1, 0), e₃ = (0, 0, 1)

These are linearly independent and span all of ℝ³, so dim(ℝ³) = 3.

However, there are infinitely many other valid bases, such as:

{(1, 1, 0), (1, 0, 1), (0, 1, 1)}

Any set of three linearly independent vectors in ℝ³ forms a basis.

Key dimension facts: dim(ℝⁿ) = n, dim(Pₙ) = n + 1 (the basis is {1, x, x², …, xⁿ}), and dim(M₂ₓ₂) = 4. The Rank-Nullity Theorem states that for any linear transformation T: V → W, dim(V) = rank(T) + nullity(T).

Linear Transformations

A linear transformation is a function T: V → W between two vector spaces that preserves the operations of addition and scalar multiplication:

T(u + v) = T(u) + T(v)
T(cv) = cT(v)

Equivalently, a function is linear if and only if T(c₁v₁ + c₂v₂) = c₁T(v₁) + c₂T(v₂) for all vectors and scalars.

Linear transformations include familiar operations like:

  • Rotations — spinning vectors around the origin
  • Reflections — flipping vectors across a line or plane
  • Scaling — stretching or compressing vectors
  • Projections — "flattening" vectors onto a subspace
  • Differentiation — T(f) = f' is a linear transformation on the space of polynomials

Example: Verifying Linearity

Is T(x, y) = (2x + y, x - 3y) a linear transformation?

Let u = (x₁, y₁) and v = (x₂, y₂):

T(u + v) = T(x₁+x₂, y₁+y₂) = (2(x₁+x₂) + (y₁+y₂), (x₁+x₂) - 3(y₁+y₂))

= (2x₁+y₁ + 2x₂+y₂, x₁-3y₁ + x₂-3y₂)

= (2x₁+y₁, x₁-3y₁) + (2x₂+y₂, x₂-3y₂)

= T(u) + T(v) ✓

T(cu) = T(cx₁, cy₁) = (2cx₁+cy₁, cx₁-3cy₁) = c(2x₁+y₁, x₁-3y₁) = cT(u) ✓

Yes, T is linear.

Matrix Representation

Every linear transformation T: ℝⁿ → ℝᵐ can be represented as multiplication by an m × n matrix A:

T(x) = Ax

The matrix A is found by applying T to each standard basis vector and using the results as columns:

A = [T(e₁) | T(e₂) | … | T(eₙ)]

Example: Finding the Matrix of a Linear Transformation

Find the matrix for T(x, y) = (2x + y, x - 3y).

T(e₁) = T(1, 0) = (2(1)+0, 1-3(0)) = (2, 1)

T(e₂) = T(0, 1) = (2(0)+1, 0-3(1)) = (1, -3)

A = [2 1]

    [1 -3]

Verify: A(3, 2)ᵀ = [2 1][3] = [2(3)+1(2)] = [8]

                 [1 -3][2]   [1(3)-3(2)]   [-3]

T(3, 2) = (2(3)+2, 3-3(2)) = (8, -3) ✓

Common Transformation Matrices in ℝ²

Here are standard transformation matrices for geometric operations:

  • Rotation by angle θ: [cos θ -sin θ; sin θ cos θ]
  • Reflection across x-axis: [1 0; 0 -1]
  • Reflection across y-axis: [-1 0; 0 1]
  • Reflection across y = x: [0 1; 1 0]
  • Scaling by factors a, b: [a 0; 0 b]
  • Horizontal shear: [1 k; 0 1]
  • Projection onto x-axis: [1 0; 0 0]

Kernel and Image

Two fundamental subspaces are associated with every linear transformation T: V → W:

The kernel (or null space) of T is the set of all vectors that T maps to zero:

ker(T) = {v ∈ V : T(v) = 0}

The image (or range) of T is the set of all possible output vectors:

im(T) = {T(v) : v ∈ V}

Example: Finding the Kernel

Find the kernel of T(x, y, z) = (x + y, y + z).

The matrix is A = [1 1 0]

                  [0 1 1]

Solve Ax = 0:

[1 1 0 | 0] → R₁ → R₁ - R₂:

[0 1 1 | 0]

[1 0 -1 | 0]

[0 1 1 | 0]

From RREF: x = z, y = -z. Setting z = t:

ker(T) = {t(1, -1, 1) : t ∈ ℝ}

The kernel is a line in ℝ³ through the origin. The nullity is 1.

A linear transformation T is one-to-one (injective) if and only if ker(T) = {0}. It is onto (surjective) if im(T) = W. The Rank-Nullity Theorem guarantees: dim(V) = dim(ker(T)) + dim(im(T)).

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are among the most important concepts in linear algebra. They reveal the "natural" behavior of a linear transformation — the directions that are merely scaled, not rotated.

Definition

Let A be an n × n matrix. A nonzero vector v is an eigenvector of A if:

Av = λv

for some scalar λ. The scalar λ is called the corresponding eigenvalue. In other words, multiplying v by A simply scales v by the factor λ — the direction of v is preserved (or reversed if λ < 0).

The Characteristic Equation

To find eigenvalues, we rearrange Av = λv:

(A - λI)v = 0

For a nonzero solution v to exist, the matrix (A - λI) must be singular, which means:

det(A - λI) = 0

This is the characteristic equation. The left side, when expanded, is a polynomial of degree n in λ called the characteristic polynomial.

Example: Find Eigenvalues and Eigenvectors of a 2 × 2 Matrix

A = [4 1]

    [2 3]

Step 1: Find eigenvalues.

det(A - λI) = det [4-λ 1 ] = (4-λ)(3-λ) - (1)(2)

                  [ 2 3-λ]

= λ² - 7λ + 12 - 2

= λ² - 7λ + 10

= (λ - 5)(λ - 2) = 0

Eigenvalues: λ₁ = 5 and λ₂ = 2

Step 2: Find eigenvectors for λ₁ = 5.

(A - 5I)v = 0:

[-1 1] [x] = [0]

[ 2 -2] [y] [0]

From R₁: -x + y = 0, so y = x.

Eigenvector: v₁ = t(1, 1) for any t ≠ 0. We often choose v₁ = (1, 1).

Step 3: Find eigenvectors for λ₂ = 2.

(A - 2I)v = 0:

[2 1] [x] = [0]

vKnTRq1qzk2LFrf4xzrEkQaRL0M3elyn2YTf9qHZLdcg1B6Xpc8mpY2IKefEwkMZ71lHa0TzpxFuXA+6wCIKkeAoFBSgkeTS6eAbYMZwL9no6Q6+8BOfs6bAeEI1IF1a7XwZKG433YagmS+VIdKBm7SclnwAsCFxyCMrg23BkVQz5472pFtH87Ln9Y5DZSQj/xZxRHJ13xrRL4B3Ekan+LnxbuGy2QNYehA9hElLsJS8QxEhHs57GTV7LQK8rXotCsbLI7i0W+O6k7wRVRvY+h0qgV2hhLF569IK0KdBQalUayXKSmHh0284xmtHlOiftVYKnBPqsVt6ReD6de/vtDtx09xYRsBQLKY95hCiKFK/btDzjl0tztuL2vdhMYqRsMY/6Ak5mlcPlAQ7gE9OaPPEC5ckzUA039wMpP2kYo3IPzmzetNJHbAs9Bs4ekL1V0uEg8IAuJTYcpELVewzNbDnmLD2IMf6nfJnVCN+C0SeQbpirSssGH8ZFkBFxQ1iWx/2WWHxvO38nDFP8xVIy1/dnliX/cEU9wRcYO4MLxLZCMefRXdlZ0znDo6I6qlisXqANH5qD2JpQzL6936se6tsF5eNcMpYJl7xHUCSrn+bLt6eHjHZpepf4mTtr5zKuWb8p467TFThLeeM7ilRpFgsZLnMjBGvTCISya45tTllGiWdou5akRFBlngpb8vVvzPs3B4VBB260/nztdaFz7S4K7+47b4g4uqgj6xEjM3/JkMps

[2 1] [y] [0]

From R₁: 2x + y = 0, so y = -2x.

Eigenvector: v₂ = t(1, -2) for any t ≠ 0. We often choose v₂ = (1, -2).

Verify: A(1,1)ᵀ = [4+1, 2+3]ᵀ = [5, 5]ᵀ = 5(1,1)ᵀ ✓

Verify: A(1,-2)ᵀ = [4-2, 2-6]ᵀ = [2, -4]ᵀ = 2(1,-2)ᵀ ✓

Example: Eigenvalues of a 3 × 3 Matrix

A = [2 0 0]

    [0 3 1]

    [0 1 3]

det(A - λI) = det [2-λ 0 0 ]

EJC2G1hn24Bf8G9WQ19z0Fg+99CTChBOpQycyY9S2xsDWAlquA9K+PmGdX/6F7dhDCoMWkCZlK1/NDEk3cyMfIJ4ZwILUoo8n1fnoo9NCaVH3gCvj066hhi2noD15b2uJ2eInuFKqU3tRGcTqedaNadtVAy9Q4aCm3Xhm64yQdnfVvUZkCHuIG6yYlBHLW7LuS6OPIv9PqkOlsEiblS5dbrmGpnvqktXEgZrtnZJkrbGGiCJjJNKacT4hEQIB6okWecWkIoIBFnLEoSbjaEXgAr2IBe/9oo1kEYjtznHpjA9ev+dLjPbBK71mNqRtIT9XXUFijZGwr41nQ15sU2mVjjZEO5RKuQQlS3LdxyeYdHHqbnHP+pJOjM9ddEvChKRnLb2INLlnK0gza0aSbIzwofFIWBxAFVT6FnPjDXgk+HOwrWehyNNMRi3tWhfd6NOrSSHh6bpStXrUicsA3lMh/VCbtWw+afaiOCWF7RuUJDHyCTH8Xv0hmDJT7N+NkK04KtmzMZrNpfLDZM405toxjXE+XRmmVfGjS4pQI201dZCVvfgJfS4+xSu3PYvMfrzkkXkb0hrgzbwxwA1tkP6J7kbfxVMwi7MoK/ulaWElGS62Ky8YkLtBCgFnOFksW2vurrW3Z9ucRwosDVn0k46VjP750gPbwJMGydIzj023xnd0ecnxlkXQZeaaqRDL6C4ySfxeHbYuqJzfKfq+fhP1vNW7ni5+YZvA5KbNZUQi+SK4vaN5

                  [ 0 3-λ 1 ]

                  [ 0 1 3-λ]

Expanding along the first column (since it has two zeros):

= (2-λ)[(3-λ)(3-λ) - (1)(1)]

= (2-λ)(9 - 6λ + λ² - 1)

= (2-λ)(λ² - 6λ + 8)

= (2-λ)(λ-2)(λ-4)

= -(λ-2)²(λ-4)

Eigenvalues: λ₁ = 2 (multiplicity 2) and λ₂ = 4 (multiplicity 1)

Key Properties of Eigenvalues

  • The sum of eigenvalues equals the trace of A: λ₁ + λ₂ + … + λₙ = tr(A) = a₁₁ + a₂₂ + … + aₙₙ
  • The product of eigenvalues equals the determinant: λ₁ · λ₂ · … · λₙ = det(A)
  • A is singular (non-invertible) if and only if 0 is an eigenvalue
  • Symmetric matrices always have real eigenvalues
  • Eigenvectors corresponding to distinct eigenvalues are linearly independent

Diagonalization

An n × n matrix A is diagonalizable if it can be written as:

A = PDP⁻¹

where D is a diagonal matrix of eigenvalues and P is the matrix whose columns are the corresponding eigenvectors. A matrix is diagonalizable if and only if it has n linearly independent eigenvectors.

Example: Diagonalize the Matrix

From our earlier example, A = [4 1; 2 3] with eigenvalues λ₁ = 5, λ₂ = 2 and eigenvectors v₁ = (1, 1), v₂ = (1, -2).

P = [1 1]    D = [5 0]

    [1 -2]        [0 2]

Then A = PDP⁻¹.

This is incredibly useful! For example, computing Aⁿ becomes easy:

Aⁿ = PDⁿP⁻¹ = P [5ⁿ 0 ] P⁻¹

                [ 0 2ⁿ]

This converts an expensive matrix power into simply raising scalars to powers!

Diagonalization is one of the most powerful tools in linear algebra. It simplifies matrix computations dramatically and has deep applications in differential equations, Markov chains, quantum mechanics, and principal component analysis (PCA). Not every matrix is diagonalizable, but symmetric matrices always are.

Applications of Linear Algebra

Linear algebra is not just an abstract mathematical subject — it is the computational engine behind many of the technologies and scientific methods that shape the modern world. Here are some of its most impactful applications.

Computer Graphics

Every image, animation, and 3D model you see on a screen is rendered using linear algebra. Transformations of objects — rotation, scaling, translation, and perspective projection — are all represented as matrix operations.

In 3D graphics, homogeneous coordinates extend ℝ³ to ℝ⁴ so that translations (which are not linear) can also be represented as matrix multiplication. A point (x, y, z) becomes (x, y, z, 1), and all transformations become 4 × 4 matrices:

Translation: [1 0 0 tₓ]  Scaling: [sₓ 0 0 0]
            [0 1 0 tᵧ]         [ 0 sᵧ 0 0]
            [0 0 1 t_z]         [ 0 0 s_z 0]
            [0 0 0 1 ]         [ 0 0 0 1]

Composing multiple transformations is simply multiplying matrices together. The GPU in your computer is essentially a massively parallel linear algebra engine.

Example: 2D Rotation

Rotate the point (3, 1) by 90° counterclockwise.

Rotation matrix: R = [cos 90° -sin 90°] = [ 0 -1]

                  [sin 90° cos 90°]   [ 1 0]

R [3] = [0(-1)(3)+(-1)(1)] = [-1]

  [1]   [1(3)+0(1)]         [ 3]

Wait, let's compute correctly:

[ 0 -1] [3] = [(0)(3)+(-1)(1)] = [-1]

[ 1 0] [1]   [(1)(3)+(0)(1)]    [ 3]

The rotated point is (-1, 3)

Data Science and Machine Learning

Data science uses linear algebra at every level. Datasets are naturally represented as matrices (rows = samples, columns = features), and most machine learning algorithms are built on linear algebraic operations:

  • Linear Regression: Finds the best-fit line (or hyperplane) by solving a system of equations. The famous normal equation is x = (AᵀA)⁻¹Aᵀb.
  • Principal Component Analysis (PCA): Uses eigenvalues and eigenvectors of the covariance matrix to identify the most important directions in high-dimensional data, enabling dimensionality reduction.
  • Neural Networks: Every layer in a neural network is essentially a linear transformation (matrix multiplication) followed by a nonlinear activation function.
  • Recommender Systems: Services like Netflix and Spotify use matrix factorization to predict user preferences.
If you want to work in data science or AI, linear algebra is non-negotiable. Understanding matrix operations, eigendecomposition, and the SVD (Singular Value Decomposition) gives you deep insight into how algorithms actually work — not just how to call library functions.
PH0p+gkdZhtTPF8YC2umNPT6ls0x4fJ6Jvn6432wSs9Zjb871B8Y1BZtEfRO/DoRDNKw09QtHef7zbUkIpALFwYpWyaZU6bOfxLPRielYtQwHT6o1EVtre98y/4/wRTUMa57dJmvey07K8cgL3AX11qhQ9iLEqwpp/Z25ynI0jTDhvffdh4IZQD02HYvhZlRLHvdXUX0PMXQfziNfgkDQWKK04PwhGi2s0t6A2tfxgCswXq41X6ZGlPoymTW/czRZCjR/bakXiJEfdpMAvgiTj7ZguHpVLZxrh6olPW7foE7VPvQ7t1RfFA2aZALONf1b74zEVJJ2rgOrn1ToOycCPjVhqcYdHMgb/CBt4RHHgp+G66WPlvtDDmsCbh8O6cOm4vqYhZthHJf94dARb3aFngX4WgbnJJKsg4FTREQ0MXopF11woTD6e0kzJkkeTUa7Do81uYcY/WRuC3h6ylx2YXmuKIy6olKfvrAbfZTNCLb9sGWfsCzCiNKmSlhkE3/Af54jAb/BAvAcCHgGEC3CmqLt5wwtpn+ueT/gisFdmmPySIQineThGMBe//0G8YirGi/C8LzYo0f/laMHhRRSZPl3Jdv+Qq4SmbD12Jl1UgtryX4wH1jlZEWK7OpIt0y7m0YTw7oJwIkrSBSlaqgVhkONsGqiXcHeFQLsqF2dmlydl7aCOq+ck33VZVleH4FJaU8L9ZHY0pQ/S0EcDymRWB5JjqBUb8snYi0Pz5VnGiWBA==

Google PageRank

One of the most famous applications of linear algebra is Google's original PageRank algorithm, which ranked web pages by importance. The key insight is modeling the web as a Markov chain:

  1. Represent the web as a directed graph where pages are nodes and links are edges
  2. Construct a transition matrix M where Mᵢⱼ = 1/Lⱼ if page j links to page i (Lⱼ = number of outgoing links from j)
  3. The PageRank vector r is the eigenvector of M corresponding to eigenvalue 1:
Mr = r

In practice, a damping factor d ≈ 0.85 is used:

r = (1-d)/N · 1 + d · Mr

This is solved iteratively (power iteration), and pages with higher PageRank values appear higher in search results.

Example: Simplified PageRank

Consider 3 pages: A links to B and C, B links to C, C links to A.

Transition matrix (each column sums to 1):

M = [ 0 0 1 ]  (A receives from C)

OI2MyM8vgsrn9qmDW3+IP1QtxWictdcaZEr6HRYq9MFDjfX6BM9LvM4XS18izXYGYia+uVk8Rv7JBn7ULGikf0Nqd7FMPBnz3sO/x5yIJsGTUJ7fyoA2LskNfo4pU1ExiuPJI3rkf29DwFYUMPdv3sfG+J/Sh0caCPjuDBJpE/o7RvYsNSPEvfu1Y89iwP6N+zRI8hNmVFaIr9WZyietGtK/o9Zue7s7qZHjVfN9w1uXic8HlcyZpcyfe6GiAdlJ8au10qn115NLGI5yp7uJ4hQY5ReUT+OdREUrff7kDjsAmtPKZAqFdVITcoHjt87u6JhkElLITKeUCh3nSdAR3/fehxFE6zRguo9jmqcllZwszf7mqhyptLDNJrfDr4ukSj2Hx1q7X0lKogwHzNJy939Gd7UrxltdPcQ5jCTdHwXvrHNioWVWqsOCMTHgs2HGnofCifF0T7EdimdYv0KXDPwQ/3qrIW6FTlNMvcFCbNNbQEYxOUDuqAFbWOu7IYFPb6h7VeSd1GCNR+TGbZINbBoKSSVIE2y+8mhqyXcFF92anpqJJr9YLenDvo2JnqrsCUemYaWJz9ex+a/LA4ow4ldLfo2HxQqoFIapEukddsUaDTw2M3Ccn0vlbHloVRBHMmtf4bkF78yipKxGdNf53kbULpv8y4361UCP+YwNuhLmlAyynMPChJ4f2WgpbBEd/L4dABGP/tF2FkAqpQVT4a55uDD3Aoe8y41nLpbWrwkemPQAQ

    [1/2 0 0 ]  (B receives from A)

    [1/2 1 0 ]  (C receives from A and B)

Starting with r₀ = (1/3, 1/3, 1/3) and iterating rₖ₊₁ = Mrₖ converges to the steady-state PageRank vector.

Least Squares Approximation

When a system Ax = b has no exact solution (more equations than unknowns, i.e., the system is overdetermined), we seek the least squares solution — the vector that minimizes the total squared error ‖Ax - b‖².

The solution satisfies the normal equations:

AᵀA = Aᵀb

If AᵀA is invertible:

= (AᵀA)⁻¹Aᵀb

Example: Fitting a Line to Data

Find the best-fit line y = c₀ + c₁x for the data points (1, 2), (2, 3), (3, 6), (4, 8).

Set up Acb:

A = [1 1]  b = [2]

    [1 2]       [3]

    [1 3]       [6]

    [1 4]       [8]

AᵀA = [1 1 1 1][1 1] = [ 4 10]

       [1 2 3 4][1 2]   [10 30]

                [1 3]

                [1 4]

Aᵀb = [1 1 1 1][2] = [ 19]

       [1 2 3 4][3]   [ 57]

                [6]

                [8]

Solving [ 4 10 | 19] → c₀ = -0.5, c₁ = 2.1

       [10 30 | 57]

Best-fit line: y = -0.5 + 2.1x

KD6qmxtLglijHqUX+VAQtNAw48XRBVxG5IyGgXRGdbvSyfBkRaAEC1p7geHw13y00Gnbwc+M1eCWDzN/kKjOcqG3DOmY/ny8TaVsZIesnWmOEXzWEWVQl+zuLh6SB5UYpjVIK8oV1L4DxM1efTRT6nWPoOmA9A8ePTpEGLq3lUPqkQ1t69JdJjNLFEA1rycW8IXoR8nCHFlFFMkQOHXZRwk1ty64RN2uL7QgvjJXjCLP/J14JO/e/46tfpyryH501rs+Gqb9wY1KyKmeD3drSW4YXg9GdzTpe+xA8BhlI8NsQfbxvL6D5L+SDh93FbGAXEfWaNGFStMzWqafM3j9aqHH/tvVcMQ1vDIKUOzqHNbtsPQeJ6sY9O45gIqx2HLCP6KoKepRW5TjEWIV/Fr/BdD+9GImp9tmlGxikEhhcetZSEKSxdxR8+/Pfhm8e4pcn6L5/oCS+br7MEL2bhCIYNXNip9KOWgTnnXF04ZrpinOJDMkw/xPBVb15DRh97tQpjT+og6PEHb1dMGx86Dg/88EM5umqCXPi5YdYA33AZa39/4jPjAMkMx7Rkl7VrnsSTkSJulYm8YUelJfyIhJq2cYCmYbrpp4KBtKMks/2ueKjwE/RC12xK2Lq13tSNoGlKmKkIS749/TE0Fw1plgw8wBu8fHoYSYJ5UXPhId190sm78v/76CDwa3fMD2+uClDwe1d85mJuR9QnQQCPCx9DkEdzdI4Y6tYacN3NIWDvBnAaN5T

Quantum Mechanics

Quantum mechanics is fundamentally built on linear algebra. The state of a quantum system is represented as a vector in a complex vector space (Hilbert space), and physical observables (like energy, momentum, position) are represented as Hermitian matrices. Measurements correspond to eigenvalues, and the system's state collapses to the corresponding eigenvector after measurement.

Economics and Network Analysis

Linear algebra appears in economics through input-output models (Leontief model), where the relationships between industries are captured in a matrix. In network analysis, matrices encode connections between nodes, and their eigenvalues reveal properties like connectivity, clustering, and influence.

Linear algebra truly lives up to its reputation as the "mathematics of the 21st century." Whether you pursue pure mathematics, engineering, computer science, physics, economics, or data science, the concepts you've learned here — vectors, matrices, transformations, eigenvalues — will be among the most frequently used tools in your mathematical toolkit.

Orthogonality and Projections

Two vectors are orthogonal (perpendicular) if their dot product is zero. Orthogonality is one of the most powerful concepts in linear algebra — it simplifies computations, enables decompositions, and underpins algorithms from signal processing to machine learning.

Orthogonal and Orthonormal Sets

A set of vectors {v₁, v₂, …, vₖ} is:
Orthogonal if vᵢ · vⱼ = 0 for all i ≠ j
Orthonormal if additionally ‖vᵢ‖ = 1 for all i

Orthonormal bases are especially convenient: to express any vector x in terms of orthonormal basis {u₁, …, uₙ}, you simply compute:

x = (x · u₁)u₁ + (x · u₂)u₂ + ⋯ + (x · uₙ)u

No matrix inversion required!

Orthogonal Projection

The projection of vector b onto a subspace W is the closest vector in W to b. If W = span(a), the projection is:

proj_a(b) = (a · b) / (a · a) · a

The error (or residual) e = b − proj_a(b) is orthogonal to a. This is the geometric foundation of least-squares regression.

Example: Project b onto a

Let a = (1, 2, 2), b = (3, 1, 0).

a · b = 3 + 2 + 0 = 5

a · a = 1 + 4 + 4 = 9

proj = (5/9)(1, 2, 2) = (5/9, 10/9, 10/9)

The Gram-Schmidt Process

Given any linearly independent set {v₁, …, vₖ}, the Gram-Schmidt process produces an orthonormal set {u₁, …, uₖ} spanning the same subspace:

w₁ = v₁,    u₁ = w₁ / ‖w₁‖
w₂ = v₂ − (v₂ · u₁)u₁,    u₂ = w₂ / ‖w₂‖
w₃ = v₃ − (v₃ · u₁)u₁ − (v₃ · u₂)u₂,    u₃ = w₃ / ‖w₃‖

Each step subtracts the projections onto all previously computed orthonormal vectors, then normalizes.

QR Decomposition

Gram-Schmidt applied to the columns of a matrix A produces the QR decomposition:

A = QR

where Q is an orthogonal matrix (Qᵀ = Q⁻¹) and R is upper triangular. QR decomposition is numerically stable and is the basis of modern algorithms for computing eigenvalues (the QR algorithm).

Orthogonal Matrices

A square matrix Q is orthogonal if QᵀQ = QQᵀ = I, meaning Q⁻¹ = Qᵀ. Orthogonal matrices preserve lengths and angles — they represent rotations and reflections. Key properties:

  • ‖Qx‖ = ‖x‖ for all x (length-preserving)
  • det(Q) = ±1
  • Eigenvalues have absolute value 1
  • Product of orthogonal matrices is orthogonal
  • nC7fwql4dijn52ujkeIPUsWISVYdrw2fYAkkO+A97F3TknFJvaMJIsOMdgInLx05DSt0Ow8Vk5ceh2JAOMmCZHlEL5SwZw/6f7Mo1gCHIFxFdED20qYwbWUTDmb83TTuxhlLMrbsNOUhQEpZbFndtCfvwgS1MHEVjr+uCkmFeD1Q11H3FxgEIqwC/kk/yWoyH3JSHvJaYjqpiR1hZnSEuueyfhbszNxf1YIpXWSxTgLBrfJKs8YbMv2QPB/iOB9sKMfjVD7hOJjZq8yLhvEDYJp6G1wn2hokL0hL1uILmS1D5k0TM6lRocHIeUyvT9/6cexa0ArHCN3s6QAtebKQaKAbdKXpub2nLLMEnITEkVtafTaIZwlb2qtAWALIq+RNxpyyB1/YJxEFPgCi0482z747WQZZk/S+HhoDPUwhrf27oLDOwA5moRC3+uS6ObZytP3zrq6IWp8ap20r933LE7opHAzAUVkb9+p4suqyWwCLvgm//xiyRlx/9ZkjQfszyJWgLtriIs64fAgxdrod0fFvDLzX6SmrfZpekEHJkMqjUgOj36ttKGx8+C7b4bKKBgwQbuk1SxQRmzHj7BCFGpjcGXVDfdersH3Cc2x/jG2GXsbNfGFJ66e5faOdA0ECjdCMSgKb62G3l2TsDlBfrcQB9n/mwB5FHsBnwZSdd/3vYplXVnq6/P+OZ8gYPLnypXp9vb5wa8HEa/k4HG/qdAs9256jASY+kUbTDIT7Z5CCGjHvJ

Singular Value Decomposition (SVD)

The Singular Value Decomposition is arguably the most important matrix factorization in applied mathematics. Every matrix — of any shape, any rank — has an SVD.

The SVD Theorem

Any m × n matrix A can be factored as:
A = UΣVᵀ
  • U (m × m): Orthogonal matrix — columns are the left singular vectors (eigenvectors of AAᵀ)
  • Σ (m × n): Diagonal matrix of singular values σ₁ ≥ σ₂ ≥ ⋯ ≥ 0 (square roots of eigenvalues of AᵀA)
  • Vᵀ (n × n): Orthogonal matrix — rows are the right singular vectors (eigenvectors of AᵀA)

Geometric Interpretation

Every linear transformation can be decomposed into three steps:

  1. Vᵀ: Rotate the input space (align with principal axes)
  2. Σ: Scale along each axis by the singular values
  3. U: Rotate the output space

The singular values tell you how much the matrix stretches in each direction. The rank of A equals the number of nonzero singular values.

Example: SVD of a 2 × 2 Matrix

A = [3 0]

    [0 2]

This diagonal matrix is already its own SVD: U = I, Σ = A, V = I.

Singular values: σ₁ = 3, σ₂ = 2.

For a non-diagonal matrix like [2 1; 1 2], the SVD reveals the principal stretching directions, which are the eigenvectors of AᵀA = [5 4; 4 5] with eigenvalues 9 and 1, giving singular values σ₁ = 3 and σ₂ = 1.

Low-Rank Approximation (Eckart-Young Theorem)

The SVD provides the best rank-k approximation of a matrix. Keep only the k largest singular values and their corresponding vectors:

A_k = σ₁uv₁ᵀ + σ₂uv₂ᵀ + ⋯ + σₖuvₖᵀ

A_k is the closest rank-k matrix to A (in both Frobenius and spectral norms). This is the mathematical foundation of:

  • Image compression: Store only the top k singular values/vectors instead of the full image matrix
  • Dimensionality reduction (PCA): Principal Component Analysis is SVD applied to centered data
  • Latent Semantic Analysis: Discover hidden topics in text document collections
  • Recommender systems: Netflix Prize-winning algorithms use SVD for collaborative filtering
  • Noise reduction: Small singular values often correspond to noise; truncating them denoises the data

The Pseudoinverse

For any matrix A (even non-square or singular), the Moore-Penrose pseudoinverse A⁺ can be computed via SVD:

If A = UΣVᵀ, then A⁺ = VΣ⁺Uᵀ

where Σ⁺ is formed by taking the reciprocal of each nonzero singular value. The pseudoinverse gives the least-squares solution to Ax = b when no exact solution exists.

The SVD connects many fundamental concepts: eigenvalues (σᵢ² are eigenvalues of AᵀA), rank (number of nonzero σᵢ), null space (right singular vectors for zero σᵢ), condition number (σ_max / σ_min), and the fundamental theorem of linear algebra (four fundamental subspaces). Mastering SVD gives you a unified view of the entire subject.
49aae6f1sB4/2YOIyhvJEY1h8BcysDKBI5Uvz6qcZ084UQS39Rdz51bfFfvVvnvas//a+V17P/fLlYbAi7vZqhAajER/pTA1voOPNSX5JH7QTzmAJbwMuSiZlsgrgGpR3Y8OGbG16PecBuFtDBGzmW+2TtG6GR38ybeuLkwLC1BD11Cfrbe89ucVZ99HK089kMY6+NSjBDYSG5oGy6BGszS4n9uHpM9PgSsfYFyNuObO/B72oiDBjwXhi4YFxwThI7BC4HBVUBZsHErrsr1KYMHnsqX1r+u6DVz0xgEzJ5RuqtQEA2GcvYFKAX56MsFZ5z7sS/ManbWnCn2L0z1fyYipB/Y6GI0n3e8gF0yVl7FGeo2SOWckTCjzaLYZVwXHXQI00XvWCnpR+9lzPGFWJE3aGt01gQrHUjYA++lsSwjuL16L2aHioi1+ThPhYfKw/VeyGqTHc4siFClsjrZXefdJ3yV0zp3WF/hvWl0Y9SjjGgBZ/YJu+M6IiCu/3HmsrjhRhThW/eMPoYgs7PC7lhLgNIyN1xf5wws9G0tWMRE5wISzMdLZUYZOi1sWIHfeuqxvgIYeayAYedTAkksQBnJHSX2x8UFneDBVdKyFMg1mVdyofv09t3juRqAPjkHAxQ5eeP8lcftjyII1Oem9EZzeXLZADOgvRGWH7JAQB4LntjISY3UpRzAolT4YKRJYfrKGzhY29uGZqCTszAP6+h+rA7FQ8skZ8q0ioBy2PBA9BlybQ