What is a positive definite matrix?

A symmetric matrix A is positive definite if xᵀAx > 0 for all nonzero vectors x. Equivalently, all eigenvalues of A are positive, all principal minors are positive, or A has a Cholesky decomposition A = LLᵀ.

Positive Definite Matrices Explained [Full Guide] - Applications in ML & Optimization

Q: What is the Cholesky decomposition?

Cholesky decomposition factors a positive definite matrix A as A = LLᵀ where L is a lower triangular matrix with positive diagonal entries. This is more efficient than LU decomposition and is used in numerical linear algebra.

Introduction to Positive Definite Matrices

A positive definite matrix is a symmetric matrix with all positive eigenvalues. These matrices appear naturally in optimization (as Hessian matrices of convex functions), statistics (as covariance matrices), physics (as inertia tensors), and machine learning (as kernel matrices).

Positive definiteness guarantees that the associated quadratic form $x^TAx$ is positive for all nonzero vectors $x$, which has profound implications for stability, convergence, and optimization properties. Understanding positive definite matrices is essential for modern applied mathematics and data science.

Visualizing Positive Definiteness

A $2 \times 2$ positive definite matrix $A = \begin{bmatrix} a & b \\ b & c \end{bmatrix}$ corresponds to an elliptic paraboloid:

$$f(x,y) = ax^2 + 2bxy + cy^2 > 0 \quad \text{for all } (x,y) \neq (0,0)$$

The surface $z = f(x,y)$ opens upward and has a unique global minimum at the origin. This geometric interpretation extends to higher dimensions.

Formal Definitions and Equivalent Characterizations

Definition: Positive Definite Matrix

A symmetric matrix $A \in \mathbb{R}^{n \times n}$ is positive definite if one of the following equivalent conditions holds:

Quadratic form condition: $x^TAx > 0$ for all nonzero $x \in \mathbb{R}^n$
Eigenvalue condition: All eigenvalues $\lambda_i$ of $A$ are positive
Sylvester's criterion: All leading principal minors are positive
Cholesky decomposition: $A = LL^T$ where $L$ is lower triangular with positive diagonal entries
Inner product condition: $\langle x, y \rangle_A = x^TAy$ defines an inner product on $\mathbb{R}^n$

If $x^TAx \geq 0$ for all $x$, we say $A$ is positive semidefinite (allowing eigenvalues to be zero).

Key Properties of Positive Definite Matrices

1. Invertibility

All positive definite matrices are invertible. Since all eigenvalues are positive, 0 is not an eigenvalue, so $\det(A) = \prod \lambda_i > 0$.

2. Stability under Addition

If $A$ and $B$ are positive definite, then $A + B$ is positive definite. This follows from $x^T(A+B)x = x^TAx + x^TBx > 0$.

3. Congruence Transformations

If $A$ is positive definite and $C$ is invertible, then $C^TAC$ is positive definite. This preserves definiteness under change of basis.

4. Principal Submatrices

All principal submatrices of a positive definite matrix are positive definite. In particular, all diagonal entries are positive.

Sylvester's Criterion: Testing for Positive Definiteness

Theorem: Sylvester's Criterion

A symmetric matrix $A \in \mathbb{R}^{n \times n}$ is positive definite if and only if all its leading principal minors are positive:

$$\det(A_1) > 0, \quad \det(A_2) > 0, \quad \ldots, \quad \det(A_n) > 0$$

where $A_k$ is the $k \times k$ submatrix in the upper-left corner of $A$.

Example: Testing a 3×3 Matrix

Consider $A = \begin{bmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{bmatrix}$. Check Sylvester's criterion:

First minor: $\det([2]) = 2 > 0$ ✓
Second minor: $\det\begin{bmatrix} 2 & 1 \\ 1 & 3 \end{bmatrix} = 6 - 1 = 5 > 0$ ✓
Third minor: $\det(A) = 2(6-1) - 1(2-0) + 0 = 10 - 2 = 8 > 0$ ✓

All leading principal minors are positive, so $A$ is positive definite.

Cholesky Decomposition: The Factorization of Choice

Theorem: Cholesky Decomposition

Every positive definite matrix $A \in \mathbb{R}^{n \times n}$ can be uniquely factored as:

$$A = LL^T$$

where $L$ is a lower triangular matrix with positive diagonal entries. This decomposition is approximately twice as efficient as LU decomposition for solving linear systems $Ax = b$.

Cholesky Algorithm (Row-by-Row)

For $i = 1$ to $n$:

L[i,i] = sqrt(A[i,i] - sum(L[i,k]² for k=1 to i-1))
For j = i+1 to n:
    L[j,i] = (A[j,i] - sum(L[j,k]*L[i,k] for k=1 to i-1)) / L[i,i]

This algorithm requires about $\frac{n^3}{3}$ operations, compared to $\frac{2n^3}{3}$ for LU decomposition.

Example: Cholesky Decomposition

Factor $A = \begin{bmatrix} 4 & 2 & 2 \\ 2 & 5 & 3 \\ 2 & 3 & 6 \end{bmatrix}$:

Step 1: $L_{11} = \sqrt{4} = 2$

Step 2: $L_{21} = 2/2 = 1$, $L_{31} = 2/2 = 1$

Step 3: $L_{22} = \sqrt{5 - 1^2} = \sqrt{4} = 2$

Step 4: $L_{32} = (3 - 1×1)/2 = 1$

Step 5: $L_{33} = \sqrt{6 - 1^2 - 1^2} = \sqrt{4} = 2$

Thus $L = \begin{bmatrix} 2 & 0 & 0 \\ 1 & 2 & 0 \\ 1 & 1 & 2 \end{bmatrix}$ and $A = LL^T$.

Connection to Inner Product Spaces and Cauchy-Schwarz

Positive definite matrices are intimately connected to the theory of inner product spaces. For any positive definite matrix $A$, we can define an inner product:

$$\langle x, y \rangle_A = x^TAy$$

This inner product satisfies all the axioms of an inner product space:

Symmetry: $\langle x, y \rangle_A = \langle y, x \rangle_A$ (since $A$ is symmetric)
Linearity: $\langle ax + by, z \rangle_A = a\langle x, z \rangle_A + b\langle y, z \rangle_A$
Positive definiteness: $\langle x, x \rangle_A > 0$ for $x \neq 0$

The Cauchy-Schwarz inequality holds for this inner product:

$$(x^TAy)^2 \leq (x^TAx)(y^TAy)$$

This connection shows that positive definite matrices naturally induce geometric structure on $\mathbb{R}^n$.

Applications of Positive Definite Matrices

Application 1: Optimization and Convexity

In optimization, positive definite matrices characterize convex functions:

Second derivative test: A twice-differentiable function $f: \mathbb{R}^n \to \mathbb{R}$ is strictly convex at $x$ if its Hessian matrix $Hf(x)$ is positive definite
Newton's method: Uses the inverse of the Hessian (or approximation) for quadratic convergence
Quadratic programming: Minimize $\frac{1}{2}x^TAx + b^Tx$ subject to constraints, where $A$ is positive definite
Trust region methods: Approximate $f$ by a quadratic model with positive definite Hessian

Positive definiteness ensures that critical points are local minima and optimization algorithms converge.

Application 2: Statistics and Covariance Matrices

In statistics, covariance matrices are always positive semidefinite:

Multivariate normal distribution: $X \sim N(\mu, \Sigma)$ where $\Sigma$ is positive definite covariance matrix
Principal Component Analysis (PCA): Eigen decomposition of covariance matrix to find directions of maximum variance
Correlation matrices: Special covariance matrices with 1's on diagonal, always positive semidefinite
Mahalanobis distance: $d(x,y) = \sqrt{(x-y)^T\Sigma^{-1}(x-y)}$ measures distance normalized by covariance

Positive definiteness ensures that Gaussian distributions are non-degenerate and Mahalanobis distance is well-defined.

Application 3: Machine Learning and Kernel Methods

In machine learning, positive definite kernels generate positive definite matrices:

Kernel matrices: For kernel $K$, the Gram matrix $K_{ij} = K(x_i, x_j)$ is positive semidefinite
Support Vector Machines: Use positive definite kernels to map data to higher-dimensional feature spaces
Gaussian Processes: Covariance functions must be positive definite to ensure valid probability distributions
Reproducing Kernel Hilbert Spaces (RKHS): Positive definite kernels define inner products in infinite-dimensional spaces

Mercer's theorem characterizes which kernels produce positive definite matrices.

Numerical Considerations and Stability

Numerical Stability

Cholesky decomposition is numerically stable for positive definite matrices. The algorithm can be implemented with pivoting for ill-conditioned matrices.

Condition Number

The condition number $\kappa(A) = \lambda_{\max}/\lambda_{\min}$ measures sensitivity. Positive definite matrices often have better conditioning than general matrices.

Positive Definite Approximations

For matrices that are not positive definite (e.g., due to numerical error), we can compute the nearest positive definite matrix using eigenvalue truncation.

Comparison with Related Matrix Classes

Matrix Class	Definition	Key Properties
Positive Definite	$x^TAx > 0$ for all $x \neq 0$	All eigenvalues positive, invertible, Cholesky decomposition exists
Positive Semidefinite	$x^TAx \geq 0$ for all $x$	Eigenvalues nonnegative, may be singular, no unique Cholesky
Negative Definite	$x^TAx < 0$ for all $x \neq 0$	All eigenvalues negative, $-A$ is positive definite
Indefinite	$x^TAx$ takes both signs	Has both positive and negative eigenvalues, saddle points

Connection to the Viral Limit Problem

The viral limit problem from our limit calculator involves inner products that can be represented by positive definite matrices:

Consider sequences $\{h_p\}, \{b_p\}, \{z_p\}$ in $\mathbb{R}^n$ with inner product $\langle x, y \rangle = x^TAy$ where $A$ is positive definite. The limits:

$$\lim_{p \to \infty} \langle h_p, z_p \rangle = 0.9, \quad \lim_{p \to \infty} \langle h_p, b_p \rangle = 0.9375$$

Assuming $\|h_p\|_A = \sqrt{h_p^TAh_p} \to 1$, the Cauchy-Schwarz inequality gives:

$$\lim_{p \to \infty} \langle b_p, z_p \rangle = 0.84375$$

The positive definiteness of $A$ ensures the induced norm is well-defined and Cauchy-Schwarz applies. For the complete solution, see our dedicated solution page.

Advanced Topics: Infinite-Dimensional Generalizations

The concept of positive definiteness extends to operators on Hilbert spaces:

Definition: Positive Definite Operator

Let $H$ be a Hilbert space. A bounded linear operator $T: H \to H$ is positive definite if there exists $c > 0$ such that:

$$\langle Tx, x \rangle \geq c\|x\|^2 \quad \text{for all } x \in H$$

This is the infinite-dimensional analog of positive definiteness. Important examples include elliptic differential operators like $-\Delta$ with Dirichlet boundary conditions.

🚀 Ready to Apply Positive Definite Matrix Theory?

Use our limit calculator to solve problems involving inner products, or explore Cauchy-Schwarz inequality and Hilbert spaces for deeper theory.

Try Limit Calculator Study Cauchy-Schwarz Matrix Calculator

Historical Development and Importance

The theory of positive definite matrices has evolved through several mathematical traditions:

19th Century: Quadratic forms studied by Gauss, Cauchy, and Sylvester
1907: Toeplitz introduces Toeplitz matrices and studies positive definiteness
1924: Cholesky develops his decomposition algorithm (published posthumously in 1924)
1930s: von Neumann uses positive definite operators in quantum mechanics
1940s: Numerical linear algebra develops efficient Cholesky algorithms
1960s: Convex optimization theory formalizes role of positive definite Hessians
1990s: Kernel methods in machine learning rely on positive definite kernel matrices
Today: Applications in optimization, statistics, machine learning, and quantum computing

❓ Positive Definite Matrices FAQ

Can a non-symmetric matrix be positive definite?

Typically, positive definiteness is defined only for symmetric/Hermitian matrices. For non-symmetric $A$, we usually consider $\frac{1}{2}(A + A^T)$, the symmetric part. Some authors define $A$ as positive definite if its symmetric part is positive definite.

What is the difference between positive definite and positive semidefinite?

Positive definite requires $x^TAx > 0$ for all $x \neq 0$ (strict inequality), while positive semidefinite allows $x^TAx = 0$ for some $x \neq 0$. Equivalently, positive definite matrices have all eigenvalues positive, while positive semidefinite matrices have nonnegative eigenvalues (some may be zero).

How do I check if a matrix is positive definite numerically?

Three practical methods: 1) Attempt Cholesky decomposition - if it succeeds, matrix is positive definite; 2) Compute eigenvalues - check if all are positive; 3) Check Sylvester's criterion - verify all leading principal minors are positive. For large matrices, Cholesky is most efficient.

Why are covariance matrices always positive semidefinite?

For random vector $X$ with mean $\mu$, the covariance matrix $\Sigma = E[(X-\mu)(X-\mu)^T]$. For any vector $v$, $v^T\Sigma v = E[(v^T(X-\mu))^2] \geq 0$, since it's the expectation of a square. Thus $\Sigma$ is positive semidefinite. It's positive definite if no linear combination of components is constant.

Positive Definite Matrices