Introduction to Positive Definite Matrices
A positive definite matrix is a symmetric matrix with all positive eigenvalues. These matrices appear naturally in optimization (as Hessian matrices of convex functions), statistics (as covariance matrices), physics (as inertia tensors), and machine learning (as kernel matrices).
Positive definiteness guarantees that the associated quadratic form $x^TAx$ is positive for all nonzero vectors $x$, which has profound implications for stability, convergence, and optimization properties. Understanding positive definite matrices is essential for modern applied mathematics and data science.
A $2 \times 2$ positive definite matrix $A = \begin{bmatrix} a & b \\ b & c \end{bmatrix}$ corresponds to an elliptic paraboloid:
The surface $z = f(x,y)$ opens upward and has a unique global minimum at the origin. This geometric interpretation extends to higher dimensions.
Formal Definitions and Equivalent Characterizations
A symmetric matrix $A \in \mathbb{R}^{n \times n}$ is positive definite if one of the following equivalent conditions holds:
- Quadratic form condition: $x^TAx > 0$ for all nonzero $x \in \mathbb{R}^n$
- Eigenvalue condition: All eigenvalues $\lambda_i$ of $A$ are positive
- Sylvester's criterion: All leading principal minors are positive
- Cholesky decomposition: $A = LL^T$ where $L$ is lower triangular with positive diagonal entries
- Inner product condition: $\langle x, y \rangle_A = x^TAy$ defines an inner product on $\mathbb{R}^n$
If $x^TAx \geq 0$ for all $x$, we say $A$ is positive semidefinite (allowing eigenvalues to be zero).
Key Properties of Positive Definite Matrices
1. Invertibility
All positive definite matrices are invertible. Since all eigenvalues are positive, 0 is not an eigenvalue, so $\det(A) = \prod \lambda_i > 0$.
2. Stability under Addition
If $A$ and $B$ are positive definite, then $A + B$ is positive definite. This follows from $x^T(A+B)x = x^TAx + x^TBx > 0$.
3. Congruence Transformations
If $A$ is positive definite and $C$ is invertible, then $C^TAC$ is positive definite. This preserves definiteness under change of basis.
4. Principal Submatrices
All principal submatrices of a positive definite matrix are positive definite. In particular, all diagonal entries are positive.
Sylvester's Criterion: Testing for Positive Definiteness
A symmetric matrix $A \in \mathbb{R}^{n \times n}$ is positive definite if and only if all its leading principal minors are positive:
where $A_k$ is the $k \times k$ submatrix in the upper-left corner of $A$.
Consider $A = \begin{bmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{bmatrix}$. Check Sylvester's criterion:
- First minor: $\det([2]) = 2 > 0$ โ
- Second minor: $\det\begin{bmatrix} 2 & 1 \\ 1 & 3 \end{bmatrix} = 6 - 1 = 5 > 0$ โ
- Third minor: $\det(A) = 2(6-1) - 1(2-0) + 0 = 10 - 2 = 8 > 0$ โ
All leading principal minors are positive, so $A$ is positive definite.
Cholesky Decomposition: The Factorization of Choice
Every positive definite matrix $A \in \mathbb{R}^{n \times n}$ can be uniquely factored as:
where $L$ is a lower triangular matrix with positive diagonal entries. This decomposition is approximately twice as efficient as LU decomposition for solving linear systems $Ax = b$.
For $i = 1$ to $n$:
L[i,i] = sqrt(A[i,i] - sum(L[i,k]ยฒ for k=1 to i-1))
For j = i+1 to n:
L[j,i] = (A[j,i] - sum(L[j,k]*L[i,k] for k=1 to i-1)) / L[i,i]
This algorithm requires about $\frac{n^3}{3}$ operations, compared to $\frac{2n^3}{3}$ for LU decomposition.
Factor $A = \begin{bmatrix} 4 & 2 & 2 \\ 2 & 5 & 3 \\ 2 & 3 & 6 \end{bmatrix}$:
Step 1: $L_{11} = \sqrt{4} = 2$
Step 2: $L_{21} = 2/2 = 1$, $L_{31} = 2/2 = 1$
Step 3: $L_{22} = \sqrt{5 - 1^2} = \sqrt{4} = 2$
Step 4: $L_{32} = (3 - 1ร1)/2 = 1$
Step 5: $L_{33} = \sqrt{6 - 1^2 - 1^2} = \sqrt{4} = 2$
Thus $L = \begin{bmatrix} 2 & 0 & 0 \\ 1 & 2 & 0 \\ 1 & 1 & 2 \end{bmatrix}$ and $A = LL^T$.
Connection to Inner Product Spaces and Cauchy-Schwarz
Positive definite matrices are intimately connected to the theory of inner product spaces. For any positive definite matrix $A$, we can define an inner product:
This inner product satisfies all the axioms of an inner product space:
- Symmetry: $\langle x, y \rangle_A = \langle y, x \rangle_A$ (since $A$ is symmetric)
- Linearity: $\langle ax + by, z \rangle_A = a\langle x, z \rangle_A + b\langle y, z \rangle_A$
- Positive definiteness: $\langle x, x \rangle_A > 0$ for $x \neq 0$
The Cauchy-Schwarz inequality holds for this inner product:
This connection shows that positive definite matrices naturally induce geometric structure on $\mathbb{R}^n$.
Applications of Positive Definite Matrices
In optimization, positive definite matrices characterize convex functions:
- Second derivative test: A twice-differentiable function $f: \mathbb{R}^n \to \mathbb{R}$ is strictly convex at $x$ if its Hessian matrix $Hf(x)$ is positive definite
- Newton's method: Uses the inverse of the Hessian (or approximation) for quadratic convergence
- Quadratic programming: Minimize $\frac{1}{2}x^TAx + b^Tx$ subject to constraints, where $A$ is positive definite
- Trust region methods: Approximate $f$ by a quadratic model with positive definite Hessian
Positive definiteness ensures that critical points are local minima and optimization algorithms converge.
In statistics, covariance matrices are always positive semidefinite:
- Multivariate normal distribution: $X \sim N(\mu, \Sigma)$ where $\Sigma$ is positive definite covariance matrix
- Principal Component Analysis (PCA): Eigen decomposition of covariance matrix to find directions of maximum variance
- Correlation matrices: Special covariance matrices with 1's on diagonal, always positive semidefinite
- Mahalanobis distance: $d(x,y) = \sqrt{(x-y)^T\Sigma^{-1}(x-y)}$ measures distance normalized by covariance
Positive definiteness ensures that Gaussian distributions are non-degenerate and Mahalanobis distance is well-defined.
In machine learning, positive definite kernels generate positive definite matrices:
- Kernel matrices: For kernel $K$, the Gram matrix $K_{ij} = K(x_i, x_j)$ is positive semidefinite
- Support Vector Machines: Use positive definite kernels to map data to higher-dimensional feature spaces
- Gaussian Processes: Covariance functions must be positive definite to ensure valid probability distributions
- Reproducing Kernel Hilbert Spaces (RKHS): Positive definite kernels define inner products in infinite-dimensional spaces
Mercer's theorem characterizes which kernels produce positive definite matrices.
Numerical Considerations and Stability
Numerical Stability
Cholesky decomposition is numerically stable for positive definite matrices. The algorithm can be implemented with pivoting for ill-conditioned matrices.
Condition Number
The condition number $\kappa(A) = \lambda_{\max}/\lambda_{\min}$ measures sensitivity. Positive definite matrices often have better conditioning than general matrices.
Positive Definite Approximations
For matrices that are not positive definite (e.g., due to numerical error), we can compute the nearest positive definite matrix using eigenvalue truncation.
Comparison with Related Matrix Classes
| Matrix Class | Definition | Key Properties |
|---|---|---|
| Positive Definite | $x^TAx > 0$ for all $x \neq 0$ | All eigenvalues positive, invertible, Cholesky decomposition exists |
| Positive Semidefinite | $x^TAx \geq 0$ for all $x$ | Eigenvalues nonnegative, may be singular, no unique Cholesky |
| Negative Definite | $x^TAx < 0$ for all $x \neq 0$ | All eigenvalues negative, $-A$ is positive definite |
| Indefinite | $x^TAx$ takes both signs | Has both positive and negative eigenvalues, saddle points |
Connection to the Viral Limit Problem
The viral limit problem from our limit calculator involves inner products that can be represented by positive definite matrices:
Consider sequences $\{h_p\}, \{b_p\}, \{z_p\}$ in $\mathbb{R}^n$ with inner product $\langle x, y \rangle = x^TAy$ where $A$ is positive definite. The limits:
Assuming $\|h_p\|_A = \sqrt{h_p^TAh_p} \to 1$, the Cauchy-Schwarz inequality gives:
The positive definiteness of $A$ ensures the induced norm is well-defined and Cauchy-Schwarz applies. For the complete solution, see our dedicated solution page.
Advanced Topics: Infinite-Dimensional Generalizations
The concept of positive definiteness extends to operators on Hilbert spaces:
Let $H$ be a Hilbert space. A bounded linear operator $T: H \to H$ is positive definite if there exists $c > 0$ such that:
This is the infinite-dimensional analog of positive definiteness. Important examples include elliptic differential operators like $-\Delta$ with Dirichlet boundary conditions.
๐ Ready to Apply Positive Definite Matrix Theory?
Use our limit calculator to solve problems involving inner products, or explore Cauchy-Schwarz inequality and Hilbert spaces for deeper theory.
Historical Development and Importance
The theory of positive definite matrices has evolved through several mathematical traditions:
- 19th Century: Quadratic forms studied by Gauss, Cauchy, and Sylvester
- 1907: Toeplitz introduces Toeplitz matrices and studies positive definiteness
- 1924: Cholesky develops his decomposition algorithm (published posthumously in 1924)
- 1930s: von Neumann uses positive definite operators in quantum mechanics
- 1940s: Numerical linear algebra develops efficient Cholesky algorithms
- 1960s: Convex optimization theory formalizes role of positive definite Hessians
- 1990s: Kernel methods in machine learning rely on positive definite kernel matrices
- Today: Applications in optimization, statistics, machine learning, and quantum computing
โ Positive Definite Matrices FAQ
Can a non-symmetric matrix be positive definite?
Typically, positive definiteness is defined only for symmetric/Hermitian matrices. For non-symmetric $A$, we usually consider $\frac{1}{2}(A + A^T)$, the symmetric part. Some authors define $A$ as positive definite if its symmetric part is positive definite.
What is the difference between positive definite and positive semidefinite?
Positive definite requires $x^TAx > 0$ for all $x \neq 0$ (strict inequality), while positive semidefinite allows $x^TAx = 0$ for some $x \neq 0$. Equivalently, positive definite matrices have all eigenvalues positive, while positive semidefinite matrices have nonnegative eigenvalues (some may be zero).
How do I check if a matrix is positive definite numerically?
Three practical methods: 1) Attempt Cholesky decomposition - if it succeeds, matrix is positive definite; 2) Compute eigenvalues - check if all are positive; 3) Check Sylvester's criterion - verify all leading principal minors are positive. For large matrices, Cholesky is most efficient.
Why are covariance matrices always positive semidefinite?
For random vector $X$ with mean $\mu$, the covariance matrix $\Sigma = E[(X-\mu)(X-\mu)^T]$. For any vector $v$, $v^T\Sigma v = E[(v^T(X-\mu))^2] \geq 0$, since it's the expectation of a square. Thus $\Sigma$ is positive semidefinite. It's positive definite if no linear combination of components is constant.