
Matrix Norms
Matrix norm is a norm on the vector space $mathbb{F}^{m times n}$, where $mathbb{F} = mathbb{R}$ or $mathbb{C}$ denotes the field. Thus, it is a mapping from the vector space to $mathbb{R}$ which satisfies the following properties of norms:
For all scalars $alpha in mathbb{F}$ and for all matrices $boldsymbol{A}, boldsymbol{B} in mathbb{F}^{m times n}$, a norm
-
is absolutely homogeneous: $lVertalphaboldsymbol{A}rVert = lvertalpharvert lVertboldsymbol{A}rVert$;
-
is sub-additive or satisfies the triangle inequality: $lVertboldsymbol{A} + boldsymbol{B}rVert le lVertboldsymbol{A}rVert + lVertboldsymbol{B}rVert$;
-
and is positive-definite: $lVertboldsymbol{A}rVert ge 0$, and $lVertboldsymbol{A}rVert = 0$ iff $boldsymbol{A} = boldsymbol{0}$.
There are several types of matrix norms that satisfy the properties above, and we name a few as follows.
Induced by Vector Norms
The most general vector-induced $(p, q)$-norm of an $m times n$ matrix $boldsymbol{A}$ is defined as
where $||_p$ and $||_q$ are vector norms. When $q = p$, the resulting matrix norm is called $ell_p$ norm for simplicity.
Three noteworthy and widely-used examples are $p = 1, 2$ and $infty$.
-
$p = 1$: $lVertboldsymbol{A}rVert_1 = max_j sum_i lvert a_{ij}rvert$, which is the maximum absolute column sum.
-
$p = infty$: $lVertboldsymbol{A}rVert_infty = max_i sum_j lvert a_{ij}rvert$, which is the maximum absolute row sum.
-
$p = 2$: $lVertboldsymbol{A}rVert_2 = sigma_{max} boldsymbol{A}$, which is the largest singular value.
Entrywise
The most famous entrywise matrix norm is the Frobenius norm: $lVertboldsymbol{A}rVert_F = (sum_i sum_j lvert a_{ij}rvert^2)^{1/2}$. An important inequality relating $ell_2$ norm to the Frobenius norm states that $lVertboldsymbol{A}rVert_2 le lVertboldsymbol{A}rVert_F$.
References
- Wikipedia: https://en.wikipedia.org/wiki/Matrix_norm.
Matrix derivatives
We first introduce some direct extensions of scalar derivative.
Then, following the right-side formula above, we
have
where $boldsymbol{f}, boldsymbol{g} in mathbb{R}^{m}, boldsymbol{x} in mathbb{R}^n$. This has used the definition
It is weird that some materials directly define
The right way is to define
Also, the notation of taking the derivative of a matrix w.r.t. a vector
like $frac{partial boldsymbol{A}}{partial boldsymbol{x}}$ is weird. I don’t understand
why it keeps showing up in different materials.
References
-
Wikipedia: https://en.wikipedia.org/wiki/Matrix_calculus.
-
Appendix C of Pattern Recognition and Machine Learning.
-
Lecture notes of Introduction to System Engineering, Prof. Jianming Hu, Department of Automation, Tsinghua University, Spring 2018.




近期评论