Inequalities and Number Theory
posted: 01-Aug-2025 & updated: 03-Aug-2025
\[% \newcommand{\algA}{\algk{A}} \newcommand{\algC}{\algk{C}} \newcommand{\bigtimes}{\times} \newcommand{\compl}[1]{\tilde{#1}} \newcommand{\complexes}{\mathbb{C}} \newcommand{\dom}{\mathop{\bf dom {}}} \newcommand{\ereals}{\reals\cup\{-\infty,\infty\}} \newcommand{\field}{\mathbb{F}} \newcommand{\integers}{\mathbb{Z}} \newcommand{\lbdseqk}[1]{\seqk{\lambda}{#1}} \newcommand{\meas}[3]{({#1}, {#2}, {#3})} \newcommand{\measu}[2]{({#1}, {#2})} \newcommand{\meast}[3]{\left({#1}, {#2}, {#3}\right)} \newcommand{\naturals}{\mathbb{N}} \newcommand{\nuseqk}[1]{\seqk{\nu}{#1}} \newcommand{\pair}[2]{\langle {#1}, {#2}\rangle} \newcommand{\rationals}{\mathbb{Q}} \newcommand{\reals}{\mathbb{R}} \newcommand{\seq}[1]{\left\langle{#1}\right\rangle} \newcommand{\powerset}{\mathcal{P}} \newcommand{\pprealk}[1]{\reals_{++}^{#1}} \newcommand{\ppreals}{\mathbb{R}_{++}} \newcommand{\prealk}[1]{\reals_{+}^{#1}} \newcommand{\preals}{\mathbb{R}_+} \newcommand{\tXJ}{\topos{X}{J}} % \newcommand{\relint}{\mathop{\bf relint {}}} \newcommand{\boundary}{\mathop{\bf bd {}}} \newcommand{\subsetset}[1]{\mathcal{#1}} \newcommand{\Tr}{\mathcal{\bf Tr}} \newcommand{\symset}[1]{\mathbf{S}^{#1}} \newcommand{\possemidefset}[1]{\mathbf{S}_+^{#1}} \newcommand{\posdefset}[1]{\mathbf{S}_{++}^{#1}} \newcommand{\ones}{\mathbf{1}} \newcommand{\Prob}{\mathop{\bf Prob {}}} \newcommand{\prob}[1]{\Prob\left\{#1\right\}} \newcommand{\Expect}{\mathop{\bf E {}}} \newcommand{\Var}{\mathop{\bf Var{}}} \newcommand{\Mod}[1]{\;(\text{mod}\;#1)} \newcommand{\ball}[2]{B(#1,#2)} \newcommand{\generates}[1]{\langle {#1} \rangle} \newcommand{\isomorph}{\approx} \newcommand{\isomorph}{\approx} \newcommand{\nullspace}{\mathcalfont{N}} \newcommand{\range}{\mathcalfont{R}} \newcommand{\diag}{\mathop{\bf diag {}}} \newcommand{\rank}{\mathop{\bf rank {}}} \newcommand{\Ker}{\mathop{\mathrm{Ker} {}}} \newcommand{\Map}{\mathop{\mathrm{Map} {}}} \newcommand{\End}{\mathop{\mathrm{End} {}}} \newcommand{\Img}{\mathop{\mathrm{Im} {}}} \newcommand{\Aut}{\mathop{\mathrm{Aut} {}}} \newcommand{\Gal}{\mathop{\mathrm{Gal} {}}} \newcommand{\Irr}{\mathop{\mathrm{Irr} {}}} \newcommand{\arginf}{\mathop{\mathrm{arginf}}} \newcommand{\argsup}{\mathop{\mathrm{argsup}}} \newcommand{\argmin}{\mathop{\mathrm{argmin}}} \newcommand{\ev}{\mathop{\mathrm{ev} {}}} \newcommand{\affinehull}{\mathop{\bf aff {}}} \newcommand{\cvxhull}{\mathop{\bf Conv {}}} \newcommand{\epi}{\mathop{\bf epi {}}} \newcommand{\injhomeo}{\hookrightarrow} \newcommand{\perm}[1]{\text{Perm}(#1)} \newcommand{\aut}[1]{\text{Aut}(#1)} \newcommand{\ideal}[1]{\mathfrak{#1}} \newcommand{\bigset}[2]{\left\{#1\left|{#2}\right.\right\}} \newcommand{\bigsetl}[2]{\left\{\left.{#1}\right|{#2}\right\}} \newcommand{\primefield}[1]{\field_{#1}} \newcommand{\dimext}[2]{[#1:{#2}]} \newcommand{\restrict}[2]{#1|{#2}} \newcommand{\algclosure}[1]{#1^\mathrm{a}} \newcommand{\finitefield}[2]{\field_{#1^{#2}}} \newcommand{\frobmap}[2]{\varphi_{#1,{#2}}} % %\newcommand{\algfontmode}{} % %\ifdefined\algfontmode %\newcommand\mathalgfont[1]{\mathcal{#1}} %\newcommand\mathcalfont[1]{\mathscr{#1}} %\else \newcommand\mathalgfont[1]{\mathscr{#1}} \newcommand\mathcalfont[1]{\mathcal{#1}} %\fi % %\def\DeltaSirDir{yes} %\newcommand\sdirletter[2]{\ifthenelse{\equal{\DeltaSirDir}{yes}}{\ensuremath{\Delta #1}}{\ensuremath{#2}}} \newcommand{\sdirletter}[2]{\Delta #1} \newcommand{\sdirlbd}{\sdirletter{\lambda}{\Delta \lambda}} \newcommand{\sdir}{\sdirletter{x}{v}} \newcommand{\seqk}[2]{#1^{(#2)}} \newcommand{\seqscr}[3]{\seq{#1}_{#2}^{#3}} \newcommand{\xseqk}[1]{\seqk{x}{#1}} \newcommand{\sdirk}[1]{\seqk{\sdir}{#1}} \newcommand{\sdiry}{\sdirletter{y}{\Delta y}} \newcommand{\slen}{t} \newcommand{\slenk}[1]{\seqk{\slen}{#1}} \newcommand{\ntsdir}{\sdir_\mathrm{nt}} \newcommand{\pdsdir}{\sdir_\mathrm{pd}} \newcommand{\sdirnu}{\sdirletter{\nu}{w}} \newcommand{\pdsdirnu}{\sdirnu_\mathrm{pd}} \newcommand{\pdsdiry}{\sdiry_\mathrm{pd}} \newcommand\pdsdirlbd{\sdirlbd_\mathrm{pd}} % \newcommand{\normal}{\mathcalfont{N}} % \newcommand{\algk}[1]{\mathalgfont{#1}} \newcommand{\collk}[1]{\mathcalfont{#1}} \newcommand{\classk}[1]{\collk{#1}} \newcommand{\indexedcol}[1]{\{#1\}} \newcommand{\rel}{\mathbf{R}} \newcommand{\relxy}[2]{#1\;\rel\;{#2}} \newcommand{\innerp}[2]{\langle{#1},{#2}\rangle} \newcommand{\innerpt}[2]{\left\langle{#1},{#2}\right\rangle} \newcommand{\closure}[1]{\overline{#1}} \newcommand{\support}{\mathbf{support}} \newcommand{\set}[2]{\{#1|#2\}} \newcommand{\metrics}[2]{\langle {#1}, {#2}\rangle} \newcommand{\interior}[1]{#1^\circ} \newcommand{\topol}[1]{\mathfrak{#1}} \newcommand{\topos}[2]{\langle {#1}, \topol{#2}\rangle} % topological space % \newcommand{\alg}{\algk{A}} \newcommand{\algB}{\algk{B}} \newcommand{\algF}{\algk{F}} \newcommand{\algR}{\algk{R}} \newcommand{\algX}{\algk{X}} \newcommand{\algY}{\algk{Y}} % \newcommand\coll{\collk{C}} \newcommand\collB{\collk{B}} \newcommand\collF{\collk{F}} \newcommand\collG{\collk{G}} \newcommand{\tJ}{\topol{J}} \newcommand{\tS}{\topol{S}} \newcommand\openconv{\collk{U}} % \newenvironment{my-matrix}[1]{\begin{bmatrix}}{\end{bmatrix}} \newcommand{\colvectwo}[2]{\begin{my-matrix}{c}{#1}\\{#2}\end{my-matrix}} \newcommand{\colvecthree}[3]{\begin{my-matrix}{c}{#1}\\{#2}\\{#3}\end{my-matrix}} \newcommand{\rowvecthree}[3]{\begin{bmatrix}{#1}&{#2}&{#3}\end{bmatrix}} \newcommand{\mattwotwo}[4]{\begin{bmatrix}{#1}&{#2}\\{#3}&{#4}\end{bmatrix}} % \newcommand\optfdk[2]{#1^\mathrm{#2}} \newcommand\tildeoptfdk[2]{\tilde{#1}^\mathrm{#2}} \newcommand\fobj{\optfdk{f}{obj}} \newcommand\fie{\optfdk{f}{ie}} \newcommand\feq{\optfdk{f}{eq}} \newcommand\tildefobj{\tildeoptfdk{f}{obj}} \newcommand\tildefie{\tildeoptfdk{f}{ie}} \newcommand\tildefeq{\tildeoptfdk{f}{eq}} \newcommand\xdomain{\mathcalfont{X}} \newcommand\xobj{\optfdk{\xdomain}{obj}} \newcommand\xie{\optfdk{\xdomain}{ie}} \newcommand\xeq{\optfdk{\xdomain}{eq}} \newcommand\optdomain{\mathcalfont{D}} \newcommand\optfeasset{\mathcalfont{F}} % \newcommand{\bigpropercone}{\mathcalfont{K}} % \newcommand{\prescript}[3]{\;^{#1}{#3}} % %\]Introduction
Preamble
Notations
-
sets of numbers
- $\naturals$ - set of natural numbers
- $\integers$ - set of integers
- $\integers_+$ - set of nonnegative integers
- $\rationals$ - set of rational numbers
- $\reals$ - set of real numbers
- $\preals$ - set of nonnegative real numbers
- $\ppreals$ - set of positive real numbers
- $\complexes$ - set of complex numbers
-
sequences $\seq{x_i}$ and the like
- finite $\seq{x_i}_{i=1}^n$, infinite $\seq{x_i}_{i=1}^\infty$ - use $\seq{x_i}$ whenever unambiguously understood
- similarly for other operations, e.g., $\sum x_i$, $\prod x_i$, $\cup A_i$, $\cap A_i$, $\bigtimes A_i$
- similarly for integrals, e.g., $\int f$ for $\int_{-\infty}^\infty f$
-
sets
- $\compl{A}$ - complement of $A$
- $A\sim B$ - $A\cap \compl{B}$
- $A\Delta B$ - $(A\cap \compl{B}) \cup (\compl{A} \cap B)$
- $\powerset(A)$ - set of all subsets of $A$
-
sets in metric vector spaces
- $\closure{A}$ - closure of set $A$
- $\interior{A}$ - interior of set $A$
- $\relint A$ - relative interior of set $A$
- $\boundary A$ - boundary of set $A$
-
set algebra
- $\sigma(\subsetset{A})$ - $\sigma$-algebra generated by $\subsetset{A}$, i.e., smallest $\sigma$-algebra containing $\subsetset{A}$
-
norms in $\reals^n$
- $\|x\|_p$ ($p\geq1$) - $p$-norm of $x\in\reals^n$, i.e., $(|x_1|^p + \cdots + |x_n|^p)^{1/p}$
- e.g., $\|x\|_2$ - Euclidean norm
-
matrices and vectors
- $a_{i}$ - $i$-th entry of vector $a$
- $A_{ij}$ - entry of matrix $A$ at position $(i,j)$, i.e., entry in $i$-th row and $j$-th column
- $\Tr(A)$ - trace of $A \in\reals^{n\times n}$, i.e., $A_{1,1}+ \cdots + A_{n,n}$
-
symmetric, positive definite, and positive semi-definite matrices
- $\symset{n}\subset \reals^{n\times n}$ - set of symmetric matrices
- $\possemidefset{n}\subset \symset{n}$ - set of positive semi-definite matrices; $A\succeq0 \Leftrightarrow A \in \possemidefset{n}$
- $\posdefset{n}\subset \symset{n}$ - set of positive definite matrices; $A\succ0 \Leftrightarrow A \in \posdefset{n}$
-
sometimes,
use Python script-like notations
(with serious abuse of mathematical notations)
-
use $f:\reals\to\reals$ as if it were $f:\reals^n \to \reals^n$,
e.g.,
$$
\exp(x) = (\exp(x_1), \ldots, \exp(x_n)) \quad \mbox{for } x\in\reals^n
$$
and
$$
\log(x) = (\log(x_1), \ldots, \log(x_n)) \quad \mbox{for } x\in\ppreals^n
$$
which corresponds to Python code
numpy.exp(x)
ornumpy.log(x)
wherex
is instance ofnumpy.ndarray
, i.e.,numpy
array -
use $\sum x$ to mean $\ones^T x$ for $x\in\reals^n$,
i.e.
$$
\sum x = x_1 + \cdots + x_n
$$
which corresponds to Python code
x.sum()
wherex
isnumpy
array -
use $x/y$ for $x,y\in\reals^n$ to mean
$$
\rowvecthree{x_1/y_1}{\cdots}{x_n/y_n}^T
$$
which corresponds to Python code
x / y
wherex
andy
are $1$-dnumpy
arrays -
use $X/Y$ for $X,Y\in\reals^{m\times n}$ to mean
$$
\begin{my-matrix}{cccc}
X_{1,1}/Y_{1,1} & X_{1,2}/Y_{1,2} & \cdots & X_{1,n}/Y_{1,n}
\\
X_{2,1}/Y_{2,1} & X_{2,2}/Y_{2,2} & \cdots & X_{2,n}/Y_{2,n}
\\
\vdots & \vdots & \ddots & \vdots
\\
X_{m,1}/Y_{m,1} & X_{m,2}/Y_{m,2} & \cdots & X_{m,n}/Y_{m,n}
\end{my-matrix}
$$
which corresponds to Python code
X / Y
whereX
andY
are $2$-dnumpy
arrays
-
use $f:\reals\to\reals$ as if it were $f:\reals^n \to \reals^n$,
e.g.,
$$
\exp(x) = (\exp(x_1), \ldots, \exp(x_n)) \quad \mbox{for } x\in\reals^n
$$
and
$$
\log(x) = (\log(x_1), \ldots, \log(x_n)) \quad \mbox{for } x\in\ppreals^n
$$
which corresponds to Python code
Some definitions
Some conventions
-
(for some subjects) use following conventions
- $0\cdot \infty = \infty \cdot 0 = 0$
- $(\forall x\in\ppreals)(x\cdot \infty = \infty \cdot x = \infty)$
- $\infty \cdot \infty = \infty$
Algebra
Inequalities
Jensen's inequality
- strictly convex function: for any $x\neq y$ and $0< \alpha <1$ () $$ \alpha f(x) + (1-\alpha) f(y) > f(\alpha x + (1-\alpha) y) $$
- convex function: for any $x, y$ and $0< \alpha <1$ () $$ \alpha f(x) + (1-\alpha) f(y) \geq f(\alpha x + (1-\alpha) y) $$
- if $f$ is strictly convex, equality holds if and only if $x_1=\cdots=x_n$
Jensen's inequality - for random variables
- discrete random variable interpretation of Jensen's inequality in summation form - assume $\Prob(X=x_i) = \alpha_i$, then $$ \Expect f(X) = \alpha_1 f(x_1) + \cdots + \alpha_n f(x_n) \geq f(\alpha_1 x_1 + \cdots + \alpha_n x_n) = f\left(\Expect X\right) $$
- true for any random variables $X$
Proof for $n=3$
- for any $x,y,z$ and $\alpha,\beta,\gamma>0$ with $\alpha + \beta + \gamma = 1$ $$ \begin{eqnarray*} \alpha f(x) + \beta f(y) + \gamma f(z) &=& (\alpha+\beta)\left(\frac{\alpha}{\alpha+\beta} f(x) + \frac{\beta}{\alpha + \beta} f(y)\right) + \gamma f(z) \\ &\geq& (\alpha+\beta)f\left(\frac{\alpha}{\alpha+\beta} x + \frac{\beta}{\alpha + \beta} y\right) + \gamma f(z) \\ &\geq& f\left((\alpha+\beta)\left(\frac{\alpha}{\alpha+\beta} x + \frac{\beta}{\alpha + \beta} y\right) + \gamma z \right) \\ &=& f(\alpha x + \beta y + \gamma z ) \end{eqnarray*} $$
Proof for all $n$
-
use mathematical induction
- assume that Jensen's inequality holds for $1\leq n\leq m$
- for distinct $x_i$ and $\alpha_i>0$ ($1\leq i\leq m+1$) with $\alpha_1 + \cdots + \alpha_{m+1} = 1$ $$ \begin{eqnarray*} \sum^{m+1}_{i=1} \alpha_i f(x_i) &=& \left(\sum^m_{j=1} \alpha_j\right) \sum^m_{i=1} \left(\frac{\alpha_i}{\sum^m_{j=1} \alpha_j} f(x_i)\right) + \alpha_{m+1} f(x_{m+1}) \\ &\geq& \left(\sum^m_{j=1} \alpha_j\right) f\left(\sum^m_{i=1} \left(\frac{\alpha_i}{\sum^m_{j=1} \alpha_j} x_i\right)\right) + \alpha_{m+1} f(x_{m+1}) \\ &=& \left(\sum^m_{j=1} \alpha_j\right) f\left(\frac{1}{\sum^m_{j=1} \alpha_j}\sum^m_{i=1} {\alpha_i}{} x_i\right) + \alpha_{m+1} f(x_{m+1}) \\ &\geq& f\left( \sum^m_{i=1} \alpha_i x_i + \alpha_{m+1} x_{m+1}\right) = f\left( \sum^{m+1}_{i=1} \alpha_i x_i \right) \end{eqnarray*} $$
1st and 2nd order conditions for convexity
- 1st order condition (assuming differentiable $f:\reals\to\reals$) - $f$ is strictly convex if and only if for any $x\neq y$ $$ f(y) > f(x) + f'(x)(y-x) $$
-
2nd order condition (assuming twice-differentiable $f:\reals\to\reals$)
- if $f''(x)>0$, $f$ is strictly convex
- $f$ is convex if and only if for any $x$ $$ f''(x) \geq 0 $$
Jensen's inequality examples
- $f(x)=x^2$ is strictly convex $$ \frac{a^2 + b^2}{2} \geq \left(\frac{a+b}{2}\right)^2 $$
- $f(x)=x^4$ is strictly convex $$ \frac{a^4 + b^4}{2} \geq \left(\frac{a+b}{2}\right)^4 $$
- $f(x)=\exp(x)$ is strictly convex $$ \frac{\exp(a) + \exp(b)}{2} \geq \exp\left(\frac{a+b}{2}\right) $$
- equality holds if and only if $a=b$ for all inequalities
1st and 2nd order conditions for convexity - vector version
- 1st order condition (assuming differentiable $f:\reals^n\to\reals$) - $f$ is strict convex if and only if for any $x,y$ $$ f(y) > f(x) + \nabla f(x)^T (y-x) $$ where $\nabla f(x) \in\reals^{n}$ with $\nabla f(x)_{i} = \partial f(x) / \partial x_i$
-
2nd order condition (assuming twice-differentiable $f:\reals^n\to\reals$)
- if $\nabla^2 f(x) \succ 0$, $f$ is strictly convex
- $f$ is convex if and only if for any $x$ $$ \nabla^2 f(x)\succeq 0 $$
Jensen's inequality examples - vector version
- assume $f:\reals^n\to\reals$
-
$f(x)=\|x\|_2 = \sqrt{\sum x_i^2}$ is strictly convex
$$
(\|a\|_2 + 2\|b\|_2 )/3
\geq
\left\|(a+2b)/3\right\|_2
$$
- equality holds if and only if $a=b\in\reals^n$
-
$f(x)=\|x\|_p = \left(\sum |x_i|^p\right)^{1/p}$ ($p>1$) is strictly convex
$$
\frac{1}{k}
\left(\sum_{i=1}^k\|x^{(i)}\|_p \right)
\geq
\left\|\frac{1}{k}\sum_{i=1}^k x^{(i)}\right\|_p
$$
- equality holds if and only if $x^{(1)}=\cdots=x^{(k)}\in\reals^n$
AM $\geq$ GM
-
for all $a,b>0$
$$
\frac{a + b}{2} \geq \sqrt{ab}
$$
- equality holds if and only if $a=b$
- below most general form holds
- let's prove these incrementally (for rational $\alpha_i$)
Proof of AM $\geq$ GM - simplest case
- use fact that $x^2\geq0$ for any $x\in\reals$
-
for any $a,b>0$
$$
\begin{eqnarray*}
&&
(\sqrt{a}-\sqrt{b})^2 \geq 0
\\
&\Leftrightarrow&
a^2 - 2\sqrt{ab} + b^2 \geq 0
\\
&\Leftrightarrow&
a + b \geq 2\sqrt{ab}
\\
&\Leftrightarrow&
\frac{a + b}{2} \geq \sqrt{ab}
\end{eqnarray*}
$$
- equality holds if and only if $a=b$
Proof of AM $\geq$ GM - when $n=4$ and $n=8$
-
for any $a,b,c,d>0$
$$
\frac{a+b+c+d}{4}
\geq
\frac{2\sqrt{ab} + 2\sqrt{cd}}{4}
=
\frac{\sqrt{ab} + \sqrt{cd}}{2}
\geq
\sqrt{\sqrt{ab} \sqrt{cd}}
=
\sqrt[4]{abcd}
$$
- equality holds if and only if $a=b$ and $c=d$ and $ab=cd$ if and only if $a=b=c=d$
-
likewise, for $a_1,\ldots,a_8>0$
$$
\begin{eqnarray*}
\frac{a_1+\cdots+a_8}{8}
&\geq&
\frac{\sqrt{a_1a_2} + \sqrt{a_3a_4} + \sqrt{a_5a_6} + \sqrt{a_7a_8}}{4}
\\
&\geq&
\sqrt[4]{\sqrt{a_1a_2} \sqrt{a_3a_4} \sqrt{a_5a_6} \sqrt{a_7a_8}}
\\
&=&
\sqrt[8]{a_1\cdots a_8}
\end{eqnarray*}
$$
- equality holds if and only if $a_1=\cdots=a_8$
Proof of AM $\geq$ GM - when $n=2^m$
-
generalized to cases $n=2^m$
$$
\left(\sum_{a=1}^{2^m} a_i\right) / 2^m\geq \left({\prod_{a=1}^{2^m} a_i}\right)^{1/2^m}
$$
- equality holds if and only if $a_1=\cdots=a_{2^m}$
- can be proved by mathematical induction
Proof of AM $\geq$ GM - when $n=3$
-
proof for $n=3$
$$
\begin{eqnarray*}
&&
\frac{a+b+c}{3} = \frac{a + b + c + (a+b+c)/3}{4}
\geq \sqrt[4]{abc(a+b+c)/3}
\\
&\Rightarrow&
\left(\frac{a+b+c}{3}\right)^4 \geq {abc(a+b+c)/3}
\\
&\Leftrightarrow&
\left(\frac{a+b+c}{3}\right)^3 \geq abc
\\
&\Leftrightarrow&
\frac{a+b+c}{3} \geq \sqrt[3]{abc}
\end{eqnarray*}
$$
- equality holds if and only if $a=b=c=(a+b+c)/3$ if and only if $a=b=c$
Proof of AM $\geq$ GM - for all integers
- for any integer $n\neq 2^m$
-
for $m$ such that $2^m>n$
$$
\begin{eqnarray*}
&&
\frac{a_1+\cdots+a_n}{n} = \frac{a_1 + \cdots + a_n + (2^m-n) (a_1+\cdots+a_n) /n}{2^m}
\\
&&
\geq
\sqrt[2^m]{a_1\cdots a_n \cdot ((a_1 + \cdots + a_n)/n)^{2^m-n}}
\\
&\Leftrightarrow&
\left(\frac{a_1+\cdots+a_n}{n}\right)^{2^m}
\geq
{a_1\cdots a_n \cdot \left(\frac{a_1 + \cdots + a_n}{n}\right)^{2^m-n}}
\\
&\Leftrightarrow&
\left(\frac{a_1+\cdots+a_n}{n}\right)^{n}
\geq
{a_1\cdots a_n}
\\
&\Leftrightarrow&
\frac{a_1+\cdots+a_n}{n}
\geq
\sqrt[n]{a_1\cdots a_n}
\end{eqnarray*}
$$
- equality holds if and only if $a_1=\cdots=a_n$
Proof of AM $\geq$ GM - rational $\alpha_i$
- given $n$ positive rational $\alpha_i$, we can find $n$ natural numbers $q_i$ such that $$ \alpha_i = \frac{q_i}{ N} $$ where $q_1+\cdots+q_n=N$
-
for any $n$ positive $a_i\in\reals$ and positive $n$ $\alpha_i\in\rationals$ with $\alpha_1+\cdots+\alpha_n=1$
$$
\alpha_1 a_1 + \cdots + \alpha_n a_n
= \frac{q_1 a_1 + \cdots + q_n a_n}{N}
\geq \sqrt[N]{a_1^{q_1}\cdots a_n^{q_n}}
= a_1^{\alpha_1}\cdots a_n^{\alpha_n}
$$
- equality holds if and only if $a_1=\cdots=a_n$
Proof of AM $\geq$ GM - real $\alpha_i$
- exist $n$ rational sequences $\{ \beta_{i,1}, \beta_{i,2}, \ldots\}$ ($1\leq i\leq n$) such that $$ \begin{eqnarray*} && \beta_{1,j}+\cdots+\beta_{n,j}=1 \ \forall \ j\geq1 \\ && \lim_{j\to\infty} \beta_{i,j} = \alpha_i \ \forall \ 1\leq i\leq n \end{eqnarray*} $$
- for all $j$ $$ \beta_{1,j} a_1 + \cdots + \beta_{n,j} a_n \geq a_1^{\beta_{1,j}}\cdots a_n^{\beta_{n,j}} $$ hence $$ \begin{eqnarray*} && \lim_{j\to\infty} \left(\beta_{1,j} a_1 + \cdots + \beta_{n,j} a_n \right) \geq \lim_{j\to\infty} a_1^{\beta_{1,j}}\cdots a_n^{\beta_{n,j}} \\ &\Leftrightarrow& \alpha_1 a_1 + \cdots + \alpha_n a_n \geq a_1^{\alpha_1}\cdots a_n^{\alpha_n} \end{eqnarray*} $$
- cannot prove equality condition from above proof method
Proof of AM $\geq$ GM using Jensen's inequality
- $(-\log)$ is strictly convex function because $$ \frac{d^2}{dx^2} \left(-\log(x)\right) = \frac{d}{dx} \left(-\frac{1}{x} \right) = \frac{1}{x^2} > 0 $$
- Jensen's inequality implies for $a_i >0$, $\alpha_i >0$ with $\sum \alpha_i = 1$ $$ \begin{eqnarray*} -\log\left(\prod a_i^{\alpha_i}\right) = -\sum \log\left( a_i^{\alpha_i}\right) = \sum \alpha_i (-\log(a_i)) \geq -\log \left(\sum \alpha_i a_i\right) \end{eqnarray*} $$
- $(-\log)$ strictly monotonically decreases, hence $\prod a_i^{\alpha_i} \leq \sum \alpha_i a_i$, having just proved $$ \alpha_1 a_1 + \cdots + \alpha_n a_n \geq a_1^{\alpha_1}\cdots a_n^{\alpha_n} $$ where equality if and only if $a_i$ are equal (by Jensen's inequality's equality condition)
Cauchy-Schwarz inequality
-
middle school proof
$$
\begin{eqnarray*}
&&\sum (t a_i + b_i)^2 \geq 0 \ \forall\ t \in \reals
\\
&\Leftrightarrow&
t^2 \sum a_i^2 + 2t \sum a_ib_i + \sum b_i^2 \geq 0 \ \forall\ t \in \reals
\\
&\Leftrightarrow&
\Delta = \left(\sum a_ib_i \right)^2 - \sum a_i^2 \sum b_i^2 \leq 0
\end{eqnarray*}
$$
- equality holds if and only if $\exists t\in\reals$, $t a_i + b_i=0$ for all $1\leq i\leq n$
Cauchy-Schwarz inequality - another proof
-
$x\geq0$ for any $x\in\reals$, hence
$$
\begin{eqnarray*}
&&
\sum_i \sum_j (a_ib_j - a_jb_i)^2 \geq0
\\
&\Leftrightarrow&
\sum_i \sum_j (a_i^2b_j^2 - 2a_ia_jb_ib_j + a_j^2b_i^2) \geq0
\\
&\Leftrightarrow&
\sum_i \sum_j a_i^2b_j^2 + \sum_i \sum_j a_j^2b_i^2 -2 \sum_i \sum_j a_ia_jb_ib_j \geq 0
\\
&\Leftrightarrow&
2 \sum_i a_i^2 \sum_j b_j^2 - 2 \sum_i a_ib_i \sum_j a_jb_j \geq 0
\\
&\Leftrightarrow&
\sum_i a_i^2 \sum_j b_j^2 - \left(\sum_i a_ib_i\right)^2 \geq0
\end{eqnarray*}
$$
- equality holds if and only if $a_ib_j=a_jb_i$ for all $1\leq i,j\leq n$
Cauchy-Schwarz inequality - still another proof
- for any $x,y\in\reals$ and $\alpha,\beta>0$ with $\alpha + \beta = 1$ $$ \begin{eqnarray*} && (\alpha x - \beta y)^2 = \alpha^2 x^2 + \beta^2 y^2 - 2\alpha \beta xy \\ && = \alpha(1-\beta) x^2 + (1-\alpha)\beta y^2 - 2\alpha \beta xy \geq 0 \\ &\Leftrightarrow& \alpha x^2 + \beta y^2 \geq \alpha \beta x^2 + \alpha \beta y^2 + 2\alpha \beta xy = \alpha \beta (x+y)^2 \\ &\Leftrightarrow& x^2 / \alpha + y^2 / \beta \geq (x+y)^2 \end{eqnarray*} $$
- plug in $x=a_i$, $y=b_i$, $\alpha = A/(A+B)$, $\beta=B/(A+B)$ where $A = \sqrt{\sum a_i^2}$, $B = \sqrt{\sum b_i^2}$ $$ \begin{eqnarray*} && \sum (a_i^2 / \alpha + b_i^2 / \beta) \geq \sum (a_i+b_i)^2 \Leftrightarrow (A+B)^2 \geq A^2 + B^2 + 2 \sum a_i b_i \\ &\Leftrightarrow& AB \geq \sum a_i b_i \Leftrightarrow A^2B^2 \geq \left(\sum a_i b_i\right)^2 \Leftrightarrow {\sum a_i^2}{\sum b_i^2} \geq \left(\sum a_i b_i \right)^2 \end{eqnarray*} $$
Cauchy-Schwarz inequality - proof using determinant
-
almost the same proof as first one - but using $2$-by-$2$ matrix determinant
$$
\begin{eqnarray*}
&&\sum (x a_i + y b_i )^2 \geq 0 \ \forall\ x,y \in \reals
\\
&\Leftrightarrow&
x^2 \sum a_i^2 + 2xy \sum a_ib_i + y^2\sum b_i^2 \geq 0 \ \forall \ x, y \in \reals
\\
&\Leftrightarrow&
\begin{my-matrix}{cc}
x & y
\end{my-matrix}
\begin{my-matrix}{cc}
\sum a_i^2 & \sum a_ib_i
\\
\sum a_ib_i & \sum b_i^2
\end{my-matrix}
\begin{my-matrix}{c}
x \\ y
\end{my-matrix}
\geq 0
\ \forall \ x, y \in \reals
\\
\\
&\Leftrightarrow&
\left|
\begin{array}{cc}
\sum a_i^2 & \sum a_ib_i
\\
\sum a_ib_i & \sum b_i^2
\end{array}
\right|
\geq 0
\Leftrightarrow
\sum a_i^2 \sum b_i^2 - \left(\sum a_ib_i \right)^2 \geq0
\end{eqnarray*}
$$
- equality holds if and only if $$ \left( \exists x,y\in\reals \mbox{ with } xy\neq0 \right) \left( xa_i + yb_i=0\ \ \forall 1\leq i\leq n \right) $$
- allows beautiful generalization of Cauchy-Schwarz inequality
Cauchy-Schwarz inequality - generalization
- want to say something like $\sum_{i=1}^n (x a_i + y b_i + z c_i + w d_i + \cdots)^2$
-
run out of alphabets - use double subscripts
$$
\begin{eqnarray*}
&&
\sum_{i=1}^n (x_1 A_{1,i} + x_2 A_{2,i} + \cdots + x_m A_{m,i})^2 \geq 0 \ \forall\ x_i \in \reals
\\
&\Leftrightarrow&
\sum_{i=1}^n (x^T a_i)^2
=
\sum_{i=1}^n x^T a_ia_i^T x
=
x^T \left(\sum_{i=1}^n a_ia_i^T\right) x \geq 0 \ \forall\ x \in \reals^m
\\
&\Leftrightarrow&
\left|
\begin{array}{cccc}
\sum_{i=1}^n A_{1,i}^2 & \sum_{i=1}^n A_{1,i} A_{2,i} & \cdots & \sum_{i=1}^n A_{1,i} A_{m,i}
\\
\sum_{i=1}^n A_{1,i}A_{2,i} & \sum_{i=1}^n A_{2,i}^2 & \cdots & \sum_{i=1}^n A_{2,i} A_{m,i}
\\
\vdots & \vdots & \ddots & \vdots
\\
\sum_{i=1}^n A_{1,i}A_{m,i} & \sum_{i=1}^n A_{2,i}A_{m,i} & \cdots & \sum_{i=1}^n A_{m,i}^2
\end{array}
\right|
\geq 0
\end{eqnarray*}
$$
- where $a_i = \begin{my-matrix}{ccc} A_{1,i} &\cdots & A_{m,i}\end{my-matrix}^T \in\reals^m$
- equality holds if and only if $\exists x\neq0\in\reals^m$, $x^Ta_i =0$ for all $1\leq i\leq n$
Cauchy-Schwarz inequality - three series of variables
-
let $m=3$
$$
\begin{eqnarray*}
&&
\begin{my-matrix}{ccc}
\sum a_{i}^2 & \sum a_{i} b_{i} & \sum a_{i} c_{i}
\\
\sum a_{i}b_{i} & \sum b_{i}^2 & \sum b_{i} c_{i}
\\
\sum a_{i}c_{i} & \sum b_{i}c_{i} & \sum c_{i}^2
\end{my-matrix}
\succeq 0
\\
&\Rightarrow&
\sum a_i^2 \sum b_i^2 \sum c_i^2 + 2 \sum a_ib_i \sum b_ic_i \sum c_ia_i
\\
&&
\geq \sum a_i^2 \left(\sum b_i c_i\right)^2 + \sum b_i^2 \left(\sum a_i c_i\right)^2 + \sum c_i^2 \left(\sum a_i b_i\right)^2
\end{eqnarray*}
$$
- equality holds if and only if $\exists x,y,z\in\reals$, $xa_i + yb_i + zc_i=0$ for all $1\leq i\leq n$
-
questions for you
- what does this mean?
- any real-world applications?
Cauchy-Schwarz inequality - extensions
- note that all these can be further generalized as in page~\pageref{page:Cauchy-Schwarz inequality - generalization}
Number Theory - Queen of Mathematics
Integers
-
integers ($\integers$)
-
$\ldots -2, -1, 0, 1, 2, \ldots$
- first defined by Bertrand Russell
-
algebraic structure - commutative ring
- addition, multiplication defined, but divison not defined
- addition, multiplication are associative
- multiplication distributive over addition
- addition, multiplication are commutative
-
natural numbers ($\naturals$)
- $1, 2, \ldots$
Division and prime numbers
- divisors for $n\in\naturals$ $$ \set{d\in\naturals}{ d \mbox{ divides } n} $$
-
prime numbers
- $p$ is primes if $1$ and $p$ are only divisors
Fundamental theorem of arithmetic
Elementary quantities
- greatest common divisor (gcd) (of $a$ and $b$) $$ \gcd(a,b) = \max \set{d}{d\mbox{ divides both }a \mbox{ and } b} $$
- least common multiple (lcm) (of $a$ and $b$) $$ \mbox{lcm}(a,b) = \min \set{m}{\mbox{both } a \mbox{ and } b \mbox{ divides }m} $$
- $a$ and $b$ coprime, relatively prime, mutually prime $\Leftrightarrow$ $\gcd(a,b)=1$
Are there infinite number of prime numbers?
- yes!
-
proof
- assume there only exist finite number of prime numbers, e.g., $p_1 < p_2 < \cdots <p_n$
- but then, $p_1 \cdot p_2 \cdots p_n + 1$ is prime, but which is greater than $p_n$, hence contradiction
Integers modulo $n$
-
$a\equiv b\Mod{n}$ and $c\equiv d\Mod{n}$ imply
- $a+c\equiv b+d \Mod{n}$
- $ac\equiv bd \Mod{n}$
Euler's theorem
- e.g., $\varphi(12) = \varphi(2^2\cdot 3^1) = 1\cdot2^1\cdot 2\cdot3^0 = 4$, $\varphi(10) = \varphi(2^1\cdot5^1) = 1\cdot2^0\cdot 4\cdot 5^0 =4$
- e.g., $5^4 \equiv 1 \Mod{12}$ whereas $4^4 \equiv 4 \neq 1 \Mod{12}$
- Euler's theorem underlies RSA cryptosystem, which is pervasively used in internet communication