Proofs of Karamata inequality and Muirhead inequality

The main goal of this post is to prove the fore mentioned inequalities, via separation theorem of functional analysis. The main result would be a proof of the result of Birkhorff:

If \displaystyle (x_1,x_2,\cdots,x_n) \succ (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n), (y_1,y_2,\cdots,y_n) \in \mathbb{R}^n, then there exists non-negative numbers a_{\sigma} (\sigma \in S_n) with \sum_{\sigma \in S_n} a_{\sigma} = 1, such that

\displaystyle \sum_{\sigma \in S_n} a_{\sigma} \left( x_{\sigma(1)}, x_{\sigma(2)},\cdots,x_{\sigma(n)} \right) = (y_1, y_2, \cdots, y_n) .

1. Definition and tools

Let \textbf{x} = (x_1,x_2,\cdots,x_n) \succ \textbf{y} = (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n)be arrays in \mathbb{R}^n. We say that \textbf{x} majorizes \textbf{y}, in symbol \textbf{x} \succ \textbf{y}, if

  1. x_1 \ge x_2 \ge \cdots \ge x_n and y_1 \ge y_2 \ge \cdots \ge y_n
  2. x_1 + \cdots x_k \ge y_1 + \cdots y_k for all 1 \leq k \leq n-1
  3. x_1 + \cdots + x_n = y_1 + \cdots + y_n

Examples:

  1. (10, 0, 0) \succ (4, 3, 3)
  2. (n, 0, \cdots , 0) \succ (1, 1, \cdots , 1) (both arrays have n terms)

To prove the theorems in this note, we will need some theorems listed here:

  • Strong Separation Theorem

Let A,B be disjoint, nonempty convex subsets of a topological vector space X. If A is compact, B is closed and X is locally convex, then there is f \in X^* such that

\displaystyle sup\{ \textrm{Re} f(x): x \in A \} < inf \{ \textrm{Re} f(y): y \in B \}.

 

  • Rearrangement Inequality

Let x_1 \ge \cdots \ge x_n, y_1 \ge \cdots \ge y_n be two sequences of real numbers. Then

\displaystyle \sum_{i=1}^n x_iy_i = \textrm{max}_{\sigma, \tau \in S_n} \sum_{i=1}^n x_{\sigma(i)}y_{\tau(i)}

\displaystyle \sum_{i=1}^n x_iy_{n-i} = \textrm{min}_{\sigma, \tau \in S_n} \sum_{i=1}^n x_{\sigma(i)}y_{\tau(i)}

  • Jensen’s inequality

 Let f:\mathbb{R} \rightarrow \mathbb{R} be a convex function. Then for real numbers a_1, \cdots , a_n \ge 0 such that a_1 + \cdots + a_n = 1 and x_1, \cdots, x_n \in \mathbb{R}, we have

\sum_{i=1}^n a_i f(x_i) \ge f \left(\sum_{i=1}^n a_ix_i \right)

  • Weighted AM-GM inequality

for real numbers a_1, \cdots , a_n \ge 0 such that a_1 + \cdots + a_n = 1 and x_1, \cdots, x_n \ge 0, we have

\displaystyle \sum_{i=1}^n a_ix_i \ge \prod_{i=1}^n x_i^{a_i}

2. Main Theorem

Theorem (Birkhoff 1):

If (x_1,x_2,\cdots,x_n) \succ (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n), (y_1,y_2,\cdots,y_n) \in \mathbb{R}^n, then there exists non-negative numbers a_{\sigma} (\sigma \in S_n) with \sum_{\sigma \in S_n} a_{\sigma} = 1, such that

\displaystyle \sum_{\sigma \in S_n} a_{\sigma} \left( x_{\sigma(1)}, x_{\sigma(2)},\cdots,x_{\sigma(n)} \right) = (y_1, y_2, \cdots, y_n) .

Geometrically, this means that the sequences majorized by (x_1,x_2,\cdots,x_n) \in \mathbb{R}^n are exactly the convex hull formed by points with coordinates being all the permutations of (x_1,x_2,\cdots,x_n).

Proof of Birkhoff’s theorem 2

Let \textbf{x} = (x_1,x_2,\cdots,x_n) \succ \textbf{y} = (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n)be arrays in \mathbb{R}^n. Assume that \textbf{x} \succ \textbf{y}, but \textbf{y} does not lie in the convex hull S of the points (x_{\sigma (1)}, \cdots, x_{\sigma (n)}) for all \sigma \in S_n. By separation theorem (take A to be S and B to be \textbf{y}), we can find a linear functional f \in \left(\mathbb{R}^n\right)^* such that

\displaystyle sup\{ \textrm{Re} f(x): x \in A \} < inf \{ \textrm{Re} f(y): y \in B \}

Note that a linear functional of \mathbb{R}^n is of the form f(x_1,\cdots, x_n) = c_1x_1+ \cdots + c_nx_n for some constants c_1, \cdots ,c_n. By adding a constant c if needed, we see that

\displaystyle \textrm{max}_{\sigma \in S_n} \{ c_1 x_{\sigma ( 1 )} + \cdots + c_n x_{\sigma ( n )} + c \} < 0 < c_1y_1 + \cdots + c_ny_n + c

 for some constants c, c_1, \cdots, c_n. The constant c can be removed from this inequality.

Let d_1 \ge d_2 \ge \cdots d_n be an rearrangement of c_1, \cdots, c_n. By rearrangement inequailty, we have

\displaystyle d_1y_1 + \cdots + d_ny_n \\ \ge c_1y_1 + \cdots c_ny_n \\ \ge d_1x_1 + \cdots + d_nx_n.

The last step is because

\displaystyle d_1x_1 + \cdots + d_nx_n = {max}_{\sigma \in S_n} \{ c_1x_{\sigma(1)} + \cdots c_n x_{\sigma(n)} \}

by rearrangement inequality.

This is a contradiction, since

\displaystyle d_1x_1 + \cdots + d_nx_n \\ = d_n(x_1 + \cdots + x_n) + (d_{n-1} - d_n)(x_1 + \cdots + x_{n-1}) + \cdots d_1x_1\\ \ge d_n(y_1 + \cdots + y_n) + (d_{n-1} - d_n)(y_1 + \cdots + y_{n-1}) + \cdots d_1y_1 \\ = d_1y_1 + \cdots + d_ny_n

3. Application 1: Karamata inequality (Majorization inequality)

Theorem (Karamata)

If (x_1,x_2,\cdots,x_n) \succ (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n), (y_1,y_2,\cdots,y_n) \in \mathbb{R}^n, then for any convex function f: \mathbb{R} \rightarrow \mathbb{R}, we have

\displaystyle f(x_1) + \cdots + f(x_n) \ge f(y_1) + \cdots + f(y_n)

Proof

By Birkhoff’s theorem, we have

\displaystyle \sum_{\sigma \in S_n} a_{\sigma} \left( x_{\sigma(1)}, x_{\sigma(2)},\cdots,x_{\sigma(n)} \right) = (y_1, y_2, \cdots, y_n) .

By Jensen’s inequality,

\displaystyle f(y_1) + \cdots + f(y_n) \\ = f(\sum_{\sigma \in S_n} a_{\sigma} x_{\sigma(1)}) + \cdots + f(\sum_{\sigma \in S_n} a_{\sigma} x_{\sigma(n)}) \\ \leq \sum_{\sigma \in S_n} a_{\sigma} \left(f(x_{\sigma(1)}) + \cdots f(x_{\sigma(n)})\right) \\ = f(x_1) + \cdots + f(x_n))

3. Application 2: Muirhead inequality

Theorem (Muirhead)

If (x_1,x_2,\cdots,x_n) \succ (y_1,y_2,\cdots,y_n), where the arrays (x_1,x_2,\cdots,x_n), (y_1,y_2,\cdots,y_n) \in \mathbb{R}^n, then for any array of positive numbers (e_1, \cdots, e_n), we have

\displaystyle \sum_{\tau \in S_n} e_{\tau ( 1)}^{x_1} \cdots e_{\tau (n)}^{x_n} \ge \sum_{\tau \in S_n} e_{\tau ( 1)}^{y_1} \cdots e_{\tau (n)}^{y_n}

Proof

By Birkhoff’s theorem, we have

\displaystyle \sum_{\sigma \in S_n} a_{\sigma} \left( x_{\sigma(1)}, x_{\sigma(2)},\cdots,x_{\sigma(n)} \right) = (y_1, y_2, \cdots, y_n). Then by weighted AM-GM inequality,

\displaystyle e_{\tau ( 1)}^{y_1} \cdots e_{\tau (n)}^{y_n} \\ = e_{\tau ( 1 )}^{\sum_{\sigma \in S_n} a_{\sigma} x_{\sigma(1)}} \cdots e_{\tau (n)}^{\sum_{\sigma \in S_n} a_{\sigma} x_{\sigma(n)}} \\ = \prod_{\sigma \in S_n} \left(e_{\tau(1)}^{x_{\sigma(1)}} \cdots e_{\tau(n)}^{x_{\sigma(n)}}\right)^{a_{\sigma}} \\ \leq \sum_{\sigma \in S_n} a_{\sigma} e_{\tau(1)}^{x_{\sigma(1)}} \cdots e_{\tau(n)}^{x_{\sigma(n)}}

Summing over all the possible \tau \in S_n, we have the desired inequality.

References

1. Albert W. Marshall and Ingram Olkin, Inequalities: Theory of Majorization and Its Applications. Academic Press, 1979.
2. Fedor Petrov, “Majorization Sequences”, http://www.mathlinks.ro/viewtopic.php?search_id=1697861374&t=5849

2 Responses to “Proofs of Karamata inequality and Muirhead inequality”

  1. Mike Steele Says:

    This is a lovely exposition. Thanks for posting it. Cheers, M.

  2. Muirheadの不等式と具体例 Says:

    […] 厳密にやるには結構難しいみたいです。 Muirheadの不等式の厳密な証明はこちら(英語です) […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: