Weierstrass Approximation Theorem

Theorem.

The set of polynomials is dense in $C^{0} ([0, 1], R)$ , i.e. for any continuous $f : [0, 1] \to R$ and any $ε > 0$ , there exists a polynomial $p (x)$ such that $∣ f (x) - p (x) ∣ < ε$ for all $x \in [0, 1]$ .

Remark.

Proving this theorem with just the tools of analysis is possible, but quite cumbersome. Borrowing some ideas from probability makes the proof much simpler and provides better motivations for the seemingly arbitrary tools used.

Proof.

Let $f : [0, 1] \to R$ be some arbitrary continuous function. We start by observing that because $f$ is continuous function on a compact domain, it must be uniformly continuous (Continuous on compact domain implies uniformly continuous). Recall that $f$ is uniformly continuous if for all $ε > 0$ , there exists a universal $δ > 0$ such that for any two points in $[0, 1]$ of distance less than $δ$ , their images are of distance less than $ε$ , i.e.

∣ s - t ∣ < δ ⟹ ∣ f (s) - f (t) ∣ < ε .

Uniform continuity is one of the main engines of the proof.

The first tool we borrow from probability is the binomial random variable. Let $X \sim Binomial (n, x)$ where $x$ is some arbitrary real number in $[0, 1]$ . Recall that $P (X = k) = (k n) x^{k} (1 - x)^{n - k}$ , $E (\frac{X}{n}) = x$ , $Var (\frac{X}{n}) = \frac{1}{n ^{2}} Var (X) = \frac{n x ( 1 - x )}{n ^{2}} = \frac{x ( 1 - x )}{n}$ , and that Chebyshev’s Inequality states that

P (\frac{X}{n} - E (\frac{X}{n}) \geq l Var (\frac{X}{n})) = P (\frac{X}{n} - x \geq l \frac{x ( 1 - x )}{n}) \leq \frac{1}{l ^{2}} .

Chebyshev’s Inequality is the second tool we borrow from probability and is the other main engine of the proof.

Let us define the following polynomial (a.k.a. the Bernstein polynomial):

b_{n} (x) := k = 0 \sum n P (X = k) f (\frac{k}{n}) = k = 0 \sum n \frac{n !}{k ! ( n - k )!} x^{k} (1 - x)^{n - k} f (\frac{k}{n}) .

Clearly this is in fact a polynomial with respect to $x$ . We will show that for all $ε > 0$ , there exists $n$ large enough such that for all $x \in [0, 1]$ , $∣ f (x) - b_{n} (x) ∣ < ε$ .

We proceed by examining $∣ f (x) - b_{n} (x) ∣ = ∣ b_{n} (x) - f (x) ∣$ :

∣ b_{n} (x) - f (x) ∣ = k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) k = 0 \sum n P (X = k)

where we used the fact that $\sum_{k = 0}^{n} P (X = k) = 1$ (since probabilities are normalized). Then,

k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) k = 0 \sum n P (X = k) = k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x),

and by $n$ applications of the Triangle Inequality,

k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) \leq k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) .

In order to use our two inequalities (one from uniform continuity and the other from Chebyshev), we split the sum into two. We do this by splitting the indices ${0, 1, \dots, n}$ into two sets based on some parameter $η > 0$ that we will set later:

S_{l} = {k \in {0, 1, \dots, n} : \frac{k}{n} - x < η}

and

S_{g} = {k \in {0, 1, \dots, n} : \frac{k}{n} - x \geq η} .

Then, by the triangle inequality,

k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) \leq k \in S_{l} \sum P (X = k) f (\frac{k}{n}) - f (x) + k \in S_{g} \sum P (X = k) f (\frac{k}{n}) - f (x) .

We will bound both terms above by $\frac{ε}{2}$ .

For the first term, we leverage uniform continuity. We know that given $\frac{ε}{2} > 0$ , there exists $δ > 0$ such that for any $x \in [0, 1]$ and any $k \in {0, 1, \dots, n}$ (which means $\frac{k}{n} \in [0, 1]$ ), $∣ \frac{k}{n} - x ∣ < δ ⟹ ∣ f (\frac{k}{n}) - f (x) ∣ < \frac{ε}{2}$ . We will set our parameter $η$ to the $δ$ obtained here. Then,

k \in S_{l} \sum P (X = k) f (\frac{k}{n}) - f (x) < \frac{ε}{2} k \in S_{l} \sum P (X = k) \leq \frac{ε}{2},

where we used the fact that $\sum_{k \in S_{l}} P (X = k) \leq 1$ .

For the second term, we will leverage Chebyshev’s Inequality. We first observe that since $f$ is continuous on a compact domain, it will have a compact image. This means its image will be bounded, i.e. there exists $M \in R$ such that for all $x \in [0, 1]$ , $f (x) \leq M$ . Thus,

k \in S_{g} \sum P (X = k) f (\frac{k}{n}) - f (x) \leq 2 M k \in S_{g} \sum P (X = k) .

We then observe that $\sum_{k \in S_{g}} P (X = k) = P (∣ \frac{X}{n} - x ∣ \geq η)$ . This is because $S_{g}$ is exactly the set of elements $k$ in the sample space such that $∣ \frac{k}{n} - x ∣ \geq η$ . If $η = l Var (\frac{X}{n})$ , then $\frac{1}{l ^{2}} = \frac{Var ( \frac{X}{n} )}{η ^{2}}$ . So, by Chebyshev,

2 M k \in S_{g} \sum P (X = k) = 2 M \cdot P (\frac{X}{n} - x \geq η) \leq 2 M \cdot \frac{1}{l ^{2}} = 2 M \cdot \frac{Var ( \frac{X}{n} )}{η ^{2}} = 2 M \cdot \frac{x ( 1 - x )}{n η ^{2}} .

Observe that $\frac{x ( 1 - x )}{n η ^{2}} \to 0$ as $n \to \infty$ . So, for large enough $n$ , $\frac{x ( 1 - x )}{n η ^{2}} < \frac{ε}{4 M}$ is true such that

k \in S_{g} \sum P (X = k) f (\frac{k}{n}) - f (x) < 2 M \cdot \frac{ε}{4 M} = \frac{ε}{2} .

We have successfully bounded both terms we had set out to bound. Thus, we have obtained our desired bound:

k = 0 \sum n P (X = k) f (\frac{k}{n}) - f (x) < \frac{ε}{2} + \frac{ε}{2} = ε . □

References: Koralov, L.; Sinai, Y. (2007). ""Probabilistic proof of the Weierstrass theorem"". Theory of probability and random processes (2nd ed.). Springer. p. 29.

markedown_

Explorer

Weierstrass Approximation Theorem

Theorem.

Remark.

Proof.

Graph View

Table of Contents