Nature of Continuity for Measure & Probability Theory – Part I


There are many textbooks, posts, videos and papers about continuity. However, it is hard to find a cohesive introduction to the concept of continuity aimed at what is needed for the basics of measure and probability theory. The textbook by R. M. Dudley [2], however, is a fundamental introduction, which we recommend for experienced users to this end.

This blog post strives to provide an overview and an easy-to-read introduction to continuity for probability and measure theory by focusing on the motivation behind the definitions and statements. Many examples will illustrate the discussed objects and their logical relationships.

Understanding continuity at a point requires a profound knowledge about Inner Products, Norms and Metrics (i.e. distance functions) since continuity is about ‘small’ changes of function arguments relative to small changes of its (function) values. Distance functions combined with limits (and its underlying topology) of functions provide a measure to determine how ‘small’ might be interpreted.

Required Knowledge:

We restrict our considerations to the Euclidean metric space (\mathbb{R}^d, L_d) with the standard metric L_d(v,w):= \left(\sum_{i=1}^d{|x_i-y_i|} \right)^{\frac{1}{d}}. Please refer to Inner Products, Norms and Metrics for further details.

Before we actually start, let us think about what we want to achieve with continuous functions. Why is this type of functions so important, not only from a pure theoretical but also from a very practical perspective?

The following heuristic tries to explain that in simple terms. Afterwards, we will start to introduce it formally.

A function behaves as continuous at x if a small change in f(x) can be reached by a sufficiently small change in the corresponding argument x.

Driving a car by operating the steering wheel might serve as a heuristic example of a continuous function f:D \rightarrow R. Consider the rotation of the steering wheel as the domain D of the function f that translates these input variables into the change in direction R of the vehicle. You would like that the function f is continuous since a small change in direction should be reached by a small change in the rotation angle of the steering wheel.

Ultimately, continuity is all about controlled behavior of the function values relative to its elements of the domain. Even though this might not be a mathematical precise definition of continuity, it might help to understand and remind the actual definition of continuous functions better.

Be aware that we would like to control the behavior of the range since this is by design the area that should behave in a continuous manner.

Keep the following points in your mind:

  • Continuity is all about controlled behavior of the function values;
  • Different types of continuity basically just require different types of controlled behavior. For instance, one can ask for a controlled behavior at a specific point in the domain or for the entire function;
  • Convergent (and Cauchy) sequences are closely interlinked with continuity. Refer to Limits & Topological Spaces for further details;

In the following, we consider the concept of continuity in 1-dimensional metric spaces. This will serve as the basis for the second part of this series, where we will also consider multi-dimensional metric spaces. Nonetheless, we are going to introduce the different concepts in a general way, such that it can be used in one and multi-dimensional metric spaces.

Let us start with the simplest form of continuity.

Continuity at a Point

Keep the heuristic –outlined above– in mind when reading the following definition.

Definition 2. 1(Continuous at a Point):
Let (X, d_X) and (Y, d_Y) be metric spaces and let f:X \rightarrow Y be a function from X to Y. The function f is said to be continuous at a point x_0 in X if for every \epsilon >0 there is a \delta >0, such that

(1)   \begin{align*} d_X(x_0, x)<\delta \Rightarrow d_Y(f(x_0), f(x))< \epsilon \end{align*}

If f is continuous at every point of a subset B of X, we say f is continuous on B. If f is continuous on its domain, we say that f is continuous.

If x_0 is a point in the domain D of the function f, where f is not continuous, we say that f is discontinuous at x_0, or that f has a discontinuity at x_0.


Definition 2.1 reflects the idea that points close to x_0 are mapped by f to points sufficiently close to f(x_0). That is, f behaves in a controlled manner. Keeping this heuristic in mind it is also clear why the formulation “for all \epsilon>0” makes sense — if the function f should behave in a controlled manner the corresponding range of the function needs to behave controlled in relation to its arguments. Hence, for every \epsilon>0 a corresponding \delta needs to exist as outlined in the definition.

We can also use balls to provide an equivalent formulation.

A function f: D \rightarrow Y is continuous at x_0 if and only if, for every \epsilon >0, there is a \delta >0 such that f(B(x_0, \delta) \cap D) \subseteq B(f(x_0), \epsilon).

Rendered by

We can even go further and formulate continuity in terms of neighborhoods:

A function f with domain D\subseteq X is continuous at x_0\in D if and only if, for every neighborhood V of f(x_0), there is a neighborhood U of x_0 such that f(U \cap D) \subseteq V.

We can use one of the equivalent formulation of continuity to double-check whether a function is continuous at a point.

Check for Continuity at a point:

One has to conduct several steps to double-check whether a function f is continuous at a given point x_0 \in X:

  1. Set y_0 :=f(x_0);
  2. Consider an arbitrary ball B(y_0, \epsilon) \subseteq R around the image point if you want to prove that f is continuous at x_0. If you think that f is not continuous try to find a suitable ball to contradict the definition in the next step;
  3. Check whether there is a corresponding ball B(x_0, \delta) \subseteq D such that B(x_0, \delta) \ni x \mapsto f(x) \in B(y_0, \epsilon). If for some fixed \epsilon>0 no such ball can exist, then the function f at x_0 is not continuous.

Another way of better understanding the definition is to consider what it means if a function f is NOT continuous at a point x_0\in D \subseteq X.
According to the Definition 2.1 this would mean that there must be an \epsilon >0 (sometimes called ‘loser’ \epsilon) with the property that, for each \delta, there are x_0, x\in D such that

    \begin{align*} d(x_0, x) < \delta \quad \text{and} \quad d(f(x_0), f(x))\geq \epsilon. \end{align*}

Let us consider some examples to illustrate the definition of continuity at a point further.

Example 2.1 (Constant Function):

The constant functions \widehat{c}:\mathbb{R} \rightarrow \mathbb{R} with x \mapsto \widehat{c}(x)=c is continuous at every point in the domain \mathbb{R}. Let A \subseteq \mathbb{R} be a bounded subset.

Graph of constant Function along with an \epsilon tube around

Let \epsilon>0 and let B(c, \epsilon) be an arbitrary ball around the only image point c\in\mathbb{R}. We can choose \delta>0 to be an arbitrarily (large) positive real figure such as \delta:=\sup\{|x-x_0| : x, x_0 \in A\}, then the following holds true:

    \begin{align*}    |x-x_0|<\delta   \Rightarrow |f(x)-f(x_0)| = |0| < \epsilon\end{align*}

for any x, x_0\in A. Given that any point of the domain will always be mapped on c the distance between the corresponding image points will always be zero.

If we pick a small \delta instead, the implication above would still hold true since the implication from a wrong statement is always correct.

Hence, the function \widehat{c} is continuous on its domain.


Note that this type of continuity of f at a point is by definition a local property:
If x_0 is an isolated point of the domain D (i.e. a point of D which is not an accumulation point of D), then every function f defined at x_0 will be continuous at x_0. The reason is quite simple: for a sufficiently small \delta there is only one x satisfying d(x, x_0)<\delta, namely x=x_0, and d(f(x_0), f(x_0))=0.

The following example shows why jumps are usually not compatible with continuity.

Example 2.2 (Heavyside Function):

Let us consider the so-called Heavyside Function h:\mathbb{R} \rightarrow \mathbb{R} defined via

(2)   \begin{align*}x \mapsto h(x):= \begin{cases} 1 & x\geq 0 \\ 0 & x<0 \end{cases}.\end{align*}

Constant functions are continuous as shown in Example 2.1. In addition, continuity is a local property which is why we can restrict our focus to the point x_0:=0 since there is a jump in the graph from 0 to 1.

Graph of Heaviside Function

The set B(h(x_0)=1, \epsilon:=\frac{1}{4}) = ]\frac{3}{4}, \frac{5}{4}[ is a ball of h(x_0)=h(0)=1. Note that the image set of h only contains 0 and 1.

Graph of Heaviside Function along with \epsilon– and \delta ball showing non-continuity at x=0

There cannot be a ball A:=B(x_0=0, \delta) of the domain, such that h(A) \subseteq ]\frac{3}{4}, \frac{5}{4}[ since there must be a negative x\in A that will be mapped to 0 via h. Apparently, 0\notin B(1, \epsilon:=\frac{1}{4}) = ]\frac{3}{4}, \frac{5}{4}[, which is why the Heaviside function is not continuous at zero but continuous everywhere else.


Maybe you want to have a look on the second part of the following video by 3Blue1Brown where limits and the \epsilon\delta definition is explained.

Theorem 2.1 (Limits & Continuity):
Let f: X \rightarrow Y be a function from one metric space (X, d_X) to the metric space (Y, d_Y). Then f is continuous at x_0\in X if, and only if, for every sequence (x_n)_{n\in \mathbb{N}} in X convergent to x_0, the corresponding sequence (f(x_n)) in Y converges to f(x_0), i.e.

(3)   \begin{align*}   f(\lim_{n\rightarrow \infty}{x_n}) = \lim_{n \rightarrow \infty}{f(x_n)} \end{align*}


Hence, continuity can also be interpreted as convergence-preserving.

Let f be continuous at x_0\in X and let (x_n) \subset X be a convergent sequence to x_0. Due to the continuity the following holds:

    \begin{align*} \forall \epsilon>0, \ \exists \delta>0 \text{ such that } \\  d_X(x_n, x_0) < \delta \Rightarrow d_Y(f(x_n), f(x_0))<\epsilon. \end{align*}

This, however, already implies the convergence of f(x_n) to f(x_0) since f(x_n) gets arbitrarily close to f(x_0).

Let us now assume that for every convergent sequence x_n \rightarrow x_0 the following holds:

(4)   \begin{align*}      \lim_{n \rightarrow \infty}{f(x_n)} = f(x_0). \end{align*}

Let us further assume that f is not continuous at x_0. This means that there exists a ball B(\epsilon, f(x_0)) around f(x_0) such that we cannot find a \delta>0 with f(A(\delta, x_0)) \subseteq B(\epsilon, f(x_0)). That is, there is an \epsilon > 0 (that corresponds with the ball B), such that for all \delta>0: f(A(\delta, x_0)) \not\subseteq B(\epsilon, f(x_0)). Let us now consider the balls A_n(\delta_n:= \frac{1}{n}, x_0) for the convergent sequence \{x_n\} \rightarrow x_0 with \lim_{n \rightarrow \infty}{f(x_n)} = f(x_0). However, we cannot find a \delta_n = \frac{1}{n}>0 (or equivalent an integer n=\frac{1}{\delta_n}) with f(A_n(\delta_n, x_0)) \subseteq B(\epsilon, f(x_0)). This means that (f(x_n)) does not converge to f(x_0) since the ball B(\epsilon, f(x_0)) would not contain any element. This contradicts the initial assumption (4) and proves the assertion.


At last, we will also look into global properties of continuous functions.

Theorem 2.2 (Continuity & Open Sets):
Let f: X \rightarrow Y be a function from one metric space (X, d_X) to the metric space (Y, d_Y). Then f is continuous on X if, and only if, f^{-1}(V) \subseteq X for every open set V in Y.


Continuity therefore preserves open sets. This should not be a big surprise since limits can also be expressed using open sets (e.g. balls).

Suppose f is continuous on X and V is an open set in Y . We have to show that every point of f^{-1}(V) is an interior point of f^{-1}(V). Suppose x_0\in X and f(x_0)\in V. Since V is open, there exists an \epsilon >0 such that y\in V if d_Y(f(x_0), y)<\epsilon. Applying the continuity of f at x_0 we know that there exists a \delta>0 such that d_Y(f(x), f(x_0))<\epsilon if d_X(x, x_0)<\delta. Hence, x\in f^{-1}(V) as soon as d_Y(f(x), f(x_0))<\epsilon.

Conversely, suppose that f^{-1}(V) is open in X for every open set V in Y. Fix x_0\in X and \epsilon>0, let V be the set of all y\in Y such that d_Y(y, f(x_0))<\epsilon. Since V is open f^{-1}(V) is also an open set. Hence, there exists a \delta>0 such that x\in f^{-1}(V) d_Y(f(x), f(x_0))<\epsilon.

This completes the proof.


One-Sided Continuity

The following video introduces continuity at a point on the real line by using limits (from the left and right). Hence, it might serve as a nice warm-up for this section.

Continuity at a point by Khan Academy

Let us restrict in this section to the metric space (\mathbb{R}, |\cdot|)

Definition 3.1 (Left- & Right-Continuous)
Let f:D\rightarrow \mathbb{R} and x_0\in D. Let D_+ := D \cap \{x_0, \infty\}. If f is continuous at x_0 as a function on D_+, we say it is right-continuous at x_0.

Let f:D\rightarrow \mathbb{R} and x_0\in D. Let D_- := D \cap \{-\infty, x_0\}. If f is continuous at x_0 as a function on D_-, we say it is left-continuous at x_0.


We can characterize both limit-types based on one-sided limits. Then, the definition of left- and right-continuity is equivalent to

    \begin{align*} \forall (x_n) \subseteq D, \ x_n \leq x_0 & \text{ with } \lim_{x_n \nearrow x_0}{x_n} = x_0 \\ & \Rightarrow \lim_{x_n \nearrow x_0}{f(x_n)} = f(x_0) \end{align*}


    \begin{align*} \forall (x_n) \subseteq D, \ x_n \leq x_0 & \text{ with } \lim_{x_n \swarrow x_0}{x_n} = x_0 \\  & \Rightarrow \lim_{x_n \swarrow x_0}{f(x_n)} = f(x_0). \end{align*}

, respectively.

In a 1-dimensional vector space such as \mathbb{R}, there are two possibilities to approach an element x_0\in \mathbb{R}.

Rendered by

In a 2-dimensional space, however, it is possible to approach from infinite many directions since you can approach a point from any possible angle \theta\in [0,2\pi].

Rendered by

Continuity in multi-dimensions will be treated further below in this article. Hence, let us get back to the 1-dimensional metric space \mathbb{R} with |\cdot| as distance function.

Example 3.1 (Signum Function and One-Sided Continuity):

Let us consider the so-called Signum Function \text{sgn}:\mathbb{R} \rightarrow \mathbb{R} defined via

(5)   \begin{align*}x \mapsto \text{sgn}(x):= \begin{cases} 1 & x> 0 \\ 0 & x=0 \\ -1 & x<0 \end{cases}.\end{align*}

The domain of the function is the real line and the corresponding graph is shown as follows.

Rendered by

The \text{sgn} function does not have a limit at x=0, because if you approach 0 from the right the value is 1 while if you approach from the left the value is -1. We then write \lim_{x \swarrow 0}{\text{sgn}(x)}=1 and \lim_{x \nearrow 0}{\text{sgn}(x)}=-1. The actual value at x_0=0 is, however, \text{sgn}(0)=0.

Note that the definition \text{sgn}(0):=0 makes the signum function continuous from neither side at 0, but a different convention would allow us to have one or the other but not both.


Example 3.1 illustrated that there might be different types of discontinuities. There are three kinds of discontinuities at a point x_0\in \mathbb{R}:

  1. Removable Discontinuity:
    If one-sided limit \lim_{x \rightarrow x_0}{f(x)} exists and is finite and can be removed by re-defining the function. That is, the function is either undefined at x_0 or \lim_{x \rightarrow x_0}{f(x)}=f(x_0).
  2. Jump or Step Discontinuity:
    If one-sided limit \lim_{x \rightarrow x_0}{f(x)} exists and is finite but not equal. It is not possible to re-define the function such that the one-sided limits are all the same.
  3. Infinite or Essential Discontinuity:
    If one-sided limits do not exist or are infinite.

Please also refer to the classification of singularities, where it is about differentiability of (complex) functions. Both classifications are related to each other closely.

Let us illustrate this classification of discontinuities by looking at specific examples.

Example 3.2 (Classification of Discontinuities):

a) Consider the function

    \begin{align*}x \mapsto f(x):=\frac{x^2+2x-3}{x-1}=\frac{(x-1)(3+x)}{(x-1)}\end{align*}

which is not defined at x_0=1 as illustrated in the next graph.

Rendered by

Apparently, this discontinuity is a removable one since we can simply extend or change the definition of f such that f(1):=4.

b) Let us now re-consider the Heavyside Function of Example 2.2, which has a jump discontinuity at x_0=0. It has \lim_{x\nearrow 0}{h(x)}=0 and \lim_{x\swarrow 0}{h(x)}=1 as its left- and right-sided limit. In its graph we can clearly see a jump.

c) An essential or infinite discontinuity is of a very different kind. Consider the following well-known function

    \begin{align*}     x \mapsto f(x):= \frac{1}{x}\end{align*}

and its graph

Rendered by

The limit from the left \lim_{x\nearrow 0}{\frac{1}{x}}=-\infty and the limit from the right \lim_{x\swarrow 0}{\frac{1}{x}}=+\infty are not consistent with each other. Hence, it is an essential or infinite discontinuity.


Note that monotone functions can only have countable many jump discontinuities.

The following theorem outlines the interlinkage between continuous at a point and one-sided continuity at the same point.

Theorem 3.1 (Left- & Right-Continuous & Continuity)
The function f:D \rightarrow \mathbb{R} is continuous at x_0 if and only if it is both right- and left-continuous at x_0\in D.

\Rightarrow‘ If the function is continuous on its domain D, it means that \forall (x_n) \subseteq D with x_n\rightarrow x_0 implies f(x_n) \rightarrow f(x_0). This holds true for x_n \geq x_0 and x_n \leq x_0. Hence, left- and right-sided continuity follows.

\Leftarrow‘ Assume that f is left and right-continuous.
Let further x_n denote a sequence (x_n) \subseteq D with x_n \rightarrow x_0. Suppose f(x_n) does not converge to f(x_0), which would contradict continuity. Then there must be an \epsilon >0 such that for all N\in \mathbb{N} and corresponding n\geq N the following holds true:

|f(x_n)-f(x_0)| \geq \epsilon.

In other words, we cannot find any N such that all terms f(x_n) with n\geq N are arbitrarily close to the value f(x_0). This, however, implies that either there exists infinitely many terms of the sequence with x_n\in D_+, or infinitely many with x_n\in D_- such that |f(x_n)-f(x_0)| \geq \epsilon. In either case, there exists a subsequence x_{n_k}, k\in \mathbb{N} with all terms in only one of D_+ or D_-, such that |f(x_{n_k}) - f(x_{0_k})| \geq \epsilon for k \rightarrow \infty. But such a sequence violates the assumption that f is left and right continuous. Hence, the function must be continuous at x_0.

The following is an alternative and much more elegant proof of the reverse direction.

\Leftarrow‘ Let \epsilon>0. By the left and right continuity of f at x_0, there are positive numbers \delta_-, \delta_+ >0 such that d(f(x), f(x_0))<\epsilon for all x\in X \cap (x_0-\delta_-, x_0] and x\in X \cap [x_0, x_0+\delta_+). Set \delta := \min\{\delta_-, \delta_+\}. Then d(f(x), f(x_0))< \epsilon for all x\in X \cap (x_0-\delta, x_0+\delta). Therefore, f is continuous at x_0.


Let f:D \rightarrow Y continuous on D and let further x_0\in X be a limit point of D. If D is not closed, then x_0 may not be in D and so f is not defined at x_0. In the following we consider whether f(x_0) can be defined so that f is continuous on D \cup \{x_0\}.

If such an extension exists, then, for any sequence x_n in D, which converges to x_0, the corresponding sequence f(x) converges to y_0:=f(x_0). Thus, for a (not necessarily continuous) function f:D \rightarrow Y and a limit point x_0\in D, we define

    \begin{align*}\lim_{x \rightarrow x_0}{f(x)} = y_0\end{align*}

provided that for each convergent sequence x_n\rightarrow x_0 in D, the corresponding sequence f(x_n) \rightarrow y_0 also converges in Y.

Proposition 3.1 (Neighborhoods and Converging Functions)
The following are equivalent:
(i) \lim_{x \rightarrow x_0}{f(x)} = y_0;
(ii) For each neighborhood V of y_0 in Y, there is a neighborhood U of x_0 in X such that f(U \cap D) \subseteq V.

Proof: ‘(i) \Rightarrow (ii)’ Suppose that there is a neighborhood V of y_0 in Y such that f(U \cap D) \nsubseteq V for each neighborhood U of x_0 in X. Consider the sequence of open balls

    \begin{align*} f \left(B(x_0, \frac{1}{n}) \cap D \right) \cap V^C \neq \emptyset \end{align*}

,n \in \mathbb{N}, in the complement set V^C. We can chose x_n from B(x_0, \frac{1}{n}) \cap D to create a sequence, which is in D and converges to x_0. However, all terms of f(x_n) are not contained in V and thus f(x_n) cannot converge to y_0.

‘(ii) \Rightarrow (i)’ Let x_n be a sequence in D such that x_n \rightarrow x_0 in X, and V a neighborhood of y_0 in Y. By hypothesis, there is some neighborhood U of x_0 such that f(U \cap D) \subseteq V. Since x_n converges to x_0, there is some N\in \mathbb{N} such that x_n\in U for all n\geq N. Thus, the image sequence f(x_n) is contained in V for all n \geq N. This means that f(x_n) \rightarrow y_0.


Uniform Continuity

Suppose f:D\rightarrow Y and D\subseteq X with (X, d_X) and (Y, d_Y) metric spaces. Assume that f is continuous on its domain D: for any point x_0 \in D and any \epsilon > 0, there is a corresponding \delta > 0, such that

    \begin{align*}\ d_X(x, x_0)< \delta \ \Rightarrow \ d_Y(f(x), f(x_0)) < \epsilon, \end{align*}

for any x\in D.

In general, \delta depends on the chosen \epsilon and the point x_0\in D as we can see in the following chart of the function x \mapsto x^2 for the domain [0, \infty).

Rendered by

Even though \delta_1=\delta_2=\delta_3= \ldots = \delta_n = 1 the corresponding \epsilon_i tubes with i\in \{1,2,3, \ldots, n\} range from \epsilon_1 = 4-1=3 to \epsilon_3=16-9=7 in the illustrative graph above. This in return means, that for a given \epsilon not an unique \delta would do the job for the entire positive real line. However, if the domain of x \mapsto x^2 is bounded, we can find a \delta depending on a given \epsilon such that a specific continuity is ensured. For more details please refer to Example 4.1 d) and Example 4.2.

Visual explanation of uniform continuity by Steve Stein

In general, we therefore cannot expect that for a fixed \epsilon the same value of \delta will serve equally well for every point x_0\in D. This might happen, however, and when it does, the function possesses the following properties.

Definition 4.1 (Uniformly Continuous):
Let f:X \rightarrow Y be a function from one metric space (X, d_X) to another (Y, d_Y). Then f is said to be uniformly continuous on a subset D of X if for every \epsilon > 0 there is a \delta >0 (depending only on \epsilon), such that

    \begin{align*}  d_X(x, x_0) < \delta  \quad \Rightarrow \quad d_Y(f(x), f(x_0)) < \epsilon. \end{align*}

for all x, x_0\in D. Note that x_0 is not fixed.


Example 4.1 (Uniform Continuity):

a) The function f:D:=(0, 1] \rightarrow \mathbb{R}, D\subseteq X

    \begin{align*}     x \mapsto f(x):= \frac{1}{x} \end{align*}

is continuous but not uniformly continuous on its domain.
Since f(x)=\frac{1}{x} is the restriction of a rational function, it is certainly continuous.

Set \epsilon:=10 and suppose we could find a 0<\delta<1 to satisfy the definition of uniform continuity. Taking x=\delta and x_0=\frac{\delta}{11}, we obtain |x-x_0|<\delta \Leftrightarrow |\delta - \frac{\delta}{11}|<\delta and

    \begin{align*}       |f(x)- f(x_0)| = |\frac{11}{\delta} - \frac{1}{\delta}|=\frac{10}{\delta} > 10 =\epsilon. \end{align*}

Hence, for these two points we would always have |f(x)- f(x_0)|>10, contradicting the definition of uniform continuity.

b) Let us continue Example 2.1 and re-consider the class of the constant functions. A constant function on a bounded subset A\subset \mathbb{R} is uniformly continuous since we can pick one \delta that works for all x\in A as outlined in Example 2.1.

c) Let f:(\mathbb{R}, |\cdot|) \rightarrow (\mathbb{R}, |\cdot|) defined by f(x):=x. This function is continuous and uniformly continuous on the standard metric space (\mathbb{R}, |\cdot|). To prove this, let \epsilon>, and set \delta:=\epsilon. Then for every \epsilon > 0 there is a \delta=\epsilon >0 (depending only on \epsilon), such that

    \begin{align*}         & |x - x_0| < \delta = \epsilon  \\         \Rightarrow & |f(x) - f(x_0)| = |x - x_0| < \epsilon.\end{align*}

for all x, x_0\in D. Note that x_0 is not fixed.

d) Let f:(\mathbb{R}, |\cdot|) \rightarrow (\mathbb{R}, |\cdot|) defined by f(x):=x^2. At the beginning of this section, we illustrated that the quadratic function x \mapsto x^2 is not uniformly continuous on the entire real line. Now, we formally prove it: Suppose \epsilon=1 and suppose that x \mapsto x^2 is uniformly continuous. For all \delta>0 and x_0=x+ \frac{\delta}{2}, we would find that

    \begin{align*}     |x^2 - \left(x+\frac{\delta}{2} \right)^2| < 1 \end{align*}

for any real x. However, this would imply

    \begin{align*}     |x\delta - \frac{\delta^2}{4}| < 1 \end{align*}

which is a contradiction since we can choose x large.

We will see that the same function is uniformly continuous on specific (bounded) subsets.


Uniform continuity on a set A implies continuity on A. The converse is also true if the set A is compact.

Theorem 4.1 (Heine, Continuous Functions on Compact Sets)
Suppose that f:X\rightarrow Y is a function from a metric spaces (X, d_X) to another (Y, d_Y). Let A be a compact set and assume that f is continuous on A. Then f is uniformly continuous on A.
That is, continuous functions on compact sets are uniformly continuous.

Proof: Let \epsilon>0 be given. Then each point a\in A has associated with it a ball B(a, r), with r depending on a, such that

    \begin{align*} d_Y(f(x), f(a)) &< \frac{\epsilon}{2}   \end{align*}

whenever x\in B(a, r) \cap A.

Consider the collection of balls B(a, \frac{r}{2}) for each a\in A with radius \frac{r}{2}. These open balls cover A and, since A is compact, a finite number m\in \mathbb{N} of them also cover A, say

    \begin{align*} A \subseteq \bigcup_{k=1}^{m}{B\left(a_k;\frac{r_k}{2}\right)}. \end{align*}

In any ball of twice the radius, B(a_k, r_k), we have

    \begin{align*} d(f(x), f(a_k)) &< \frac{\epsilon}{2} \end{align*}

whenever x\in B(a_k, r_k)\cap A. Let \delta be the smallest of the numbers \frac{r_1}{2}, \ldots, \frac{r_m}{2}. We show that this \delta works for the definition of uniform continuity. For this purpose, consider two points of A, say x and x_0 with d_X(x, x_0)<\delta. By the above discussion there is some ball B(a_k, \frac{r_k}{2}) containing x, so d_Y(f(x), f(a_k)) < \frac{\epsilon}{2}. By the triangle inequality we have

    \begin{align*} & d_Y( f(x_0), f(a_{k}) ) \\ & \leq d_Y( f(x_0), f(x) ) + d( f(x), f(a_k) ) \\ & < \delta + \frac{r_k}{2} \\ & < \frac{r_k}{2} + \frac{r_k}{2} = r_k. \end{align*}

Hence, x_0\in B(a_k, r_k) \cap X, so we also have d_Y(f(x_0), f(a_k))<\frac{\epsilon}{2}. Using the triangle inequality once more we find

    \begin{align*} & d_Y( f(x), f(x_0) )  \\ & \leq d_Y( f(x), f(a_k) ) + d_Y( f(a_k), f(x_0) ) \\ & < \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon. \end{align*}


In general, the function f:\mathbb{R}\rightarrow \mathbb{R} defined by x \mapsto x^2 is not uniformly continuous. However, if we restrict the function to a bounded closed subset the situation is different.

Example 4.2 (Uniform Continuity on Compact Sets):

The function f:D:=[0, 1] \rightarrow \mathbb{R}

    \begin{align*} x \mapsto f(x):= x^2 \end{align*}

is continuous and uniformly continuous on its bounded and closed domain D. To prove this, observe that

    \begin{align*} |f(x)-f(x_0)| = |x^2 - x_0^2| = |(x-x_0)(x+x_0)| < 2|x-p| \end{align*}

for x, x_0 \in D=[0,1]. Thus, if |x-x_0|<\delta, then |f(x)-f(x_0)|<2\delta. If \epsilon>0 is given we only need to take \delta:=\epsilon/2 to guarantee that |f(x)-f(x_0)|<\epsilon for every pair x, x_0 with |x-x_0|<\delta. This shows that f is uniformly continuous on D=[0,1].


Lipschitz & Hölder Continuity

The next type of continuity is named after the German mathematician Rudolf Lipschitz.

Definition 5.1 (Lipschitz Continuity):
Let f:X \rightarrow Y be a function from one metric space (X, d_X) to another (Y, d_Y). Then f is said to be Lipschitz continuous on D\subseteq X if there exists a positive real number L (which may depend on x_0) such that

(6)   \begin{align*} |f(x)- f(x_0)| < L \cdot |x-x_0| \end{align*}

whenever x, x_0\in D and x\neq x_0.


Think about why the constant L needs to be positive?
The inequality contains only absolute values, which is why a change of sign is not possible. Because of a similar reason, the constant L cannot be zero since the product would be zero as well.

The so-called Lipschitz condition (6) is also important in other areas of Analysis (e.g. differential equations) and Measure Theory (e.g. geometric measure theory or nonlinear expectations).

A Lipschitz continuous function f on D is (uniformly) continuous: Let f be Lipschitz continuous, which means that there is a L> 0 such that |f(x)-f(x_0)| < L |x-x_0| for all x, x_0\in D. Let \epsilon >0 and set \delta:= \frac{\epsilon}{L}, then we can imply for any x, x_0\in D with |x-x_0|<\delta = \frac{\epsilon}{L} that |f(x)-f(x_0)| < L |x-x_0| < L \frac{\epsilon}{L} = \epsilon. Note that the \delta is not dependent on x_0.


Let us consider two very simple examples.

Example 5.1 (Lipschitz Continuity):

a) The identity f: \mathbb{R} \rightarrow \mathbb{R} defined by x \mapsto x is Lipschitz continuous on \mathbb{R}. To prove this, observe that

    \begin{align*} |f(x)-f(x_0)| = |x-x_0|< L \cdot |x-x_0| \end{align*}

which holds true for any constant L>1.

b) The absolute value function f(x):= |x| is Lipschitz continuous on \mathbb{R}. To prove this, observe that

    \begin{align*} |f(x)-f(x_0)| = ||x|-|x_0|| \leq |x-x_0| \end{align*}

due to the reverse triangle inequality. Hence, we can choose L=1 as Lipschitz constant. Note, however, that the absolute value function is not differentiable at x_0=0.


Lipschitz continuity in (\mathbb{R}, |\cdot|) means that

    \begin{align*} |\frac{f(x)-f(x_0)}{x-x_0}| \leq L \end{align*}

which is the amount of the secant increase is limited by L.

Rendered by

A generalization of Lipschitz continuity is named after another German mathematician Otto Hölder.

Definition 5.2 (Hölder Continuity):
Let f:X \rightarrow Y be a function from one metric space (X, d_X) to another (Y, d_Y). Then f is said to satisfy a Hölder condition of order \alpha if there exists a positive real number L (which may depend on x_0) and an exponent \alpha>0 such that

    \begin{align*}|f(x)- f(x_0)| < L \cdot |x-x_0|^\alpha\end{align*}

whenever x, x_0\in A.


Lipschitz continuous functions are Hölder continuous with exponent \alpha=1 since Hölder continuity is a generalization of Lipschitz continuity. Hence, Hölder continuous functions are in general not Lipschitz continuous.

A Hölder continuous function f:D\rightarrow \mathbb{R} is (uniformly) continuous with Lipschitz constant L and exponent \alpha >0: let \epsilon > and set \delta:= (\frac{\epsilon}{L})^{\frac{1}{\alpha}} (only dependent on \epsilon) , then we got for all x, x_0\in D with |x-x_0|<\delta

    \begin{align*}  |f(x)- f(x_0)| & < L \cdot |x-x_0|^\alpha \\                 & < L \delta^\alpha = \epsilon. \end{align*}

Note that \delta does not depend on x_0.


Example 5.2 (Hölder Continuity):

The root function f: [0,1] \rightarrow \mathbb{R} defined by x \mapsto \sqrt{x} is Hölder but not Lipschitz continuous on D. Let x, x_0\in [0,1] and without loss of generality x \leq x_0, then

    \begin{align*} \sqrt{x}\sqrt{x_0} \geq x  & \Rightarrow |x-x_0|=x_0-x \geq x_0-2\sqrt{x}\sqrt{x_0}+x\\                            & \Rightarrow |\sqrt{x}-\sqrt{x_0}| \leq |x-x_0|^{\frac{1}{2}} \end{align*}

The second statement that the root function is not Lipschitz continuous can be concluded as follows: Assume there is a Lipschitz constant L>0 such that |\sqrt{x}-\sqrt{x_0}| \leq L |x-x_0| for all x, x_0\in [0,1]. If we then choose x_0=0 we get \sqrt{x}<Lx and thus 1<L\sqrt{x} for all 0<x<1, which contradicts the existence of L.



Rudin, W. (1976) Principles of mathematical analysis. 3d ed. New York: McGraw-Hill (International series in pure and applied mathematics).

Dudley, R.M. (2002) Real analysis and probability. Cambridge ; New York: Cambridge University Press (Cambridge studies in advanced mathematics, 74).

Apostol, T.M. (1974) Mathematical analysis. 2nd ed. Reading, Mass.: Addison-Wesley (Addison-Wesley series in mathematics).