#TranslationInvariance

Introduction

The quantification and even the definition of ‘risk’ is a hard problem. Questions like the following are therefore–in general–hard to answer:

What is the risk of investing in a the Euro Stoxx 50?
What is the risk that there will be a war between Taiwan and China?
What is the default risk of Apple Inc.?
…

In this post, we recap the general properties of a risk measure as derived in the seminal paper “Coherent Measures of Risk” [1] by Philippe Artzner, Freddy Delbaen, Jean-Marc Eber and David Heath. We will focus on outlining the heuristic behind the concept of coherent risk measures by connecting it to practice and providing illustrative examples. To this end, we follow mainly the structure of the outstanding book [2] by Föllmer & Schied. In addition, we will also extend the basic idea of coherent to convex risk measures as suggested by Föllmer & Schied [2].

The focus is on the general properties of measures of risk and its applications and connections to other areas of math (finance). Concrete risk measures such as Worst-Case Risk Measure, Value at Risk or Expected Shortfall will serve as examples but are not in the main focus of this post.

Note that, if we define a concrete risk measure such as the Value at Risk, then the term “risk” is implicitly defined by the conducted calculation.

Monetary Risk Measures

The main objective of a risk measure $\rho$ is to quantify the risk of any financial position $X$ . By applying a risk measure, we are then able to ‘compare’ the riskiness of different financial positions. The precise meaning of a financial position is left unspecified but may include assets, liabilities, and any kind other financial instrument such as derivatives.

Let $\Omega$ be a fixed set of scenarios, that are needed for the valuation of a financial position $X$ . The set $\Omega$ is also called the set of states of nature or states of the world. Refer also to the post about the notation of decision problems.

One scenario $\omega\in \Omega$ might be represented by cash flows of the financial position rolled out on a timeline that stretches from inception $t=0$ until the end $T$ of the risk horizon. These cash flows dependent on a specific scenario $\omega\in \Omega$ can also be interrelated to indicators such as an interest environment, specific macro-economic developments or the foreign exchange rates.

A financial position is defined by a mapping $X:\Omega \rightarrow \mathbb{R}$ , where $X(\omega)$ is the discounted net worth of the position at the end of the risk horizon if the scenario $\omega\in \Omega$ is realized. Positive values of $X$ denote profits while negative values denote losses accrued over $T$ .

That is, $\rho(X)$ is the net cash flow of the financial position based on all possible scenarios $\omega \in \Omega$ with their net worth $X(\omega)$ . The associated risk $\rho(X(\omega))$ of a financial position $X$ under a given scenario $\omega\in \Omega$ is also dependent on the reporting date.

Example 2.1 (Present Value of a Coupon Bond)
Let us assume we have invested mn€ 100 in one bond with a 2% coupon and a maturity of 5 years. That is, the financial position $X$ comprises only the single bond and we receive the annual interest payment of mn€ 2 for five consecutive years. If the counterparty will not default, we also receive the notional at maturity.

The scenario $\omega$ is represented by the development of the issuer-specific interest rate levels as reflected in column “Rate” of Tab. 1. Note that these specific interest rate for that particular bond comprises all types of risk including the general interest rate risk, the credit spread risk as well as the liquidity situation of the corresponding market.

Due to the fact that the reference interest rates for all terms are below the issuer-specific 2% coupon, the bond is valued above 100.
The present value of $(101.1\%) \cdot 100€mn = 101.1 mn€$ is also implicitly given in Tab. 1 by its bond price denoted in %. Since we are ultimately interested in the net present value (i.e. the P&L) we need to deduct the initial price of $100\%$ that we paid for the bond portfolio. Hence, $X(\omega)= (101.1\% - 100\%) \cdot 100mn = 1.1€mn$

**Tab. 1**: Exemplary calculation of present value of a single bond position in column PV denoted in %

A functional $\omega \mapsto X(\omega) \in\mathbb{R}$ connects the scenarios of the basic set $\Omega$ with its related net present value. Thereby, it doesn’t really matter how $\Omega$ exactly looks like. It can comprise the risk factors that are needed for the valuation or some macroeconomic scenarios where we know how these fluctuations will impact the net worth of the position.

$\square$

Recall that a functional is a mapping from a (vector) space into a field of scalars. Hereby, the field of scalars is the real line with the usual addition and multiplication, denoted by $(\mathbb{R}, +, \cdot)$ .

The collection of all financial positions, denoted by $\mathcal{X}$ , therefore comprises a function vector space equipped with the usual operations of a function space. The main task of this function space is to provide the environment for the valuation of all financial positions as of a given reporting date.

If $n=|\Omega|\in \mathbb{N}$ , the set of all financial positions $\mathcal{X}$ can also be identified with $\mathbb{R}^n$ . In this case, all financial positions can only have $n$ function values, which means that we can define a bilateral mapping between the function values and $\mathbb{R}^n$ .

A financial position $X\in \mathcal{X}$ also needs to be bounded and all constant functions need to be contained in $\mathcal{X}$ . Latter requirement reflects the existence of financial positions, that are (assumed to be) not subject to fluctuations and can therefore be considered as risk-free. The boundedness of any financial position is also reasonable since we live in a finite world and thus every valuation of a real-world asset also needs to be finite. Refer to the famous St. Petersberg paradox in this context.

Our objective is to quantify the risk of any financial position $X\in \mathcal{X}$ by a risk measure $\rho:\mathcal{X} \rightarrow \mathbb{R}$ defined by $X \mapsto \rho(X)\in \mathbb{R}$ . This means, that we need to ‘compare’ the valuations of the financial positions with each other using the order relation and think about their implications on risk.

Definition 2.1 (Dominance & Order of Functions)
Let $\mathcal{X}$ be a function space and $X, Y\in \mathcal{X}$ . We say that $X$ is dominated by $Y$ , or $Y$ dominates $X$ , if $X(\omega)\leq Y(\omega)$ for all scenarios $\omega\in \Omega$ .

$\square$

The finitary relation ‘ $\leq$ ‘ as outlined in Definition 2.1 on the function space $\mathcal{X}$ is a partial and even a total order. Note that the comparison actually takes place in $\mathbb{R}$ employing the canonical or usual order on the real line.

Definition 2.2 (Monetary Measure)
A mapping $\rho: \mathcal{X} \rightarrow \mathbb{R}$ is called a monetary measure of risk if it satisfies the following conditions for all $X, Y\in \mathcal{X}$ .
(i) Monotonicity: If $X \leq Y$ , then $\rho(X) \geq \rho(Y)$ ;
(ii) Cash or Translation Invariance: If $c \in \mathbb{R}$ , then $\rho(X+c)=\rho(X)-c$ .

$\square$

Note that no probability measure has been used in this definition. We are simply working on a function space with certain properties. That is, $X+c$ means that we add an arbitrary function and a constant function together using the vector addition as defined on the function space.

By definition, risk is denoted as a positive while chance is denoted as negative figure.

The economic meaning of both properties (i) and (ii) are as follows:

(i) Monotonicity:
The risk $\rho(Y)$ of the position $Y$ is reduced when the payoff profile $Y$ is increased compared to $X$ ;
(ii) Cash or Translation Invariance:
Cash is considered to be risk free since it is deposited at central banks at the end of each business day by banks. In addition, $\rho(X)$ can be interpreted as some sort of capital requirement (imposed by a regulator). Common Equity Tier 1 Capital needs to be in some cash-like form.

Cash invariance implies

$\rho(X+\rho(X)) = \rho(X) - \rho(X) = 0$

since $\rho(X)\in \mathbb{R}$ as part of the argument of the function can be considered as a constant. That is, sufficient cash/capital is added to neutralize the risk $\rho(X)$ completely. We can also conclude that

$\rho(c)= \rho(0) -c = -c$

for all $c\in \mathbb{R}$ . That is, a cash amount $c\in [0,\infty)$ is reducing the risk by the same amount as we have seen in just above.

We also say that the risk has become acceptable (by supervisory authorities, for example). It also implies $\rho(c)=\rho(0)-c$ for all $c\in \mathbb{R}$ .

For most purposes it wouldn’t be a loss of generality to assume that a given monetary risk measure satisfies the condition of

Normalization: $\rho(0) = 0$ .

Since the argument of the monetary measure $\rho$ represents the discounted net worth, the zero vector $0$ tells us that this kind of investment would not imply any benefit no matter what scenario would realize.

Convex & Coherent Risk Measures

Why should a risk measure be convex and what does it actually mean?

The convexity property of a monetary measure of risk restricts the risk values of sub-portfolios with respect to the corresponding overarching portfolio. Before we come to convex risk measures, let us recall what convex sets and functions are.

Definition 3.1 (Convex Set & Convex Function)
A set $\Omega \subseteq \mathbb{R}^n$ is convex if for all $x,y\in \mathbb{R}^n$ , and $\lambda\in [0,1]$ , we have

$\begin{align*} \lambda x + (1-\lambda)y \in \Omega. \end{align*}$

A point of that form $\lambda x + (1-\lambda)y$ , $\lambda\in [0,1]$ is called convex combination of $x$ and $y$ .

A function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is convex if its domain is a convex set and for all $x,y\in \mathbb{R}^n$ , and $\lambda\in [0,1]$ , we have

$\begin{align*} f(\lambda x + (1-\lambda)y) \leq \lambda f(x) + (1-\lambda) f(y). \end{align*}$

$\square$

If we take any two points $x, y$ , then $f$ evaluated at any convex combination of these two points should be no larger than the same convex combination of $f(x)$ and $f(y)$ . Geometrically, the line segment connecting $(x, f (x))$ to $(y, f (y))$ must sit above the graph of $f$ . A convex combination of the two points $(x, f (x))$ to $(y, f (y))$ can be considered as a line segment (i.e. a chord) between the points. Thereby, the convex combination with $\lambda \in \{0,1\}$ form the end-points $(x, f (x))$ to $(y, f (y))$ of the line segment. The remaining convex combinations generate the points in between these two end points.

Example 3.1 (Quadratic Function)
The function $x \mapsto f(x)=x^2$ defined on the domain $[-5,5]$ is convex, which can directly be seen geometrically.

In the sketched graph of $f$ , we can see several example chords that connect two points on the graph. All of these chords sit above the the graph of $f$ as required by the interpretation of the convexity property.

In order to illustrate the convexity property, let us fix two points on the graph that imply the chord between them. Convexity now tells us that $f(\frac{1}{3} \cdot (-3) + \frac{2}{3} \cdot 2)=\frac{1}{9}$ (green point) is equal or smaller than $\frac{1}{3} \cdot f(-3)+\frac{2}{3} \cdot f(2)=\frac{17}{3}$ (blue point) at $x=\frac{1}{3} \cdot (-3) + \frac{2}{3} \cdot 2=\frac{1}{9}$ .

The convex combination of the function values therefore sits above the graph on the chord while the convex combination of the arguments will be mapped on the graph of $f$ .

Let us now proof algebraically that the function $f$ is convex.

The function $f$ is convex on $[-5,5]$ . Let us apply the definition of the function and convexity to see this:

$\begin{align*} & f (\lambda x + (1-\lambda)y) = (\lambda x + (1-\lambda)y)^2 \\ &= \lambda^2 x^2+ (1-\lambda)^2 y^2 + 2\lambda x \cdot (1-\lambda)y \end{align*}$

Due to the fact that $x\neq y$ we imply that $(x-y)^2>0$ and thus $(x^2+y^2-2xy)>0$ . By adding $2xy$ , we get the inequality $x^2+y^2>2xy$ which can be applied to

$\begin{align*} &f(\lambda x + (1-\lambda)y) \\ &= \lambda^2 x^2+ (1-\lambda)^2 y^2 + 2\lambda x (1-\lambda)y \\ &= \lambda^2 x^2+ (1-\lambda)^2 y^2 + \lambda x (1-\lambda) 2xy \\ &< \lambda^2 x^2+ (1-\lambda)^2 y^2 + \lambda x (1-\lambda) (x^2+y^2) \\ &= \lambda f(x) + (1-\lambda) f(y). \end{align*}$

$\square$

Let us now apply convex functions to monetary risk measures.

Definition 3.1 (Convex Risk Measure)
A monetary risk measure $\rho: \mathcal{X} \rightarrow \mathbb{R}$ is called a convex measure of risk if for all real $0\leq \lambda \leq 1$ the following holds true.
(iii) Convexity: $\rho(\lambda X +(1-\lambda)Y) \ \leq \ \lambda \rho(X)+(1-\lambda)\rho(Y)$ .

$\square$

The entire financial position is represented by $Z:=(\lambda X+ (1-\lambda) Y)$ . However, let us consider the isolated sub-portfolios of the financial positions $\lambda X$ and $(1-\lambda) Y$ , where $\lambda$ and $(1-\lambda)$ has been invested into $X$ and $Y$ , respectively.

The convexity property (iii) states that the risk $\rho(Z)$ of a portfolio is not greater than the weighted sum of the risk of its constitutes. That is, diversification in a given portfolio $Z$ does not increase the risk $\rho(Z)$ and is therefore not greater than $\lambda \rho(X)+(1-\lambda)\rho(Y)$ .

In general, this assumptions does make sense since diversification should decrease risk. Just think about the idiom “Do not put all our eggs in one basket“.

Before we dive further into convex risk measures, let us recall the concept of linear cones. We will see that there is a one-to-one connection between convex risk measures and linear cones.

Definition 3.2 (Linear or Convex Cone)
Let $\mathcal{X}$ be a real vector space. A non-empty subset $C\subseteq \mathcal{X}$ is called a cone if it is closed under multiplication by non-negative scalar, i.e. if $\lambda C \subseteq C$ for each scalar $\lambda\geq 0$ .

$\square$

Let us have a look at simple examples.

Example 3.3 (Cones)
(a) The non-negative number tuples in the quarter plane $C:=\{(x,y) | \ x, y\geq 0\} \subsetneq \mathbb{R}^2$ forms a cone. Every point $(a,b)\in C$ can be scaled with $\lambda\geq 0$ arbitrarily, such that the entire quarter is comprised by the cone $C$ .

For the sake of simplicity, the highlighted area is finite but the actual area of $C$ , of course, is infinite.

(b) Any wedge which extends to infinity from the origin is a cone in $\mathbb{R}^2$ .

Note that the sketch is simplified since the highlighted area is finite even though the wedge $C$ is not bounded as indicated by the dashed line.

(c) Given any finite number of vectors $v_1, \ldots, v_n \in V$ in a real vector space, the conical combination

$\begin{align*} C:=\{ \sum_{i=1}^{n}{ \lambda_i v_i | \lambda_i \geq 0} \} \end{align*}$

forms a convex set and a cone.

(d) In any function space, the set $C:=\{f| f\geq 0\}$ is a cone since the function $\lambda f \geq 0$ for all $\lambda \geq 0$ . So, $\lambda f\in C$ . Let us consider the scaling effect with scalars $\lambda \in \{0.5,2,3,4\}$ on the positive function $x \mapsto x^2$ that is sketched in dark blue.

$\square$

For more details and further definitions and theorems about linear cones please refer to [5].

The analog property to be closed under multiplication by non-negative scalar is called positive homogeneity for functions.

Definition 3.3 (Coherent Risk Measure)
A convex risk measure $\rho: \mathcal{X} \rightarrow \mathbb{R}$ is called a coherent measure of risk if it satisfies
(iv) Positive Homogeneity:
$\rho(\lambda X) = \lambda \rho(X)$ for all $\lambda\in [0, \infty)$ .

$\square$

If a monetary measure of risk $\rho$ is positively homogeneous, then it is normalized with $\rho(0)=\rho(0\cdot X)$ $=0\cdot \rho(X)=0.$

In addition, property (iv) tells us that the risk grows by the same proportion $\lambda\in [0,\infty)$ if we scale our portfolio by $\lambda$ . Loosely speaking, by doubling a portfolio the corresponding risk will also be doubled.

The assumption (iv) might be questionable since risk might not increase linearly when we scale up the portfolio by the same factor. A possible cause for an increasing risk can be a decreasing degree of liquidity. Please note that Artzner et al [1] was aware of that model limitation of coherent risk measures:

Of course, this assumes that markets at date $T$ are liquid; if they are not, more complicated models are required, where we can distinguish the risk of a position from and of a future net worth, since, with illiquid markets, the mapping from the former to the latter may not be linear.
— Section 2.2 in [1]

Under the assumption of positive homogeneity, convexity is equivalent to

(v) Subadditivity: $\rho(X +Y) \leq \rho(X)+\rho(Y)$ .

To see this, let us consider a convex risk measure $\rho:\mathcal{X} \rightarrow \mathbb{R}$ that is also positive homogeneous. By setting $\lambda:=\frac{1}{2}$ , we can derive

$\begin{align*} \rho(X+Y) &= 2 \cdot \rho\left(\frac{1}{2}X+\frac{1}{2}Y \right)\\ &\leq 2\cdot \left(\frac{1}{2}\rho(X)+\frac{1}{2}\rho(X) \right)\\ &= \rho(X)+\rho(Y), \end{align*}$

which shows the sub-linearity simply by applying the convexity and the homogeneity property. The choice of $\lambda=0.5$ is required to have a portfolio with an equal share of financial positions $X$ and $Y$ . It is then possible to scale up these proportions to the required size of 1 unit for both financial positions.

$\square$

Subadditivity reflects the idea that risk cannot be increased by diversification and that risk can be scaled up proportionally with the size of a portfolio. If separate risk limits are assigned to separate ‘(trading) desks’, then the risk of the aggregate position is bounded by the sum of the individual risk limits.

Acceptance Sets & Risk Measures

Let us now put ourselves into the shoes of a supervisory authority. Regulators such as EBA, ECB, PRA, etc. want to is to prevent or mitigate systemic risks to the financial system as a whole.

Regulated financial institutions usually need to determine and reserve capital requirements in a particular process to ensure the soundness of the regulated financial system. In Europe, this process is called Internal Capital Adequacy Assessment Process, short ICAAP. The main purpose of this process is to calculate the capital requirement of the aggregated risk positions for a financial institution. If this capital is reserved for that particular purpose, the financial position is considered to be acceptable from a regulator’s and/or an investor’s point of view.

Definition 4.1 (Acceptance Set)
A financial position $X$ with respect to a monetary measure $\rho$ is said to be acceptable if $\rho(X)\leq 0$ and not acceptable otherwise. A monetary measure $\rho$ induces the class

$\begin{align*} \mathcal{A}_{\rho} := \{ X \in \mathcal{X} | \rho(X) \leq 0 \} \end{align*}$

of positions, which are acceptable in the sense that it does not require any additional capital. The class $\mathcal{A}_\rho$ will be called the acceptance set of $\rho$ .

$\square$

That is, an acceptable financial position needs to comprise the capital requirements in some form of cash. Let us consider this simple example further.

Example 4.1 (Acceptable and Unacceptable Position)
Assume that a financial institution holds a risky position $X$ and $\rho$ is a corresponding coherent risk measure. The position $X$ entails a not acceptable risk of $c:=\rho(X)>0$ . That is, if we add cash as a capital reserve to the financial position of $X$ , then we receive an acceptable position $Y=(X+c)\in \mathcal{A}_\rho$ since $\rho(X + \rho(X))=\rho(X + c)=\rho(X)-\rho(X)=0$ .

$\square$

The following two propositions summarize the relationship between monetary measures of risk and their acceptance sets.

Proposition 4.1 (Monetary Measure & Acceptance Sets)
Suppose that $\rho$ is a monetary measure of risk with acceptance set $\mathcal{A}_\rho$ .

(a) $\mathcal{A}_\rho$ is non-empty, and satisfies the following two conditions:

(1) $\begin{align*} \inf\{c\in \mathbb{R} | c\in \mathcal{A}_\rho\} &> -\infty \end{align*}$

(2) $\begin{align*} X \in \mathcal{A}_\rho, Y\in \mathcal{X} \text{ and } Y &\geq X \Rightarrow Y\in \mathcal{A}_\rho \end{align*}$

If $X \in \mathcal{A}_\rho, Y\in \mathcal{X}$ the following set

(3) $\begin{align*} \{ \lambda\in [0,1] \ | \ \lambda X+(1-\lambda)Y \in \mathcal{A}_\rho \} \\ \text{ is closed in } [0,1] \end{align*}$

(b) The monetary measure of risk $\rho$ can be recovered from $\mathcal{A}:=\mathcal{A}_\rho$ via

(4) $\begin{align*} \rho_\mathcal{A}(X) := \inf\{\ c \in \mathbb{R} \ | \ c+X \in \mathcal{A} \ \}. \end{align*}$

Proof. (a) We show that $\rho_\mathcal{A}$ takes only finite values. To this end, fix some $Y$ in the non-empty set $\mathcal{A}$ . For a $X\in \mathcal{X}$ given, there exists a finite number $c$ with $c+X>Y$ , because $X$ and $Y$ are both bounded. Then

$\begin{align*} \rho_\mathcal{A}(X) - c = \rho_\mathcal{A}(c+X) \leq \rho_\mathcal{A}(Y) \leq 0 \end{align*}$

and hence $\rho_\mathcal{A}(X) \leq c < \infty$ . Note that (1) is equivalent to $\rho_\mathcal{A}(0)>-\infty$ . To show that $\rho_\mathcal{A}(X)> -\infty$ for arbitrary $X\in \mathcal{X}$ , we take $c'$ such that $c'+X \leq 0$ and conclude by monotonicity and cash invariance that $\rho_\mathcal{A}(X) \geq \rho_\mathcal{A}(0)+c' > -\infty$ .

As an accepted position $X$ , it’s risk value needs to fulfill the condition $\rho(X)\leq 0$ . If $Y\in \mathcal{X}$ and $X \leq Y$ then $\rho(Y)\leq \rho(X) \leq 0$ due to monotonicity. Hence, $Y$ also needs to be acceptable.
The function $\lambda \mapsto \rho(\lambda X+(1-\lambda)Y)$ for $\lambda\in [0,1]$ is continuous such that

$\begin{align*} \{ \lambda X +(1-\lambda)Y : \rho(\lambda X+(1-\lambda) Y)<0\} \end{align*}$

is closed. Refer to Theorem 5.1 and 5.5 for further details.

(b) Applying the cash translation invariance, we can derive the following chain of equations.

$\begin{align*} \rho_\mathcal{A}(X) &= \inf\{c\in \mathbb{R} | c + X \in \mathcal{A}\} \\ & = \inf\{c\in \mathbb{R} | \rho(c + X)\leq 0\} \\ & = \inf\{c\in \mathbb{R} | \rho(X)\leq c\} \\ & = \rho(X). \end{align*}$

If $\rho(X)=c^* \in \mathbb{R}$ then $c^*$ has to be equal to $\inf \{c\in \mathbb{R} | \rho(X)\leq c\}$ since any smaller value would not imply an acceptable set and and bigger value would not be the infimum.

$\square$

Implication (2) in the Proposition 4.1 means the following: if a financial position $X$ is acceptable and we consider another financial position $Y\geq X$ , where the net cash flow profile is greater or equal to $X$ , then $Y$ also needs to be acceptable. Consider that $Y \geq X$ means that an investor gets its money back more quickly such that the risk can only be lower.

As mentioned in (b) of the last proposition, we can take a given class $\mathcal{A} \subseteq \mathcal{X}$ of acceptable positions and define the risk as the minimal amount $c\in \mathbb{R}$ for which the position $(X+c)$ becomes acceptable.

Definition 4.2 (Capital Requirement)
The minimal amount $\rho_\mathcal{A}(X)$ –as defined in (4)– for which the position $(X+c)$ becomes acceptable is called capital requirement and the map $\rho_{\mathcal{A}}:\mathcal{X} \rightarrow \mathbb{R}$ with $X \mapsto \rho_{\mathcal{A}}(X)$ is called capital requirement measure.

$\square$

The following proposition will show that the capital requirement measure is a monetary measure of risk.

Proposition 4.2 (Convex Risk Measure)
Suppose that $\mathcal{A}$ is a non-empty subset of $\mathcal{X}$ which satisfies (1) and (2). Then the functional $\rho_\mathcal{A}$ has the following properties:

(a) $\rho_\mathcal{A}$ is a monetary measure of risk.

(b) If $\mathcal{A}$ is a convex set, then $\rho_\mathcal{A}$ is a convex measure of risk.

(c) If $\mathcal{A}$ is a cone, then $\rho_\mathcal{A}$ is positively homogenous. In particular, $\rho_\mathcal{A}$ is a coherent measure of risk if $\mathcal{A}$ is a convex cone.

(d) The monetary measure of risk $\rho$ is convex if and only if $\mathcal{A}_\rho$ is convex.

(e) The monetary measure of risk $\rho$ is positively homogeneous if and only if $\mathcal{A}_\rho$ is a cone. In particular, $\rho$ is coherent if and only if $\mathcal{A}_\rho$ is a convex cone.

Proof. (a) First of all, consider that (1) as well as (2) are assumed to be valid. According to (b) of Proposition 4.1, the following equation is valid.

$\begin{align*} X \mapsto \rho_\mathcal{A}(X) &= \inf\{c\in \mathbb{R} | X +c \in \mathcal{A}_\rho\} \\ & = \rho(X). \end{align*}$

Since $\rho$ is cash translation invariant so is $\rho_\mathcal{A}$ . The monotonicity follows from the same argument and (2).

(b) Suppose that $X, Y\in \mathcal{X}$ and that $c,d\in \mathbb{R}$ are such that $c+X, d+Y\in \mathcal{A}$ . If $\lambda\in [0,1]$ , then the convexity of $\mathcal{A}$ implies that $\lambda(c+X)+(1-\lambda)(d+Y)\in \mathcal{A}$ . Hence, by the cash invariance of $\rho_\mathcal{A}$ ,

$\begin{align*} 0 &\leq \rho_\mathcal{A}(\ \lambda(c+X)+(1-\lambda)(d+Y) \ ) \\ &= \rho_\mathcal{A}(\lambda X + (1-\lambda) Y) - (\lambda c + (1-\lambda) d) \end{align*}$

and the convexity of $\rho_\mathcal{A}$ follows.

(c) As in (b), we obtain that $\rho_\mathcal{A}(\lambda X) \leq \lambda \rho_\mathcal{A}(X)$ for $\lambda\geq 0$ if $\mathcal{A}$ is a cone. To prove the converse inequality, let $c< \rho_\mathcal{A}(X)$ . Then $c+X\notin \mathcal{A}$ and hence $\lambda c +\lambda X\notin \mathcal{X}$ for $\lambda\geq 0$ . Thus, $\lambda c < \rho_\mathcal{A}(\lambda X)$ , and (c) follows.

(d) The inclusion $\mathcal{A} \subseteq \mathcal{A}_\rho$ is obvious. Now assume that $\mathcal{A}$ satisfies (3). We have to show that $X\notin \mathcal{A}$ implies that $\rho_\mathcal{A}(X)>0$ . To this end, take $c> ||X||=\sup_\omega|X(\omega)|$ . By assumption, there exists an $\epsilon \in (0,1)$ such that $\epsilon c + (1-\epsilon)X \notin \mathcal{A}$ . Thus,

$\begin{align*} 0 \leq \rho_\mathcal{A}( \epsilon c + (1-\epsilon) X ) = \rho_\mathcal{A}((1-\epsilon) X ) - \epsilon c. \end{align*}$

Since $\rho_\mathcal{A}$ is a monetary measure of risk, Lemma 3.1 shows that

$\begin{align*} |\rho_\mathcal{A}( (1-\epsilon) X ) - \rho_\mathcal{A}(X)| \leq \epsilon ||X||. \end{align*}$

Hence,

$\begin{align*} \rho_\mathcal{A}(X) &\geq \rho_\mathcal{A}((1-\epsilon)X) - \epsilon ||X|| \\ &\geq \epsilon (c-||X||) >0. \end{align*}$

(e) Clearly, positive homogeneity of $\rho$ implies that $\mathcal{A}$ is a cone. The converse follows from (d).

$\square$

Let us now study whether a monetary risk measure can be considered continuous.

Lemma 4.1 (Lipschitz Continuity of Monetary Measure)
Any monetary measure of risk $\rho$ is Lipschitz continuous with respect to the supremum norm $||X||_{\infty}:=\sup\{|X(\omega)|\ : \ \omega \in \Omega\}$ .
That is,

$\begin{align*} |\rho(X)-\rho(Y)| &\leq || X - Y ||_{\infty} \\ &= \sup\{|(X-Y)(\omega)|\ : \ \omega \in \Omega\} \end{align*}$

Proof. Due to the fact that $X-Y \leq ||X-Y||_{\infty}$ is valid, we can deduct the following by applying the monetary measure $\rho$ to this inequality.

$\begin{align*} X-Y &\leq ||X-Y||_{\infty} \\ \Rightarrow X &\leq Y + ||X - Y||_{\infty} \\ \Rightarrow \rho(X) &\geq \rho(Y + || X - Y ||) \\ \Rightarrow \rho(X) &\geq \rho(Y) - || X - Y || \\ \Rightarrow \rho(Y)-\rho(X) &\leq || X - Y || \end{align*}$

Note that the first switch from $\leq$ to $\geq$ is caused by applying the monetary measure and its monotonicity property. Afterwards the cash invariance property is applied and the inequality is multiplied by $-1$ such that the inequality sign changes again.

Repeating the same argument on $-(X-Y) \leq ||X-Y||_{\infty}$ yields $\rho(X)-\rho(Y)\leq ||X-Y||_{\infty}$ and thus the assertion.

$\square$

Lemma 4.1 implies the existence of an unique extension of $\rho$ on $\mathcal{X}$ . Therefore, we can define the expectation operator $\mathbb{E}_Q$ with respect to a finitely additive measure $Q$ of total mass 1. We are going to study this connection in the upcoming Part II of this series in detail.

Notice that a coherent risk measure corresponds to the upper expectations.

Popular Risk Measures

Let us start with a very easy example and then introduce the undoubtedly most famous monetary measure of risk – the value at risk measure.

Worst-Case Risk Measure

The following risk measure just takes the worst case for a financial position.

Definition 5.1 (Worst-Case Risk Measure)
The worst-case risk measure $\rho_{\text{max}}$ is defined by

$\begin{align*} \rho_{\text{max}}(X) := -\inf_{\omega \in \Omega}{X(\omega)} \quad \forall X\in \mathcal{X}. \end{align*}$

$\square$

The value $\rho_{\text{max}}$ is the least upper bound for the potential net present value, which can occur in any scenario. That is, the infimum is applied as we need to ensure the worst case of all loses across all scenarios $\omega\in \Omega$ is taken. Finally, all profits/losses are multiplied by -1 since we want to turn the losses, that are denoted with negative numbers, to positive risk figures (to match the convention).

The corresponding acceptance set equals

$\begin{align*} \mathcal{A} &= \{ X\in \mathcal{X} | \rho_{\text{max}}(X) \leq 0 \} \\ &= \{ X\in \mathcal{X} | -\inf_{\omega \in \Omega}{X(\omega)} \leq 0 \}\\ &= \{ X\in \mathcal{X} | \inf_{\omega \in \Omega}{X(\omega)} \geq 0 \}. \end{align*}$

Given that $\mathcal{X}$ is a function (vector) space, the acceptance set $\mathcal{A}$ is given by a convex cone of all non-negative functions – refer to (d) of Example 3.3.

According to (e) of Proposition 4.2, $\rho_{\text{max}}$ is a coherent measure of risk. Let us check the properties anyway:

The function $\rho_{\text{max}}$ is monotone: if $Y$ dominates $X$ , then $Y(\omega) \geq X(\omega)$ for all $\omega\in \Omega$ and we can conclude that

$\begin{align*} \rho_{\text{max}}(Y) \leq \rho_{\text{max}}(X). \end{align*}$

The cash translation invariance follows also from the rules of the infimum:

$\begin{align*} \rho_{\text{max}}(X+c) &= -\inf_{\omega\in \Omega}{(X(\omega)+c)} \\ &= -(\inf_{\omega\in \Omega}{X(\omega)}+\inf_{\omega\in \Omega}{c}) \\ &= -(\inf_{\omega\in \Omega}{X(\omega)}+c) \\ &= -\inf_{\omega\in \Omega}{X(\omega)}-c \\ &= \rho_{\text{max}}(X)-c. \end{align*}$

Hence, it is a monetary measure of risk. The risk measure $\rho_{\text{max}}$ is also positively homogenous according to the properties of infimum on the real line.

By definition, it is the most conservative measure of risk in the sense that any normalized monetary risk measure $\rho$ on $\mathcal{X}$ satisfies

$\begin{align*} \rho(X) \leq \rho(\inf_{\omega\in \Omega}{ X(\omega)}) = \rho_{\text{max}}(X) \end{align*}$

for all $X\in \mathcal{X}$ .

Note that $\rho_{\text{max}}$ can be represented in the form

$\begin{align*} \rho_{\text{max}}(X) = \sup_{Q\in \mathcal{Q}}{E_Q[-X]}, \end{align*}$

where $\mathcal{Q}$ is the class of all probability measures on $(\Omega, \mathcal{F})$ .

Value at Risk

Up to now, we have only worked on function spaces, not assuming any probability measure. Value at risk (VaR) subsumes the concept of maximum loss that a financial position may experience over the risk horizon $T>0$ up to an assigned level of confidence $\lambda \in [0,1]$ . We therefore assume that a probability measure $P$ on $(\Omega, \mathcal{F})$ models our financial positions $X\in \mathcal{X}$ with cumulative distribution function $F_X$ . Positive values of $X$ denote profits while negative values denote losses accrued over $T$ . $\Omega$ is the basic set and $\mathcal{F} \supseteq \mathcal{X}$ the corresponding $\sigma$ -algebra.

In plain English, the “value at risk” of a random variable $X$ on $(\Omega, \mathcal{F}, P)$ is a real number $c\in \mathbb{R}$ , such that the probability of a loss (i.e. $X<0$ ) is bounded by a given level $\lambda\in (0,1)$ . Usually $\lambda$ is set to $\lambda \in \{0.95, 0.99, 0.999\}$ . That is,

$\begin{align*} P[X < 0] \leq \lambda. \end{align*}$

Let us define the corresponding monetary risk measure.

Definition 5.2 (Value at Risk)
The value at risk (measure) $\rho_{\text{VaR}}$ at level $\lambda\in (0,1]$ is defined by

$\begin{align*} \rho_{\text{VaR}_{\lambda}}(X)&:=\inf_{\omega \in \Omega}\{c\in \mathbb{R} | \ P[X(\omega)+c<0] \leq \lambda \}\\ &= F^{-1}_X(\lambda), \end{align*}$

which refers to a quantile of the loss distribution of $P$ .

$\square$

Given that the VaR is a loss measure, the quantile needs to be chosen accordingly. On the one hand, if losses are denoted by negative figures, then we need to restrict events such as $X+ c < 0$ by requiring $P[X+c<0] \leq \lambda$ . This is exactly the situation that we have here according to the definitions of section 2.

On the other hand, if losses are denoted by positive figures, events like $X+c>0$ need to be rstricted by requiring $P[X+c>0] \leq \lambda$ .

Value at risk is the smallest amount of capital $c>0$ which, if added to $X(\omega)$ as a risk-free investment, needs to keep the probability of a negative outcome below the level $\lambda$ . Let us illustrate that using a simple example.

If, for instance, cash of $c=2$ is added to a profit-and-loss profile, the entire profile is shifted to the right by $c$ . The corresponding risk distribution, however, is shifted by $-c$ to the left as sketched in the following graph.

By definition or convention, risk is reflected by positive and chance by negative figures.

Example 5.1 (VaR of Risk-Free Investment)

Let us calculate the value at risk of a risk-free investment $X:=d>0$ . We assume that holding an amount of cash $d$ is risk-free, for instance. If we add the deterministic pay-off profile to the definition of the value at risk, we receive the following:

$\begin{align*} &\rho_{\text{VaR}}(X=d)\\ &=\inf_{\omega \in \Omega}\{c\in \mathbb{R} | \ P[X+c<0] \leq \lambda \} \end{align*}$

We need to figure when the condition $P[X+c<0] \leq \lambda$ holds true for $X=d>0$ . Thereby we need to consider that the random variable is deterministic such that we either fulfill the condition or not. If we set $c$ to a value smaller or equal than $-d$ , i.e. $c < -d$ , then the condition $d+c < 0$ holds true for all $\omega \in \Omega$ , such that we have $P[d+c<0]=1$ . Thus, $1=P[d+c<0] \leq \lambda$ with $\lambda \in (0,1)$ cannot be true.

It therefore turns out that $P[d+c<0] \leq \lambda$ is fulfilled if $c \geq -d$ . Taking the infimum $\inf_{\omega \in \Omega}{(-d, \infty) }$ results in $\rho_{\text{VaR}}(X=d)=-d$ .

$\square$

If $Y$ dominates $X$ , i.e. $Y(\omega) \geq X(\omega)$ for all $\omega \in \Omega$ , then the financial position $Y$ provides more return than $X$ , no matter what scenarios occurs. Hence, for a fixed $\lambda\in (0,1]$ , an arbitrary $c\in \mathbb{R}$ and $\omega \in \Omega$ , we conclude the monotonicity of the value at risk:

$\begin{align*} &\qquad X(\omega) \leq Y(\omega) \quad \forall \omega\in \Omega \\ &\Rightarrow \{\omega | X(\omega) < -c\} \supseteq \{\omega | Y(\omega) < -c\} \\ &\Rightarrow \{c | P[X+c < 0 ] \leq \lambda \} \supseteq \\ &\qquad \{c | P[Y+c < 0] \leq \lambda \} \\ &\Rightarrow \rho_{\text{VaR}}(X) \geq \rho_{\text{VaR}}(Y). \end{align*}$

Since the profit and loss profile of $X$ is always worse than this of $Y$ , we need to add more cash to cover potential losses of $X$ compared to $Y$ . Hence, the set $\{c | P[X+c < 0 ] \leq \lambda \}$ is a superset to $\{c | P[Y+c < 0 ] \leq \lambda \}$ , which ultimately results in the asserted inequality that proves the monotonicity.

To prove the cash translation invariance, we recall Example 5.1 where we have shown that $\rho_{\text{VaR}}(c)=-c$ for an amount of cash $c>0$ . If $Y:=(X+d)$ , then the cash translation invariance follows via

$\begin{align*} &\rho_{\text{VaR}}(Y+d) \\ &=\inf_{\omega \in \Omega}\{c=(c_X+c_d) \in \mathbb{R} | \ P[Y+c<0] \leq \lambda \}\\ &=\inf_{\omega \in \Omega}\{(c_X+c_d)\in \mathbb{R} | \ P[X+(c_X+c_d)<0] \leq \lambda \}\\ &=\inf_{\omega \in \Omega}\{c_X \in \mathbb{R} | \ P[X+c_X<0] \leq \lambda \} \\ &+\inf_{\omega \in \Omega}\{c_d\in \mathbb{R} | \ P[d+c_d<0] \leq \lambda \} \\ &=\rho_{\text{VaR}}(X+c_X)-d. \end{align*}$

Thereby, consider Example 5.1 as well as the the rules of the infimum.

Hence, the following proposition follows.

Proposition 5.1 (VaR is a Monetary Risk Measure)
Cash translation invariance, monotonicity, and positive homogenity hold for a VaR measure. That is, VaR is a monetary measure of risk.

$\square$

The risk measure $\rho_{\text{VaR}}$ is, however, not subadditive as the following example shows.

Example 5.2 (VaR not subadditive)
Consider an investment into two defaultable zero bonds $X_1$ and $X_2$ , each with return $\tilde{r_i}>r$ , $i\in \{1,2\}$ , where $r\geq 0$ is the return of a ‘riskless’ investment. We furthermore assume that both bonds default mutually independent from each other, that both bonds have the same default probability of $\text{PD}\in (0,1)$ and. Refer to Binomial- and Poisson-Mixture Models for further details.

The discounted net gain of an investment $N>0$ in one of the two bonds is given by

$\begin{align*} X_i := \begin{cases} -N & \text{ in case of default } \\ \frac{N (\tilde{r}-r)}{1+r} & \text{ otherwise} \end{cases}. \end{align*}$

That is, we assume a loss given default of $100\%$ for all $X_i$ such that the investment $N$ will be lost completely in case of a default. According to Proposition 4.1 the Infimum of $c\in \mathbb{R}$ such that $\rho_{\text{VaR}}(X_i + c_i)<0$ equals the risk. Hence, for both bonds, we have

$\begin{align*} &P\left[ X_i + \left( -\frac{N (\tilde{r}-r)}{1+r} \right) < 0 \right] \\ & = P[\text{Default of } X_i ] \\ & = \text{PD}. \end{align*}$

This means that both positions $X_i$ are acceptable if we add $\frac{N (\tilde{r_i}-r)}{1+r}$ for $i\in \{1,2\}$ cash to each financial position (i.e. portfolio). That is, $\rho_{\text{VaR}}(X_1)=-\frac{N (\tilde{r_1}-r)}{1+r}$ and $\rho_{\text{VaR}}(X_2)=-\frac{N (\tilde{r_2}-r)}{1+r}$ .

Now, consider the financial position $Y:= (X_1+X_2)/2$ , which comprises the first two portfolios but equally weighted by $\frac{N}{2}$ . According to (4) of Binomial- and Poisson-Mixture Models, the probability that $k=1$ out of the $m=2$ bonds default can be calculated as follows:

$\begin{align*} P[\text{Default of one bond}] = \binom{2}{1} p^1(1-p)^{2-1} \end{align*}$

Employing the same argument as above, we can derive the risk as follows.

$\begin{align*} &P\left[ Y + \left( -\frac{N (\tilde{r}-r)}{1+r} \right) < 0 \right] \\ & = P[\text{Default of } X_1 \text{ or } X_2 ] \\ & = \text{PD}^2. \end{align*}$

$\square$

Another weakness of the VaR is that it does not properly reflect unlikely but catastrophic risks in the relevant tail of the distribution. A natural remedy for not considering unlikely loss events would be to consider the average VaR values beyond some confidence level. This leads us to a risk measure called Expected Shortfall or Average Value at Risk.

Expected Shortfall

The risk measure for market risk according to the Fundamental Review of the Trading Book is the Expected Shortfall (ES). It overcomes the main weaknesses of the VaR.

Definition 5.3 (Expected Shortfall)
The Expected Shortfall (ES) at level $\lambda\in (0,1]$ of a position $X\in \mathcal{X}$ is given by

$\begin{align*} \rho_{\text{ES}_\lambda}(X) := \frac{1}{\lambda} \cdot \int_{0}^{\lambda}{ \rho_{\text{VaR}_{\theta}} (X) \ \text{d}\theta } \end{align*}$

The ES is also called Average VaR and Conditional VaR.

$\square$

The ES inherits its properties from the VaR as the integral function is even linear. For the proof that the ES is sublinear we refer to the Paper “Seven Proofs for the Subadditivity of Expected Shortfall” by Embrechts and Wang.

Literature

[1]

[2]

[3]

[4]

[5]