Binomial- and Poisson-Mixture Model

Introduction

The assumption of independent and identically distributed random variables, short i.i.d., might be quite handy since it simplifies several statistical methods. The strong law of large numbers as well as the central limit theorem, for instance, assume independent variables. Benefiting from the independence while being able to incorporate dependency structures between random variables would therefore be very useful. And indeed, there is a way how to sneak in correlations in a given setting of i.i.d. random variables. One of those techniques are mixture models such as the binomial- and the Poisson-mixture model that are both studied in this post. The following is mainly based on [1].

Let (\Omega, \mathcal{A}, \mathbb{P}) be a suitable probability space and assume that we are interested in a set of m\in \mathbb{N} random variables L_1, \ldots, L_m.

Bernoulli-Mixture Model

A Bernoulli trial (or binomial trial) X \sim B(p) is a random experiment with exactly two possible outcomes which we can call “success” and “failure”. The “success” outcome, often represented by 1, appears with probability p\in[0,1], while the “failure” state, represented by 0, appears with complement probability 1-p\in [0,1].

One of the simplest stochastic processes, the so-called Bernoulli process, is simply a set of Bernoulli trials.

Example 1:
Let us assume that L_i represents the creditworthiness of counterparty i\in \{1, \ldots, m\} of a credit portfolio \{ L_1, \ldots, L_m\}. It is natural to model the two states “survival” and “default”   by using a Bernoulli process with L_i \sim B(p) for all counterparties i\in \{1, \ldots, m\}. That is, L_i \sim B(p) can take on the following two states:

  • 1 with probability p, if the counterparty goes bankrupt, and,
  • 0 with probability 1-p, if the counterparty survives.

Here, p\in [0,1] is the probability of default of counterparty i which is usually derived from a corresponding credit rating and suitable  historical default rates.
\square

Another popular example of a Bernoulli trial/process is tossing a coin, where one side appears with probability p.

More formally, a Bernoulli process B(p) is a (finite) sequence of independent random variables L_1, \ldots, L_m, with

(1)   \begin{equation*}  L_i = \begin{cases} 1 & \text{with probability } p \\ 0 & \text{with probability } 1-p \end{cases} \end{equation*}

as Bernoulli trial with probability p.

A Bernoulli process is therefore a sequence of independent identically distributed Bernoulli trials. Given the discrete nature of the Bernoulli trial, the expectation \mathbb{E}(L_i) as well as the variance \mathbb{V}(L_i) for counterparty i is given by

(2)   \begin{align*}  \mathbb{E}(L_i) &= (1 \cdot p) + (0 \cdot (1-p)) = p \\ \mathbb{E}(L_i^2) &= (\mathbb{P}[L_i=1] \cdot 1^2) + (\mathbb{P}[L_i=0] \cdot 0^2) = p \\ \mathbb{V}(L_i) &= \mathbb{E}(L_i^2)- \mathbb{E}(L_i)^2 = p-p^2 = p(1-p). \end{align*}

In the next sections we will have a closer look at the Bernoulli process and we will relax the assumption of independence between them.

Sum of i.i.d. Bernoulli Random Variables with constant Parameter

For the sake of  illustration let us assume that all random variables are mutually independent and that all random variables do have the same probability p\in [0,1] assigned. That is, L_1, \ldots, L_m is a Bernoulli process with probability p. In this case the distribution of the sum

(3)   \begin{equation*}  \mathbf{L}= \sum_{i=1}^m{L_i} \end{equation*}

is governed by the binomial distribution B(m, p).

Example 2:
We continue Example 1 and consider the sum \mathbf{L}=\sum_{i=1}^m{L_i} of the corresponding Bernoulli process L_i \sim B(p) that is distributed according to B(m, p). The probability that k out of m counterparties are going to default (“success”) is \mathbb{P} [ \mathbf{L} = k ] = \binom{m}{k} p^k(1-p)^{m-k}.

For instance, if m=100 and p=0.01 then two defaults will occur with probability \binom{100}{2} 0.01^2 (1-0.01)^{100-2} \approx~ 18.49\%.
\square

As illustrated, it is easy to calculate any feature of the distribution of \mathbf{L} \sim B(m,p). For instance, the expectation, the variance as well as the probability of a specific value k\in \mathbb{N} is given by:

(4)   \begin{align*}  \mathbb{E}(\mathbf{L}) &=m\cdot p \\ \mathbb{V}(\mathbf{L}) &=m\cdot p \cdot (1-p) \\ \mathbb{P}[\mathbf{L}=k] &= \binom{m}{k} p^k(1-p)^{m-k} \end{align*}

Beside that, it would also be quite simple to determine \mathbb{E}(L) as well as the variance \mathbb{V}(L) by just knowing (2) and applying the additivity of the expectation and variance.

In the next step we are going to incorporate correlation between the variables L_1, \ldots, L_m by using the distribution parameters p_1, \ldots, p_m. Notice that p_i \in [0,1] is a probability.

Sum of Independent Bernoulli Random Variables with Random Parameter

By treating the distribution parameters \mathbf{p}=(p_1, \ldots, p_m) of the outlined Bernoulli process as random variables \mathbf{P}=(P_1, \ldots, P_m), it is possible to incorporate a correlation. That is, we assume that \mathbf{P}=(P_1, \ldots, P_m) follows a joint distribution represented by the distribution function F with support in [0,1]^m. Correspondingly, we assume \mathbf{L} \sim B(m, \mathbf{P}), where \mathbf{P} itself is considered to be a random variable.

This kills two birds with one stone. First, we drop the assumption that all Bernoulli trials do have the same probability p applied. Second, we can incorporate a correlation between the m random variables since the correlation Corr(L_i, L_j) only depends on P_i and P_j.

In more mathematical terms, let

(5)   \begin{align*}  L_i |_{P_i=p_i}  \sim B(p_i),  \hspace{0.25cm} \text{ and }\hspace{0.25cm} (L_i |_{\mathbf{P}=\mathbf{p}})_{i=1, \ldots, m} \text{ independent.} \end{align*}

Conditional on the realization of the parameters p_1, \ldots, p_m, the random variables L_1, \ldots, L_m are still independent. One could, for instance, assume that \mathbf{P} is uniformly distributed on [0,1]^m as in the R code example below.

The  joint distribution of \mathbf{L} is given by

(6)   \begin{align*}  \mathbb{P}[L_1=l_1, \ldots, L_m=l_m] = \int_{[0,1]^m}{\prod_{i=1}^m{p_i^{l_i}(1-p_i)^{1-l_i}} \ dF(p_1, \ldots, p_m)} \end{align*}

where l_i \in \{0,1\} = {“failure”, “success”}. The representation (6) is possible since we can leverage on the independence.

The covariance between the marginal random variables equals

(7)   \begin{align*} Cov(L_i, L_j) = \mathbb{E}[L_iL_j] - \mathbb{E}[L_i]\mathbb{E}[L_j] = Cov[P_i, P_j] \end{align*}

and therefore the correlation can be determined by

(8)   \begin{align*} Cor(L_i, L_j) = \frac{Cov(P_i, P_j)}{ \sqrt{\mathbb{E}(P_i)(1-\mathbb{E}(P_i)} \sqrt{\mathbb{E}(P_j)(1-\mathbb{E}(P_j)}}. \end{align*}

That is, the correlation between the two variables L_i and L_j is fully determined by the variables P_i and P_j. We can therefore use this backdoor to sneak in correlation using conditional independence.

The binomial-mixture model is analytically. Nonetheless, we can  simulate both sides of the equations to double-check it by example. The following R code is one way to do so.

# Binomial Mixture Model:
# ---------

# Produce dependent uniform distributed variables by starting with multi-variate 
# normal variables. We therefore need the following package
library(mvtnorm)

m <- 20 #nbr of variables
simulation.nbr <- 1000000 #nbr of simulations

# get correlated normal random variables
sigma <- diag(1, nrow = m, ncol = m)
# variable 2 and 3 shall be correlated with 0.5
sigma[2,3] <- 0.5
sigma[3,2] <- 0.5
# variable 12 and 20 shall be correlated with 0.8
sigma[12,20] <- 0.8
sigma[20,12] <- 0.8

# simulate the m-dimensional joint normal distribution
X <- rmvnorm(n=simulation.nbr, mean = rep(0,m), sigma = sigma)
# transform data into correlated prob on [0,1]^m
P <- pnorm(X)

# prepare data structure for variables
L <- matrix(nrow = simulation.nbr, ncol=m)

# simulate the Bernoulli distribution for each variable i
for(i in 1:m){
  L[,i] <- ifelse(runif(simulation.nbr) < P[,i], 0, 1)
}

# By construction the following covariances should be in the same order
cov(P[,2],P[,3])
cov(L[,2],L[,3])

cov(P[,12],P[,20])
cov(L[,12],L[,20])

# By construction the following correlations should be in the same order
cov(P[,2],P[,3])/( sqrt(mean(P[,2])*(1-mean(P[,2]))) * sqrt(mean(P[,3])*(1-mean(P[,3]))))
cor(L[,2],L[,3])

cov(P[,12],P[,20])/( sqrt(mean(P[,12])*(1-mean(P[,12]))) * sqrt(mean(P[,20])*(1-mean(P[,20]))))
cor(L[,12],L[,20])

Please notice that we simulate a suitable Bernoulli process using the code  ifelse(runif(simulation.nbr) < P[,i], 0, 1), where P[,i] carries the correlation information of the parameter p_i already in it.

Poisson-Mixture Model

A similar pattern is also used for Poisson-distributed random variables. Let us therefore assume that our random variables L_1, \ldots, L_m are distributed according to the Poisson distribution with intensity parameter \lambda_i, denoted by L_i \sim Pois(\lambda_i) \forall i\in \{1, \ldots, m\}. The expectation as well as the variance of a Poisson-distributed random variable L_i is given by its intensity parameter, i.e.,

(9)   \begin{align*} \mathbb{E}(L_i)  =  \mathbb{V}(L_i)  = \lambda_i. \end{align*}

The distribution of the  sum of independent Poisson-distributed variables \mathbf{L} = \sum_{i=1}^m{L_i} is determined by

(10)   \begin{equation*}  \mathbf{L} \sim Pois \left( \sum_{i=1}^m{\lambda_i}\right). \end{equation*}

Given that we know the distribution of \mathbf{L}, it is easy to derive any feature such as expectation and variance:

(11)   \begin{align*} \mathbb{E}(\mathbf{L})  =  \mathbb{V}(\mathbf{L})  = \sum_{i=1}^m{\lambda_i} . \end{align*}

Just as in the binomial case we introduce correlation between the variables L_i and L_j with i\neq j by assuming that \mathbf{\Lambda} = (\Lambda_1, \ldots, \Lambda_m) is a random vector with distribution function F and support in [0, \infty[^m.

Further we assume that conditionally on a realization \mathbf{\lambda} = (\lambda_1, \ldots, \lambda_m) of \mathbf{\Lambda} the variables L_i|_{\Lambda_i=\lambda_i} with i\in \{1, \ldots, m\} are independent:

(12)   \begin{equation*}  L_i|_{\Lambda_i=\lambda_i} \sim Pois(\lambda_i), \hspace{0.5cm} (L_i|_{\mathbf{\Lambda}=\mathbf{\lambda}})_{i=1, \ldots, m} \text{ independent}. \end{equation*}

The  joint distribution of \mathbf{L} is given by

(13)   \begin{align*}  \mathbb{P}[L_1=l_1, \ldots, L_m=l_m] = \int_{[0,\infty]^m}{\prod_{i=1}^m{ \frac{\lambda_i^{l_i}}{l_i!} } \ dF(\lambda_1, \ldots, \lambda_m)} \end{align*}

where l_i \in \{0,1,2, \ldots\}.

The covariance between the marginal random variables equals

(14)   \begin{align*} Cov(L_i, L_j) = Cov[\Lambda_i, \Lambda_j] \end{align*}

and therefore the correlation can be determined by

(15)   \begin{align*} Cor(L_i, L_j) = \frac{Cov(\Lambda_i, \Lambda_j)}{ \sqrt{\mathbb{V}(\Lambda_i)+ \mathbb{E}(\Lambda_i)} \sqrt{\mathbb{V}(\Lambda_j)+\mathbb{E}(\Lambda_j)}}. \end{align*}

That is, the correlation between the two variables L_i and L_j is fully determined by the variables \Lambda_i and \Lambda_j. We can therefore use this backdoor to sneak the correlation in using conditional independence.

The Poisson-mixture model is analytically. Nonetheless, we simulate both sides of the equations to double-check it by example. The following R code is one way to do so.

# Poisson Mixture Model:
# ---------

# Produce dependent variables on positive reals using multivariate discrete 
# random variables and a Gaussian copula. To this end we use the GenOrd package:
library(GenOrd)

m <- 20 #nbr of variables
simulation.nbr <- 1000000

# since we want to have correlated random variables we need a corr matrix
sigma <- diag(1, nrow = m, ncol = m)
# variable 2 and 3 shall be correlated with 0.2
sigma[2,3] <- 0.2
sigma[3,2] <- 0.2
# variable 12 and 20 shall be correlated with 0.7
sigma[12,20] <- 0.7
sigma[20,12] <- 0.7

# to make it reproduceable 
set.seed(1)

# list of m vectors representing the  cumulative  probabilities  defining
# the  marginal distribution of the i-th marginal distribution via the CDF
# that are all the same in our simple example
marginal.CDF <- rep(list(bquote()), m)
for(i in 1:m){
  marginal.CDF[[i]] <- c(0.1, 0.3, 0.5, 0.6, 0.75) 
}

# checks the lower and upper bounds of the correlation coefficients.
# in our case the two correlations lie within the thresholds
corrcheck(marginal.CDF)

# create correlated sample with given marginals
Lambda <- ordsample(n = simulation.nbr, marginal = marginal.CDF, Sigma = sigma)

# prepare data structure for the Poisson-distributed variables
# using the realized lambdas from variable Lambda
L <- array(0,dim=c(simulation.nbr*10, m)) 

# simulate the Poisson distribution for each variable i
for(i in 1:m){ # do for all variables L.i
    L[,i] <- rpois(n = simulation.nbr*10, lambda = Lambda[,i])
}

# By construction the following covariances should be in the same order
cov(Lambda[,2],Lambda[,3])
cov(L[,2],L[,3])

cov(Lambda[,12],Lambda[,20])
cov(L[,12],L[,20])

# By construction the following correlations should be in the same order
cov(Lambda[,2],Lambda[,3])/( sqrt(var(Lambda[,2])+ mean(Lambda[,2])) * sqrt(var(Lambda[,3])+ mean(Lambda[,3])) )
cor(L[,2],L[,3])

cov(Lambda[,12],Lambda[,20])/( sqrt(var(Lambda[,12])+ mean(Lambda[,12])) * sqrt(var(Lambda[,20])+ mean(Lambda[,20])) )
cor(L[,12],L[,20])

Please notice that rpois(n=20, lambda=c(1,500)), for instance, generates a random variate using the two different intensity parameter \lambda_1=1 and \lambda_2=500 sharing the sample space of size n=10. We therefore multiply the variable simulation.nbr by 10 to increase the quality of the approximation.

Literature:
[1]

Bluhm, C., Overbeck, L. and Wagner, C. (2010) Introduction to credit risk modeling. 2nd ed. Boca Raton, FL: Chapman & Hall/CRC (Chapman & Hall/CRC financial mathematics series).