# Compound Poisson distribution-discrete example

We present a discrete example of a compound Poisson distribution. A random variable $Y$ has a compound distribution if $Y=X_1+ \cdots +X_N$ where the number of terms $N$ is a discrete random variable whose support is the set of all nonnegative integers (or some appropriate subset) and the random variables $X_i$ are identically distributed (let $X$ be the common distribution). We further assume that the random variables $X_i$ are independent and each $X_i$ is independent of $N$. When $N$ follows the Poisson distribution, $Y$ is said to have a compound Poisson distribution. When the common distribution for the $X_i$ is continuous, $Y$ is a mixed distribution if $P[N=0]$ is nonzero. When the common distribution for the $X_i$ is discrete, $Y$ is a discrete distribution. In this post we present an example of a compound Poisson distribution where the common distribution $X$ is discrete. The compound distribution has a natural insurance interpretation (see the following links).

General Discussion
In general, the distribution function of a compound Poisson random variable $Y$ is the weighted average of all the $n^{th}$ convolutions of the common distribution function of the individual claim amount $X$. The following shows the form of such a distribution function:

$\displaystyle F_Y(y)=\sum \limits_{n=0}^{\infty} F^{*n}(y) P[N=n]$

where $\displaystyle F$ is the common distribution of the $X_n$ and $F^{*n}$ is the $n^{th}$ convolution of $F$.

If the distribution of the individual claim $X$ is discrete, we can obtain the probability mass function of $Y$ by convolutions as follows:

$\displaystyle f_Y(y)=P[Y=y]=\sum \limits_{n=0}^{\infty} p^{*n}(y) P[N=n]$

where $\displaystyle p^{*1}(y)=P[X=y]$
and $\displaystyle p^{*n}=p^* \cdots p^{*}(x)=P[X_1+X_2+ \cdots +X_n=y]$
and $\displaystyle p^{*0}(y)=\left\{\begin{matrix}0&\thinspace y \ne 0\\{1}&\thinspace x=0\end{matrix}\right.$

Example
Suppose the number of claims generated by a portfolio of insurance policies over a fixed time period has a Poisson distribution with parameter $\lambda$. Individual claim amounts will be 1 or 2 with probabilities 0.6 and 0.4, respectively. For the compound Poisson aggregate claims $Y=X_1+ \cdots +X_N$, find $P[Y=k]$ for $k=0,1,2,3,4$.

The probability mass function of $N$ is: $\displaystyle f_N(n)=\frac{\lambda^n e^{-\lambda}}{n!}$ where $n=0,1,2, \cdots$. The individual claim amounnt $X$ has a Bernoulli distribution since it is a two-valued discrete random variable. For convenience, we let $p=0.4$ (i.e. we consider $X=2$ is a success). Then the sum $X_1+ \cdots + X_n$ has a Binomial distribution. Consequently, the $n^{th}$ convolution $p^{*n}$ is simply the distribution function of Binomial(n,p). The following shows $p^{*n}$ for $n=1,2,3,4$.

$\displaystyle p^{*1}(1)=0.6, \thinspace p^{*1}(2)=0.4$

$\displaystyle p^{*2}(2)=\binom{2}{0} (0.4)^0 (0.6)^2=0.36$
$\displaystyle p^{*2}(3)=\binom{2}{1} (0.4)^1 (0.6)^1=0.48$
$\displaystyle p^{*2}(4)=\binom{2}{2} (0.4)^2 (0.6)^0=0.16$

$\displaystyle p^{*3}(3)=\binom{3}{0} (0.4)^0 (0.6)^3=0.216$
$\displaystyle p^{*3}(4)=\binom{3}{1} (0.4)^1 (0.6)^2=0.432$
$\displaystyle p^{*3}(5)=\binom{3}{2} (0.4)^2 (0.6)^1=0.288$
$\displaystyle p^{*3}(6)=\binom{3}{3} (0.4)^3 (0.6)^0=0.064$

$\displaystyle p^{*4}(4)=\binom{4}{0} (0.4)^0 (0.6)^4=0.1296$
$\displaystyle p^{*4}(5)=\binom{4}{1} (0.4)^1 (0.6)^3=0.3456$
$\displaystyle p^{*4}(6)=\binom{4}{2} (0.4)^2 (0.6)^2=0.3456$
$\displaystyle p^{*4}(7)=\binom{4}{3} (0.4)^3 (0.6)^1=0.1536$
$\displaystyle p^{*4}(8)=\binom{4}{4} (0.4)^4 (0.6)^0=0.0256$

Since we are interested in finding $P[Y=y]$ for $y=0,1,2,3,4$, we only need to consider $N=0,1,2,3,4$. The following matrix shows the relevant values of $p^{*n}$. The rows are for $y=0,1,2,3,4$. The columns are $p^{*0}$, $p^{*1}$, $p^{*2}$, $p^{*3}$, $p^{*4}$.

$\displaystyle \begin{pmatrix} 1&0&0&0&0 \\{0}&0.6&0&0&0 \\{0}&0.4&0.36&0&0 \\{0}&0&0.48&0.216&0 \\{0}&0&0.16&0.432&0.1296\end{pmatrix}$

To obtain the probability mass function of $Y$, we simply multiply each row by $P[N=n]$ where $n=0,1,2,3,4$.

$\displaystyle P[Y=0]=e^{-\lambda}$
$\displaystyle P[Y=1]=0.6 \lambda e^{-\lambda}$
$\displaystyle P[Y=2]=0.4 \lambda e^{-\lambda}+0.36 \frac{\lambda^2 e^{-\lambda}}{2}$
$\displaystyle P[Y=3]=0.48 \frac{\lambda^2 e^{-\lambda}}{2}+0.216 \frac{\lambda^3 e^{-\lambda}}{6}$
$\displaystyle P[Y=4]=0.16 \frac{\lambda^2 e^{-\lambda}}{2}+0.432 \frac{\lambda^3 e^{-\lambda}}{6}+0.1296 \frac{\lambda^4 e^{-\lambda}}{24}$

# Compound Poisson distribution

The compound distribution is a model for describing the aggregate claims arised in a group of independent insureds. Let $N$ be the number of claims generated by a portfolio of insurance policies in a fixed time period. Suppose $X_1$ is the amount of the first claim, $X_2$ is the amount of the second claim and so on. Then $Y=X_1+X_2+ \cdots + X_N$ represents the total aggregate claims generated by this portfolio of policies in the given fixed time period. In order to make this model more tractable, we make the following assumptions:

• $X_1,X_2, \cdots$ are independent and identically distributed.
• Each $X_i$ is independent of the number of claims $N$.

The number of claims $N$ is associated with the claim frequency in the given portfolio of policies. The common distribution of $X_1,X_2, \cdots$ is denoted by $X$. Note that $X$ models the amount of a random claim generated in this portfolio of insurance policies. See these two posts for an introduction to compound distributions (An introduction to compound distributions, Some examples of compound distributions).

When the claim frequency $N$ follows a Poisson distribution with a constant parameter $\lambda$, the aggreagte claims $Y$ is said to have a compound Poisson distribution. After a general discussion of the compound Poisson distribution, we discuss the property that an independent sum of compound Poisson distributions is also a compound Poisson distribution. We also present an example to illustrate basic calculations.

Compound Poisson – General Properties

Distribution Function
$\displaystyle F_Y(y)=\sum \limits_{n=0}^{\infty} F^{*n}(y) \frac{\lambda^n e^{-\lambda}}{n!}$

where $\lambda=E[N]$, $F$ is the common distribution function of $X_i$ and $F^{*n}$ is the n-fold convolution of $F$.

Mean and Variance
$\displaystyle E[Y]=E[N] E[X]= \lambda E[X]$

$\displaystyle Var[Y]=\lambda E[X^2]$

Moment Generating Function and Cumulant Generating Function
$\displaystyle M_Y(t)=e^{\lambda (M_X(t)-1)}$

$\displaystyle \Psi_Y(t)=ln M_Y(t)=\lambda (M_X(t)-1)$

Note that the moment generating function of the Poisson $N$ is $M_N(t)=e^{\lambda (e^t - 1)}$. For a compound distribution $Y$ in general, $M_Y(t)=M_N[ln M_X(t)]$.

Skewness
$\displaystyle E[(Y-\mu_Y)^3]=\Psi_Y^{(3)}(0)=\lambda E[X^3]$

$\displaystyle \gamma_Y=\frac{E[(Y-\mu_Y)^3]}{Var[Y]^{\frac{3}{2}}}=\frac{1}{\sqrt{\lambda}} \frac{E[X^3]}{E[X^2]^{\frac{3}{2}}}$

Independent Sum of Compound Poisson Distributions
First, we state the results. Suppose that $Y_1,Y_2, \cdots, Y_k$ are independent random variables such that each $Y_i$ has a compound Poisson distribution with $\lambda_i$ being the Poisson parameter for the number of claim variable and $F_i$ being the distribution function for the individual claim amount. Then $Y=Y_1+Y_2+ \cdots +Y_k$ has a compound Poisson distribution with:

• the Poisson parameter: $\displaystyle \lambda=\sum \limits_{i=1}^{k} \lambda_i$
• the distribution function: $\displaystyle F_Y(y)=\sum \limits_{i=1}^{k} \frac{\lambda_i}{\lambda} \thinspace F_i(y)$

The above result has an insurance interpretation. Suppose we have $k$ independent blocks of insurance policies such that the aggregate claims $Y_i$ for the $i^{th}$ block has a compound Poisson distribution. Then $Y=Y_1+Y_2+ \cdots +Y_k$ is the aggregate claims for the combined block during the fixed policy period and also has a compound Poisson distribution with the parameters stated in the above two bullet points.

To get a further intuitive understanding about the parameters of the combined block, consider $N_i$ as the Poisson number of claims in the $i^{th}$ block of insurance policies. It is a well known fact in probability theory (see [1]) that the indpendent sum of Poisson variables is also a Poisson random variable. Thus the total number of claims in the combined block is $N=N_1+N_2+ \cdots +N_k$ and has a Poisson distribution with parameter $\lambda=\lambda_1 + \cdots + \lambda_k$.

How do we describe the distribution of an individual claim amount in the combined insurance block? Given a claim from the combined block, since we do not know which of the constituent blocks it is from, this suggests that an individual claim amount is a mixture of the individual claim amount distributions from the $k$ blocks with mixing weights $\displaystyle \frac{\lambda_1}{\lambda},\frac{\lambda_2}{\lambda}, \cdots, \frac{\lambda_k}{\lambda}$. These mixing weights make intuitive sense. If insurance bock $i$ has a higher claim frequency $\lambda_i$, then it is more likely that a randomly selected claim from the combined block comes from block $i$. Of course, this discussion is not a proof. But looking at the insurance model is a helpful way of understanding the independent sum of compound Poisson distributions.

To see why the stated result is true, let $M_i(t)$ be the moment generating function of the individual claim amount in the $i^{th}$ block of policies. Then the mgf of the aggregate claims $Y_i$ is $\displaystyle M_{Y_i}(t)=e^{\lambda_i (M_i(t)-1)}$. Consequently, the mgf of the independent sum $Y=Y_1+ \cdots + Y_k$ is:

$\displaystyle M_Y(t)=\prod \limits_{i=0}^{k} e^{\lambda_i (M_i(t)-1)}= e^{\sum \limits_{i=0}^{k} \lambda_i(M_i(t)-1)} \displaystyle = e^{\lambda \biggl[\sum \limits_{i=0}^{k} \frac{\lambda_i}{\lambda} M_i(t) - 1 \biggr]}$

The mgf of $Y$ has the form of a compound Poisson distribution where the Poisson parameter is $\lambda=\lambda_1 + \cdots + \lambda_k$. Note that the component $\displaystyle \sum \limits_{i=0}^{k} \frac{\lambda_i}{\lambda}M_i(t)$ in the exponent is the mgf of the claim amount distribution. Since it is the weighted average of the individual claim amount mgf’s, this indicates that the distribution function of $Y$ is the mixture of the distribution functions $F_i$.

Example
Suppose that an insurance company acquired two portfolios of insurance policies and combined them into a single block. For each portfolio the aggregate claims variable has a compound Poisson distribution. For one of the portfolios, the Poisson parameter is $\lambda_1$ and the individual claim amount has an exponential distribution with parameter $\delta_1$. The corresponding Poisson and exponential parameters for the other portfolio are $\lambda_2$ and $\delta_2$, respectively. Discuss the distribution for the aggregate claims $Y=Y_1+Y_2$ of the combined portfolio.

The aggregate claims $Y$ of the combined portfolio has a compound Poisson distribution with Poisson parameter $\lambda=\lambda_1+\lambda_2$. The amount of a random claim $X$ in the combined portfolio has the following distribution function and density function:

$\displaystyle F_X(x)=\frac{\lambda_1}{\lambda} (1-e^{-\delta_1 x})+\frac{\lambda_2}{\lambda} (1-e^{-\delta_2 x})$

$\displaystyle f_X(x)=\frac{\lambda_1}{\lambda} (\delta_1 \thinspace e^{-\delta_1 x})+\frac{\lambda_2}{\lambda} (\delta_2 \thinspace e^{-\delta_2 x})$

The rest of the discussion mirrors the general discussion earlier in this post.

Distribution Function
As in the general case, $\displaystyle F_Y(y)=\sum \limits_{n=0}^{\infty} F^{*n}(y) \frac{\lambda^n e^{-\lambda}}{n!}$

where $\lambda=\lambda_1 +\lambda_2$, $F=F_X$ and $F^{*n}$ is the n-fold convolution of $F_X$.

Mean and Variance
$\displaystyle E[Y]=\frac{\lambda_1}{\delta_1}+\frac{\lambda_2}{\delta_2}$

$\displaystyle Var[Y]=\frac{2 \lambda_1}{\delta_1^2}+\frac{2 \lambda_2}{\delta_2^2}$

Moment Generating Function and Cumulant Generating Function
To obtain the mgf and cgf of the aggregate claims $Y$, consider $\lambda [M_X(t)-1]$. Note that $M_X(t)$ is the weighted average of the two exponential mgfs of the two portfolios of insurance policies. Thus we have:

$\displaystyle M_X(t)=\frac{\lambda_1}{\lambda} \frac{\delta_1}{\delta_1 - t}+\frac{\lambda_2}{\lambda} \frac{\delta_2}{\delta_2 - t}$

$\displaystyle \lambda [M_X(t)-1]=\frac{\lambda_1 t}{\delta_1 - t}+\frac{\lambda_2 t}{\delta_2 - t}$

$\displaystyle M_Y(t)=e^{\lambda (M_X(t)-1)}=e^{\frac{\lambda_1 t}{\delta_1 - t}+\frac{\lambda_2 t}{\delta_2 - t}}$

$\displaystyle \Psi_Y(t)=\frac{\lambda_1 t}{\delta_1 -t}+\frac{\lambda_2 t}{\delta_2 -t}$

Skewness
Note that $\displaystyle E[(Y-\mu_Y)^3]=\Psi_Y^{(3)}(0)=\frac{6 \lambda_1}{\delta_1^3}+\frac{6 \lambda_2}{\delta_2^3}$

$\displaystyle \gamma_Y=\displaystyle \frac{\frac{6 \lambda_1}{\delta_1^3}+\frac{6 \lambda_2}{\delta_2^3}}{(\frac{2 \lambda_1}{\delta_1^2}+\frac{2 \lambda_2}{\delta_2^2})^{\frac{3}{2}}}$

Reference

1. Hogg R. V. and Tanis E. A., Probability and Statistical Inference, Second Edition, Macmillan Publishing Co., New York, 1983.

# Some examples of compound distributions

We present two examples of compound distributions to illustrate the general formulas presented in the previous post (An introduction to compound distributions).

For the examples below, let $N$ be the number of claims generated by either an individual insured or a group of independent insureds. Let $X$ be the individual claim amount. We consider the random sum $Y=X_1+ \cdots + X_N$. We discuss the following properties of the aggregate claims random variable $Y$:

1. The distribution function $F_Y$
2. The mean and higher moments: $E[Y]$ and $E[Y^n]$
3. The variance: $Var[Y]$
4. The moment generating function and cumulant generating function:$M_Y(t)$ and $\Psi_Y(t)$.
5. Skewness: $\gamma_Y$.

Example 1
The number of claims for an individual insurance policy in a policy period is modeled by the binomial distribution with parameter $n=2$ and $p$. The individual claim, when it occurs, is modeled by the exponential distribution with parameter $\lambda$ (i.e. the mean individual claim amount is $\frac{1}{\lambda}$).

The distribution function $F_Y$ is the weighted average of a point mass at $y=0$, the exponential distribution and the Erlang-2 distribution function. For $x \ge 0$, we have:

$\displaystyle F_Y(x)=(1-p)^2+2p(1-p)(1-e^{-\lambda x})+p^2(1-\lambda x e^{-\lambda x}-e^{-\lambda x})$

The mean and variance are are follows:

$\displaystyle E[Y]=E[N] \thinspace E[X]=\frac{2p}{\lambda}$

$\displaystyle Var[Y]=E[N] \thinspace Var[X]+Var[N] \thinspace E[X]^2$

$\displaystyle =\frac{2p}{\lambda^2}+\frac{2p(1-p)}{\lambda^2}=\frac{4p-2p^2}{\lambda^2}$

The following calculates the higher moments:

$\displaystyle E[Y^n]=(1-p)^2 0 + 2p(1-p) \frac{n!}{\lambda^n}+p^2 \frac{(n+1)!}{\lambda^n}$

$\displaystyle = \frac{2p(1-p)n!+p^2(n+1)!}{\lambda^n}$

The moment generating function $M_Y(t)=M_N[ln \thinspace M_X(t)]$. So we have:

$\displaystyle M_Y(t)=\biggl(1-p+p \frac{\lambda}{\lambda -t}\biggr)^2$

$\displaystyle =(1-p)^2+2p(1-p) \frac{\lambda}{\lambda -t}+p^2 \biggl(\frac{\lambda}{\lambda -t}\biggr)^2$

Note that $\displaystyle M_N(t)=(1-p+p e^{t})^2$ and $\displaystyle M_X(t)=\frac{\lambda}{\lambda -t}$.

For the cumulant generating function, we have:

$\displaystyle \Psi_Y(t)=ln M_Y(t)=2 ln\biggl(1-p+p \frac{\lambda}{\lambda -t}\biggr)$

For the measure of skewness, we rely on the cumulant generating function. Finding the third derivative of $\Psi_Y(t)$ and then evaluate at $t=0$, we have:

$\displaystyle \Psi_Y^{(3)}(0)=\frac{12p-12p^2+4p^3}{\lambda^3}$

$\displaystyle \gamma_Y=\frac{\Psi_Y^{(3)}(0)}{Var(Y)^{\frac{3}{2}}}=\frac{12p-12p^2+4p^3}{(4p-2p^2)^{\frac{3}{2}}}$

Example 2
In this example, the number of claims $N$ follows a geometric distribution. The individual claim amount $X$ follows an exponential distribution with parameter $\lambda$.

One of the most interesting facts about this example is the moment generating function. Note that $\displaystyle M_N(t)=\frac{p}{1-(1-p)e^t}$. The following shows the derivation of $M_Y(t)$:

$\displaystyle M_Y(t)=M_N[ln \thinspace M_X(t)]=\frac{p}{1-(1-p) e^{ln M_X(t)}}$

$\displaystyle =\frac{p}{1-(1-p) \frac{\lambda}{\lambda -t}}=\cdots=p+(1-p) \frac{\lambda p}{\lambda p-t}$

The moment generating function is the weighted average of a point mass at $y=0$ and the mgf of an exponential distribution with parameter $\lambda p$. Thus this example of compound geometric distribution is equivalent to a mixture of a point mass and an exponential distribution. We make use of this fact and derive the following basic properties.

Distribution Function
$\displaystyle F_Y(y)=p+(1-p) (1-e^{\lambda p y})=1-(1-p) e^{-\lambda p y}$ for $y \ge 0$

Density Function
$\displaystyle f_Y(y)=\left\{\begin{matrix}p&\thinspace y=0\\{(1-p) \lambda p e^{-\lambda p y}}&\thinspace 0 < y\end{matrix}\right.$

Mean and Higher Moments
$\displaystyle E[Y]=(1-p) \frac{1}{\lambda p}=\frac{1-p}{p} \frac{1}{\lambda}=E[N] E[X]$

$\displaystyle E[Y^n]=p 0 + (1-p) \frac{n!}{(\lambda p)^n}=(1-p) \frac{n!}{(\lambda p)^n}$

Variance
$\displaystyle Var[Y]=\frac{2(1-p)}{\lambda^2 p^2}-\frac{(1-p)^2}{\lambda^2 p^2}=\frac{1-p^2}{\lambda^2 p^2}$

Cumulant Generating Function
$\displaystyle \Psi_Y(t)=ln \thinspace M_Y(t)=ln\biggl(p+(1-p) \frac{\lambda p}{\lambda p-t}\biggr)$

Skewness
$\displaystyle E\biggl[\biggl(Y-\mu_Y\biggr)^3\biggr]=\Psi_Y^{(3)}(0)=\frac{2-2p^3}{\lambda^3 p^3}$

$\displaystyle \gamma_Y=\frac{\Psi_Y^{(3)}(0)}{(Var[Y])^{\frac{3}{2}}}=\frac{2-2p^3}{(1-p^2)^{\frac{3}{2}}}$

# An introduction to compound distributions

Compound distributions have many natural applications. We motivate the notion of compound distributions with an insurance application. In an individual insurance setting, we wish to model the aggregate claims during a fixed policy period for an insurance policy. In this setting, more than one claim is possible. Auto insurance and property and casualty insurance are examples. In a group insurance setting, we wish to model the aggregate claims during a fixed policy period for a group of insureds that are independent. In other words, we discuss distributions that can either model the total claims for an individual insured or a group of independent risks over a fixed period such that the claim frequency is uncertain (no claim, one claim or multiple claims). Note that in a previous post (More insurance examples of mixed distributions), we discussed a specific type of compound distribution with the simplifying assumption of having at most one claim. We now discuss models for aggregate claims where the claim frequency includes the possibility of having multiple claims. We first define the notion of compound distributions. We then discuss some general properties. We present some examples to illustrate the calculations discussed in Some examples of compound distributions.

The random variable $Y$ is said to have a compound distribution if $Y$ is of the following form

$\displaystyle Y=X_1+X_2+\cdots + X_N$

where (1) the number of terms $N$ is uncertain, (2) the random variables $X_i$ are independent and identically distributed (with common distribution $X$) and (3) each $X_i$ is independent of $N$.

The sum $Y$ as defined above is sometimes called a random sum. If $N=0$ is realized, then we have $Y=0$. Even though this is implicit in the definition, we want to call this out for clarity.

In our insurance contexts, the variable $N$ represents the number of claims generated by an individual policy or a group of indpendent insureds over a policy period. The variable $X_i$ represents the $i^{th}$ claim. Then $Y$ represents the aggregate claims over the fixed policy period.

We discuss the following properties of compound distributions:

1. Distribution function.
2. Mean and higher moments.
3. Variance.
4. Moment generating function and cumulant generating function.
5. Skewness.

The random sum $Y$ is a mixture. Thus many properties such as distribution function, expected value and moment generating function of $Y$ can be expressed as a weighted average of the corresponding items for the basic distributions.

1. Compound Distribution – Distribution Function
By the law of total probability, the distribution function of $Y$ is given by the following:

$\displaystyle F_Y(y)=\sum \limits_{n=0}^{\infty} G_n(y) \thinspace P[N=n]$

where for $n \ge 1$, $G_n(y)$ is the distribution function of the independent sum $X_1+ \cdots + X_n$ and $G_0(y)$ is the distribution function of the point mass at $y=0$.

We can also express $F_Y$ in terms of convolutions:

$\displaystyle F_Y(y)=\sum \limits_{n=0}^{\infty} F^{*n}(y) \thinspace P[N=n]$

where $F$ is the common distribution function for $X_i$ and $F^{*n}$ is the n-fold convolution of $F$.

If the common claim distribution $X$ is discrete, then the aggregate claims $Y$ is discrete. On the other hand, if $X$ is continuous and if $P[N=0]>0$, then the aggregate claims $Y$ will have a mixed distribution, as is often the case in insurance applications.

2. Compound Distribution – Mean and Higher Moments
The mean aggregate claims $E[Y]$ is:

$\displaystyle E[Y]=E[N] \thinspace E[X]$

The expected value of the aggregate claims has a natural interpretation. It is the product of the expected number of claims and the expected individual claim amount. This makes intuitive sense. The following is the derivation:

$\displaystyle E[Y]=E_N[E(Y \lvert N)]= E_N[E(X_1+ \cdots +X_N \lvert N)]$

$\displaystyle =E_N[N \thinspace E(X)]=E[N] \thinspace E[X]$

The higher moments of the aggregate claims $Y$ do not have a intuitively clear formula as the first moment. However, we can obtain the higher moments by using the first principle.

$\displaystyle E[Y^n]=E_N[E(Y^n \lvert N)]= E_N[E(\lbrace{X_1+ \cdots +X_N}\rbrace^n \lvert N)]$

$\displaystyle = E[Z_1^n] \thinspace P[N=1]+E[Z_2^n] \thinspace P[N=2]+ \cdots$

where $\displaystyle Z_n=X_1+ \cdots +X_n$.

3. Compound Distribution – Variance
The variance of the aggregate claims $Var[Y]$ is:

$\displaystyle Var[Y]=E[N] \thinspace Var[X]+Var[N] \thinspace E[X]^2$

The variance of the aggregate claims also has a natural interpretation. It is the sum of two components such that the first component stems from the variability of the individual claim amount and the second component stems from the variability of the number of claims. The variance of the aggregate claims can be derived by using the total variance formula:

$\displaystyle Var[Y]=E_N[Var(Y \lvert N)]+Var_N[E(Y \lvert N)]$

$\displaystyle =E_N[Var(X_1+ \cdots +X_N \lvert N)]+Var[E(X_1+ \cdots +X_N \lvert N)]$

$\displaystyle =E_N[N \thinspace Var(X)]+Var[N \thinspace E(X)]$

$\displaystyle =E[N] \thinspace Var[X]+Var[N] \thinspace E[X]^2$

4. Compound Distribution – Moment Generating Function and Cumulant Generating Function

The moment generating function $M_Y(t)$ is: $\displaystyle M_Y(t)=M_N[ln \thinspace M_X(t)]$ where the function $ln$ is the natural log function. The following is the derivation.

$\displaystyle M_Y(t)=E[e^{tY}]=E_N[E(e^{t(X_1+ \cdots +X_N)} \lvert N)]$

$\displaystyle =E_N[E(e^{tX_1} \cdots e^{tX_N} \lvert N)]$

$\displaystyle =E_N[E(e^{tX_1}) \cdots E(e^{tX_N}) \lvert N]=E_N[M_X(t)^N]$

$\displaystyle =E_N[e^{N \thinspace ln M_X(t)}]=M_N[ln \thinspace M_X(t)]$

Cumulant Generating Function
For any random variable $Z$, the cumulant generating function of $Z$ is defined as: $\Psi_Z(t)=ln M_Z(t)$. It can be shown that the cumulant generating function characterizes the second and third moments. We will use this fact to derive the skewness of the aggregate claims $Y$.

$\displaystyle \Psi_Z^{(k)}(0)=\left\{\begin{matrix}E[Z]&\thinspace k=1\\{Var[Z]=E[(Z-\mu_Z)^2]}&k=2\\{E[(Z-\mu_Z)^3]}&k=3\end{matrix}\right.$

Based on the definition of cumulant generating function, for the aggregate claims $Y$, $M_Y(t)=M_N[\Psi_X(t)]$. Thus we have:

$\displaystyle \Psi_Y(t)=ln M_Y(t)=ln \thinspace M_N[\Psi_X(t)]=\Psi_N[\Psi_X(t)]$

5. Compound Distribution – Skewness
The skewness for any random variable $Z$ is defined as:

$\displaystyle \gamma_Z=E\biggl[\biggl(\frac{Z-\mu_Z}{\sigma_Z}\biggr)^3\biggr]=\sigma_Z^{-3} \thinspace E\biggl[\biggl(Z-\mu_Z\biggr)^3\biggr]$.

Since $\Psi_Z^{(3)}(0)=E[(Z-\mu_Z)^3]$, we have $\gamma_Z=\sigma_Z^{-3} \thinspace \Psi_Z^{(3)}(0)$ and $\Psi_Z^{(3)}(0)= \sigma_Z^3 \thinspace \gamma_Z$.

From the section 4, $\Psi_Y(t)=\Psi_N[\Psi_X(t)]$. Taking the third derivative of $\Psi_Y(t)$ and evaluate at $t=0$, we have:

$\displaystyle \Psi_Y^{(3)}(0)=\gamma_N \thinspace \sigma_N^3 \thinspace \mu_X^3+3 \thinspace \sigma_N^2 \thinspace \mu_X \thinspace \sigma_X^2+\mu_N \thinspace \gamma_X \thinspace \sigma_X^3$

Thus, the following is the skewness of the aggregate claims $Y$:

$\displaystyle \gamma_Y=\frac{\gamma_N \thinspace \sigma_N^3 \thinspace \mu_X^3+3 \thinspace \sigma_N^2 \thinspace \mu_X \thinspace \sigma_X^2+\mu_N \thinspace \gamma_X \thinspace \sigma_X^3}{(\mu_N \thinspace \sigma_X^2+\sigma_N^2 \thinspace \mu_X^2)^{\frac{3}{2}}}$

Examples
Refer to Some examples of compound distributions for illustrations of the calculations discussed in this post.

# More insurance examples of mixed distributions

Four posts have already been devoted to describing three models for “per loss” insurance payout. These are mixed distributions modeling the amount the insurer pays out for each random loss. They can also be viewed as mixtures. We now turn our attention to the mixed distributions modeling the “per period” payout for an insurance policy. That is, the mixed distributions we describe here are to model the total amount of losses paid out for each insurance policy in a given policy period. This involves the uncertain random losses as well as uncertain claim frequency. In other words, there is a possiblity of having no losses. When there are losses in a policy period, the number of losses can be uncertain (there can be only one loss or multiple losses). The links to the previous posts on mixed distributions are found at the end of this post.

The following is the general setting of the insurance problem we discuss in this post.

1. The random variable $X$ is the size of the random loss that is covered in an insurance contract. We assumme that $X$ is a continuous random variable. Naturally, the support of $X$ is the set of nonnegative numbers (or some appropriate subset).
2. Let $Z$ be the “per loss” payout paid to the insured by the insurer. The variable $Z$ could refect the coverage modification such as deductible and/or policy cap or other policy provisions that are applicable in the insurance contract.
3. Let $N$ be the number of claims in a given policy period. In this post, we assume that $N$ has only two possibilities: $N=0$ or $N=1$. In other words, each policy has at most one claim in a period. Let $p=P[N=1]$.
4. Let $Y$ be the total amount paid to the insured by the insurer during a fixed policy period.

The total claims variable $Y$ is the mixture of $Y \lvert N=0$ and $Y \lvert N=1$. The conditional variable $Y \lvert N=0$ is a point mass representing “no loss”. On the other hand, we assume that $[Y \lvert N=1]=Z$. Thus $Y$ is a mixture of a point mass at the origin and the “per loss” payout variable $Z$

We first have a general discussion of the stated insurance setting. Then we discuss several different cases based on four coverage modifications that can be applied in the insurance contract. In each case, we illustrate with the exponential distribution. The four cases are:

• Case 1. $Z=X$. There is no coverage modification. The insurer pays the entire loss amount.
• Case 2. The insurance contract has a cap and the cap amount is $m$.
• Case 3. The insurance contract is an excess-of-loss policy. The deductible amount is $d$.
• Case 4. The insurance contract has a deductible $d$ and a policy cap $m$ where $d.

General Discussion
The total payout $Y$ is the mixture of a point mass at $y=0$ and the “per loss” payout $Z$. The following is the distribution $F_Y(y)$:

$\displaystyle F_Y(y)=(1-p) \thinspace F_U(y)+p \thinspace F_Z(y)$ where $\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

Since the distribution of $Y$ is a mixture, we have a wealth of information available for us. For example, the following lists the mean, higher moments, variance, the moment generating function and the skewness.

• $\displaystyle E[Y]=p \thinspace E[Z]$
• $\displaystyle E[Y^n]=p \thinspace E[Z^n]$ for al integers $n>1$
• $\displaystyle Var[Y]=pE[Z^2]-p^2 E[Z]^2$
• $\displaystyle M_Y(t)=(1-p)+p \thinspace M_Z(t)$
• $\displaystyle \gamma_Y=p \thinspace \gamma_Z$

The Derivations:
$\displaystyle E[Y]=(1-p) \thinspace 0+p \thinspace E[Z]=p \thinspace E[Z]$

$\displaystyle E[Y^n]=(1-p) \thinspace 0^n+p \thinspace E[Z^n]=p \thinspace E[Z^n]$ for al integers $n>1$

$\displaystyle Var[Y]=E[Y^2]-E[Y]^2=pE[Z^2]-p^2 E[Z]^2$

$\displaystyle M_Y(t)=(1-p) \thinspace e^0+p \thinspace M_Z(t)=(1-p)+p \thinspace M_Z(t)$

$\displaystyle \gamma_Y=(1-p) \thinspace 0+p \thinspace \gamma_Z=p \thinspace \gamma_Z$

The following is another way to derive $Var[Y]$ using the total variance formula:

$\displaystyle Var[Y]=E_N[Var(Y \lvert N)]+Var_N[E(Y \lvert N)]$

$\displaystyle =(1-p)0+pVar[Z] + E_N[E(Y \lvert N)^2]-E_N[E(Y \lvert N)]^2$

$\displaystyle =pVar[Z] + (1-p)0^2+pE[Z]^2-p^2 E[Z]^2$

$\displaystyle =pE[Z^2]-p E[Z]^2 +pE[Z]^2-p^2 E[Z]^2$

$\displaystyle =pE[Z^2]-p^2 E[Z]^2$

The above derivations are based on the idea of mixtures. The two conditional variables are $Y \lvert N=0$ and $\lbrace{Y \lvert N=1}\rbrace=Z$. The mixing weights are $P[N=0]$ and $P[N=1]$. For more basic information on distributions that are mixtures, see this post (Basic properties of mixtures).

We now discuss the four specific cases based on the variations on the coverage modifications that can be placed on the “per loss” variable $Z$.

Case 1
This is the case that the insurance policy has no coverage modification. The insurer pays the entire random loss. Thus $Z=X$. The following is the payout rule of $Y$:

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{Z=X}&\thinspace \text{a loss occurs}\end{matrix}\right.$

This is a mixed distribution consisting of a point mass at the origin (no loss) and the random loss $X$. In this case, the “per loss” variable $Z=X$. Thus $Y$ is a mixture of of the following two distributions.

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_Z(x)=\left\{\begin{matrix}0&\thinspace x<0\\{F_X(x)}&\thinspace 0 \le x\end{matrix}\right.$

Case 1 – Distribution Function
The following shows $F_Y$ as a mixture, the explicit rule of $F_Y$ and the density of $Y$.

$\displaystyle F_Y(x)=(1-p) \thinspace F_U(x) + p \thinspace F_Z(x)$.

$\displaystyle F_Y(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1-p+p \thinspace F_X(x)}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle f_Y(x)=\left\{\begin{matrix}1-p&\thinspace x=0\\{p \thinspace f_X(x)}&\thinspace 0 < x\end{matrix}\right.$

Case 1 – Basic Properties
Using basic properties of mixtures stated in the general case, we obtain the following:

$\displaystyle E[Y]=p \thinspace E[X]$

$\displaystyle E[Y^n]=p \thinspace E[X^n]$ for all integers $n>1$

$\displaystyle Var[Y]=p \thinspace E[X^2] - p^2 E[X]^2$

$\displaystyle M_Y(t)=1-p + p \thinspace M_X(t)$

$\displaystyle \gamma_Y=p \thinspace \gamma_X$

Case 1 – Exponential Example
If the unmodified random loss has an exponential distribution, we have the following results:

$\displaystyle E[Y]=\frac{p}{\lambda}$

$\displaystyle E[Y^n]=\frac{p \thinspace n!}{\lambda^n}$ for all integers $n>1$

$\displaystyle Var[Y]=\frac{2p-p^2}{\lambda^2}$

$\displaystyle M_Y(t)=1-p+\frac{p \thinspace \lambda}{\lambda - t}$

$\displaystyle \gamma_Y=2 \thinspace p$

Case 2
This is the case that the insurance policy has a policy cap. The “per loss” payout amount is capped at the amount $m$. The following is the payout rule of $Y$:

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{Z}&\thinspace \text{a loss occurs}\end{matrix}\right.$

$\displaystyle Z=\left\{\begin{matrix}X&\thinspace X

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{X}&\thinspace \text{a loss occurs and } X

Case 2 – Per Loss Variable $Z$
The following lists out the information we need for $Z$. For more information about the “per loss” payout for an insurance contract with a policy cap, see the post An insurance example of a mixed distribution – I.

$\displaystyle F_Z(x)=\left\{\begin{matrix}0&\thinspace x<0\\{F_X(x)}&\thinspace 0 \le x

$\displaystyle f_Z(x)=\left\{\begin{matrix}f_X(x)&\thinspace X

$\displaystyle E[Z]=\int_0^{m} x \thinspace f_X(x) \thinspace dx + m \thinspace [1-F_X(m)]$

$\displaystyle E[Z^n]=\int_0^{m} x^n \thinspace f_X(x) \thinspace dx + m^n \thinspace [1-F_X(m)]$ for all integers $n > 1$

$\displaystyle M_Z(t)=\int_0^m e^{tx}f_X(x)dx+e^{tm}[1-F_X(m)]$

$\displaystyle \gamma_Z=\int_0^m \biggl(\frac{z-\mu_Z}{\sigma_Z}\biggr)^3f_X(z)dz+\biggl(\frac{m-\mu_Z}{\sigma_Z}\biggr)^3 [1-F_X(m)]$

Case 2 – Distribution Function
Since $Z$ is a mixture, the distribution of $Y$ is a mixture of a point mass at the origin (no loss) and the mixture $Z$. As in the general case discussed above, the distribution function $F_Y$ is a weighted average of $F_U$ and $F_Z$ where $F_U$ is the distribution function of the point mass at $y=0$. The following shows the distribution function and the density function of $Y$.

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_Y(y)=(1-p) \thinspace F_U(y) + p \thinspace F_Z(y)$.

$\displaystyle F_Y(y)=\left\{\begin{matrix}0&\thinspace y<0\\{1-p+pF_X(y)}&\thinspace 0 \le y

$\displaystyle f_Y(y)=\left\{\begin{matrix}1-p&y=0\\{pf_X(y)}&\thinspace 0

Case 2 – Basic Properties
To obtain the basic properties such as $E[Y]$, $E[Y^2]$, $M_Y(t)$ and $\gamma_Y$, just take the weighted average of the point mass and the “per loss” $Z$ of this case. In other words, they are obtained by weighting the point mass (of no loss) with the “per loss variable $Z$.

Case 2 – Exponential Example
If the unmodified loss $X$ has an exponential distribution, we have the following results:

$\displaystyle E[Y]=\frac{p}{\lambda}(1-e^{-\lambda m})$

$\displaystyle E[Y^2]=p\biggl(\frac{2}{\lambda^2}-\frac{2m}{\lambda}e^{-\lambda m}-\frac{2}{\lambda^2}e^{-\lambda m}\biggr)$

$\displaystyle Var[Y]=p \thinspace E[Z^2] - p^2 E[Z]^2$

$\displaystyle M_Y(t)=1-p+pM_Z(t)$ where

$\displaystyle M_Z(t)=\int_0^m e^{tx} \lambda e^{-\lambda x}dx+e^{tm} e^{-\lambda m}$

$\displaystyle =\frac{\lambda}{\lambda -t}-\frac{\lambda}{\lambda -t} e^{-(\lambda-t)m}+e^{-(\lambda-t)m}$

Case 3
This is the case that the insurance policy is an excess-of-loss policy. The insurer agrees to pay the insured the amount of the random loss $X$ in excess of a fixed amount $d$. The following is the payout rule of $Y$:

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{Z}&\thinspace \text{a loss occurs}\end{matrix}\right.$

$\displaystyle Z=\left\{\begin{matrix}0&\thinspace X

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{0}&\thinspace \text{a loss occurs and } X

Case 3 – Per Loss Variable $Z$
The following lists out the information we need for $Z$. For more information about the “per loss” payout for an insurance contract with a deductible, see the post An insurance example of a mixed distribution – II.

$\displaystyle F_Z(y)=\left\{\begin{matrix}0&\thinspace y<0\\{F_X(y+d)}&\thinspace y \ge 0\end{matrix}\right.$

$\displaystyle f_Z(y)=\left\{\begin{matrix}F_X(d)&\thinspace y=0\\{f_X(y+d)}&\thinspace y > 0\end{matrix}\right.$

$\displaystyle E[Z]=\int_o^{\infty} y \thinspace f_X(y+d) \thinspace dy$

$\displaystyle E[Z^n]=\int_o^{\infty} y^n \thinspace f_X(y+d) \thinspace dy$ for all integer $n>1$

$\displaystyle M_Z(t)=F_X(d) e^{0} + \int_0^{\infty} e^{tz}f_X(z+d) dz$

$\displaystyle =F_X(d) + e^{-td} \int_d^{\infty} e^{tw} f_X(w) dw$

$\displaystyle \gamma_Z=F_X(d) \biggl(\frac{0-\mu_Z}{\sigma_Z}\biggr)^3+\int_0^{\infty} \biggl(\frac{z-\mu_Z}{\sigma_Z}\biggr)^3 f_X(z+d) dz$

Case 3 – Distribution Function
Since $Z$ is a mixture, the distribution of $Y$ is a mixture of a point mass at the origin (no loss) and the mixture $Z$. As in the general case discussed above, the distribution function $F_Y$ is a weighted average of $F_U$ and $F_Z$ where $F_U$ is the distribution function of the point mass at $y=0$. The following shows the distribution function and the density function of $Y$.

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_Y(y)=(1-p) \thinspace F_U(y) + p \thinspace F_Z(y)$.

$\displaystyle F_Y(y)=\left\{\begin{matrix}0&\thinspace y<0\\{1-p+pF_X(y+d)}&\thinspace 0 \le y\end{matrix}\right.$

$\displaystyle f_Y(y)=\left\{\begin{matrix}1-p+pF_X(d)&y=0\\{pf_X(y+d)}&\thinspace 0

Note that the point mass of $Y$ is made up of two point masses, one from having no loss and one from having losses less than the deductible.

Case 3 – Basic Properties
The basic properties of $Y$ as a mixture are obtained by applying the general formulas with the specific information about the “per loss” $Z$ in this case. In other words, they are obtained by weighting the point mass (of no loss) with the “per loss variable $Z$.

Case 3 – Exponential Example
If the unmodified loss $X$ has an exponential distribution, then we have the following results:

$\displaystyle E[Y]=pE[Z]=p \thinspace \frac{e^{-\lambda d}}{\lambda}=p \thinspace e^{-\lambda d} E[X]$

$\displaystyle E[Y^2]=pE[Z^2]=p \thinspace \frac{2e^{-\lambda d}}{\lambda^2}=p \thinspace e^{-\lambda d}E[X^2]$

$\displaystyle Var[Y]=p \thinspace \frac{2e^{-\lambda d}}{\lambda^2}-p^2 \thinspace \frac{e^{-2\lambda d}}{\lambda^2}=pe^{-\lambda d}(2-pe^{-\lambda d})Var[X]$

$\displaystyle M_Y(t)=1-e^{-\lambda d}+e^{-\lambda d} \frac{\lambda}{\lambda -t}$

$\displaystyle = 1-e^{-\lambda d}+e^{-\lambda d} M_X(t)$

Case 4
This is the case that the insurance policy has both a policy cap and a deductible. The “per loss” payout amount is capped at the amount $m$ and is positive only when the loss is in excess of the deductible $d$. The following is the payout rule of $Y$:

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{Z}&\thinspace \text{a loss occurs}\end{matrix}\right.$

$\displaystyle Z=\left\{\begin{matrix}0&\thinspace X

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace \text{no loss occurs}\\{0}&\text{a loss and }X

Case 4 – Per Loss Variable $Z$
The following lists out the information we need for $Z$. For more information about the “per loss” payout for an insurance contract with a deductible and a policy cap, see the post An insurance example of a mixed distribution – III.

$\displaystyle F_Z(y)=\left\{\begin{matrix}0&\thinspace y<0\\{F_X(y+d)}&\thinspace 0 \le y < m\\{1}&m \le y\end{matrix}\right.$

$\displaystyle f_Z(y)=\left\{\begin{matrix}F_X(d)&\thinspace y=0\\{f_X(y+d)}&\thinspace 0 < y < m\\{1-F_X(d+m)}&y=m\end{matrix}\right.$

$\displaystyle E[Z]=\int_0^m y \thinspace f_X(y+d) \thinspace dy + m \thinspace [1-F_X(d+m)]$

$\displaystyle E[Z^n]=\int_0^m y^n \thinspace f_X(y+d) \thinspace dy + m^n \thinspace [1-F_X(d+m)]$ for all integer $n>1$

$\displaystyle M_Z(t)=F_X(d) e^0 + \int_0^m e^{tx} f_X(x+d) dx + e^{tm} [1-F_X(d+m)]$

$\displaystyle =F_X(d) + \int_0^m e^{tx} f_X(x+d) dx + e^{tm} [1-F_X(d+m)]$

$\displaystyle \gamma_Z=F_X(d) \biggl(\frac{0-\mu_Z}{\sigma_Z}\biggr)^3+\int_0^{\infty} \biggl(\frac{z-\mu_Z}{\sigma_Z}\biggr)^3 f_X(z+d) dz$

$\displaystyle + \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ [1-F_X(d+m)] \biggl(\frac{m-\mu_Z}{\sigma_Z}\biggr)^3$

Case 4 – Distribution Function
Since $Z$ is a mixture, the distribution of $Y$ is a mixture of a point mass at the origin (no loss) and the mixture $Z$. As in the general case discussed above, the distribution function $F_Y$ is a weighted average of $F_U$ and $F_Z$ where $F_U$ is the distribution function of the point mass at $y=0$. The following shows the distribution function and the density function of $Y$.

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_Y(y)=(1-p) \thinspace F_U(y) + p \thinspace F_Z(y)$.

$\displaystyle F_Y(y)=\left\{\begin{matrix}0&\thinspace y<0\\{1-p+pF_X(y+d)}&\thinspace 0 \le y

$\displaystyle f_Y(y)=\left\{\begin{matrix}1-p+pF_X(d)&y=0\\{pf_X(y+d)}&\thinspace 0

Note that the point mass of $Y$ is made up of two point masses, one from having no loss and one from having losses less than the deductible.

Case 4 – Basic Properties
The basic properties of $Y$ as a mixture are obtained by applying the general formulas with the specific information about the “per loss” $Z$ in this case. In other words, they are obtained by weighting the point mass (of no loss) with the “per loss variable $Z$.

Case 4 – Exponential Example
If the unmodified loss $X$ has an exponential distribution, then we have the following results:

$\displaystyle E[Y]=pE[Z]=p e^{-\lambda d} \frac{1}{\lambda} (1-e^{-\lambda m})=p e^{-\lambda d} (1-e^{-\lambda m}) E[X]$

Another view of $E[Y]$:
$\displaystyle E[Y]=e^{-\lambda d} E[Y_2]$ where $Y_2$ is the $Y$ in Case 2.

Also, it can be shown that:
$\displaystyle E[Y^2]=e^{-\lambda d} E[Y_2^2]$ where $Y_2$ is the $Y$ in Case 2.

Here's the links to the previous discussions of mixed distributions:
An insurance example of a mixed distribution – I
An insurance example of a mixed distribution – II
An insurance example of a mixed distribution – III
Mixed distributions as mixtures

# Mixed distributions as mixtures

A random variable $X$ is a mixture if its distribution function $F_X$ is a weighted average of a family of conditional distribution functions. The random variable $X$ is a mixed distribution if it is a distribution that has at least one probability mass (i.e. there is at least one point $a$ in the support of $X$ such that $P[X=a]>0$) and there is some interval $(a,b)$ contained in the support such that $P[X=c]=0$ for every $c \in (a,b)$. It turns out that a mixed distribution can be expressed as a mixture. Three examples of mixed distributions from insurance applications have been presented in this blog. We demonstrate that these three mixed distributions are mixtures. The links to some previous posts on mixtures can be found at the end of this post.

Example 1
Link: An insurance example of a mixed distribution – I The mixed distribution in this example is the “per loss” payout for an insurance contract that has a policy maximum.

Example 2
Link: An insurance example of a mixed distribution – II The mixed distribution in this example is the “per loss” payout for an insurance policy that has a deductible.

Example 3
Link: An insurance example of a mixed distribution – III The mixed distribution in this example is the “per loss” payout of an insurance contract where there are both a deductible and a policy maximum.

Throughout this post, let $X$ be the unmodified random loss. We assume that $X$ is a continuous random variable with support the nonnegative real numbers.

Discussion of Example 1
Let $Y_1$ be the “per loss” insurance payout for a policy where the payout is capped at $m$. The following are the payout rule, the distribution function and the density function:

$\displaystyle Y_1=\left\{\begin{matrix}X&\thinspace X

$\displaystyle F_{Y_1}(x)=\left\{\begin{matrix}0&\thinspace x<0\\{F_X(x)}&\thinspace 0 \le x

$\displaystyle f_{Y_1}(x)=\left\{\begin{matrix}f_X(x)&\thinspace 0 < x

We show that $F_{Y_1}$ can be expressed as a weighted average of two distribution functions. One of the distributions is the random loss between $0$ and $m$. This is a limited loss and call this loss $U$. The second distribution is the point mass at $m$. Call this point mass $V$. The following are the distribution functions:

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{\displaystyle \frac{F_X(x)}{F_X(m)}}&\thinspace 0 \le x

$\displaystyle F_V(x)=\left\{\begin{matrix}0&\thinspace x

It follows that $F_{Y_1}(x)= p \thinspace F_U(x) + (1-p) \thinspace F_V(x)$ where $p=F_X(m)$. Note that the distribution of $U$ only describes the loss within $(0,m)$. Thus the distribution function $F_U$ is obtained from $F_X$ by a scaler adjustment.

Discussion of Example 2
Let $Y_2$ be the “per loss” insurance payout for a policy where there is a deductible $d$. For each loss, the insurer pays the insured in excess of the deductible $d$. The following are the payout rule, the distribution function and the density function:

$\displaystyle Y_2=\left\{\begin{matrix}0&\thinspace X

$\displaystyle F_{Y_2}(y)=\left\{\begin{matrix}0&\thinspace y<0\\{F_X(y+d)}&\thinspace y \ge 0\end{matrix}\right.$

$\displaystyle f_{Y_2}(y)=\left\{\begin{matrix}F_X(d)&\thinspace y=0\\{f_X(y+d)}&\thinspace y > 0\end{matrix}\right.$

We show that $F_{Y_2}$ can be expressed as a weighted average of two distribution functions. One of the distributions is the random loss greater than $d$. Call this loss $U$. The second distribution is the point mass at $0$. Call this point mass $V$. The following are the distribution functions:

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{\displaystyle \frac{F_X(x+d)-F_X(d)}{1-F_X(d)}}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_V(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

It follows that $F_{Y_2}(x)= p \thinspace F_U(x) + (1-p) \thinspace F_V(x)$ where $p=1-F_X(d)$. The random variable $V$ is a point mass at the origin reflecting the case where no claim is made by the insurer. This point mass has weight $F_X(d)$. The random variable $U$ is the distribution describing the random losses that are greater than $d$.

Discussion of Example 3
Let $Y_3$ be the “per loss” insurance payout for a policy where there are both a deductible $d$ and a policy cap $m$ with $d. For each loss, the insurer pays the insured in excess of the deductible $d$ up to the policy cap $m$. The following are the payout rule, the distribution function and the density function:

$\displaystyle Y_3=\left\{\begin{matrix}0&\thinspace X

$\displaystyle F_{Y_3}(y)=\left\{\begin{matrix}0&\thinspace y<0\\{F_X(y+d)}&\thinspace 0 \le y < m\\{1}&m \le y\end{matrix}\right.$

$\displaystyle f_{Y_3}(y)=\left\{\begin{matrix}F_X(d)&\thinspace y=0\\{f_X(y+d)}&\thinspace 0 < y < m\\{1-F_X(d+m)}&y=m\end{matrix}\right.$

The distribution of $Y_3$ can be expressed as a mixture of three distributions – two point masses (one at the origin and one at $m$) and one continuous variable describing the random losses in between $0$ and $m$. Consider the following distribution functions:

$\displaystyle F_U(x)=\left\{\begin{matrix}0&\thinspace x<0\\{1}&\thinspace 0 \le x\end{matrix}\right.$

$\displaystyle F_V(x)=\left\{\begin{matrix}0&\thinspace x<0\\{\displaystyle \frac{F_X(x+d)-F_X(d)}{F_X(d+m)-F_X(d)}}&\thinspace 0 \le x

$\displaystyle F_W(x)=\left\{\begin{matrix}0&\thinspace x

The random variables $U$ and $W$ represent the point masses at $0$ and $m$, respectively. The variable $V$ describes the random losses in between $0$ and $m$. It follows that $F_{Y_3}$ is the weighted average of these three distribution functions.

$\displaystyle F_{Y_3}(x)=p_1 \thinspace F_U(x)+p_2 \thinspace F_V(x)+p_3 \thinspace F_W(x)$

The weights are: $p_1=F_X(d)$, $p_2=F_X(d+m)-F_X(d)$, and $p_3=1-F_X(d+m)$

Here’s the links to examples of mixed distributions:
Example 1 An insurance example of a mixed distribution – I
Example 2 An insurance example of a mixed distribution – II
Example 3 An insurance example of a mixed distribution – III

Here’s the links to some previous posts on mixtures:
Examples of mixtures
Basic properties of mixtures

# An insurance example of a mixed distribution – III

In the previous two posts, we discuss mixed distributions that are derived from modifying coverage on insurance contracts. Let $X$ be the dollar amount of an random loss covered by an insurance contract. Without any coverage modification, the insurer would be obligated to pay the entire amount of the loss $X$. With some type of coverage modification, we are interested in the amount $Y$ paid out by the insurer. How do we model $Y$ based on the distribution of $X$? In one previous post, we discussed the model of the insurance payout $Y$ when the insurance contract has a policy maximum (An insurance example of a mixed distribution – I). In another post, the coverage modification is having a deductible (An insurance example of a mixed distribution – II). In this post, we consider an insurance contract that has a combination of a deductble and a policy maximum. We discuss the model for the insurance payout $Y$ and illustrate the calculation with the exponential distribution.

Note that the model for $Y$ in this post and in the previous two posts is to model the insurance per loss or per claim. In other words, we model the payment made by the insurer for each insured loss. In future posts, we will discuss models that describe the insurance payments per insurance policy during a policy period. Such models will have to take into account that there may be no loss (or claim) during a period or that there may be multiple losses or claims in a policy period.

Modifying insurance coverage (e.g. having a policy maximum and/or a deductible) is akin to censoring and/or truncating the random loss amounts. Each type of censoring creates a probability mass in the distribution in the “per loss” insurance payout. So the presence of a deductible and a policy maximum in the same contract creates two probability masses. Let $d$ be the deductible and let $m$ be the policy maximum where $d. Specifically, the following is the payout rule:

$\displaystyle Y=\left\{\begin{matrix}0&\thinspace X

The two probability masses are at $y=0$ and $y=m$. Thus the distribution function $F_Y$ of $Y$ has two jumps:

$\displaystyle F_Y(y)=\left\{\begin{matrix}0&\thinspace y<0\\{F_X(y+d)}&\thinspace 0 \le y < m\\{1}&m \le y\end{matrix}\right.$

Note that the distribution function $F_Y$ is obtained by shifting the graph of $F_X$ between the points $(d,F_X(d))$ and $(d+m,F_X(d+m))$ leftward to the point $(0,F_X(d))$.

The point mass at $y=0$ has probability $P[X and the point mass at $y=m$ has probability $P[X \ge d+m]$. Thus the following is the density function of the “per loss” insurance payout $Y$:

$\displaystyle f_Y(y)=\left\{\begin{matrix}F_X(d)&\thinspace y=0\\{f_X(y+d)}&\thinspace 0 < y < m\\{1-F_X(d+m)}&y=m\end{matrix}\right.$

Here’s the mean payout and the higher moments of the payout:

$\displaystyle E[Y]=\int_0^m y \thinspace f_X(y+d) \thinspace dy + m \thinspace [1-F_X(d+m)]$

$\displaystyle E[Y^n]=\int_0^m y^n \thinspace f_X(y+d) \thinspace dy + m^n \thinspace [1-F_X(d+m)]$ for all integer $n>1$

Example
Suppose the random loss $X$ follows an exponential distribution with parameter $\lambda$. Let $Y$ be the “per loss” payout for an insurance contract that has a combination of a deductible $d$ and a policy maximum $m$. Let $Z$ be the payout for an insurance contract with the same policy maximum $m$ but with no deductible. Interestingly, in this exponential example, we can express $E[Y]$ and $Var[Y]$ in terms of $E[Z]$ and $E[Z^2]$ (see An insurance example of a mixed distribution – I).

$\displaystyle E[Y]=\int_0^m y \thinspace \lambda e^{-\lambda (y+d)} \thinspace dy + m \thinspace [e^{-\lambda (d+m)}]$

$\displaystyle =e^{-\lambda d} \thinspace \biggl(\int_0^m y \thinspace \lambda \thinspace e^{-\lambda y} \thinspace dy+m \thinspace e^{-\lambda m}\biggr)=e^{-\lambda d} E[Z]$

$\displaystyle E[Y^2]=\int_0^m y^2 \thinspace \lambda \thinspace e^{-\lambda (y+d)} \thinspace dy+m^2 \thinspace e^{-\lambda (d+m)}$

$\displaystyle =e^{-\lambda d} \thinspace \biggl(\int_0^m y^2 \thinspace \lambda \thinspace e^{-\lambda y} \thinspace dy+m^2 \thinspace e^{-\lambda m}\biggr)=e^{-\lambda d} E[Z^2]$

$\displaystyle Var[Y]=e^{-\lambda d} E[Z^2] - e^{-2 \lambda d} E[Z]^2$

In the insurance contract with a policy maximum but no deductible, the expected insurance payout (per loss) is reduced from $\displaystyle \frac{1}{\lambda}$ to $\displaystyle \frac{1-e^{-\lambda m}}{\lambda}$. With the addition of a deductible $d$, the expected payout is reduced from $\displaystyle \frac{1}{\lambda}$ to $\displaystyle e^{-\lambda d} \frac{1-e^{-\lambda m}}{\lambda}$. The expected insurance payout is reduced by the amount $\displaystyle \frac{1-e^{-\lambda d}(1-e^{-\lambda m})}{\lambda}$. Then the fraction of the loss eliminated by the deductible and the policy cap is $\displaystyle 1-e^{-\lambda d}(1-e^{-\lambda m})$. Note that the fraction of the loss eliminated by the policy cap alone is $\displaystyle 1-(1-e^{-\lambda m})=e^{-\lambda m}$.

Intuitively, it is clear that a higher fraction of the random loss is eliminated if there is a deductible on top of the policy cap. In this example, this is also borne out by the following inequality:

$\displaystyle e^{-\lambda m} < 1-e^{-\lambda d}(1-e^{-\lambda m})$

It is also intuitively clear that the presence of a deductible and a policy maximum reduces the variance of the insurance payout. We show that this is the case for the exponential example. In fact, we show that $Var[Y]$ is less than $Var[Z]$, that is when there is a deductible on top of the policy maximum, the variance of the “per loss” payout is less than the variance when there is only a policy maximum. We have the following derivation and the following claim:

$\displaystyle Var[Y]=e^{-\lambda d} E[Z^2] - e^{-2 \lambda d} E[Z]^2$

$\displaystyle =e^{-\lambda d} (Var[Z]+E[Z]^2) - e^{-2 \lambda d} E[Z]^2$

$\displaystyle =e^{-\lambda d} Var[Z] + (e^{-\lambda d}-e^{-2 \lambda d}) E[Z]^2$

Claim.
$\displaystyle Var[Z]>e^{-\lambda d} Var[Z] + (e^{-\lambda d}-e^{-2 \lambda d}) E[Z]^2$

Suppose not. The following derives a contradiction:

$\displaystyle Var[Z] \le e^{-\lambda d} Var[Z] + (e^{-\lambda d}-e^{-2 \lambda d}) E[Z]^2$

$\displaystyle (1-e^{-\lambda d}) Var[Z] \le e^{-\lambda d}(1-e^{-\lambda d})E[Z]^2$

$\displaystyle Var[Z] \le e^{-\lambda d}E[Z]^2$

Note that the last inequality holds for all positive number $d>0$. Thus $Var[Z] \le 0$. This is a contradiction as $Var[Z]>0$. Thus $Var[Y]$ is less than $Var[Z]$.