Examples of Bayesian prediction in insurance

We present two examples to illustrate the notion of Bayesian predictive distributions. The general insurance problem we aim to illustrate is that of using past claim experience data from an individual insured or a group of insureds to predict the future claim experience. Suppose we have X_1,X_2, \cdots, X_n with each X_i being the number of claims or an aggregate amount of claims in a prior period of observation. Given such results, what will be the number of claims during the next period, or what will be the aggregate claim amount in the next period? These two examples will motivate the notion of credibility, both Bayesian credibility theory and Buhlmann credibility theory. We present Example 1 in this post. Example 2 is presented in the next post (Examples of Bayesian prediction in insurance-continued).

Example 1
In this random experiment, there are a big bowl (called B) and two boxes (Box 1 and Box 2). Bowl B consists of a large quantity of balls, 80% of which are white and 20% of which are red. In Box 1, 60% of the balls are labeled 0, 30% are labeled 1 and 10% are labeled 2. In Box 2, 15% of the balls are labeled 0, 35% are labeled 1 and 50% are labeled 2. In the experiment, a ball is selected at random from bowl B. The color of the selected ball from bowl B determines which box to use (if the ball is white, then use Box 1, if red, use Box 2). Then balls are drawn at random from the selected box (Box i) repeatedly with replacement and the values of the series of selected balls are recorded. The value of first selected ball is X_1, the value of the second selected ball is X_2 and so on.

Suppose that your friend performs this random experiment (you do not know whether he uses Box 1 or Box 2) and that his first ball is a 1 (X_1=1) and his second ball is a 2 (X_2=2). What is the predicted value X_3 of the third selected ball?

Though it is straightforward to apply the Bayes’ theorem to this problem (the solution can be seen easily using a tree diagram) to obtain a numerical answer, we use this example to draw out the principle of Bayesian prediction. So it may appear that we are making a simple problem overly complicated. We are merely using this example to motivate the method of Bayesian estimation.

For convenience, we denote “draw a white ball from bowl B” by \theta=1 and “draw a red ball from bowl B” by \theta=2. Box 1 and Box 2 are conditional distributions. The Bowl B is a distribution for the parameter \theta. The distribution given in Bowl B is a probability distribution over the space of all parameter values (called a prior distribution). The prior distribution of \theta and the conditional distributions of X given \theta are restated as follows:

\pi_{\theta}(1)=0.8
\pi_{\theta}(2)=0.2

\displaystyle f_{X \lvert \Theta}(0 \lvert \theta=1)=0.60
\displaystyle f_{X \lvert \Theta}(1 \lvert \theta=1)=0.30
\displaystyle f_{X \lvert \Theta}(2 \lvert \theta=1)=0.10

\displaystyle f_{X \lvert \Theta}(0 \lvert \theta=2)=0.15
\displaystyle f_{X \lvert \Theta}(1 \lvert \theta=2)=0.35
\displaystyle f_{X \lvert \Theta}(2 \lvert \theta=2)=0.50

The following shows the conditional means E[X \lvert \theta] and the unconditional mean E[X].

\displaystyle E[X \lvert \theta=1]=0.6(0)+0.3(1)+0.1(2)=0.50
\displaystyle E[X \lvert \theta=2]=0.15(0)+0.35(1)+0.5(2)=1.35
\displaystyle E[X]=0.8(0.50)+0.2(1.35)=0.67

If you know which particular box your friend is using (\theta=1 or \theta=2), then the estimate of the next ball should be E[X \lvert \theta]. But the value of \theta is unkown to you. Another alternative for a predicted value is the unconditional mean E[X]=0.67. While the estimate E[X]=0.67 is easy to calculate, this estimate does not take the observed data (X_1=1 and X_2=2) into account and it certainly does not take the parameter \theta into account. A third alternative is to incorporate the observed data into the estimate of the next ball. We now continue with the calculation of the Bayesian estimation.

Unconditional Distribution
\displaystyle f_X(0)=0.6(0.8)+0.15(0.2)=0.51
\displaystyle f_X(1)=0.3(0.8)+0.35(0.2)=0.31
\displaystyle f_X(2)=0.1(0.8)+0.50(0.2)=0.18

Marginal Probability
\displaystyle f_{X_1,X_2}(1,2)=0.1(0.3)(0.8)+0.5(0.35)(0.2)=0.059

Posterior Distribution of \theta
\displaystyle \pi_{\Theta \lvert X_1,X_2}(1 \lvert 1,2)=\frac{0.1(0.3)(0.8)}{0.059}=\frac{24}{59}

\displaystyle \pi_{\Theta \lvert X_1,X_2}(2 \lvert 1,2)=\frac{0.5(0.35)(0.2)}{0.059}=\frac{35}{59}

Predictive Distribution of X
\displaystyle f_{X_3 \lvert X_1,X_2}(0 \lvert 1,2)=0.6 \frac{24}{59} + 0.15 \frac{35}{59}=\frac{19.65}{59}

\displaystyle f_{X_3 \lvert X_1,X_2}(1 \lvert 1,2)=0.3 \frac{24}{59} + 0.35 \frac{35}{59}=\frac{19.45}{59}

\displaystyle f_{X_3 \lvert X_1,X_2}(2 \lvert 1,2)=0.1 \frac{24}{59} + 0.50 \frac{35}{59}=\frac{19.90}{59}

Here is another formulation of the predictive distribution of X_3. See the general methodology section below.
\displaystyle f_{X_3 \lvert X_1,X_2}(0 \lvert 1,2)=\frac{0.6(0.1)(0.3)(0.8)+0.15(0.5)(0.35)(0.2)}{0.059}=\frac{19.65}{59}

\displaystyle f_{X_3 \lvert X_1,X_2}(1 \lvert 1,2)=\frac{0.3(0.1)(0.3)(0.8)+0.35(0.5)(0.35)(0.2)}{0.059}=\frac{19.45}{59}

\displaystyle f_{X_3 \lvert X_1,X_2}(2 \lvert 1,2)=\frac{0.1(0.1)(0.3)(0.8)+0.5(0.5)(0.35)(0.2)}{0.059}=\frac{19.90}{59}

The posterior distribution \pi_{\theta}(\cdot \lvert 1,2) is the conditional probability distribution of the parameter \theta given the observed data X_1=1 and X_2=2. This is a result of applying the Bayes’ theorem. The predictive distribution f_{X_3 \lvert X_1,X_2}(\cdot \lvert 1,2) is the conditional probability distribution of a new observation given the past observed data of X_1=1 and X_2=2. Since both of these distributions incorporate the past observations, the Bayesian estimate of the next observation is the mean of the predictive distribution.

\displaystyle E[X_3 \lvert X_1=1,X_2=2]

\displaystyle =0 \thinspace f_{X_3 \lvert X_1,X_2}(0 \lvert 1,2)+1 \thinspace f_{X_3 \lvert X_1,X_2}(1 \lvert 1,2)+2 \thinspace f_{X_3 \lvert X_1,X_2}(2 \lvert 1,2)

\displaystyle =0 \frac{19.65}{59}+1 \frac{19.45}{59}+ 2 \frac{19.90}{59}

\displaystyle =\frac{59.25}{59}=1.0042372

\displaystyle E[X_3 \lvert X_1=1,X_2=2]

\displaystyle =E[X \lvert \theta=1] \medspace \pi_{\Theta \lvert X_1,X_2}(1 \lvert 1,2)+E[X \lvert \theta=2] \medspace \pi_{\Theta \lvert X_1,X_2}(2 \lvert 1,2)

\displaystyle =0.5 \frac{24}{59}+1.35 \frac{35}{59}=\frac{59.25}{59}

Note that we compute the Bayesian estimate E[X_3 \vert X_1,X_2] in two ways, one using the predictive distribution and the other using the posterior distribution of the parameter \theta. The Bayesian estimate is the mean of the hypothetical means E[X \lvert \theta] with expectation taken over the entire posterior distribution \pi_{\theta}(\cdot \lvert 1,2).

Discussion of General Methodology
We now use Example 1 to draw out general methodology. We first describe the discrete case and have the continuous case as a generalization.

Suppose we have a family of conditional density functions f_{X \lvert \Theta}(x \lvert \theta). In Example 1, the bowl B is the distribution of the parameter \theta. Box 1 and Box 2 are the conditional distributions with density f_{X \lvert \Theta}(x \lvert \theta). In an insurance application, the \theta is a risk parameter and the conditional distribution f_{X \lvert \Theta}(x \lvert \theta) is the claim experience in a given fixed period (conditional on \Theta=\theta).

Suppose that X_1,X_2, \cdots, X_n,X_{n+1} (conditional on \Theta=\theta) are independent and identically distributed where the common density function is f_{X \lvert \Theta}(x \lvert \theta). In our Example 1, once a box is selected (e.g. Box 1), then the repeated drawing of the balls are independent and identically distributed. In an insurance application, the X_k are the claim experience from an insured (or a group of insureds) where the insured belongs to the risk class with parameter \theta.

We are interested in the conditional distribution of X_{n+1} given \Theta=\theta to predict X_{n+1}. In our example, X_{n+1} is the value of the ball in the (n+1)^{st} draw. In an insurance application, X_{n+1} may be the claim experience of an insured (or a group of insureds) in the next policy period. We can use the unconditional mean E[X]=E[E(X \lvert \Theta)] (the mean of the hypothetical means). This approach does not take the risk parameter of the insured into the equation. On the other hand, if we know the value of \theta, then we can use f_{X \lvert \Theta}(x \lvert \theta). But the risk parameter is usually unknown. The natural alternative is to condition on the observed experience in the n prior periods X_1, \cdots, X_n rather than conditioning on the risk parameter \theta. Thus we derive the predictive distribution of X_{n+1} given the observation X_1, \cdots, X_n. Given the observed experience data X_1=x_1,X_2=x_2, \cdots, X_n=x_n, the following is the derivation of the Bayesian predictive distribution. Note that the prior distribution of the parameter \theta is \pi_{\Theta}(\theta).

The Unconditional Distribution
\displaystyle f_X(x)=\sum \limits_{\theta} f_{X \lvert \Theta}(x \lvert \theta) \ \pi_{\Theta}(\theta)

The Marginal Distribution
\displaystyle f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)=\sum \limits_{\theta} \biggl[\prod \limits_{i=1}^{n} f_{X_i \lvert \Theta}(x_i \lvert \theta)\biggr] \pi_{\Theta}(\theta)

The Posterior Distribution
\displaystyle \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n)

\displaystyle = \ \ \ \ \ \ \ \ \ \ \frac{1}{f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)} \biggl[\prod \limits_{i=1}^{n} f_{X_i \lvert \Theta}(x_i \lvert \theta)\biggr] \pi_{\Theta}(\theta)

The Predictive Distribution
\displaystyle f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n)

\displaystyle =\ \ \ \ \ \ \ \ \ \ \sum \limits_{\theta} f_{X \lvert \Theta}(x \lvert \theta) \thinspace \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n)

Another formulation is:
\displaystyle f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n)

\displaystyle =\ \ \ \ \ \ \ \ \ \ \frac{1}{f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)} \sum \limits_{\theta} f_{X_{n+1} \lvert \Theta}(x \lvert \theta) \biggl[ \prod \limits_{j=1}^{n}f_{X_j \lvert \Theta}(x_j \lvert \theta)\biggr] \thinspace \pi_{\Theta}(\theta)

The Bayesian Predictive Mean of the Next Period
\displaystyle E[X_{n+1} \lvert X_1=x_1, \cdots, X_n=x_n]

\displaystyle =\ \ \ \ \ \ \ \ \ \ \sum \limits_{x} x \thinspace f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n)

\displaystyle E[X_{n+1} \lvert X_1=x_1, \cdots, X_n=x_n]

\displaystyle =\ \ \ \ \ \ \ \ \ \ \sum \limits_{\theta} E[X \lvert \theta] \thinspace \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n)

We state the same results for the case that the claim experience X is continuous.

The Unconditional Distribution
\displaystyle f_{X}(x) = \int_{\theta} f_{X \lvert \Theta} (x \lvert \theta) \ \pi_{\Theta}(\theta) \ d \theta

The Marginal Distribution
\displaystyle f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)=\int \limits_{\theta} \biggl[\prod \limits_{i=1}^{n} f_{X \lvert \Theta}(x_i \lvert \theta)\biggr] \pi_{\Theta}(\theta) d \theta

The Posterior Distribution
\displaystyle \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n)

\displaystyle =\ \ \ \ \ \ \ \ \ \ \frac{1}{f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)} \biggl[\prod \limits_{i=1}^{n} f_{X \lvert \Theta}(x_i \lvert \theta)\biggr] \pi_{\Theta}(\theta)

The Predictive Distribution
\displaystyle f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n)

\displaystyle =\ \ \ \ \ \ \ \ \ \ \int \limits_{\theta} f_{X \lvert \Theta}(x \lvert \theta) \thinspace \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n) \ d \theta

Another formulation is:
\displaystyle f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n)

\displaystyle =\ \ \ \ \ \ \ \ \ \ \frac{1}{f_{X_1, \cdots, X_n}(x_1, \cdots, x_n)} \int \limits_{\theta} f_{X_{n+1} \lvert \Theta}(x \lvert \theta) \biggl[ \prod \limits_{j=1}^{n}f_{X_j \lvert \Theta}(x_j \lvert \theta)\biggr] \thinspace \pi_{\Theta}(\theta) \ d \theta

The Bayesian Predictive Mean of the Next Period
\displaystyle E[X_{n+1} \lvert X_1=x_1, \cdots, X_n=x_n]

\displaystyle =\ \ \ \ \ \ \ \ \ \ \int \limits_{x} x \thinspace f_{X_{n+1} \lvert X_1, \cdots, X_n}(x \vert x_1, \cdots, x_n) dx

\displaystyle E[X_{n+1} \lvert X_1=x_1, \cdots, X_n=x_n]

\displaystyle =\ \ \ \ \ \ \ \ \ \ \int \limits_{\theta} E[X \lvert \theta] \thinspace \pi_{\Theta \lvert X_1, \cdots, X_n}(\theta \lvert x_1, \cdots, x_n) d \theta

See the next post (Examples of Bayesian prediction in insurance-continued) for Example 2.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s