In this post, we continue our discussion in credibility theory. Suppose that for a particular insured (either an individual entity or a group of insureds), we have observed data (the numbers of claims or loss amounts). We are interested in setting a rate to cover the claim experience
from the next period. In two previous posts (Examples of Bayesian prediction in insurance, Examples of Bayesian prediction in insurance-continued), we discussed this estimation problem from a Bayesian perspective and presented two examples. In this post, we discuss the Buhlmann credibility model and work the same two examples using the Buhlmann method.
First, let’s further describe the setting of the problem. For a particular insured, the experience data corresponding to various exposure periods are assumed to be independent. Statistically speaking, conditional on a risk parameter , the claim numbers or loss amounts
are independent and identically distributed. Furthermore, the distribution of the risk characteristics in the population of insureds and potential insureds is represented by
. The experience (either claim numbers or loss amounts) of a particular insured with risk parameter
is modeled by the conditional distribution
given
.
The Buhlmann Credibility Estimator
Given the observations in the prior exposure periods, the Buhlmann credibility estimate
of the claim experience
is
where is the credibility factor assigned to the observed experience data and
is the unconditional mean
(the mean taken over all members of the risk parameter
). The credibility factor
is of the form
where
is a measure of the exposure size (it is the number of observation periods in our examples) and
. The parameter
will be further explained below.
The Buhlmann credibility estimator is a linear function of the past data. Note that it is of the form:
where and
for
.
Not only is the Buhlmann credibility estimator a linear estimator, it is the best linear estimator to the Bayesian predictive mean and the hypothetical mean
in terms of minimizing squared error loss. In other words, the coefficients
are obtained in such a way that the following expectations (loss functions) are minimized where the expectations are taken over all observations and/or
(see [1]):
The Buhlmann Method
As discussed above, the Buhlmann credibility factor is chosen such that
is the best linear approximation to the Bayesian estimate of the next period’s claim experience. Now we focus on the calculation of the parameter
.
Conditional on the risk parameter ,
is called the hypothetical mean and
is called the process variance. Then
is the expected value of hypothetical means (the unconditional mean). The total variance of this random process is:
The first part of the total variance is called the expected value of process variance (EPV) and the second part
is called the variance of the hypothetical means (VHM). The parameter
in the Buhlmann method is simply the ratio
.
We can get an intuitive feel of this formula by considering the variability of the hypothetical means across many values of the risk parameter
. If the entire population of insureds (and potential insureds) is fairly homogeneous with respect to the risk parameter
, then
does not vary a great deal and is relatively small in relation to
. As a result,
is large and
is closer to 0. This agrees with the notion that in a homogeneous population, the unconditional mean (the overall mean) is of more value as a predictor of the next period’s claim experience. On the other hand, if the population of insureds is heterogeneous with respect to the risk parameter
, then the overall mean is of less value as a predictor of future experience and we should reply more on the experience of the particular insured. Again, the Buhlmann formula agrees with this notion. If
is large relative to
, then
is small and
is closer to 1.
Another attractive feature of the Buhlmann formula is that as more experience data accumulate (as ), the credibility factor
approaches 1 (the experience data become more and more credible).
Example 1
In this random experiment, there are a big bowl (called B) and two boxes (Box 1 and Box 2). Bowl B consists of a large quantity of balls, 80% of which are white and 20% of which are red. In Box 1, 60% of the balls are labeled 0, 30% are labeled 1 and 10% are labeled 2. In Box 2, 15% of the balls are labeled 0, 35% are labeled 1 and 50% are labeled 2. In the experiment, a ball is selected at random from bowl B. The color of the selected ball from bowl B determines which box to use (if the ball is white, then use Box 1, if red, use Box 2). Then balls are drawn at random from the selected box (Box ) repeatedly with replacement and the values of the series of selected balls are recorded. The value of first selected ball is
, the value of the second selected ball is
and so on.
Suppose that your friend performs this random experiment (you do not know whether he uses Box 1 or Box 2) and that his first selected ball is a 1 () and his second selected ball is a 2 (
). What is the predicted value
of the third selected ball?
This example was solved in (Examples of Bayesian prediction in insurance) using the Bayesian approach. We now work this example in the Buhlmann approach.
The following restates the prior distribution of and the conditional distribution of
. We denote “white ball from bowl B” by
and “red ball from bowl B” by
.
The following computes the conditional means (hypothetical means) and conditional variances (process variances) and the other parameters of the Buhlmann method.
Hypothetical Means
Process Variances
Expected Value of the Hypothetical Means
Expected Value of the Process Variance
Variance of the Hypothetical Means
Buhlmann Credibility Factor
Buhlmann Credibility Estimate
Note that the Bayesian estimate obtained in Examples of Bayesian prediction in insurance is 1.004237288. Under the Buhlmann model, the past claim experience of the insured in this example is assigned 33% weight in projecting the claim frequency in the next period.
Example 2
The number of claims generated by an insured in a potfolio of independent insurance policies has a Poisson distribution with parameter
. In the portfolio of policies, the parameter
varies according to a gamma distribution with parameters
and
. We have the following conditional distributions of
and distribution of the risk parameter
.
where
where
is the gamma function.
Suppose that a particular insured in this portfolio has generated 0 and 3 claims in the first 2 policy periods. What is the Buhlmann estimate of the number of claims for this insured in period 3?
Since the conditional distribution of is Poisson, we have
and
. As a result, the
,
and
are:
As a result, the credibility factor for a 2-period experience period is and the Buhlmann estimate of the claim frequency in the next period is:
To generalize the above results, suppose that we have observed for this insured in the prior periods. Then the Buhlmann estimate for the claim frequency in the next period is:
In this example, the Buhlmann estimate is exactly the same as the Bayesian estimate (Examples of Bayesian prediction in insurance-continued).
Reference
- Klugman S. A., Panjer H. H., Willmot G. E., Loss Models, From Data To Decisions, Second Edition, 2004, John Wiley & Sons, Inc.
Pingback: Empirical Bayes for multiple sample sizes · The File Drawer | Artificia Intelligence