The hazard rate function, an introduction

The goal of this post is to introduce the concept of hazard rate function by modifying one of the postulates of the approximate Poisson process. The rate of changes in the modified process is the hazard rate function. When a “change” in the modified Poisson process means a termination of a system (be it manufactured or biological), the notion of the hazard rate function leads to the concept of survival models. We then discuss several important examples of survival probability models that are defined by the hazard rate function. These examples include the Weibull distribution, the Gompertz distribution and the model based on the Makeham’s law.

We consider an experiment in which the occurrences of a certain type of events are counted during a given time interval or on a given physical object. Suppose that we count the occurrences of events on the interval (0,t). We call the occurrence of the type of events in question a change. We assume the following three conditions:

  1. The numbers of changes occurring in nonoverlapping intervals are independent.
  2. The probability of two or more changes taking place in a sufficiently small interval is essentially zero.
  3. The probability of exactly one change in the short interval (t,t+\delta) is approximately \lambda(t) \delta where \delta is sufficiently small and \lambda(t) is a nonnegative function of t.

For the lack of a better name, throughout this post, we call the above process the counting process (*). The approximate Poisson process is defined by conditions 1 and 2 and the condition that the \lambda(t) in condition 3 is a constant function. Thus the process we describe here is a more general process than the Poisson process.

Though the counting process indicated here can model the number of changes occurred in a physical object or a physical interval, we focus on the time aspect by considering the counting process as models for the number of changes occurred in a time interval where a change means “termination” or ‘failure” of a system under consideration. In many applications (e.g. in actuarial science and reliability engineering), the interest is on the time until termination or failure. Thus, the distribution for the time until failure is called a survival model. The rate of change function \lambda(t) indicated in condition 3 is called the hazard rate function. It is also called the failure rate function in reliability engineering. In actuarial science, the hazard rate function is known as the force of mortality.

Two random variables naturally arise from the counting process (*). One is the discrete variable N_t, defined as the number of changes in the time interval (0,t). The other is the continuous random variable T, defined as the time until the occurrence of the first (or next) change.

Claim 1. Let \displaystyle \Lambda(t)=\int_{0}^{t} \lambda(y) dy. Then e^{-\Lambda(t)} is the probability that there is no change in the interval (0,t). That is, \displaystyle P[N_t=0]=e^{-\Lambda(t)}.

We are interested in finding the probability of zero changes in the interval (0,y+\delta). By condition 1, the numbers of changes in the nonoverlapping intervals (0,y) and (y,y+\delta) are independent. Thus we have:

\displaystyle P[N_{y+\delta}=0] \approx P[N_y=0] \times [1-\lambda(y) \delta] \ \ \ \ \ \ \ \ (a)

Note that by condition 3, the probability of exactly one change in the small interval (y,y+\delta) is \lambda(y) \delta. Thus [1-\lambda(y) \delta] is the probability of no change in the interval (y,y+\delta). Continuing with equation (a), we have the following derivation:

\displaystyle \frac{P[N_{y+\delta}=0] - P[N_y=0]}{\delta} \approx -\lambda(y) P[N_y=0]

\displaystyle \frac{d}{dy} P[N_y=0]=-\lambda(y) P[N_y=0]

\displaystyle \frac{\frac{d}{dy} P[N_y=0]}{P[N_y=0]}=-\lambda(y)

\displaystyle \int_0^{t} \frac{\frac{d}{dy} P[N_y=0]}{P[N_y=0]} dy=-\int_0^{t} \lambda(y)dy

Integrating the left hand side and using the boundary condition of P[N_0=0]=1, we have:

\displaystyle ln P[N_t=0]=-\int_0^{t} \lambda(y)dy

\displaystyle P[N_t=0]=e^{-\int_0^{t} \lambda(y)dy}

Claim 2
As discussed above, let T be the length of the interval that is required to observe the first change in the counting process (*). Then the following are the distribution function, survival function and pdf of T:

  • \displaystyle F_T(t)=\displaystyle 1-e^{-\int_0^t \lambda(y) dy}
  • \displaystyle S_T(t)=\displaystyle e^{-\int_0^t \lambda(y) dy}
  • \displaystyle f_T(t)=\displaystyle \lambda(t) e^{-\int_0^t \lambda(y) dy}

In Claim 1, we derive the probability P[N_y=0] for the discrete variable N_y derived from the counting process (*). We now consider the continuous random variable T. Note that P[T > t] is the probability that the first change occurs after time t. This means there is no change within the interval (0,t). Thus S_T(t)=P[T > t]=P[N_t=0]=e^{-\int_0^t \lambda(y) dy}. The distribution function and density function can be derived accordingly.

Claim 3
The hazard rate function \lambda(t) is equivalent to each of the following:

  • \displaystyle \lambda(t)=\frac{f_T(t)}{1-F_T(t)}
  • \displaystyle \lambda(t)=\frac{-S_T^{'}(t)}{S_T(t)}

Based on the condition 3 in the counting process (*), the \lambda(t) is the rate of change in the counting process. Note that \lambda(t) \delta is the probability of a change (e.g. a failure or a termination) in a small time interval of length \delta. Thus the hazard rate function can be interpreted as the failure rate at time t given that the life in question has survived to time t. Claim 3 shows that the hazard rate function is the ratio of the density function and the survival function of the time until failure variable T. Thus the hazard rate function \lambda(t) is the conditional density of failure at time t. It is the rate of failure at the next instant given that the life or system being studied has survived up to time t.

It is interesting to note that the function \Lambda(t)=\int_0^t \lambda(y) dy defined in claim 1 is called the cumulative hazard rate function. Thus the cumulative hazard rate function is an alternative way of representing the hazard rate function (see the discussion on Weibull distribution below).

Examples of Survival Models

Exponential Distribution
In many applications, especially those for biological organisms and mechanical systems that wear out over time, the hazard rate \lambda(t) is an increasing function of t. In other words, the older the life in question (the larger the t), the higher chance of failure at the next instant. For humans, the probability of a 85 years old dying in the next year is clearly higher than for a 20 years old. In a Poisson process, the rate of change \lambda(t)=\lambda indicated in condition 3 is a constant. As a result, the time T until the first change derived in claim 2 has an exponential distribution with parameter \lambda. In terms of mortality study or reliability study of machines that wear out over time, this is not a realistic model. However, if the mortality or failure is caused by random external events, this could be an appropriate model.

Weibull Distribution
This distribution is an excellent model choice for describing the life of manufactured objects. It is defined by the following cumulative hazard rate function:

\displaystyle \Lambda(t)=\biggl(\frac{t}{\beta}\biggr)^{\alpha} where \alpha > 0 and \beta>0

As a result, the hazard rate function, the density function and the survival function for the lifetime distribution are:

\displaystyle \lambda(t)=\frac{\alpha}{\beta} \biggl(\frac{t}{\beta}\biggr)^{\alpha-1}

\displaystyle f_T(t)=\frac{\alpha}{\beta} \biggl(\frac{t}{\beta}\biggr)^{\alpha-1} \displaystyle e^{\displaystyle -\biggl[\frac{t}{\beta}\biggr]^{\alpha}}

\displaystyle S_T(t)=\displaystyle e^{\displaystyle -\biggl[\frac{t}{\beta}\biggr]^{\alpha}}

The parameter \alpha is the shape parameter and \beta is the scale parameter. When \alpha=1, the hazard rate becomes a constant and the Weibull distribution becomes an exponential distribution.

When the parameter \alpha<1, the failure rate decreases over time. One interpretation is that most of the defective items fail early on in the life cycle. Once they they are removed from the population, failure rate decreases over time.

When the parameter 1<\alpha, the failure rate increases with time. This is a good candidate for a model to describe the lifetime of machines or systems that wear out over time.

The Gompertz Distribution
The Gompertz law states that the force of mortality or failure rate increases exponentially over time. It describe human mortality quite accurately. The following is the hazard rate function:

\displaystyle \lambda(t)=\alpha e^{\beta t} where \alpha>0 and \beta>0.

The following are the cumulative hazard rate function as well as the survival function, distribution function and the pdf of the lifetime distribution T.

\displaystyle \Lambda(t)=\int_0^t \alpha e^{\beta y} dy=\frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}

\displaystyle S_T(t)=\displaystyle e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}}

\displaystyle F_T(t)=\displaystyle 1-e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}}

\displaystyle f_T(t)=\displaystyle \alpha e^{\beta t} \thinspace e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}}

Makeham’s Law
The Makeham’s Law states that the force of mortality is the Gompertz failure rate plus an age-indpendent component that accounts for external causes of mortality. The following is the hazard rate function:

\displaystyle \lambda(t)=\alpha e^{\beta t}+\mu where \alpha>0, \beta>0 and \mu>0.

The following are the cumulative hazard rate function as well as the survival function, distribution function and the pdf of the lifetime distribution T.

\displaystyle \Lambda(t)=\int_0^t (\alpha e^{\beta y}+\mu) dy=\frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}+\mu t

\displaystyle S_T(t)=\displaystyle e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}+\mu t}

\displaystyle F_T(t)=\displaystyle 1-e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}+\mu t}

\displaystyle f_T(t)=\biggl( \alpha e^{\beta t}+\mu t \biggr) \thinspace e^{\displaystyle \frac{\alpha}{\beta} e^{\beta t}-\frac{\alpha}{\beta}+\mu t}


Introduction to Buhlmann credibility

In this post, we continue our discussion in credibility theory. Suppose that for a particular insured (either an individual entity or a group of insureds), we have observed data X_1,X_2, \cdots, X_n (the numbers of claims or loss amounts). We are interested in setting a rate to cover the claim experience X_{n+1} from the next period. In two previous posts (Examples of Bayesian prediction in insurance, Examples of Bayesian prediction in insurance-continued), we discussed this estimation problem from a Bayesian perspective and presented two examples. In this post, we discuss the Buhlmann credibility model and work the same two examples using the Buhlmann method.

First, let’s further describe the setting of the problem. For a particular insured, the experience data corresponding to various exposure periods are assumed to be independent. Statistically speaking, conditional on a risk parameter \Theta, the claim numbers or loss amounts X_1, \cdots, X_n,X_{n+1} are independent and identically distributed. Furthermore, the distribution of the risk characteristics in the population of insureds and potential insureds is represented by \pi_{\Theta}(\theta). The experience (either claim numbers or loss amounts) of a particular insured with risk parameter \Theta=\theta is modeled by the conditional distribution f_{X \lvert \Theta}(x \lvert \theta) given \Theta=\theta.

The Buhlmann Credibility Estimator
Given the observations X_1, \cdots, X_n in the prior exposure periods, the Buhlmann credibility estimate C of the claim experience X_{n+1} is

\displaystyle C=Z \overline{X}+(1-Z)\mu

where Z is the credibility factor assigned to the observed experience data and \mu is the unconditional mean E[X] (the mean taken over all members of the risk parameter \Theta). The credibility factor Z is of the form \displaystyle Z=\frac{n}{n+K} where n is a measure of the exposure size (it is the number of observation periods in our examples) and \displaystyle K=\frac{E[Var[X \lvert \Theta]]}{Var[E[X \lvert \Theta]]}. The parameter K will be further explained below.

The Buhlmann credibility estimator C is a linear function of the past data. Note that it is of the form:

\displaystyle C=Z \overline{X}+(1-Z)\mu=w_0+\sum \limits_{i=1}^{n} w_i X_i

where w_0=(1-Z)\mu and \displaystyle w_i=\frac{Z}{n} for i=1, \cdots, n.

Not only is the Buhlmann credibility estimator a linear estimator, it is the best linear estimator to the Bayesian predictive mean E[X_{n+1} \lvert X_1, \cdots, X_n] and the hypothetical mean E[X_{n+1} \lvert \Theta] in terms of minimizing squared error loss. In other words, the coefficients w_i are obtained in such a way that the following expectations (loss functions) are minimized where the expectations are taken over all observations and/or \Theta (see [1]):

\displaystyle L_1=E\biggl( \biggl[E[X_{n+1} \lvert \Theta]-w_0-\sum \limits_{i=1}^{n} w_i X_i \biggr]^2 \biggr)

\displaystyle L_2=E\biggl( \biggl[E[X_{n+1} \lvert X_1, \cdots, X_n]-w_0-\sum \limits_{i=1}^{n} w_i X_i \biggr]^2 \biggr)

The Buhlmann Method
As discussed above, the Buhlmann credibility factor Z=\frac{n}{n+K} is chosen such that C=Z \overline{X}+(1-Z) \mu is the best linear approximation to the Bayesian estimate of the next period’s claim experience. Now we focus on the calculation of the parameter K.

Conditional on the risk parameter \Theta, E[X \lvert \Theta] is called the hypothetical mean and Var[X \lvert \Theta] is called the process variance. Then \mu=E[X]=E[E[X \lvert \Theta]] is the expected value of hypothetical means (the unconditional mean). The total variance of this random process is:

\displaystyle Var[X]=E[Var[X \lvert \Theta]]+Var[E[X \lvert \Theta]]

The first part of the total variance E[Var[X \lvert \Theta]] is called the expected value of process variance (EPV) and the second part Var[E[X \lvert \Theta]] is called the variance of the hypothetical means (VHM). The parameter K in the Buhlmann method is simply the ratio K=\frac{EPV}{VHM}.

We can get an intuitive feel of this formula by considering the variability of the hypothetical means E[X \lvert \Theta] across many values of the risk parameter \Theta. If the entire population of insureds (and potential insureds) is fairly homogeneous with respect to the risk parameter \Theta, then VHM=Var[E[X \lvert \Theta]] does not vary a great deal and is relatively small in relation to EPV=E[Var[X \lvert \Theta]]. As a result, K is large and Z is closer to 0. This agrees with the notion that in a homogeneous population, the unconditional mean (the overall mean) is of more value as a predictor of the next period’s claim experience. On the other hand, if the population of insureds is heterogeneous with respect to the risk parameter \Theta, then the overall mean is of less value as a predictor of future experience and we should reply more on the experience of the particular insured. Again, the Buhlmann formula agrees with this notion. If VHM=Var[E[X \lvert \Theta]] is large relative to EPV=E[Var[X \lvert \Theta]], then K is small and Z is closer to 1.

Another attractive feature of the Buhlmann formula is that as more experience data accumulate (as n \rightarrow \infty), the credibility factor Z approaches 1 (the experience data become more and more credible).

Example 1
In this random experiment, there are a big bowl (called B) and two boxes (Box 1 and Box 2). Bowl B consists of a large quantity of balls, 80% of which are white and 20% of which are red. In Box 1, 60% of the balls are labeled 0, 30% are labeled 1 and 10% are labeled 2. In Box 2, 15% of the balls are labeled 0, 35% are labeled 1 and 50% are labeled 2. In the experiment, a ball is selected at random from bowl B. The color of the selected ball from bowl B determines which box to use (if the ball is white, then use Box 1, if red, use Box 2). Then balls are drawn at random from the selected box (Box i) repeatedly with replacement and the values of the series of selected balls are recorded. The value of first selected ball is X_1, the value of the second selected ball is X_2 and so on.

Suppose that your friend performs this random experiment (you do not know whether he uses Box 1 or Box 2) and that his first selected ball is a 1 (X_1=1) and his second selected ball is a 2 (X_2=2). What is the predicted value X_3 of the third selected ball?

This example was solved in (Examples of Bayesian prediction in insurance) using the Bayesian approach. We now work this example in the Buhlmann approach.

The following restates the prior distribution of \Theta and the conditional distribution of X \lvert \Theta. We denote “white ball from bowl B” by \Theta=1 and “red ball from bowl B” by \Theta=2.


\displaystyle f_{X \lvert \Theta}(0 \lvert \Theta=1)=0.60
\displaystyle f_{X \lvert \Theta}(1 \lvert \Theta=1)=0.30
\displaystyle f_{X \lvert \Theta}(2 \lvert \Theta=1)=0.10

\displaystyle f_{X \lvert \Theta}(0 \lvert \Theta=2)=0.15
\displaystyle f_{X \lvert \Theta}(1 \lvert \Theta=2)=0.35
\displaystyle f_{X \lvert \Theta}(2 \lvert \Theta=2)=0.50

The following computes the conditional means (hypothetical means) and conditional variances (process variances) and the other parameters of the Buhlmann method.

Hypothetical Means
\displaystyle E[X \lvert \Theta=1]=0.60(0)+0.30(1)+0.10(2)=0.50
\displaystyle E[X \lvert \Theta=2]=0.15(0)+0.35(1)+0.50(2)=1.35

\displaystyle E[X^2 \lvert \Theta=1]=0.60(0)+0.30(1)+0.10(4)=0.70
\displaystyle E[X^2 \lvert \Theta=2]=0.15(0)+0.35(1)+0.50(4)=2.35

Process Variances
\displaystyle Var[X \lvert \Theta=1]=0.70-0.50^2=0.45
\displaystyle Var[X \lvert \Theta=2]=2.35-1.35^2=0.5275

Expected Value of the Hypothetical Means
\displaystyle \mu=E[X]=E[E[X \lvert \Theta]]=0.80(0.50)+0.20(1.35)=0.67

Expected Value of the Process Variance
\displaystyle EPV=E[Var[X \lvert \Theta]]=0.8(0.45)+0.20(0.5275)=0.4655

Variance of the Hypothetical Means
\displaystyle VHM=Var[E[X \lvert \Theta]]=0.80(0.50)^2+0.20(1.35)^2-0.67^2=0.1156

Buhlmann Credibility Factor
\displaystyle K=\frac{4655}{1156}

\displaystyle Z=\frac{2}{2+\frac{4655}{1156}}=\frac{2312}{6967}=0.33185

Buhlmann Credibility Estimate
\displaystyle C=\frac{2312}{6967} \frac{3}{2}+\frac{4655}{6967} (0.67)=\frac{6586.85}{6967}=0.9454356

Note that the Bayesian estimate obtained in Examples of Bayesian prediction in insurance is 1.004237288. Under the Buhlmann model, the past claim experience of the insured in this example is assigned 33% weight in projecting the claim frequency in the next period.

Example 2
The number of claims X generated by an insured in a potfolio of independent insurance policies has a Poisson distribution with parameter \Theta. In the portfolio of policies, the parameter \Theta varies according to a gamma distribution with parameters \alpha and \beta. We have the following conditional distributions of X and distribution of the risk parameter \Theta.

\displaystyle f_{X \lvert \Theta}(x \lvert \theta)=\frac{\theta^x e^{-\theta}}{x!} where x=0,1,2, \cdots

\displaystyle \pi_{\Theta}(\theta)=\frac{\beta^{\alpha}}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-\beta \theta} where \Gamma(\cdot) is the gamma function.

Suppose that a particular insured in this portfolio has generated 0 and 3 claims in the first 2 policy periods. What is the Buhlmann estimate of the number of claims for this insured in period 3?

Since the conditional distribution of X is Poisson, we have E[X \lvert \Theta]=\Theta and Var[X \lvert \Theta]=\Theta. As a result, the EPV, VHM and K are:

\displaystyle EPV=E[\Theta]=\frac{\alpha}{\beta}

\displaystyle VHM=Var[\Theta]=\frac{\alpha}{\beta^2}

\displaystyle K=\frac{EPV}{VHM}=\beta

As a result, the credibility factor for a 2-period experience period is Z=\frac{2}{2+\beta} and the Buhlmann estimate of the claim frequency in the next period is:

\displaystyle C=\frac{2}{2+\beta} \thinspace \biggl(\frac{3}{2}\biggr)+\frac{\beta}{2+\beta} \thinspace \biggl(\frac{\alpha}{\beta}\biggr)

To generalize the above results, suppose that we have observed X_1=x_1, \cdots, X_n=x_n for this insured in the prior periods. Then the Buhlmann estimate for the claim frequency in the next period is:

\displaystyle C=\frac{n}{n+\beta} \thinspace \biggl(\frac{\sum \limits_{i=1}^{n}x_i}{n}\biggr)+\frac{\beta}{n+\beta} \thinspace \biggl(\frac{\alpha}{\beta}\biggr)

In this example, the Buhlmann estimate is exactly the same as the Bayesian estimate (Examples of Bayesian prediction in insurance-continued).


  1. Klugman S. A., Panjer H. H., Willmot G. E., Loss Models, From Data To Decisions, Second Edition, 2004, John Wiley & Sons, Inc.