Non-Life Insurance Pricing Models

4 min readApr 12, 2021

Model Development Review

Background and Definitions About Insurance

Any insurance that is not life insurance is classified as non-life insurance, also called general insurance or (in the US) property and casualty insurance.

A non-life insurance policy is an agreement between an insurance company and a customer — the policyholder — in which the insurer undertakes to compensate the customer for certain unpredictable losses during a time period, usually one year, against a fee, the premium.

By the insurance contract (the policy), economic risk is transferred from policyholder to the insurer. So this leads us to the generally applied principle that the premium should be based on the expected loss that is transferred from the policyholder to the insurer. Of course there must also be a loading for administration costs, cost of capital, etc., but this is not our focus here in the pricing models.

The need for statistical methods comes from the fact that the expected losses vary between policies (policyholders): the accident rate (claim frequency) is not the same for all policyholders and once a claim has occurred, the expected damages (claim severity) vary between policyholders.

Statistical Methods/Models Developments

The Minimum-Bias method for non-life insurance pricing was implemented by Bailey and Simon in 1960, which is considered to be an important milestone. The principle of this method is defining randomly the link between the explanatory variables, the risks levels and the distance between the predicted values and the observed ones. Once these elements are established, an iterative algorithm calculates the coefficient associated with each risk level using the minimizing distance criterion.

GLM (Generalized Linear Model) were formulated by Nelder and Wederburn in 1972, and become standard industry practice for non-life insurance pricing in 1990s. GLM generalises the ordinary linear model: extending the assumption of normal distribution (response variable following normal distribution) to a particular family of distribution, namely the exponential family, like the Normal, Poisson, Binomial and the Gamma distributions.

AI (Artificial Intelligence) or ML(Machine Learning) have entered the scene of modelling insurance price in very recent years from 2018 to 2020. The outperforming AI algorithms for insurance pricing modelling are Light-GBM (Light Gradient Boosting Machine) and Neural Network, which works better than GLM in terms of prediction accuracy and automation. But since AI algorithms lack of interpretability compared to GLM, it is not ubiquitous in insurance modelling yet. However, the area of Explainable AI or interpretable machine learning, which is a very hot topic in AI, provide a big chance to explain the predictions of AI black-box models.

Insurance Pricing by GLM

Next we’ll focus on GLM which is still the mainstream of non-life insurance pricing models. So, how do insurance companies set prices/premiums by statistical modelling of GLM?

The price/premium is set in relation to the risk of the customer, to ensure that the loss of the customer is covered. However, this does not account for all of the final price, since there are always administration costs, capital costs etc. and making a profit like in all other businesses. Nonetheless, the foundation of the premium is to choose a premium as per the risk of the customer, which we call pure premium.

Pure premium = Claim frequency × Claim severity
(based on assumption of claim severity independent of claim frequency)

The claim frequency is the number of claims divided by the duration, for some group of policies in force during a specific time period, i.e., the average number of claims per time period (usually one year).

The claim severity is the total claim amount divided by the number of claims, i.e., the average cost per claim.

Model Assumptions: claim frequency is assumed to be distributed according to a Poisson distribution or Negative Binomial distribution etc., which the claim severity follows a Gamma distribution. When actuaries model the pure premium directly, they normally assume Tweedie distribution as the distribution of the pure premium. Looking into the reality, these distributions, which belong to a family of exponential distribution, are reasonable.

The distributions of the response variable is not restricted to normal distribution, but a family of exponential distribution(which is a extension of normal distribution), so that we have to use GLM instead of Linear model.

Implementation Practise (to be continued)

My next article will show a practise of Tweedie GLM application on pricing with R dataset.

Summary

I summarised what the non-life insurance is doing, and then what methods have been used for pricing the insurance, and specifically introduced GLM for insurance pricing, the one in majority use. To be continued in next article is a practise of GLM application on pricing by R language and R insurance dataset.

Reference

Bailey R.A., Simon L.R.J., 1960. Two Studies in Automobile Insurance Ratemaking. ASTIN Bulletin, 1 (4) , pp. 192–217.

Jong, P., Zeller, G., 2008. Generalized Linear Models for Insurance Data. Cambridge University Press, Cambridge.

Ohlsson, E., Johansson, B., 2010. Non-Life Insurance Pricing with Generalized Linear Models. Springer, Berlin.

David, M., 2015. Auto Insurance Premium Calculation Using Generalized Linear Models. Procedia Economics and Finance, Volume 20, pp. 147–156.