A comparative analysis of statistical models for the pricing of health insurance
Actuaries frequently classify policyholders in terms of their risk potential in order to price insurance contracts, with the goal of balancing fair tariffs with guarding against expected losses. Statistical models enable actuaries to determine pure premiums by applying the frequency-severity approach. Accordingly, this thesis investigates the use of standard linear models (LM), generalized linear models (GLM), and generalized additive models (GAM) for the pricing of insurance contracts. Specifically, this paper explores the properties of these models, their purpose, and then studies their ability to deal with continuous and spatial variables. Methodologically, we construct a framework to demonstrate how actuaries can take these variables into account in the different models and how one can compare the statistical models and their associated
premiums. We construct a case study with data from a Belgian hospitalization insurance and apply the corresponding models to predict the severity and frequency of the claims, and obtain pure premiums. We conclude that a log-normal LM for claim severity and a negative binomial GLM for claim frequency lead to the preferred premium structure in terms of predictive performance and fair pricing. Furthermore, we notice that the use of GAM points to a similar premium structure.