A common myth surrounding usage-based insurance is that UBI modeling fundamentally differs from techniques that actuaries and other quantitative risk managers may already use. We sought to dispel that notion when we were invited to contribute to the new textbook, Predictive Modeling in Actuarial Science: Volume 2, Case Studies in Insurance, published in July by Cambridge and the Institute and Faculty of Actuaries.
This book is a follow-up to the well-received Volume 1, Predictive Modeling Techniques, which covers many of the established methods used in insurance. The new volume contains 11 examples of analytics in action—including applications to group health, workers' compensation, and UBI. The latter chapter, “Predictive Modeling for UBI,” was co-authored by me and Udi Makov of Verisk - insurance solutions.
As we challenged some myths of UBI modeling, at the same time we had to consider how commonly used techniques such as multivariate regression may falter in the face of the speed, variety, and volumes of data typically associated with UBI. Our discussion therefore alternates between best practices for preparing telematics data for modeling, and finding the right algorithms to apply to this data.
The data preparation process we present involves creating thousands of multi-dimensional variables to describe risk dynamics potentially associated with driving. For example, sudden braking may mean one thing in the context of traffic congestion or a “quick” traffic light, and another when changing lanes in fog or tailgating at high speed. Without examining the full context of each maneuver using geospatial overlays and other “non-telemetric” data, it would be easy to misclassify a risky event as safe or vice versa.
Once we create variables capable of discerning these differences, we show how machine learning can supplement multivariate regression to identify a streamlined set of variables to help predict the likelihood of a claim. One model presented in the chapter identifies the riskiest one in five insured vehicles which, all other things being equal (such as age, gender, marital status), is up to 10 times more loss prone than the least risky one in five. We hope readers can pursue results as precise as this by building on techniques such as those presented in the chapter.
UBI is constantly evolving, and new techniques to gain deeper insights from the data emerge every day. However, we hope our work will help advance the conversation and provide a resource new practitioners can use to hit the ground running.