Make the most of your data using machine learning

April 29, 2021 / 3 min read

Businesses have leveraged the power of data for decades, traditionally using rules-based information systems for analysis. But as technology becomes more sophisticated, so does the potential of data. Machine learning is an advanced application of this technology.

imageepz0i

Rules-based learning

Credit card fraud detection systems flagging suspicious transactions is one example of rules-based information systems. Using intuitive rules, the company can intuit if a purchase is made in an atypical geographic region for the account holder or if it's an unusually large transaction. As fraudsters become more creative, companies employ experts with intimate knowledge of intercepting fraud to create additional, esoteric rules for the system's engineers to implement. But it's often not enough – the fraudulent actors remain innovative, leaving the systems a step behind. Quite simply, with rules-based systems, data is ingested. With the credit card fraud detection system example, the algorithm is used to distinguish between legitimate and illegitimate transactions.

ML_workflow_1

Machine-based learning

That begs the question: How does one keep up to counter the evolving nature of fraud? Machine learning is one tool that can help. A model can be trained using historical data to infer rules. Machine learning refers to a collection of algorithms that discern patterns from data. When a dataset is refreshed, the model can be retrained, enabling a system to respond to novel threats. Machine learning invites automation to the process, developing deeper analysis than traditional rules-based processes. With additional, more profound insights, it’s possible to develop solutions that serve customers better.

ML_workflow_2

Machine learning for cyber insurance

Verisk's forthcoming Cyber Risk Navigator platform, a risk aggregation and exposure management tool, helps identify the probability of a future cyber incident on an individual risk using machine learning. Typically, data points associated with single organizations are firmographic data, including industry, revenue, and number of employees. But cyber insurance benefits from outside-in data, such as observing internet traffic and marking potential evidence of possible compromise. This can also show if the software within an organization is outdated or has known vulnerabilities. All this information is condensed into security ratings that indicate the strength of an organization's cyber hygiene. By combining breach data, firmographic data, security ratings, and training a machine learning model to identify breaches, we can measure the relative risk of compromise to individual organizations.

Granted, for most machine learning models, there exists a trade-off between complexity and interpretability. It's easier to follow a decision tree's path and understand the relative weights placed by the decision tree on input variables. But as one seeks to train more complex models, accuracy tends to improve, though it becomes exceedingly difficult to understand the choices arrived at by the model – leading users to brand the model a black box.

A dual approach

We took a dual approach with Cyber Risk Navigator: Using a sophisticated model but training it with constraints that allow sensitivity testing. The machine learning model lends itself to answering complex, granular questions to help inform risk selection and transfer, pricing, and portfolio management. Cyber Risk Navigator helps users to make better risk management decisions and stay ahead of the competition with the industry's most innovative and flexible cyber risk modeling platform.

We're living in an exciting time: Machine learning is now at a mature juncture, and previously intractable business problems are solvable. To be sure, it's still a challenge to build models that are both transparent and predictive. Reasonable trade-offs must be made depending on the application. We are proponents of newer methods in the field but deliberately select those with tunable parameters that meet our stakeholders' concerns.