By Phil Hatfield
Let’s whip up an easy cake. We’ll need some flour, eggs, milk, and sugar. I already have all of those ingredients in my kitchen, so this should be a cinch. I’ll get a bowl, put in some flour, crack open some eggs, and add them to the bowl. How many eggs? I don’t know…whatever I have on hand. Slosh in some milk. Add a handful of sugar, and I’m done.
And the result looks like…a bowl full of ingredients — not very appetizing and not at all like a cake.
What went wrong? Well, we didn’t have much of a plan, it was sloppily executed, there were gaping holes in our process, and we left out several minor, but important, ingredients. Obviously, that’s a silly way to make a cake. But many companies take a similar approach to building a data resource (aka data warehouse) that they expect will provide new business insights and information. Many executives know they’re collecting a lot of data and have a vague sense that they aren’t getting as much benefit from their data as they should. They’ve heard about data warehouses and big data and about combining data to get business insights. So they direct someone on their team to put all their company data together into a single resource. They spend a lot of time, energy, and money buying new hardware and software, cataloging the data, extracting it from existing databases, and putting it into this new single data resource. Maybe they even throw in some of the much-hyped new technologies such as open-source database tools and mechanisms for storage and retrieval of data.
And the result looks like…a box full of data that’s been mixed with miscellaneous bits of technology — not very useful and not at all like business insights.
The technical side of building a single data resource is important but ultimately insufficient by itself if the goal is to create actionable information and business insights from the massive amounts of data collected nowadays. So how can a company produce a fully baked, multilayered “cake” of data rather than a bunch of half-baked ideas? Four steps in the process — often inadequately executed — are governance, integration, analysis, and reporting. Here’s a closer look at governance, a good first step toward ensuring that the outcome, when served, will be actionable. (We’ll examine the other steps in the process in part 2 of this article, “Beyond the Box of Data: Next Steps.”)
In the broadest sense of the word, governance means directing the development and management of the data assets of a company so that the enterprise derives the most benefit from the data in terms of actionable insights.
The end goal of the exercise — actionable insights — needs to stay constantly in mind. An actionable insight is useful information delivered to decision makers in a timely fashion in a form they can readily comprehend. Actionable insights come in many forms — from the very tactical to the very strategic. An example of a tactical decision might be, “Given that we’ve recently seen an unusual number of claims from small manufacturers, should we order a loss control visit for this workers' comp policy?” An example of a strategic decision might be, “Given that two-income couples with more cars than drivers in the household seem to be very profitable, should we shift our marketing focus?”
Who decides the specific requirements of what constitutes an actionable insight? Who designs and builds the system that will successfully deliver actionable insights? Technologists usually don’t have the deep understanding of the business processes or the data content to know what an actionable insight is. On the other hand, businesspeople often don’t understand the available data or technology well enough to know what’s possible, which makes it difficult to specify what they want to get from the data resource. Data owners are familiar with the content and know how to use the data but may not be able to imagine how others would use the data and may suspect that the whole exercise of giving others access to such data will adversely affect operations.
The best outcomes require a close and continual collaboration between data owners, developers, and end users. Participants in this task force must have a willingness to learn from each other and work together toward the common goal in a true spirit of cooperation. Enterprise data management is so important and involves so many parts of an organization that it needs to start with commitment and leadership from the highest levels. Executive sponsorship for this type of task force will need to be someone with the authority to marshal the required resources across the organization for the required length of time. Often that will have to be the CEO.
Data quality is another area that needs to be governed at the project level. Data quality will differ depending on the purpose for which the data was originally collected, and data quality standards for the enterprise data resource will differ depending on the ultimate use for which the data is intended. For example, correctly assigning a policy to a rating territory 99 percent of the time may be perfectly adequate for an actuarial study looking at historical data in the aggregate, especially if you can identify and ignore the 1 percent that is of questionable quality. But if you're assigning rating territories to underwrite policies, misclassifying 1,000 of the 100,000 policies you quoted last month will probably cause your company problems on your next market conduct audit, not to mention business lost to competitors that are able to get it right. Increased data quality has costs. But poor quality data has costs as well, and the cost of poor quality data will be different for every application. Each use of the data must be assessed to determine the applicable quality requirements, and an analysis of the cost trade-off must be made.
Closely related to the quality and usability of data is the maintenance of metadata. Metadata is a broad term that means information about the content of the data as well as how particular data elements are related to each other in the data resource. If users are to easily evaluate whether the data is fit for a particular purpose, then they must have a metadata resource available that will allow them to understand everything from how the database is structured to information about individual data elements, such as quality measures and code books.
And finally, the issue of data security must be addressed and managed at a high level. Data is one of the most valuable assets that a company has. Therefore, it’s in every company’s interest to ensure that its data is used — but not misused — to the maximum benefit of the company. In addition, there are regulatory, contractual, and moral restrictions on acceptable uses of the data. That scrutiny should be even higher in a data-warehousing environment, where data is often reused for purposes beyond which it was originally collected.
Once the data has fulfilled the expectations set in governance, it’s ready to be prepared and presented as actionable insights. For how to achieve that, see part 2 of this article — “Beyond the Box of Data: Next Steps.
Phil Hatfield, J.D., CPCU, leads the Modeling Data Services group for ISO Insurance Programs and Analytic Services, a unit of Verisk Analytics.