Discovering Hidden Insights

Discovering Hidden Insights
In the third of our DOCOInsights series of blogs, we look at the latest thought leadership focusing on frequent pattern mining. DOCOsoft is working with a number of academic partners, including a collaboration with a professor from Cornell University. We recently wrote the abstracts of two papers. The first of these papers is titled Feature selection for reserve prediction in a very large dataset and the second is called Discovering hidden insights in insurance claims using frequent pattern mining.

DOCOsoft was due to present these papers at two conferences in the autumn, however due to COVID-19, we were unable to present.

Working With Top Tier Academics
The first conference that we were due to present at was the IEEE International Conference on Data Mining (ICDM) in Sorrento, Italy, which is a top tier conference. The ICDM has established itself as one of the world’s premier research conferences in data mining. The conference covers all aspects of data mining, including algorithms, software, systems, and applications.

ICDM draws researchers, application developers, and practitioners from a wide range of data mining related areas such as big data, deep learning, pattern recognition, statistical and machine learning, databases, data warehousing, data visualization, knowledge-based systems, and high-performance computing.

Employing Algorithms
As we explained in the summary of our work, one of the key problems for insurers is that the bad predictors will outweigh the good predictors. What are the good predictors to know how much a claim will cost, does the broker matter does the location matter? A revised version of the full paper is being submitted to the prestigious journal Pattern Recognition and our results show an improvement over the current state of the art approach.

Then in the second paper, that was submitted to the International Machine Learning Conference in Belgium, we looked at ways to find frequently occurring patterns. In the paper we describe how, using a worker’s comp data set, we can employ algorithms to quickly find interesting patterns as well discount obvious or boring (e.g. of no practical use) patterns.

We can think of a bad outcome for an insured (slow processing, delayed payment, inconsistent service, rejection of claim, arguing over fees etc.) and our algorithm can find the characteristics of claims that frequently lead to such outcomes. It can also find frequent patterns in good outcomes. It can also find inefficient claims processes and fraud, but we want to focus our attention today on improving customer experience rather than other more theoretical aspects.

Improving Processes
Imagine a claim that takes too long to settle, which has been open for two years. What kind of patterns can we observe that have caused those delays or resulted in a complaint? It will find non-obviously correlating or seemingly non-obvious correlating patterns. It still takes lots of computer processing power, because it needs to generate all possible combinations.

So DOCOsoft is trying to make that process quicker, help our customers to be more proactive and get on the front foot with their insureds. In the next blog, we will examine potential practical use cases for frequent pattern mining.