DOCOsoft’s 10-year journey in AI and machine learning

It’s hard to believe how far we’ve come – and how much the world has changed around us – since DOCOsoft’s data science team first began exploring the potential of machine learning, advanced analytics, and AI.

Back then, more than a decade ago, the concept of something like ChatGPT and the whole realm of large language models and Generative AI would have sounded fantastical to most people in the insurance sector. Now we’re on the verge of taking it for granted – although our industry is still in the very early stages of taking full advantage of Gen AI’s undoubted potential.

But, now with a new year just beginning, it seems a good time to look back briefly at some of the landmarks along the way for DOCOsoft and its clients.

Our point of departure

For us, it all began with looking at claims data and working with our clients to pin down exactly which patterns in that data would be most interesting, useful, and ultimately valuable for their claims teams. A key determinant of the practical utility of any claims management system is its ability to out the information decision makers need right there at their fingertips when they need it. Visualising that data to maximise its intelligibility is what sets a great CMS apart from a merely adequate one. We recognised from the outset that you can’t deliver this kind of readily accessible intelligence, without a deep understanding of how claims people work – and what information they need to achieve peak performance.

Data-driven insights

Over the years, DOCOsoft and its clients have learned together by doing. One of the first things we looked at back in the early 2010s was average time taken complete various kinds of claims task. We explored what we could learn by breaking down all the relevant data by, for example, class of business, broker or handler. That was how we first began generating actionable data-driven insights for claims managers and heads of claims. We did a lot of work on visualisations to help claims people detect and understand significant patterns in the data.

Being curious about what patterns in data can tell you will often reveal unexpected things. For example, one early insight came about when we discovered that productivity was dipping badly for one client on Thursdays. The reason for this turned out to be a weekly internal meeting lasting several hours. Managers had previously been dismissive when adjusters complained about the time these meetings took up. But the evidence in the data prompted a rethink.

Predictive analytics

From teasing out interesting or significant patterns in the data to gain practical insights, we then branched out into applying machine learning to analyse, not simply what has happened, but also to start predicting what will happen. As soon as you are able to apply algorithms to determine, for example that processing a particular claim that has just come in is likely to take a particular amount of time, or require a particular type of resources or expertise, you can then apply AI to take autonomous or semi-autonomous actions based on that prediction.

Topic modelling

Around 2018 we were doing a lot of work with text analytics and topic modelling using a highly advanced algorithm based on a Bayesian model, to identify particular ‘topics’, hidden structures within a body of text data, that would not normally be apparent to the human eye and which can help drive better claims management decisions. This approach can help with segmenting the data more coherently before applying further levels of data analysis or AI. To take a fairly basic example, if you are looking at a mass of data relating to personal injury claims, this approach can help you move beyond simply identifying claims involving head injury into those with potentially life-changing head injuries – which can then be escalated appropriately – from those involving nothing more serious than a black eye or a broken nose. With this level of segmentation, it becomes easier and more instructive to work on teasing out patterns within directly comparable data that can help spot emerging trends sooner and inform better claims and underwriting decision making.

Feature selection

In 2019 DOCOsoft was awarded a three-year Marie Curie Fellowship which enabled us to work with a group of leading academics on developing an algorithm that performs better than any other previously created equivalent at separating out features of interest from within categorical and mixed data sets – thus making them susceptible to advanced data analytics and machine learning – thus avoiding the impracticably long processing times that come with the inclusion of large amounts of irrelevant or redundant information.

Insurance data sets are a classic example of mixed data, in other words: data that combines both numerical and categorical information. Developing an algorithm with state-of-the-art performance for clustering and classification accuracy, and better-than-state-of-the-art performance for processing times, represented a significant step forward in the field data analytics and machine learning as applied to the insurance sector. This opened up exciting possibilities in areas like reserve modelling

Process mining

Another area in which our data scientists have been active in recent years is process mining. Registering every click made within a claims management system enables us and our clients to extract powerful insights, for example: identifying bottlenecks or uncovering areas where current procedures are generating unintended negative consequences.

Automated task-allocation

This kind of process mining has also enabled us to develop ever more powerful task-allocation algorithms using reinforcement learning, whereby parallel algorithms learn from one another. In simplistic terms, this enables the system to reallocate claims automatically from claims handlers or teams where bottlenecks could start to accumulate to individuals or teams with comparable expertise who are expected to have greater capacity. It can also help us identify claims tasks where relatively expensive inputs like manager approvals are unnecessary, or is simply a box-ticking exercise, thus streamlining processes, boosting efficiency, and ultimately settling claims faster.

Deep learning

Another growing area of activity for us at DOCOsoft recently has been using deep learning to automate the extraction of key data from documents. Back in the bad old days of optical character recognition (OCR), something a trivial as a document scanned at a slight angle, or a minor change in the formatting of a particular form, could throw a major spanner in the works. Now we have developed AI-powered tools that can take any document in whatever format it may happen to arrive in and still extract all the necessary data.

This is a multilayered approach that begins with layout analysis tools that segment a document and recognise where each element of relevant information can be found. The next layer is analysing individual sections and applying learning from other similar documents to classify them by a range of applicable criteria, prior to directing each claim to the most appropriate processing pathway.

Generative AI

We have also been working extensively in recent years with the emerging technology around large language models and generative AI to optimise and accelerate claims processing and decision-making. The ability Gen AI brings to digest and summarise huge volumes of text data can have powerful applications to areas like learning the lessons of past claims handling experience. There’s a lot more to say about Gen AI, but that’s a huge story in itself and will have to wait for another blog!

Conclusion

We hope this very brief summary of DOCOsoft’s journey into machine learning and AI will give you at least a flavour of how we’ve got to where we are now. If you’d like to know more about any of the topics touched on above – or about how DOCOsoft’s data scientists can help your claims team achieve peak performance – please contact us and we’d be happy to explain in greater detail.