AI-driven data agility: a case for synthetic data in insurance

Key take-aways

Insurers have at their disposal a new ecosystem of data sources
Regulations, privacy concerns, and legacy technology impact insurers' agility and competitiveness.
The data privacy and quality guarantees of privacy-preserving technologies make them an opportunity for insurers.
Privacy-preserving synthetic data can be used along the data value lifecycle to train fraud detection models or aggregate customer data for analysis.

⬝⬝⬝

‍

Evolving practices, new demands from consumers, and increasing competitive pressure push insurance players to make the most out of the data they collect. However, data security and privacy constraints, coupled with legacy and siloed systems, limit their ability to do so.

It is crucial for insurers undergoing digital transformation to make privacy a strategic decision if they want to remain agile in using their data. This article puts forward privacy-preserving synthetic insurance data and how it represents, for insurers, a renewed ability to process customer data safely and seamlessly.

The data paradigm in the insurance industry

The insurance industry built itself on the collection of customer data. From KYC data to market survey and claim history, data collection is essential for modeling risks, identifying new markets, or reducing fraud incidence.

insurance synthetic data sources — *Insurance players have now access to a new ecosystem of data sources that adds up to the traditional insurance data channel ecosystem.*

In recent years, new data collection channels brought insurers additional customer data. With IoT devices, health wearables, social web, and other online content, customers produce data usable for predictive scoring and analysis.

For instance, drawing on the data from telematics devices, auto insurers can propose new types of coverage based on customer usage. Metromile's app and mileage tracking device allowed the company to develop a pay-per-mile pricing model for car insurance. The company, which received almost 300$ million in funding, already collected more than 3 billion miles of driving data for predictive modeling.

The industry is also increasingly relying on third parties to provide new datasets or enrich existing ones. From satellite data to support claim processing to demographic data to improve marketing targeting, third-party data has become part of the insurance data paradigm.

For instance, Accenture reported that thanks to third-party data sources such as electronic health records, and advanced rules engines, insurers could underwrite policies with a lower face value ($250,000 and less) without any need for human intervention.

The industry has large volumes of data at its disposal and all the reasons to use it. Whether it is about identifying risks and developing new product lines or optimizing customer experience and acquisition, data is at the core of business development for modern insurers. The data journey is, however, growing in complexity.

A limited ability to generate insights from data

Evolving regulatory and compliance requirements, rising privacy concerns, and legacy technology are causing inertia.

Since 2018, the General Data Protection Regulation has strictly regulated the processing of customer data for insurers and their third parties in Europe. In 2021, the European Commission released a proposal to regulate AI systems that would concern AI-generated social credit scores. These regulations require fast adapting from insurance players.

The strategic prioritization of compliance and governance frequently comes at the cost of data agility. With customer data protection moving up the risk agenda, processes and evolving norms constrain organizations’ ability to use data.

Legacy systems and fractured data are additionally impeding data access and quality. Data is fractured across the organization, and is sometimes siloed across multiple jurisdictions in the case of multinational corporations.

According to PwC’s 23rd Annual Global CEO Survey, transforming their core technology remains a necessity for CEOs. It is seen as a priority investment for more than half of the insurance companies surveyed.

*PwC 23rd Annual Global CEO Survey |* *Insurance trends 2020*

‍

This limited ability to leverage data results in lower productivity, missed opportunities in terms of customer experience, and longer time to market for insurers. Ultimately, it reflects negatively on the organization's competitiveness.

Agile and secure data processing is a critical driver in digital transformation. The ability to run predictive and behavioral analysis on customer data opens the doors to a deeper understanding of customers, more accurate forecasting, and better risk management.

Building sustainable agility with privacy-preserving synthetic data

So how can insurers address regulatory, infrastructure, and risk challenges? Implementing privacy-by-design and making data available through technology changes is one way to future-proof data analytics initiatives.

The industry is increasingly adopting an approach where privacy informs strategic decisions within the organization. For example, since 2015, AXA's Data Privacy Advisory panel has been advising on strategic and governance decisions. The company is also funding privacy research that could ultimately lead to redesigning its framework for personal data use.

By securing data storage and processing while enabling insight generation, the different privacy-preserving technologies represent an opportunity for insurers. From homomorphic encryption to federated learning or synthetic data, these technologies provide security mechanisms against privacy breaches and allow insurers to leverage data safely.

In this context, Swiss RE, jointly with the EPFL, is investigating the scalability and flexibility of privacy-preserving computing techniques for mutual sharing of risk data derived from the global data of a network of insurers and reinsurers. The project could enable systematic access to global insurance-data.

Another privacy-preserving technology, synthetic data, allows companies to work on an artificially-generated synthetic version of a sensitive data resource. The synthetic data preserves the statistical integrity of the original data but doesn't hold one-to-one relationships with its data points. Privacy-preserving synthetic data represents an accessible, safe, and useful data source for insurers. It specifically provides:

Reduced time-to-data: synthetic data shortens the time to access data and improve project agility. It's not uncommon during synthetic data projects to see data access requests go from a few weeks, because of tedious governance processes, to a few hours.

Preserved data utility: synthetic data provides high-quality alternatives to real-life data and will generate equivalent quality data analysis insights. In research funded by the Society of Actuaries on the use of synthetic driver claims and telematics data, researchers were able to advance risk assessment modeling for usage-based insurance with the same accuracy as with the original dataset.

Minimized privacy risks: avoid the risk of re-identification and provide a compliant resource for any project that will not compromise your sensitive data. While many de-identification approaches promise data anonymity, in practice cross-referencing a few data points is enough to re-identify an individual.

Privacy-preserving synthetic data for the insurance industry

Privacy-preserving synthetic data offers insurers an answer to some of the data processing challenges the industry faces in a context where regulations and customer’s demands are raising the bar for data privacy and security.

In practice in the insurance field, it is possible to produce synthetic versions of most tabular customer data: claim data, sales and churn data or digital user data, as well as market and survey data. It’s an opportunity to:
‍

Identify, develop and test new products that answer customers' needs from data that complies with the strictest privacy and legal frameworks.
Improve the customer journey to increase conversion rates with real-time and secure exchanges of information across departments and jurisdictions.
Increase risk assessment precision in underwriting with accurate statistical insights to model risk or anonymized metadata.
Scale AI and make the most out of cloud technologies with data assets that comply with governance and security requirements.
Strengthen fraud detection system with large volumes of data to train detection models.
Improve accuracy of claim predictions with data patterns to identify risk group characteristics.

synthetic data use cases insurance — *Privacy-preserving synthetic data represents a new ability to remain agile and generate insights from data without compromising your customers' privacy.*

European insurers are already implementing synthetic insurance data. In Switzerland, insurance company Die Mobiliar validated the use of synthetic churn data in the context of data privacy protection, adding a new tool to their digital transformation toolbox. Public authorities in Europe are also investing in synthetic data, like in the UK or in Germany. And global market research firms are ranking it as a forward-looking privacy technology for the coming years.

While the value of these technologies is demonstrable, rethinking data systems is a complex task for large companies. During the projects carried out so far, we have identified points that will be important for the success and implementation of synthetic data projects.

Implementing technological changes

Undeniably, new methods and technologies represent investments and risks for insurers. Leaders in insurance must approach such projects methodically if they wish to convince internally and be successful.

Insurers must identify objectives, data use-cases, and barriers if they wish to conduct successful changes. Understanding why the data is needed and which limitations hinder its use is essential to pick the right technological approach.

Successful projects are also part of broader digital transformation efforts. It's essential to think of technological changes within a more general strategic direction to connect the different systems (technology, processes, stakeholders). Implementing these technologies within a strategic roadmap, where privacy and agility are priorities for instance, will help project leaders to unlock budgets.

“Insurers are shifting budgets to develop use cases based on AI and to unlock the full potential in data analytics. Investing in the digitization of product management, underwriting, and claims management can help improve an organization’s profitability and efficiency.”
Accelerating the digital transformation | Strategy& by PwC

Teams must validate technological feasibility first before committing to larger-scale projects. The ability to engage in rapid prototyping is essential for insurers—the more agile, the better time-to-value they can expect. During this phase, it is critical to define success criteria ahead to determine which items of the checklist are a priority.

Then insurers must identify internal and external experts to maximize project impacts and success. It is crucial to involve a team with the skill sets required to run the project and to communicate its results to all stakeholders in a transverse manner. Project leaders will have to align expectations and objectives across teams and departments. When dealing with privacy-preserving technologies, it’s necessary to bring to the table compliance and security, IT, and AI innovation departments as soon as possible to make sure all organizational requirements are met.

Many elements will contribute to the success and adoption of new technologies. What insurers must keep in mind is the ultimate gain of a renewed ability to process customer data safely and seamlessly.

‍

⬝⬝⬝

‍

The insurance industry is constantly changing. Market evolution and competitive pressure, regulatory updates, worldwide sanitary and financial crises are all forcing insurance players to adapt. Insurance players are shifting from a data hoarding paradigm to a data-sharing one.

With the right tools, they will derive actionable insights from their data pool to prevent fraud, improve customer service, enhance predictive capabilities and accelerate knowledge management. With privacy-preserving synthetic data, insurance gets a new ability to remain agile and generate insights from data without compromising customers' privacy.

‍

Read the case studies for synthetic data in insurance from Mobiliar and Provinzial.

‍

AI-driven data agility: a case for synthetic data in insurance