• No results found

2 Data to Insights to Decisions

3. In what ways could a predictive analytics model help to address the business problem? For any business problem, there are a number of different analytics

2.2 Assessing Feasibility

Once a set of candidate analytics solutions that address a business problem have been defined, the next task is to evaluate the feasibility of each solution. This involves considering the following questions:

Is the data required by the solution available, or could it be made available?

What is the capacity of the business to utilize the insights that the analytics solution will provide?

The first question addresses data availability. Every analytics solution will have its own set of data requirements, and it is useful, as early as possible, to determine if the business has sufficient data available to meet these requirements. In some cases a lack of appropriate data will simply rule out proposed analytics solutions to a business problem.

More likely, the easy availability of data for some solutions might favor them over others.

In general, evaluating the feasibility of an analytics solution in terms of it data requirements involves aligning the following issues with the requirements of the analytics solution:

The key objects in the company’s data model and the data available regarding them. For example, in a bricks-and-mortar retail scenario, the key objects are likely to be customers, products, sales, suppliers, stores, and staff. In an insurance scenario, the key objects are likely to be policy holders, policies, claims, policy applications, investigations, brokers, members, investigators, and payments.

The connections that exist between key objects in the data model. For example, in a banking scenario is it possible to connect the multiple accounts that a single customer might own? Similarly, in an insurance scenario is it possible to connect the information from a policy application with the details (e.g., claims, payments, etc) of the resulting policy itself?

The granularity of the data that the business has available. In a bricks-and-mortar retail scenario, data on sales might only be stored as a total number of sales per product type per day, rather than as individual items sold to individual customers.

The volume of data involved. The amount of data that is available to an analytics project is important because (a) some modern datasets are so large that they can stretch

The second issue affecting the feasibility of an analytics solution is the ability of the business to utilize the insight that the solution provides. If a business is required to drastically revise all their processes to take advantage of the insights that can be garnered from a predictive model, the business may not be ready to do this no matter how good the model is. In many cases the best predictive analytics solutions are those that fit easily into

an existing business process.

Based on analysis of the associated data and capacity requirements, the analytics practitioner can assess the feasibility of each predictive analytics solution proposed to address a business problem. This analysis will eliminate some solutions altogether and for those solutions that appear feasible will generate a list of the data and capacity required for successful implementation. Those solutions that are deemed feasible should then be presented to the business, and one or more should be selected for implementation.

As part of the process of agreeing on the solution to pursue, the analytics practitioner must agree with the business, as far as possible, the goals that will define a successful model implementation. These goals could be specified in terms of the required accuracy of the model and/or the impact of the model on the business.

2.2.1 Case Study: Motor Insurance Fraud

Returning to the motor insurance fraud detection case study, below we evaluate the feasibility of each proposed analytics solution in terms of data and business capacity requirements.

[Claim prediction] Data Requirements: This solution would require that a large collection of historical claims marked as fraudulent and non-fraudulent exist.

Similarly, the details of each claim, the related policy, and the related claimant would need to be available. Capacity Requirements: Given that the insurance company already has a claims investigation team, the main requirements would be that a mechanism could be put in place to inform claims investigators that some claims were prioritized above others. This would also require that information about claims become available in a suitably timely manner so that the claims investigation process would not be delayed by the model.

[Member prediction] Data Requirements: This solution would not only require that a large collection of claims labeled as either fraudulent or non-fraudulent exist with all relevant details, but also that all claims and policies can be connected to an identifiable member. It would also require that any changes to a policy are recorded and available historically. Capacity Requirements: This solution first assumes that it is possible to run a process every quarter that performs an analysis of the behavior of each customer.

More challenging, there is the assumption that the company has the capacity to contact members based on this analysis and can design a way to discuss this issue with customers highlighted as likely to commit fraud without damaging the customer relationship so badly as to lose the customer. Finally, there are possibly legal restrictions associated with making this kind of contact.

[Application prediction] Data Requirements: Again, a historical collection of claims marked as fraudulent or non-fraudulent along with all relevant details would be required. It would also be necessary to be able to connect these claims back to the policies to which they belong and to the application details provided when the member first applied. It is likely that the data required for this solution would stretch back over many years as the time between making a policy application and making a claim could cover decades. Capacity Requirements: The challenge in this case would be to integrate the automated application assessment process into whatever application approval process currently exists within the company.

[Payment prediction] Data Requirements: This solution would require the full details of policies and claims as well as data on the original amount specified in a claim and the amount ultimately paid out. Capacity Requirements: Again, this solution assumes that the company has the potential to run this model in a timely fashion whenever new claims rise and also has the capacity to make offers to claimants. This assumes the existence of a customer contact center or something similar.

For the purposes of the case study, we assume that after the feasibility review, it was decided to proceed with the claim prediction solution, in which a model will be built that can predict the likelihood that an insurance claim is fraudulent.