• No results found

10 Checking and Correcting

55 Recovery/cleansing project

Example: A large organization that uses of ce space it does not own might  nd that it is charged on the basis of  oor space and that the  oor space charged for is not the same as the  oor space actually available. Measuring the space might give a basis for getting a lower rent and a refund of past over-charged amounts.

RECOVERY/CLEANSING PROJECTS are relevant to most large-scale processes, where even small percentage errors can have high value, and to processes at any scale that are known to be unreliable.

™™™

Even tiny  aws in the control system of a large scale information process can be costly. Something powerful needs to be done and often pays off handsomely.

Recovery/cleansing projects are projects that involve checking large quantities of data, typically using software ‘‘power tools’’, to  nd what may be faults, then checking through them, leading to identi cation and correction of speci c errors and other problems. This in turn leads to higher data quality and greater ef ciency in future, ideas for improvements to controls, systems, and processes to prevent reoccurrence of the errors, and recovery of money from other parties e.g. customers, suppliers.

Recovery projects are often justi ed purely in terms of the  nancial recovery bene ts, but these are not the only or even the main bene ts to be gained. People who  nd themselves affected by recovery projects usually feel that improvements to prevent further problems are important and they want reassurance and evidence that these are part of the project.

One way to prevent a recovery project from starting is to demand proof in advance that it will be worthwhile. Usually nobody knows exactly what will be found or how much work will be required until the project is done.

It is better to consider all processes guilty until proven innocent and go ahead with a project that gives good information early on and keeps risks low. For example, if you involve external experts in the work they will often be used to working for a percentage of the money saved or recovered. If they  nd nothing then their client pays nothing –– other than the minor inconvenience of supplying data downloads and answering questions.

This problem of not knowing if, or where, the problems lie is fundamental in recovery projects and solving it is at the heart of this control pattern.

A  rst step is to ask people in a position to know to help make a list of error types either known to exist or suspected. Usually there will be quite a list of things that are already being worked on in some way or are currently concerns. Each of these should be subjectively rated to capture their importance and the uncertainty around them at the outset. This should be done by, in effect, estimating a probability distribution over extent of loss. For example, you could ask for the chance that the

annual loss is greater than £10,000, that it is greater than £100,000, and greater than £1,000,000.

Do not ask for an estimate of the annual losses from each issue because people do not know the amounts, usually, and this question will kill the project. It pushes people to pretend they know more than they do and it removes from consideration loss levels other than the best guesses provided.

It is quite possible to rate the losses in every individual possible location as being, at a best guess, too low to bother with even though there are some worthwhile losses to go for. Some loss levels will actually be lower than the best guess while others will be higher. Work in the areas of higher losses may be very worthwhile.

The trick is to  nd those special areas as quickly and economically as possible.

Asking about known issues is an important step but should not be the only basis of initial plans. There will probably be other issues not yet known about. Look at risk factors in order to guide the hunt for those other issues.

Tabulate possible loss areas and consider each one. Rate each area on relevant risk factors. The choice of risk factors is judgemental because I’’m not aware of published research on what is typically most useful. I suggest considering the following:

Overall value:

It helps to think in terms of the overall value affected by the process/data and the percentage rate of errors. Then multiply them.

Extent of automation:

Steps with a lot of human input tend to give high

levels of errors, varied errors, and concessions to other parties made to preserve good relationships. Steps that are fully automated tend to be correct except when they are systematically and invisibly wrong. Automated steps are often easier to sort out but there is less certainty of  nding at least some errors.

Extent of previous recovery projects and other investigations and

reviews: The prospects may be lower with processes that have already been studied in detail and improved.

Evidence from previous projects and reviews:

On the other hand,

previous reviews may give evidence of remaining problems.

History of management and ownership:

Business units and systems that

have been low priority for a long period, or have changed ownership once, twice, or more times in the last decade or so tend to have more problems.

Complexity of task:

If the process/system has to do something that is inherently complicated, under time pressure, and perhaps also despite poor cooperation by other parties, then there are likely to be more issues.

Age:

Very young and very old processes/systems are more likely to have issues.

Very young processes/systems that have been created by a project that was out of control are particularly hot prospects for recovery work.

Extent of known issues:

Usually if more issues are known about that suggests there are yet more to discover, but not always.

These ratings need to be summarized and combined in some suitable way. Awarding points and adding them together is often as good as any other way and tends to

be more consistent than unaided judgement as well as taking into account all the factors every time.

A more sophisticated alternative if you are con dent with mathematics is to represent the probability density of different levels of percentage error using a beta distribution2 and update its parameters in light of evidence as it is obtained.

An advantage of this is that it is easy to draw graphs that show the relative chances of different levels of loss. This keeps people aware of the possibility (even when slim) of high levels of loss being present.

Having analysed the total area to be covered by the recovery project and made an initial assessment of the extent of issues that might be found in each sub-area the next phase is to begin work checking data in such a way that more is learned about the true level of problems in each sub-area, as well as perhaps identifying some errors.

The three main techniques for this are as follows:

Searches for anomalies and discrepancies:

Identifying what appear to

be incorrect data is not the same as  nding all the errors. There will be errors that are not anomalous and do not create discrepancies. There will also be many items that appear to be incorrect but have legitimate but unexpected explanations and in fact are correct. Typically fewer than half of the items that at  rst appear wrong actually are, and often it is much less than half. If searches are hard to do they can be on a sample only. This work allows the probability distributions of loss to be revised judgementally but does not allow a point estimate because of the uncertainty over how much that is suspicious will turn out to be wrong.

Testing samples:

A more reliable guide to the actual error rate is to select a sample of items, run queries to  nd apparent errors, and follow them up fully to  nd the explanation and identify any true errors and their  nancial and other consequences. The extra follow-up work is time consuming but helpful, and can even be extended to trying to recover money from other parties (if appropriate). This technique also allows the probability distributions of loss (and possible recovery) to be revised but also does not allow a point estimate because of the sampling uncertainty. If beta distributions are used they can be updated very easily from sample information. Estimates of the work needed to process suspect items can be made but of course, over time, ef ciency will improve so do not be put off by work that seems very slow initially.

Attempting reconciliations in total:

Sometimes it is possible to work out

the total loss by comparing the value of data at one point with the value at some earlier, more reliable point. For example, you might know the goods sold and how much they should have been billed for and compare that with the total of bills actually raised. Sometimes this is the easiest and hardest hitting estimate of all. However, it is easy to misunderstand and to make a mistake in such reconciliations so again the probability distributions of loss should be revised but not reduced to a point estimate.

Exactly which errors are searched for and where may be guided by known issues and by knowledge of where the process is complex, manual, and poorly controlled.

The remainder of the project consists of increments of work after which estimates of losses, potential recoveries, and work to achieve them are revised and decisions taken.

Gradually the initially wide scope will narrow to a few areas of highest net reward.

Throughout it is important to estimate open-mindedly and to report progress without suppressing uncertainty. Always remember that apparent errors usually are not errors at all, and that a genuine error that should give rise to recovering money often will not. For example, some ‘‘customers’’ who have been receiving services without being charged may defect when asked to pay and may then dodge outstanding bills.

In summary, a good project can claw back money, prevent future losses, and lead to better systems and controls. Many previous projects have paid for themselves many times over and in some areas specialist consultants offer services that are priced as a percentage of recoveries. Therefore:

Unless a process is known to be close to perfect consider a project to search hard with computer tools for errors whose correction might lead to bene ts. Develop an improving capability for this kind of project.

™™™

These projects rely heavily on DISCREPANCY SEARCHING,ANOMALY SEARCHING,MASS CORRECTION TOOL use, andERROR FILE REDUCTION BY CLUSTER. They may also use information from END-TO-END RECONCILIATION.