White Paper: FSA Data Audit

(1)

Background

In most insurers the internal model will consume information from a wide range of technology platforms. The prohibitive cost of formal integration of these platforms means that inevitably a significant proportion of the data feeding internal models is held in, passes through or is manipulated by, End User Computing (EUC) applications (such as Microsoft Excel and Access) or similar files such as .CSVs. The integrity of these processes is a specific focus for FSA Data Audits.

The FSA has developed a review tool to help assess whether a firm’s data management complies with the standards set out in the Solvency II Directive for the purposes of internal model approval. This tool will be used as part of the FSA’s Internal Model Approval Process (IMAP). It informs firms on what they might do in order to satisfy the standards set out in the Directive and is also based on what is expected to be required from the delegated acts (formerly referred to as the Level 2 implementing measures).

The scope of the review is all data (internal and external) that could materially impact the Internal Model. After conducting the review the firm is expected to provide summary findings to the FSA and be ready to provide the evidence that formed the basis of the conclusions.

The review schedule has five sections as follows:

1. The approach (i.e. matters of policy) to managing data

2. The level of oversight of the implementation of the data policy 3. The level of understanding of the data used in the internal model 4. Data issues that may undermine the integrity of the internal model 5. Unreliable processes that may undermine the integrity of the model The areas of potential risk and expected controls in each of these sections are documented in detail below, alongside the capabilities of the ClusterSeven solution to meet these control requirements – where they are applicable to End User Computing applications.

Risk 1: The approach to managing data for use in the internal model does not ensure consistency in quality and application of the internal model

Control objective: To ensure that data quality is maintained throughout the process of the internal model as required by Solvency II.

Expected control ClusterSeven support for control

requirement A data policy has been established

and implemented. The policy, its associated procedures, and standards include:

Central monitoring and associated reporting provide robust evidence of implementation. Flexible control models enable data policies to be efficiently ‘tuned’ for different data streams.

Data management is the process of collecting data from disparate sources and combining it in a way that the [internal] model can use. Data operations: Any point in the system where you ‘do something’ with data e.g.: - interpretation (e.g.when a

user takes a freeform field and interprets it as a structured one)

- formatting (e.g.changing a date format from 3.1.2010 to 1/3/2010)

- alteration (e.g. data cleaning)

- joining (e.g.contractual data such as annuity policies with observational data such as mortality tables)

- restructuring (e.g. Excel Pivot function)

- aggregation (e.g.SUMIF function in Excel)

- extraction (e.g. a SQL query downloading a CSV file) - derivation (e.g. matrix

multiplication)

- merging (e.g. cut and paste from different sources) - translation (a term which

has a meaning for one system is translated into a different term that has the same meaning for another system).

FSA Spreadsheets and Solvency II July 2010

(2)

− a definition of the different data sets that are to be covered by the policy;

Full inventory reports available on demand

− a definition of materiality (which is aligned to the firm’s risk appetite where appropriate e.g. when an expert judgement is made to adjust for insufficient observational data);

Materiality of an EUC may be defined by users as a tag within the EUC or by independent technology-based assessment. This metadata can drive the application of different control processes. Expert judgement

amendments are automatically caught as manual changes and reported.

− the respective ownership and - responsibility for the data sets, including the system of governance and assurance over data quality;

Client may establish multiple ‘attributes’ for an EUC to allow for such roles as ownership, testing, auditing, alerting, approval. All such attributes can be reported across the EUC inventory, including the exposure of gaps such as ‘non-existent’ owners.

− a definition of the standards for maintaining and assessing quality of data, including specific qualitative and quantitative standards for the data sets, based on the criteria of accuracy, completeness and appropriateness;

Flexible controls allow a wide range of qualitative and quantitative standards to be automatically applied to any chosen data set.

− the use of assumptions made in the collection, processing and application of data;

Data and metadata assumptions held in a spreadsheet can be routinely checked for timeliness and

consistency. Changes to assumptions can be automatically notified.

− the process for carrying out data updates to the internal model, including the frequency of regular updates and the circumstances that trigger additional updates and recalculations of the probability distribution forecast;

Where spreadsheet-based updates to the internal model require a specific set of actions to be completed in a specific order these processes can be automatically logged and checked. Anomalies to the expected process can be automatically notified.

− a high level description of the risk and impact assessment process including the frequency with which the assessment process is conducted, and;

The Risk of a spreadsheet can be automatically assessed against client-defined rules. Users can extend the documentation of the Risk status by tags inserted into the spreadsheet. These metadata items can

automatically drive the application of different control processes.

− the frequency of the review of the data policy, associated procedures, and standards.

The frequency and completeness of all spreadsheet review processes is automatically monitored and may be reported as required.

I still see audit reports or project plans that recommend replacing spreadsheets and manual processes with ‘IT solution’

This will never happen - It is impractical to replace 2

or more fragmented systems with a single system

- Replacing the spreadsheet operations with ‘IT designed’ ones only compounds the problem and removes any ability of users to address problems.

The only ‘solution’ is to eliminate the worst processes, and to apply appropriate controls to the ones that remain.

FSA Spreadsheets and Solvency II July 2010

(3)

Risk 2:Inadequate oversight of the development and implementation of the data policy increases the risk of poorly informed decision-making and non-compliance with the required quality and standards

Control Objective 2.1: To set the tone and provide appropriate oversight of the implementation of the data policy necessary for sound decision making

Control Objective 2.2: To ensure appropriate and timely reporting to support required governance and management decision making process and timely detection of issues

requirement The data governance structures and

processes are operating as defined in the data policy and associated procedures and effective in:

Central monitoring and associated reporting provide robust evidence of processes operating as per policy. Flexible control models enable data policies to be efficiently ‘tuned’ for different data streams.

− providing appropriate oversight in the application of the data policy

Summary reports are available to meet all stakeholder needs including Executive, Management, Risk, Compliance and IT.

− ensuring that the data policy, associated procedures, and standards including the responsibilities and accountabilities of the various stakeholders across the firm, the quantity and quality of data metrics reported to management, the data directory, and the risk and impact assessment are kept under regular review;

Summary reports are available to highlight the frequency of operational issues that will drive escalation for the improvement of underlying business processes.

− ensuring appropriate assurance is carried out and received for validating the quality of data used in the internal model.

Where required the client can require that anomalies to data or processes are ‘signed off’/approved before running the internal model

Data quality metrics (qualitative and quantitative) defined in the data policy are reported (individually, aggregated or categorised) to appropriate levels of management on a regular basis to enable them to assess the quality of data and take remedial action when there are material issues

A wide range of data checks are available and may be reported in detail or summary form to designated management. These checks include the presence (or absence) of change, tolerance levels (max, min, percentage) and trends over time.

The system of reporting should include a deficiency management process

whereby exceptions identified as a

Where required the client can require that anomalies to data or processes are ‘signed off’/approved

“In many firms, spreadsheets provide a key area of risk, because they are typically not owned by IT, but by other business or control areas, such as the actuarial function. They may not be subject to the same general IT controls as the firm’s formal IT

systems (e.g. change controls, disaster recovery planning, security etc) and firms need to develop a control system around this.”

FSA Solvency II: IMAP, Thematic Review findings

February 2011

Controls we expect to see  Audit trail

- Tamper proof record of changes

 Version control + backup  Segregation of duties  Code checking / code reading  Testing

 Maintainability

FSA 2007

(4)

result of data quality checks and controls, which could have a material impact on the internal model, are escalated to appropriate levels of management and actions taken to address them on a timely basis.

before running the internal model

Risk 3: Lack of a clear understanding of the data used in the internal model, and of its impact and vulnerabilities, can create gaps in ownership and control

Control Objective: To ensure that data used in the internal model, its impact and vulnerabilities has been clearly identified and maintained.

requirement A directory of all data used in the

internal model has been compiled specifying source, usage and characteristics including:

ClusterSeven automatically compiles an inventory of files linked to the feeds of the internal model (presented in ‘spider’ diagrams and associated reports) and captures and reports associated metadata. This metadata may be enhanced by defining

additional spreadsheet attributes that are automatically compiled for reporting as required.

− storage (e.g. location, multiple copies) across the data flow to internal model

Spreadsheet locations, historical versions and version copies can be automatically identified.

− how data is used in internal model including any transformation (e.g. aggregation, enrichment, derivation) processes

ClusterSeven can monitor the changes (or lack of changes) within

spreadsheets to confirm that the observed activity matches the expected transformation.

Reports/alerts can be delivered based on anomalies or business-related changes in the process (e.g the arrival of new transactions).

For each data set, a risk and impact (sensitivity) assessment has been performed to identify:-

Risk assessments can be tailored to client specific needs and run on an ad hoc (but monitored) basis or on a pre-scheduled basis.

− whether the impact of poor quality data(individually or in aggregation) on the internal model is material;

Cell and range-level tolerances can be placed on all critical values imported into the internal model

− the points in the data flow from source to internal model where likelihood of data errors is the

greatest, and therefore, what specific

‘Spidering’ and cell-precedent analysis allow key data nodes to be identified for the application of appropriate controls.

Possible controls

 Peer (non-independent) review

 Independent expert review

 Segregation of ‘production’ from ‘test’ version

 Version control over production version

 The list is long – think in terms of what controls would be applicable to ‘corporate IT’ application. A user-developed application, if business critical, should be no different.

FSA 2010

Version control

Where controls can be difficult

 Any ‘firms books and records’ e.g. those involving legal or contractual records - These typically require access control

and an audit trail of changes  Any application where changes would

have significant economic impact - These require monitoring, access control and frequent independent checking

FSA 2011

(5)

data quality controls are required; tolerance threshold beyond which a data error could become material (individually or in aggregation).

Cell and range-level tolerances can be placed on all critical values imported into the internal model

Risk 4: Errors, omissions and inaccuracies in the data can undermine the integrity of the internal model and management decision making.

Control Objective: To ensure that data quality (complete, accurate, appropriate, and timely/current) is maintained in the internal model.

requirement The management and data quality

controls (preventative, detective, and corrective) proportional to the probability and materiality of potential data errors have been identified and implemented effectively. The controls should include (at a minimum):

Risk assessment enables the detection of control vulnerabilities such as inadequate use of passwords on spreadsheet files, VBA modules and ‘lock down’ of critical cells. Additional reports provide analysis of ACL vulnerabilities (e.g. single name user groups attached to file shares). Once vulnerabilities have been closed and a new baseline established, alerts provide immediate notifications of regression of these controls.

− having individuals with sufficient competence to conduct the manual data checks on accuracy,

completeness and appropriateness

It is usually the case that manual checks require much higher user competency than the automated processes delivered by ClusterSeven.

− A well-defined and consistent process for refreshing or updating all data items in line with the data policy (timeliness and currency of data). The process must include appropriate change controls (automated or manual) that take into account any material impact (individually or in aggregation) on the internal model.

The automation provided by ClusterSeven enables well-defined processes to be consistently applied, including a full audit log of changes. In addition the materiality of changes – in terms of individual/aggregation, action/inaction, and data/structure may be separately highlighted and reported for specific attention.

− Data input validations

(auto/manual) that prevent data having incorrect or inconsistent format or invalid values.

ClusterSeven enables an automated data validation process to be conducted across millions of data items from spreadsheets and flat files (e.g. .CSVs) prior to being uplifted into internal models in order to capture inconsistent or invalid data. It is unrealistic to assume that manual checks will be reliable on anything but a small number of data items.

Controls over accuracy

 A recognised check over accuracy is to compare the data received with the original source

- Reasonable checks, or random checks against the primary (i.e. objectively verifiable) data may often be sufficient

- Internal coherency/consistency checks based on known properties of the data (e.g. its expected distribution) can also be effective - Think about possible worst/bad cases and place appropriate controls.

FSA 2011

Tolerance alerts

Controls over completeness

 Reconciliation is a recognised check on completeness

- Reconciliation means a check on records that were received against the records that were expected to be received

- This can be difficult to achieve, as it requires transparency about what records were requested e.g. by a complex SQL query

- Reconciliation can also be more difficult with end-user applications

FSA 2011

(6)

Completeness checks such as: − Reconciliation of data received against data expected

ClusterSeven enables automated reconciliations of data against both expected control values and expected processes (such as transaction

maturity dates).

− A process to assess if data is available for all relevant model variables and risk modules

ClusterSeven enables automated checks on the presence of up-to-date data extracts from third party systems.

Accuracy checks such as

− Comparison directly against the source (if available).

ClusterSeven enables automated checks of values held in spreadsheets and flat files against values held in other locations – such as source data.

− Internal consistency and coherence checks of the received/output data against expected properties of the data such as age-range, standard deviation, number of outliers, and mean.

ClusterSeven reports can apply a range of consistency and coherence checks on output data. However, for more complex analysis (e.g. statistical) these calculations may be completed in a spreadsheet and automatically captured and reported using standard ClusterSeven functionality.

− Comparison with other data derived from the same source, or sources which are correlated.

In addition to formal value

reconciliations ClusterSeven enables visual comparisons (e.g. trending) via graphs to confirm expected

correlations.

Appropriateness checks such as − Consistency and reasonableness checks to identify outliers and gaps through comparison against known trends, historic data and external independent sources.

Unlike a spreadsheet file or .CSV which typically only contains a ‘snapshot’ of data in time, ClusterSeven retains the full time series of each cell history, enabling full analysis of historic data and trends.

− A definition and consistent application of the rules that govern the amount and nature of data used in the internal model.

Policy definitions applicable to

spreadsheets and flat file data sources may be established within

ClusterSeven, enabling the consistent application of these rules during the peaks and troughs of other business activity.

− A process to assess the data used in internal model for any inconsistencies with the assumptions underlying the actuarial and statistical techniques or made during the collection,

processing and application of data.

Inconsistencies in process or data highlighted by ClusterSeven may be surfaced by alerts or reports to trigger workflow for amendment or approval.

Risk 5: Unreliable IT environment, technology or tools can compromise the

“Data management appeared to be one area where firms still have comparatively more to do to achieve the likely Solvency II requirements. Also firms did not have a

documented validation policy that clearly explained all the processes used to validate their internal model. We will be looking at these areas in more detail at a firm-by-firm level during the

pre-application phase of IMAP.” FSA Solvency II: IMAP, Thematic Review findings

(7)

quality and integrity of the data and its processing within the internal model

Control Objective: To ensure that the quality of data and its processing for use in the internal model is maintained

requirement IT general computer (ITGC) controls

over the data environment (for e.g. Mainframes, End User Computing applications such as spreadsheets, etc) that may have material impact on the internal model are

established, such as

ClusterSeven provides a complete solution for End User Computing (EUC) applications such as spreadsheets and MS Access databases as well as flat file extracts such as .CSV and .DAT.

− logical access management ClusterSeven provides analysis of access

vulnerabilities (e.g. inappropriate ACLs or inadequate application of Excel passwords) and facilitates the

application and maintenance of these controls together with the adoption of Information Rights Management.

− development and change management (infrastructure, applications, and database);

ClusterSeven provides complete lifecycle support for EUC applications including test and audit cycles.

− security (network and physical); N/a

− business continuity; ClusterSeven provides the opportunity

for disaster recovery and business continuity with respect to monitored files.

− incident management and reporting, and;

ClusterSeven provides a full alerting and reporting environment for EUC activity (or inactivity).

− other operational controls that support the collection (including data feeds), storage, analysis and processing.

ClusterSeven provides operational controls appropriate to these processes when conducted within EUCs e.g. confirming that data feeds have been updated.

Conclusions

The presence of spreadsheets and flat files (e.g. CSVs) within the data processing chain, between source systems and the internal model, presents significant challenges in meeting the wide-ranging requirements of the FSA Data Audit.

The FSA Data Audit specifically allows manual controls to be established provided that they are applied consistently. However, the practicality and cost constraints of applying experienced staff to these processes means that automation is likely to be a far more effective and robust solution for anything beyond the simplest and smallest of files.

ClusterSeven success stories in insurance to date:

“Part of our financial close process used to take hours, now it takes minutes”

Financial controller, Global insurer

“This is the first time I can show the rest of the business how it all works”

Head of Actuarial, Lloyds managing agent

“We will use data dictionary terms within our spreadsheets and then automatically upload validated information into our corporate BI solution”