• No results found

Testing data warehouses with key data indicators Results with Highspeed

N/A
N/A
Protected

Academic year: 2021

Share "Testing data warehouses with key data indicators Results with Highspeed"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)

Testing data warehouses with key

data indicators

Results with Highspeed

(2)

2

Agenda

The Problem

General Problem

Problem within the Project

The Idea

The Solution

General Method of Solution

(3)
(4)

4

General Problem

Test in the project / regression test

Non-recurring assurance of the data quality in a project

within a specified project plan

• Test and retest multiple deliveries of mass-data

• Quality assurance of historical data for Basel II, IFRS etc

.

Testbegin

Test

Scheduled end of project

Time

Data-delivery Re-Test

Plan

Corrected Data-delivery approval Test preparation
(5)

General Problem

Data verification

Recurring assurance of data quality within production

• Continuous check of the delivery of mass data

Additional sources of errors within recurring data deliveries

DWH X Code X Datamart Datamart Reporting Reporting X X Code X X

(6)

Problem

(7)

Concrete Problem within the Project

Project

• Build and test of a DWH for historical Basel II data

Basel II

DWH

(min. 5 years history) ET L Processes Root-Systems Subsequent processing eg. calculation of parameters or regulatory reporting
(8)

8

Concrete Problem within the Project

The original plan

Non-recurring historical data delivery and test of this data set

(inclusive Re-Test)

• Handover of the daily data delivery within production

(9)

Concrete Problem within the Project

Scope of testing

• Around 50 tables –500 fields –

several millions of data records

• Around 500 test cases within 3 levels

(Possible value range – Data integrity – End-to-End-Test)

Test execution

• Manual execution and documentation of the tests

• Individual execution of every test case

• Documentation of the test execution within a MS Access

(10)

10

Concrete Problem within the Project

Actual condition

Recurring historical data delivery because of changes and

incidents

Time- and resources consuming

(Duration of a complete test cycle around 20 person days) Partial abort of the test because of a new data delivery Concentration on one defined test data

(One historical month)

Additional requirement

(11)

Concrete Problem within the Project

Testbegin

Test

Scheduled end of project

Re-Test ?!

Time

Test Test

Current situation

Data-delivery New Data-delivery Corrected Data-delivery Corrected Data-delivery Re-Test Test preparation Testbegin Test

Scheduled end of project

Time

Data-delivery Re-Test

Plan

Corrected Data-delivery approval Test preparation
(12)
(13)

The Idea

Design of a slim test tool

using predefined data quality indicators

Automatization! Fast test

execution!

No time-consuming repeat of all test cases!

(14)

14

+ Merge

=

Test: Balance Account 0815 Table A = Balance Account 0815 Table B??

Table A Table B

The Idea

Table A Table B

(15)

?

B A

Balance

Balance

Balance

Balance

A

Balance

B

Balance

The Idea

Table A Table B

(16)

16

The Idea

Balance Creditcards Defaults ... ...

(17)
(18)

18

The Solution

Administration und configuration of indicators and rules Reporting Calculation engine
(19)

Process

System Landscape Indicators & Rules Execution Export Testing-Database Results
(20)

20

System Landscape

Prerequisite

• SAS and Excel

• Data sources directly in SAS or through links to Oracle or DB2

• Authority to read the data

Indicators and Rules Calculation Engine

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(21)

System Landscape

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(22)

22

Indicators Functional

Indicators in the context are sums and other aggregate functions like:

Indicator Table Usage

Number of accounts Accounts, Scoring

Reconciliation between data sources

Number of customers Accounts, Customers

Reconciliation between data sources

Sum of balance Accounts,

Balance-Sheet

Reconciliation with the Balance-Sheet

Sum of credit cards balance Accounts,

Balance-Sheet

Reconciliation with the Balance-Sheet

Number of defaulted accounts without being past due

Accounts Direct Validation, Integrity

Number of accounts without scoring

Accounts Direct Validation, Integrity

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(23)

Indicators Technique

From a technical point of view the indicators are summary functions according to the SQL standard (SAS):

Function Description Example

SUM Sum SUM(Balance)

AVG|MEAN Average AVG(Scorevalue) COUNT|FREQ|N Counting values COUNT(*)

NMISS Counting missing values

NMISS(Customer)

MIN Smallest value MIN(Scorevalue) MAX Maximum value MAX(Scorevalue) MIN, PRT, RANGE, STD, STDERR, T, USS, VAR, CSS, CV Statistics e.g.: STD (standard deviation) STD(Scorevalue) System Landscape Indicators

& Rules Execution

Export Testing-Database Results

(24)

24

Indicators Technique

• Conditional sum of balance of accounts being in a special product

group with CASE WHEN...

SUM(CASE WHEN ARREAR <= 0 AND Default = 1 THEN 1

ELSE 0 END)

• Complex functions with the SAS Macro engine possible (but

knowledge in programming necessary):

SUM(CASE WHEN NPV <> ((YEAR1 / (1.1)**1) %DO i = 2 %TO 12; + (YEAR&i./1.1**&i.) %end; )

THEN 1

ELSE 0 END)

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(25)

Indicators

The summary functions are placed without a complete SQL function in the MS Excel sheet:

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(26)

26

Rules

Rules define criteria for combinations of indicators and may be directly assigned to test cases.

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(27)

Execution

Results Rules Rules checking 4 Indicators and Rules SAS Program Start 1 Testing-Database Export of results 5 Results in PDF and Excel Read indicators and rules 2 Results Indicators Indicators calculation 3 Data Sources System Landscape Indicators

& Rules Execution

Export Testing-Database Results

(28)

28

Results - Example

Indicator Name Kennzahl_4

Description Number of defaulted accounts without arrear

Indicator SUM(CASE WHEN Arrear <= 0 AND Default = 1 THEN 1 ELSE 0 END)

Expected Result VALUE EQ 0

Result 1446

Compare Incident

Testcase TF_KONTEN_RUECKSTAND

Description Plausibility of Arrear

Rule Kennzahl_3 EQ 0 AND Kennzahl_4 EQ 0

Result Incident

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(29)

Results

System Landscape

Indicators

& Rules Execution

Export Testing-Database Results

(30)

30

Results

E-MAIL System Landscape Indicators

& Rules Execution

Export Testing-Database Results

(31)

Testing Database

• Documentation of the results within self-developed testing

database

• Documentation of test cases and test executions

• Incident-Reporting

• Status-Tracking

• Import of Rules results

Rules-Results

Testing concept Definition of test cases

per testing object

Testing Database Test cases Test executions Status-Tracking Reporting System Landscape Indicators

& Rules Execution

Export Testing-Database Results

(32)

32

Constraints of the Solution

Indicators are partially „just“ indicators for an incident

Not all test cases possible, e.g. End-to-End-Test

Explanatory power of indicators compared to test of

individual records

(33)

Benefits of the Solution

Test cases

Test cases

(34)

34

Benefits of the Solution

Execution 50% Conception 15% Documentation 10% Analyse 25% Before After Analysis 50% Execution 15% Conception 25% Documentation 10%

(35)

Benefits of the Solution

• Light weigh realization of the Idea with MS Excel and SAS

• Standardized checking logic through summary functions

• Performant realization of the indicators with SAS

• Editing of indicators through business department possible

• Fast reports with results in MS Excel and PDF

(36)

36

Capabilities

Project

• Reduce testing effort

• Regression tests and Ad Hoc Retests

Continuous data verification

• Daily usage to assure the quality of input data

(37)
(38)

38

Implementation in the Project

Before

• Around 500 test cases in 3

levels

(Possible values – Integrity – End-to-End-Test)

• Focusing on one selected

data set (one historical month)

• Duration of one complete

cycle around 20 person days

After

• Around 400 test cases of

level 1 and 2

(Possible values – Integrity)

• Test of every historical month

possible

• Duration working with the

tool: around 5 hours

Duration of one complete cycle around 8 person days

(39)

Implementation in the Project

Project Success

Automated and fast execution of the test cases Complete test of the data possible

Assured data quality within the scheduled project deadline

Production

(40)

40

Implementation in the Project

Time

Testbegin

Scheduled end of project

With the tool

Test preparation

Production

Testbegin

Test

Scheduled end of project

Re-Test ?!

Time

Test Test

Current situation

Data-delivery New Data-delivery Corrected Data-delivery Corrected Data-delivery Re-Test Test preparation
(41)
(42)

42

Contact

Adalbert Thomalla Lead Consultant +49 (0)211 5355 0 adalbert.thomalla@cgi.com Stefan Platz Senior Consultant +49 (0)211 5355 0 stefan.platz@cgi.com
(43)

Our commitment to you

We're trusted advisors committed to

delivering solutions that control

(44)

_experience the commitment TM

PROPRIETARY AND CONFIDENTIALITY NOTICE Confidentiality

The material contained within this document is proprietary and confidential. It is the sole property of CGI Inc. (CGI), and may not be disclosed to anyone except for the purpose of bidding for work for or on behalf of CGI.

Based upon the extent of the information provided, individuals with access to this information may be required to sign a nondisclosure confidentiality agreement.

Intellectual Property Rights

The contents of this document remain the intellectual property of CGI at all times.

Unauthorised reproduction, photocopying or disclosure to any other party, in whole or in part, without the expressed written consent of CGI is prohibited.

All authorised reproductions must be returned to CGI immediately upon request.

Trademarks

All trademarks mentioned herein, marked and not marked, are the property of their respective owners.

References

Related documents

Relaxing this assump- tion, Tornell and Velasco (1992) make clear some interesting implications of a feedback Nash equilibrium of an AK model of endogenous growth, finding a

Key Performance Indicators: 1: Improved student outcomes 2: Common Learning objective results 3: Common Teaching objective results 4: Default rates..

Analyze the results and directly apply the masking rules in the policy to the columns in the data

earthquake? Insurers have represented that they are unable to buy unlimited reinsurance cover so may need to reinsure an element of the catastrophic risk with the Crown. viii Is

In voluntary purchase of health insurance, it is near impossible to derive reasonable premium loadings for coverage of Pre-Existing Conditions policies alone... • No underwriting

To estimate the proportion of attached crossbridges, we compared the stiffness (change of force in response to a quick length change) in the active muscle fiber to that of the

Given the difficulties of manually annotating such difficult sequences, we present a FCN model and a training strategy that rely on incomplete 2-D annotations, where only some of

So if the distance between any two cars is less than 100m and the speed is more than 60 km/hr the Bluetooth device gets enabled and if the car gets closer than 10 m the