• No results found

IM02 How to manage your Test Data on zenterprise September, 2012 IBM Forum Brussels

N/A
N/A
Protected

Academic year: 2021

Share "IM02 How to manage your Test Data on zenterprise September, 2012 IBM Forum Brussels"

Copied!
68
0
0

Loading.... (view fulltext now)

Full text

(1)

IM02

How to manage your Test Data on

zEnterprise

18 - 20 September, 2012

IBM Forum Brussels

(2)

Notices

This information was developed for products and services offered in the U.S.A.

Note to U.S. Government Users Restricted Rights — Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES

CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.

(3)

Trademarks

 This presentation contains trade-marked IBM products and technologies. Refer to the

following Web site:

(4)

Agenda

 Data Governance - Focus on Test Data creation and masking

Obtaining test data. Some common practices.

Obtaining test data. Best practices.

InfoSphere Optim Test Data Management / Data Privacy

(5)

A Smarter Planet harnesses today’s information

explosion for business benefit…

Instrumented

Interconnected

Intelligent

…creating a need for better

Information Governance

Streamlining processes to

manage business growth with

consistency

Ensuring compliance with

policies, laws and regulations

Controlling costs and

(6)

Information

Governance

Govern

Quality Security & Privacy Lifecycle Standards Transactional & Collaborative Applications Business Analytics Applications External Information Sources

Success requires governance across the “Information

Supply Chain”

Analyze

Integrate

Manage

Cubes Big Data Master Data Content Data Streaming Information Data Warehouses Content Analytics
(7)

Requirements for managing data across its lifecycle

Capture & replay production workloads

Define policies Report & retrieve archived data

Enable compliance with retention &

e-discovery Move only the needed information Integrate into single

data source

Create & refresh test

data Manage data growth

Classify & define data and relationships

Develop & test database structures/ code Enhance performance Discover where data resides

Develop &

Develop &

Test

Test

Discover &

Discover &

Define

Define

Optimize &

Optimize &

Archive

Archive

Consolidate &

Consolidate &

Retire

Retire

Information Governance Core Disciplines

Lifecycle Management

(8)

Increasing Risk

Mandatory to protect

data and comply with

regulations

Increasing Costs

Defects are caught

late in the cycle

Organizations continue to be challenged with

building quality applications

Time to Market

Lack of realistic test

data and inadequate

(9)

Increasing Risk

45,000+

Number of sensitive

records exposed to 3

rd

party during testing

c

70%

companies use actual

customer data to test

applications

a

Time to Market

37%

Satisfied with speed of

software development

f

30-50%

Time testing teams

spend on setting up test

environments, instead of

testing

b

Increasing Costs

$300 billion

Annual costs of

software-related downtime.

d

32%

Low success rate for

software projects

e

Organizations continue to be challenged with building

quality applications

a. The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis

b. NIST, Planning Report. The Economic Impacts of Inadequate Infrastructure for Software Testing

c. Federal Aviation Administration: Exposes unprotected test data to a third party http://fcw.com/articles/2009/02/10/faa-data-breach.aspx d. The Standish Group, Comparative Economic Normalization Technology Study, CHAOS Chronicles v12.3.9, June 30, 2008

e. The Standish Group, Chaos Report, April 2009

(10)

Vulnerable non-production environments at risk

Most ignore security in non-production environments

70%

of organizations surveyed use live

customer data in non-production

environments (testing, Q/A, development)

Database Trends and Applications. Ensuring Protection for Sensitive Test Data

50%

of organizations surveyed have no way

of knowing if data used in test was

compromised

The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis

52%

of surveyed organizations

outsource development

The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis

$194

per record

cost of a data breach

(11)

 Align application performance to business processes  Ensure business continuity

 Respond quickly and accurately to audit and discovery requests

 Leverage existing investments in applications, databases and storage  Reduce resource requirements for key IT operations

 Profit from superior application performance and availability

 Provision resources to meet priority business needs  Automate data retention to support compliance initiatives  Eliminate budget variances

 Streamline application and database upgrades  Speed disaster recovery

 Simplify database administration

 Reclaim underutilized capacity

 Protect data privacy, integrity, and security

CIO

Business

IT

Impact Across the Enterprise – The benefits of TDM

strategy

(12)

Agenda

Data Governance - Focus on Test Data creation and masking

 Obtaining test data. Some common practices.

Obtaining test data. Best practices.

InfoSphere Optim Test Data Management / Data Privacy

(13)

Test data creation is often accomplished through

cloning

Positives

Negatives

• Simple to do

• Requires little knowledge of the data model or infrastructure

• Creates an exact duplication of production

• Uses significant storage

- Much more than team needs

- Often done once and not for each team member

• Data is production ready and therefore a privacy risk

• Takes significant amounts of time to create • No way to compare to original after test is

complete

• Cannot span multiple data sources/applications • Developer/Tester downtime when sharing

data accessibility

Clone

(14)

500 GB

500 GB

Actual Data Burden = Size of production database + all replicated clones

The Data Multiplier Effect

500 GB

500 GB

500 GB

500 GB

500 GB

500 GB

Development

Test

User

Acceptance

Production

500 GB

500 GB

Backup

500 GB

500 GB

Disaster

Recovery

3000 GB

3000 GB

Total

(15)

Generating syntheticTest data

Positives

Negatives

• Safe • Resource-intensive:

- Huge commitment from DBA

- Deep knowledge of database schema

• Tedious: DBA’s must intentionally include errors to ensure robust testing process.

• Created data does not always reflect the integrity of the original data.

• Time-consuming: process is slower and can be error-prone.

(16)

Test data creation by writing SQL

Positives

Negatives

• ? • Write and maintain SQL.

• Complex and subject to change. • Referential Integrity? Right data? • Expensive, dedicated staff.

• Cannot span multiple data sources/applications. • Developer/Tester downtime when sharing

data accessibility.

SQL

(17)

Agenda

Data Governance - Focus on Test Data creation and masking

Obtaining test data. Some common practices.

 Obtaining test data. Best practices.

InfoSphere Optim Test Data Management / Data Privacy

(18)

Test Data Management – Concepts

• Test Data Management (TDM) refers to the need to manage data used in various

pre-production environments and is a vital part of Application Quality & Delivery.

• Extract production data into referentially intact data subsets to be used to support

application data in other environments.

• De-identify (mask) extracted production data to protect privacy.

• Compare “before” and “after” images of test data.

(19)

Test Data Management Best Practices

Inspect/Browse & Seed Test Cases

Inspect/Browse & Seed Test Cases

Run Test

Production

Production

Correct Production Errors

50 GB

50 GB

50 GB

2 TB

2 TB

3 TB

Unit Integration Performance Function UAT

4 TB

Production

Compare Before and After Results Compare Before and

After Results Source 1 Source 2 Compare Process Subset Privatize Subset Privatize

Refresh Test Data Refresh Test Data

(20)

Gold Copy

Extract

& Subset

Extract

& Subset

Convert

& Mask

Convert

& Mask

Compare

& Audit

Compare

& Audit

Subset

Criteria

Subset

Criteria

Subset and Privatized Test Data

Load &

Distribute

Load &

Distribute

Masking

Rules

Masking

Rules

& Target

Source

Source

& Target

DB List

& Auth

DB List

& Auth

TDM Processes

Secured Lock-down Environment

(21)

Refresh test data

Developer Developer Developer Developer Tester Tester Tester Tester Tester Tester

Training Database

75 GB

Dev Database

25 GB

Refresh Refresh Refresh

Test Database

50 GB

Test Environments

(22)

Example process

Speed delivery, reduce costs and improve quality while

reducing risk and increasing compliance with Test Data

Management

Without Test

Data Management

With Test

Data Management

Sits in queue for days and take several days to create

Submits request for test data

Sits in queue for days

Create test data

Tester

DBA

Takes several days to create

DBA Create or Refresh test data

Tester -Use test data in testing

-Request data refresh

Takes hours to create

Tester -Use test data in testing Refresh test data

-Submits request for test data

Sits in queue for days

Create test data

Tester

(23)

Agile development relies on agile testing, agile testing

relies on continuous access to test data

Organization

Process

Technology

an agile test organization

and testers need

continuous access to

test data

with agile development

you have continuous

integration and delivery

and have to test often

test data management

software needs to support

agile method by

streamlining access to

test data and having

insight into test data

(24)

Agenda

Data Governance - Focus on Test Data creation and masking

Obtaining test data. Some common practices.

Obtaining test data. Best practices.

 InfoSphere Optim Test Data Management / Data Privacy

Q & A

(25)

Data

Governance

Manage Data

Lifecycle

•Data Retention •Data Retirement

Secure

•Prevent Access •Restrict Access • Monitor Access

Protect &

Privacy

•Mask Data •Encrypt Data

Audit

•Audit Privileges •Audit Users •Audit Access

DB2/RACF

Security

Tivoli zSecure

Audit

Data Encryp.

for IMS / DB2

Optim TDM

and DP

Optim

Data

Growth

Solution

Reduce risk from Security breaches

Comply with

regulatory compliance requirements

Archive inactive data and reduce amount of data exposed and requiring protections.

IBM is only solution provider with an end to end comprehensive solution

Protect sensitive customer data and employee data

Enterprise Data Governance for System z

Guardium

for z

(26)

IBM InfoSphere Optim supports the heterogeneous enterprise

Single, scalable, heterogeneous information lifecycle management solution provides a central point to

deploy policies to extract, archive, subset, and protect application data records from creation to deletion

Capture & Replay

Manage Test Data Archive

Discover

Partner

(27)

ISPF

Workstation

Repository Services

Data Access Services

Archiving Services

Subsetting Services

Data Privacy Services

Open Data Management

Security

Server

Storage Independent Archive

Storage Independent Archive

Index Index Data Data Metadata Artifacts

Storage Independent ArchiveExtract & Archive Files Index Index Data Data Metadata Artifacts Data Data Metadata Artifacts ODBC/JDBC

DB2

Optim

Directory

IMS

IMS Native Access

Orders Products

DB2

DB2 Access

Employee Payroll

VSAM / SEQ Files

Native Access

Customers Payments

An ISPF workbench software running under

Z/OS utilized to design, test and deploy

projects to the OPTIM™ Server. The ISPF

workbench software enables either Online and

Batch (JCL) execution.

(28)

ISPF

Workstation

Z/OS

Repository Services

Data Access Services

Archiving Services

Subsetting Services

Data Privacy Services

Open Data Management

Security

Server

Storage Independent Archive

Storage Independent Archive

Index Index Data Data Metadata Artifacts

Storage Independent ArchiveExtract & Archive Files Index Index Data Data Metadata Artifacts Data Data Metadata Artifacts ODBC/JDBC

DB2

Optim

Directory

Repository Services

Data Access Services

Archiving Services

Subsetting Services

Data Privacy Services

Open Data Management

Security

Store and retrieve metadata information, project information, archive catalog in the Optim

Directory

IMS

IMS Native Access

Orders Products

DB2

DB2 Access

Employee Payroll

VSAM / SEQ Files

Native Access

Customers Payments

OPTIM Server / Repository

Utilizes IMS provided drivers to access

data. Metadata is captured via

copybook imports (COBOL or PL/1)

Utilizes VSAM Native Accesss to access data. Metadata

is captured via copybook imports (COBOL or PL/1)

Utilizes SQL to access data. Provides

process to capture metadata from DB2

catalog.

Access source or destination Databases via data access with specific drivers per file type Extract and restore relationally intact Business Objects across multiple DB2 Databases, IMS and VSAM (e.g. Orders from DB2, Customers from VSAM and Credit Cards from IMS)

Store and retrieve, restore, delete, compress data, metadata and artifacts (e.g. external documents as BLOBs) related to Business Objects

Consistently and predictably mask and propagate data for the purpose of test data management with data compliance

Enable relational access to Archived Data via ODBC/JDBC and SQL-92. Can be used in conjunction to remote Database services of ODM to integrate with Enterprise Data Access for Business Intelligence

Provide functional and object security to separate product and data access by role and responsibilities using RACF

(29)
(30)

Optim Captures the Complete

Business Object

DBA view Referentially-intact subset of data DBA view Referentially-intact subset of data Application view Application-level business rules for data relationships

Application view

Application-level business rules for data relationships IMS IMS DB2 DB2 IMS IMS VSAM VSAM Federated access to data and metadata Federated access to data and metadata Related LUW Files or Documents Related LUW Files or Documents

Business Object : Represents application data record – payment, invoice, customer

Referentially-intactsubset of data across related tables and applications; includes metadata, DDL, Reference + Transaction.

Benefit:Referential Integrity: Ensure data is captured and masked consistently

Complete Data

RI Preserved!

OS Independent

DB independent

ODBC Accessible

(31)

IBM InfoSphere Optim Test Data Management

Solution

Requirements

Benefits

• Deploy new functionality more quickly and with improved quality

• Easily refresh & maintain test environments

• Protect sensitive

information from misuse & fraud with data masking • Accelerate delivery of test

data through refresh

• Create referentially intact, “right-sized” test databases • Automate test result

comparisons to identify hidden errors

• Protect confidential data used in test, training & development

• Shorten iterative testing cycles and accelerate time to market

Create “right-size”

production-like environments for application testing

Test Data Management 100 GB 100 GB 25 GB 50 GB 50 GB 25 GB 2TB 2TB Development Unit Test Training Integration Test -Subset -Mask Production or Production Clone

InfoSphere Optim TDM supports data on distributed platforms (LUW) and z/OS.

Out-of-the-box subset support for packaged applications ERP/CRM solutions as well as :

Other

Other

-Compare -Refresh

(32)

InfoSphere Optim Test Data Management

Standard methodology

Production Test Application Extract / privatize production data Load test database Test Extract file Compare results Results? Success Unsuccessful Edit data Deploy Application

Refresh and Retest

Optim fits with your

testing methodology

– Extract

– Subset

– Privatize

– Load

– Edit

– Compare

Automate all or part of

the process

(33)

Enterprise Testing Solution with Rational and InfoSphere Optim

Building better quality applications

Initiate Data Extract Scripts Subset & Mask Production Data for Testing

Execute Automated Test Routines Compare Before

& After Data Go Production! Design & Manage

Test Campaign

Browse & Edit Test Data InfoSphere Optim InfoSphere Optim InfoSphere Optim InfoSphere Optim InfoSphere Optim InfoSphere Optim Fail Refresh Masked Test Data

InfoSphere Optim

InfoSphere Optim

Comprehensive software quality

process to minimize cost and shorten

development cycles

– Manage test labs

– Create realistic test environments

from production data

– Ensure protection of sensitive data

– Manage unit, functional and

performance testing and quality test

cases

Streamline your test data

management processes and deliver

your project sooner and with fewer

defects

(34)

The process : Access Definiton

(35)

DB2  DB2 defined relationship

OPT  Relationship defined with Optim

(36)

Only records where State = “GA”

(37)

Extract Process

Extract from source tables

Extract data and/or object definitions

EXTRACT

Extract

File

Process Report -- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - -- --- -- --- -- --- -- --- -- --- ----CUSTOMERS ORDERS DETAILS PRODDB

Use BROWSE to verify extracted data

Point &

(38)
(39)
(40)

Insert Process : Populate Destination Tables

Table Map

Table names need not match

Change qualifier and/or table name

(41)

Insert Process : Populate Destination Tables

Column Map

Map unlike column names

Transform/mask sensitive data

Datatype conversions

Column-level date aging

Literals Special Registers Expressions Default Values User exits

(42)

Edit / Browse : Traditional vs. Relational Tools

 One table/view at a time  No edit of related data

from multiple tables

FIND CUSTOMER NOTE INFO EXIT TABLE FIND ORDERS NOTE INFO EXIT TABLE FIND DETAILS NOTE INFO

EXIT TABLE

CUSTOMERS

ORDERS

DETAILS

... ... ... ... ...

Single Table Editors

The Relational Editor

Simultaneous browse/edit

of related data from

(43)

Editing Data

Edit data to:

• Insert Rows

• Delete Rows

• Update Rows

(44)

Relationally Joined Data

Browse or edit related rows

Scroll of higher-level table automatically synchronizes

(45)

Commit/Restore

 Commits are automatically made to the database when you move your

pointer to a different row

– Each instance of a commit counts as an undo level

 Restore changes to a row, table or fetch set

(46)

Backing Out Changes - Row Level

Undo removes last change made to the

current row

Undo… brings up a row list and lets you

select how far back you want to restore

the current row

Undo All removes all changes made to the

(47)

Challenges of Enterprise Data Privacy

 Multi-platforms

 Relational database applications in the enterprise

– Complex data model

– Multiple databases

– Legacy data components

– Interconnected applications

 Distributed work teams

– Employees and contractors

– Global 24 x 7 operations

(48)

What is data masking?

Definition

Method for creating a structurally similar but inauthentic version of an organization's data. The purpose is to protect the actual data while having a functional substitute for occasions when the real data is not required.

Requirement

Effective data masking requires data to be altered in a way that the actual values cannot be determinedor reengineered, functional appearance is maintained.

Other Terms Used

Obfuscation, scrambling, data de-identification

Commonly masked data types

Name, address, telephone, SSN/national identity number, credit card number

Methods

Static Masking: Extracts rows from production databases, obfuscating data values

that ultimately get stored in the columns in the test databases

Dynamic Masking: Masks specific data elements on the fly without touching applications or physical production data store

(49)

Statically mask data in non-production databases

Patient No

123456

SSN

333-22-4444

Name

Erica Schafer

Address

12 Murray Court

City

Austin

State

TX

Zip

78704

Patient No

112233

SSN

123-45-6789

Name

Amanda Winters

Address

40 Bayberry Drive

City

Elgin

State

IL

Zip

60123

Statically mask

 Mask data in non-production databases such as test and development

 Improve security of non-production environments

 Facilitate faster testing processes with accurate test data

 Support referential integrity

(50)

Optim Data Privacy Solution

Production

Contextual,

Application- Aware,

Persistent Data

Masking

Contextual,

Application- Aware,

Persistent Data

Masking

VSAM

IMS

DB2

Test

VSAM

IMS

DB2

 Substitute confidential information with fictionalized data

 Deploy multiple masking algorithms

 Provide consistency across environments and iterations

 Enable off-shore testing

(51)

Masked fields

are consistent

Data is

masked

132009824 157342266 SSN#s 132009824 157342266 SS#s

DB2

323457245 134235489 SSN#s 323457245 134235489 SSN#s

Client Billing Application

Consistent mapping Across the enterprise

(52)

During Extract Process

Or

Standalone Convert Process

Or

During Insert/Load Process

Transform or Replace sensitive data using

Standard mapping rules:

Literals, Special Registers,

Expressions, Default Values,

Look-up tables

Complex mapping rules:

User exits

De-Identify test data

Production Data Extract and Convert Masked Test Data

(53)

Optim Data Privacy in Application Testing

Extract

File

-- --- -- --- -- --- ----Transform / mask sensitive data -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL TESTDB -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL QADB

LOAD

INSERT/

UPDATE

Load Files

Extract a relationally intact subset from production database(s)

• Extract data and/or object definitions

• Define a new set of test tables

• Apply masking during population process

• Extract file may be reused but contains un-Masked data

• Good practice for testing masks

-- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - ----CUSTOMERS ORDERS DETAILS -- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - -- --- -- --- -- --- -- --- -- --- ----CUSTOMERS ORDERS DETAILS -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL NewDB

Create

(54)

Extract

File

-- --- -- --- -- --- --- --- --- --- --- ---CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL TESTDB -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL QADB

LOAD

INSERT/

UPDATE

Load Files Transform / mask sensitive data Extract a relationally intact subset

from production database(s)

• Extract data and/or object definitions in pre-masked file

• Use pre-masked Extract file to create new set of tables

• Convert Pre-masked extract file data into second masked extract file

• Share masked extract file to be reused for population step

• Good practice for testing masks using COMPARE

-- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - ----CUSTOMERS ORDERS DETAILS -- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - -- --- -- --- -- --- -- --- -- --- ----CUSTOMERS ORDERS DETAILS

Masked

Extract

File

-- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL NewDB

Create

(55)

Optim Data Privacy in Application Testing

Extract

File

-- --- -- --- -- --- ----Transform / mask sensitive data -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL TESTDB -- ---- ---- ---- --- ----CUST -- ---- ---- ---- --- ----ORD -- --- -- --- ----DETL QADB

LOAD

INSERT/

UPDATE

Load Files

Extract a relationally intact subset from production database(s)

• Most Secure Approach

• Extract data only

• Convert during extract

•Extract file already contains masked data

•Can be shared with testers to reuse

-- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - ----CUSTOMERS ORDERS DETAILS -- --- -- --- -- -- -- -- - -- -- -- -- - -- -- -- -- - -- -- -- -- - -- --- -- --- -- --- -- --- -- --- ----CUSTOMERS ORDERS DETAILS

(56)

Transformation Techniques

 String literal values

 Character substrings

 Random or sequential numbers

 Arithmetic expressions

 Concatenated expressions

 Date aging

 Lookup values

 Intelligence

(57)

IBM InfoSphere Optim Data Masking

• Protect sensitive

information from misuse and fraud

• Prevent data breaches and associated fines

• Achieve better information governance

• Protect confidential data used in test, training & development systems • Mask data on screen in

applications

• Implement proven data masking techniques • Support compliance with

privacy regulations

• Solution supports custom & packaged ERP

applications

Requirements

Benefits

De-identify sensitive information with realistic but fictional data

Personal identifiable information

is masked with realistic but

fictional data

JASON MICHAELS

(58)

Contextually accurate masked data facilitates

business processes

PersNbr FstNEvtOwn LstNEvtOwn

27645 Elliot Flynn 27645 Elliot Flynn

Event Table

Event Table

PersNbr FstNEvtOwn LstNEvtOwn

10002 Pablo Picasso 10002 Pablo Picasso

Event Table

Event Table

Personal Info Table

Personal Info Table

PersNbr FirstName LastName

08054 Alice Bennett 19101 Carl Davis

27645 Elliot Flynn

Personal Info Table

Personal Info Table

PersNbr FirstName LastName

10000 Jeanne Renoir 10001 Claude Monet 10002 Pablo Picasso

Referential integrity is maintained with key propagation

Patient Information

Patient Information

Patient Information

Patient No. SSN Name Address

City State Zip

Patient No. SSN

Name Address

City State Zip

112233 123-45-6789 Amanda Winters 40 Bayberry Drive Elgin IL 60123 123456 333-22-4444 Erica Schafer 12 Murray Court Austin TX 78704

Data is masked with contextually correct data to preserve integrity of data

Maintain value of test data Reduce risk of data breaches

Satisfy Privacy regulations

• Generic mask • Dates

• User defined • Arithmetic expressions

• Lookup values

• Business data types (CCN, NID) • String literal values

• Character substrings & concatenation • Random or sequential numbers

(59)

Street Address/City/State/Zip Code Data Sets

6767 Rte 10 S 12 Buttercup Ln Street 08594 NJ Princeton 105,333 $8,777,733,811 44101 OH Cleveland 54,999 $534,674,233 Zip Code State City Customers Total Assets

1) Client is a Bank who wishes to mask its assets by location

2 Applegarth Ln 12 Street Road 3526 Diamond Rd 12 Roden Dr 288 Helm St 04011 ME Brunswick 89101 NV Las Vegas 98101 WA Seattle 90001 CA Los Angeles 53201 WI Milwaukee 21 Street Rd 3526 Diamond Rd Street 89101 NV Las Vegas 105,333 $8,777,733,811 98101 WA Seattle 54,999 $534,674,233 Zip Code State City Customers Total Assets 2) Optim provides corresponding Street

Address/City/State/Zip Codes for masking

New Table with Masked Data

Address Lookup

Table

3) Leverage Multiple Column Replacement. Entire address row can be masked with a valid Coding Accuracy Support System

(CASS) address using enhanced random lookup function

(60)

© 2012 IBM Corporation

First Names and Last Names Data Sets

Stacey Dave Danielle Bob John

First Name Last Name GPA High School Advisor State

Paul Smith 3.2 Princeton

Johnson NJ

Kate Jones 2.7 Albany Kline NY

First Name Last Name GPA High School Advisor State

Dave Nelson 3.2 Princeton

Johnson NJ

Stacey Reese 2.7 Albany Kline

NY

1) Client is a University who wishes to mask

the first and last name fields in their

admissions database

2) Optim now has a first name lookup

table with over 5,000 male/female names

and a last name lookup table with over

80,000 names

Test Database Reese Howell Kline Nelson Newton First Name Lookup Table Production Database Last Name Lookup Table

3) Use Lookup Tables to randomly

replace table first and last names

(61)

Data Privacy Transformation Library Functions

TRANS SSN

 Generates valid and unique U.S. Social Security Number (SSN).

 By default, algorithmically generates consistently altered destination SSN based on source SSN.

 Can also generate a random SSN when the source data does not have an SSN value or when there is no need for transforming the source SSN in a consistent manner.

TRANS CCN

 Use the TRANS CCN function to generate a valid and unique credit card number (CCN).

 By default, randomizes entire string, can also randomize parts of the credit card (example- preserve cc type).

TRANS EML

 Generates a random e-mail address.

 An e-mail address consists of two parts, a user name followed by a domain name, separated by ‘@’. For example, user@domain.com.

JASON MICHAELS

(62)

Without Key Propagation…

Cust ID Item # Order Date

27645 80-2382 20 June 2004

27645 86-4538 10 October 2005

Customers Table

Orders Table

Cust ID Name Street

08054 Alice Bennett 2 Park Blvd 19101 Carl Davis 258 Main

27645 Elliot Flynn 96 Avenue

Original Data

Cust ID Item # Order Date

27645 80-2382 20 June 2004

27645 86-4538 10 October 2005

Customers Table

Orders Table

Cust ID Name Street

10000 Auguste Smith Mars23 10001 Claude Jones Venus24

10002 Pablo Adams Saturn25

Without Key Propagation

Now these are Orphans!

(63)

Masking with Key Propagation

Cust ID Item # Order Date

27645 80-2382 20 June 2004

27645 86-4538 10 October 2005

Customers Table

Orders Table

Cust ID Name Street

08054 Alice Bennett 2 Park Blvd 19101 Carl Davis 258 Main

27645 Elliot Flynn 96 Avenue

Original Data

Cust ID Item # Order Date

10002 80-2382 20 June 2004

10002 86-4538 10 October 2005

Customers Table

Orders Table

Cust ID Name Street

10000 Auguste Smith Mars23 10001 Claude Jones Venus24

10002 Pablo Adams Saturn25

De-Identified Data

Referential integrity is maintained

(64)

Using Custom Masking Exits

Apply complex

data transformation algorithms

and populate the

resulting value to the destination column

Selectively

include or exclude rows

and apply logic to the

masking process

 Valuable where the desired transformation is beyond the scope of

supplied Column Map functions

 Example:

Generate a value for CUST_ID based on customer location,

average account balance, and volume of transaction activity

(65)

Agenda

Data Governance - Focus on Test Data creation and masking

Obtaining test data. Some common practices.

Obtaining test data. Best practices.

InfoSphere Optim Test Data Management / Data Privacy

(66)

66

Which IBM Software is key for delivering a

completly masked Test data subset?

A. Combination of answer B & D

B. InfoSphere Optim Test Data Management

C. InfoSphere Guardium

D. InfoSphere Optim Data Privacy

Participate in the System z Expert and Superhero contest!

Fill in your answer to the question below on the scorecard and

deposit your card in the box!

(67)

More information on zEnterprise

 IBM zEnterprise / System z Redbooks Portal:

http://www.redbooks.ibm.com/portals/systemz

 IBM zEnterprise Announcement Landing Page:

ibm.com/systems/zenterprise196

 IBM zEnterprise HW Landing Page:

ibm.com/systems/zenterprise196

 IBM zEnterprise Events Landing Page:

ibm.com/systems/breakthrough

 IBM Software:

ibm.com/software/os/systemz/announcements

 IBM System Storage:

ibm.com/systems/storage/product/z.html

 IBM Global Financing:

ibm.com/financing/us/lifecycle/acquire/zenterprise/

 Global Technology Services:

(68)

Thank You

Merci

Bedankt

Gracias!

Obrigado

Danke

Japanese English French Russian German Nederlands Spanish Brazilian Portuguese Arabic Traditional Chinese Simplified Chinese Thai

References

Related documents

( 2016a ) on the application of describing function method to not only understand the impact of different demand signals passing through replenishment constraints but also to use it

This Guide This Guide intends to brief you about control and operation of the 1. DT! en"ines so can en#oy playin" with it. $e aware that this could be an expensive hobby.

In turn in general a valid patch is a patch that can be expressed as a valid SPARQL 1.1 Update Query using INSERT DATA and DELETE DATA opera- tions, where the quad data of the

In a comparable study, EC2012 studied the ocean circulation changes in the PCCS using a regional model driven by the GO2011 statistically downscaled atmospheric forcing and

free & money  free & mortgage free & phone  free & Satellite TV  free & Satellite  free & sex  free & sites  free & software  free & webcam 

Detection and wavlet analysis of partial discharge using an optical fiber interferometric sensor for high- power transformers.2003.J.Opt.A: pure Appl.5 pp (66-72).. Fiber

Test Automation Continuous Testing Service Virtualization Test Data Management Performance IBM Rational Test Workbench IBM Rational Test Virtualization Server IBM

The IBM® InfoSphere™ Optim™ Test Data Management solution for Oracle E-Business Suite offers the technology to optimize and automate processes that create and manage data