• No results found

Best Practice in SAS programs validation. A Case Study

N/A
N/A
Protected

Academic year: 2021

Share "Best Practice in SAS programs validation. A Case Study"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

CROS NT srl

Contract Research Organisation

Clinical Data Management

Statistics

Dr. Paolo Morelli, CEO

Dr. Luca Girardello, SAS programmer

Best Practice in SAS

programs validation. A Case

Study

(2)

AGENDA

AGENDA

Introduction

Program Verification: a Business Approach

(3)

FACTS

FACTS

about

about

CROS NT

CROS NT

• Headquarters in Verona (Italy) • Founded in 1993

• Offices in Milan and Munich • 40 employees

• Data Management, Statistical, PhV and hosting services

• Services to Pharma, Biotech and CROs

(4)

Introduction

Introduction

Topic of the presentation: how to maximize the quality of

programming while minimizing the time to verify program.

In the first part of the presentation we will discuss about the

business part:

What is program verification?

Why program verification is necessary?

When is program verification done?

Who performs program verification?

How does the verification process work?

In the second part of the presentation we will discuss about a

(5)

What is program verification

What is program verification

Making certain that the program does what it is

supposed to do, producing a documented evidence

of this

(6)

Why program verification is necessary

Why program verification is necessary

The aim of SAS validation in pharmaceutical research area

is that end-users will produce high quality programs that fit

the purpose for which they are designed and provide

accurate results with a style that they promote:

Reliabity

Efficiency

Portability

Flexibility

(7)

When is program verification done

When is program verification done

Program verification should performed as soon after the

development of the SAS code, before putting the “product” in

production

Development and production environment should be clearly

defined;

Audit trail of program changes should be present as soon

(8)

Who performs program verification

Who performs program verification

The SAS programmer who create the code should perform basic testing and follow coding rules, like:

Error log search

Warning evaluation

Comments on critical steps

Comments on Macro usage

Details of the SAS program (datetime of creation, SAS programmer name, dataset used, datetime of verification, Name of second SAS programmer, etc)

It should be emphasized to perform then a program

verification by a second SAS programmer

(9)

How does the verification process work

How does the verification process work

Biostatistician creates specs then

Submits request

SAS developer produces TLGs

Then submits verification request

Quality Control programmer verifies results

Interactive Process

(10)

Different Verification Procedures

Different Verification Procedures

SOP should define different verification procedures.

üIndependent programming

üReviewing results

üRandom review of results

üVisually verify code

Some of them should mandatory, other optional.

The Document Containing the programming specs (for example the SAP) should define which approach to follow, illustrating program verification techniques (for example using alternative SAS programming procedures)

The determination of the level of validation should follow a risk-based model. The key is to determine the effect on the process if the program does not produce the desired result.

(11)

Error Types

Error Types

Business strategy should identify common ‘error types’ found in:

ü

Statistical tables

ü

Listings

ü

Graphs

ü

Data analysis files

ü

Header section of SAS programs

ü

Bad programming specifications

Metric report related to error type should be analyzed in order to

(12)

Specific CDISC SDTM Validation specs

Specific CDISC SDTM Validation specs

Metadata Level

Metadata Level

Verifies that all required variables are present in the dataset

Reports as an error any variables in the dataset that are not defined in the domain

Reports a warning for any expected domain variables which are not in the dataset

(13)

Specific CDISC SDTM Validation specs

Specific CDISC SDTM Validation specs

-

-Metadata Level

Metadata Level

Notes any permitted domain variables which are not in the dataset

Verifies that all domain variables are of the expected data type and proper length

Detects any domain variables which are assigned a controlled terminology specification by the domain and do not have a format assigned to them

(14)

SAS Programming Rules when

SAS Programming Rules when

validating

validating

Ø

Emphasizing well commented programs.

Ø

Macro in order to use programs repeatedly to verify different

programs (re-usability)

Ø

Using alternative SAS programming procedures when

validating.

(15)

How to optimize the process

How to optimize the process

Good specs & Good standards & Good training =

(16)

A Case

(17)

Example

Example

of

of

Derived

Derived

Datasets

Datasets

Validation

Validation

(1/4)

(1/4)

PROC COMPARE

Compare

original derived datasets

versus

validation derived datasets

“Second Programmer” programs all derived datasets

“First Programmer” programs all derived datasets

(18)

Example

Example

of

of

Derived

Derived

Datasets

Datasets

Validation

Validation

(2/4)

(2/4)

The COMPARE Procedure

Comparison of WORK.LISTING with WORK.VALIDATION (Method=EXACT)

Observation Summary Observation Base Compare ID First Obs 1 1 pt=121

First Unequal 79 79 pt=201 Last Unequal 79 79 pt=201

Last Obs 89 89 pt=212 Number of Observations in Common: 89.

Total Number of Observations Read from WORK.LISTING: 89. Total Number of Observations Read from WORK.VALIDATION: 89. Number of Observations with Some Compared Variables Unequal: 1.

Number of Observations with All Compared Variables Equal: 88.

proc compare base=listing compare=validation

listbase listcomp;

id pt;

(19)

Values Comparison Summary

Number of Variables Compared with All Observations Equal: 3. Number of Variables Compared with Some Observations Unequal: 1. Total Number of Values which Compare Unequal: 1.

Maximum Difference: 1.

Variables with Unequal Values

Variable Type Len Label Ndif MaxDif age NUM 8 AGE (years) 1 1.000

Value Comparison Results for Variables

_________________________________________________________ || AGE (years)

|| Base Compare

pt || age age Diff. % Diff _______ || _________ _________ _________ _________ ||

201 || 41 40 -1.0000 -2.4390

_________________________________________________________

Example

Example

of

of

Derived

Derived

Datasets

Datasets

Validation

(20)

The COMPARE Procedure

Comparison of WORK.LISTING with WORK.VALIDATION (Method=EXACT)

Observation Summary Observation Base Compare ID First Obs 1 1 pt=121 Last Obs 89 89 pt=212 Number of Observations in Common: 89.

Total Number of Observations Read from WORK.LISTING: 89. Total Number of Observations Read from WORK.VALIDATION: 89. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 89.

NOTE: No unequal values were found. All values compared are exactly equal.

Example

Example

of

of

Derived

Derived

Datasets

Datasets

Validation

(21)

Example

Example

of

of

Tables

Tables

Validation

Validation

(1/3)

(1/3)

“First Programmer” programs all tables applying the set of

layout specifications and saves outputs in Word

“Second Programmer” programs all tables avoiding to add additional SAS code to control

output

(22)

Example

Example

of

of

Tables

Tables

Validation

Validation

(2/3)

(2/3)

________________________________________________________________ Tmt A Tmt B ________________________________________________________________ Age (years) n 41 48 Mean (SD) 51.44 (10.39) 52.10 (11.00) Median 55.00 55.00 Min - Max 30.00- 66.00 27.00- 71.00 Gender Female 14 (34.15%) 21 (43.75%) Male 27 (65.85%) 27 (56.25%) ________________________________________________________________ First Programmer -Output in Word Second programmer -Output SAS

proc means data=demog n mean stddev median min max;

var age;

by tmt;

(23)

________________________________________________________________ Tmt A Tmt B ________________________________________________________________ Age (years) n 41 48 Mean (SD) 51.44 (10.39) 52.10 (11.00) Median 55.00 55.00 Min - Max 30.00- 66.00 27.00- 71.00 Gender Female 14 (34.15%) 21 (43.75%) Male 27 (65.85%) 27 (56.25%) ________________________________________________________________ First Programmer -Output in Word Second programmer -Output SAS

proc freq data=demog;

tables gender*tmt;

run;

Example

(24)

Example

Example

of

of

Listings

Listings

Validation

Validation

(1/2)

(1/2)

“Second Programmer” prints derived datasets in SAS “First Programmer” programs

all listings applying the set of layout specifications and

saves outputs in Word

Compare

listing output in Word

versus

output in SAS of derived dataset

(25)

Example

Example

of

of

Listings

Listings

Validation

Validation

(2/2)

(2/2)

Listing 1 Demographic Characteristics Subject ID Gender Age Race _______________ _______ ____ _____ 121 M 50 3 122 M 34 3 123 F 58 3 124 M 64 3 125 M 57 3 126 F 64 3 127 M 39 3 128 M 55 2 129 M 41 3 130 M 44 3 131 M 32 3 132 M 37 3 133 M 61 3 134 F 56 3 135 M 34 3 136 M 34 3

(26)

Example

(27)

Programming 41% Specification 14% Layout 45%

Metrics

Metrics

on

on

Programming

Programming

Errors

Errors

Selection of Variables 14% Calculation of variables 20% SAS Programming 66% Specification not detailed 40% Wrong interpretation of specification 60% Output Writing 56% Output Structure 30% Display Variables 14%

(28)

Examples

Examples

of

of

Errors

Errors

Layout

Writing of a note in table

Incorrect: “Percentages are calculated number of patients”

Correct: “Percentages are calculated on number of patients”

(29)

Examples

Examples

of

of

Errors

Errors

data age;

set demog;

if age<20 then age_c=1;

else if 20<age<40 then age_c=2;

else if age>=40 then age_c=3;

run;

Programming

data age;

set demog;

if age<20 then age_c=1;

else if 20<=age<40 then age_c=2;

else if age>=40 then age_c=3;

(30)

Examples

Examples

of

of

Errors

Errors

Wrong interpretation of specification

Note of a table (in SAP):

“Note 1: Only patients with all value for primary analysis are

included in the table.”

In SAS Program:

(31)

Thank

Thank

you

you

for

for

your

your

attention

attention

Questions

References

Related documents

[r]

Given this negative result we then tested whether independently the unrestricted version of the monetary model, the UIP condition and the proportionality between the black

Legal Land Environment Service Research &amp; Development Gas and Chemical Engineering Production &amp; Operation Engineering Design &amp; Construction Engineering Drilling

In general, to determine fuel moisture effect on the gasification process, the model calculated such parameters as: amount of produced syngas, heating value of the syngas, cold gas

Similar to the observations in rice [59,60], the positive allele effect on PRI observed in this study (Table 5) indicates that SbGI enhances photoperiodic response to SD conditions

To open the Deployment Kit landing page, choose a Deployment Kit link on the Actuate Java Components landing page.. The Deployment Kit landing page appears, as shown in

FEMA published guidelines for risk assessment and mitigation of all varieties of terrorist attacks on buildings (Federal Emergency Management Agency, 2003; 2005) Within

If the starter motor load is sufficient to lower the supply voltage to the ignition coil positive terminal to a value less than 9 V, the coil is unable to provide the voltage