USING CLSI GUIDELINES TO
PERFORM METHOD EVALUATION STUDIES IN YOUR LABORATORY
James Blackwood , MS, CLSI
David D. Koch, PhD, FACB, DABCC, Pathology & Laboratory Medicine, Emory University School of Medicine
Breakout Session 3B Tuesday, May 1
8:30 – 10 am
Outline
Learning Objectives
• Identify the seven performance characteristics that should be evaluated prior to reporting results from a new test
method.
• Verify the claims of manufacturers regarding analytical performance by following CLSI guidelines.
• Demonstrate ongoing compliance with the method
evaluation criteria contained in accreditation guidelines.
Who is CLSI and what are these guidelines?
Method evaluation basic definitions and experiments
Use of the CLSI Evaluation Protocols
Use of StatisPro software in method evaluation
Who is CLSI?
• Clinical and Laboratory Standards Institute
• ANSI-accredited, global, nonprofit standards development organization
• CLSI has over 2,000 members – organizations such as IVD
manufacturers, hospital laboratories, reference laboratories, universities, professional associations, and government agencies
We promote the development and use of voluntary consensus standards and guidelines within the health care community.
Our documents help health care organizations meet their responsibilities with efficiency, effectiveness, and global acceptance.
We Make the “Blue Books”
Standards in the Clinical Laboratory
Goal of standardization in the laboratory:
The right laboratory test at the right time with the right result leads to quality diagnostics and improved patient care, and improved public health
around the world.
Standardized test
Standardized procedure
Standardized reporting
Improved outcomes
• CLSI documents are developed by volunteer experts from three distinct constituencies: professions, government, and industry.
• Under the supervision of a consensus committee, these volunteers work on:
Document Development Committees or
Standing subcommittees and Working groups
How are CLSI Documents developed?
CLSI Consensus Process
Balance Government
Industry
Professions
Why are CLSI Guidelines Important?
The US Food and Drug Administration(FDA) recognizes over 100 CLSI documents.
The College of American Pathologists (CAP) recognizes 80 CLSI documents.
The Joint Commission recognizes over 144 CLSI documents.
All Evaluation Protocol guidelines in this presentation are recognized by all three groups.
Why are Evaluation Protocols Important?
• Provide clear and thorough guidance.
Evaluation protocols are guidelines for clinical laboratories and manufacturers to characterize the performance of their analytical systems.
• Ensure consistency with good laboratory practice.
Good laboratory practice requires clinical laboratories to verify
performance claims before reporting results used for decisions about patient care.
• Help you to comply with the law!
Evaluation of performance characteristics is required by regulatory and accreditation bodies in the United States.
See http://www.cms.hhs.gov/clia (§493.1255).
CLSI and Evaluation Protocols
CLSI has over 25 Evaluation Protocol Guidelines.
These include:
EP05 – Evaluation of Precision EP06 – Evaluation of Linearity
EP09 – Evaluation of Bias and Comparability Using Patient Samples EP10 – Preliminary Evaluation (Bias, Carryover, Drift, Linearity)
EP15 – Verification of Precision and Trueness
EP17 – Limits of Detection and Limits of Quantitation
C28 – Defining, Establishing, and Verifying Reference Intervals
Performance Characteristics
The seven performance characteristics that should be evaluated before reporting results of a new test method include:
1. Precision
2. Accuracy (measured bias) or comparability (measured differences)
3. Linearity over the measuring interval or analytical measurement range (AMR)
4. Limit of detection (LoD) and limit of quantitation (LoQ or analytical sensitivity)
5. Specificity or interference
6. Reagent or sample (analyte) carryover
7. Reference interval or decision value (interpretive information)
Precision & Accuracy
• Apply a clinical perspective; set a target, an analytical goal, before you begin
• Perform experiments that gather representative data about a method’s analytical performance
• Convert data into statistical estimates of errors
• Compare error estimates to specifications for medically allowable error for an objective assessment of the
errors
Introducing a New Method
Establish a Need
Method Selection
Method Evaluation
Method Development Define the
Quality Goal
Implementation
Routine Analysis Submit
Specimen
Quality Control Practices Preventive Maintenance
Report Result
APPROACH IN METHOD EVALUATION:
Evaluate imprecision and inaccuracy
IMPRECISION Refers to Random Analytic Error
(Lack of repeatability, reproducibility)
INACCURACY Refers to Systematic Analytic Error (Lack of trueness)
1. Constant 2. Proportional
TOTAL ERROR Combined error for a single result
RELIABLE DECISIONS ABOUT PERFORMANCE REQUIRE:
1. Standards for acceptable performance
2. Experimental protocols to estimate performance reliably
3. Criteria for comparing estimated
performance with performance standards
PERFORMANCE STANDARD (PS) PERFORMANCE STANDARD (PS)
Specify:
E A . . . Allowable error X C . . . Decision level Format:
PS = E A at X C
ALLOWABLE ERROR (E A ) ALLOWABLE ERROR (E A )
The amount of error that can be tolerated without
• invalidating the medical usefulness of the result or
• causing the test to fail a proficiency testing event
DECISION LEVEL (X C ) DECISION LEVEL (X C )
Any concentration of the analyte that is critical for medical interpretation — whether for
• diagnosis,
• monitoring, or
• therapeutic decisions.
Laboratory data are most carefully interpreted at
these decision level concentrations.
plasma glucose, mg/dL
1 2 3
0 20 50 80 126 160 200 260 300 340
DECISION LEVELS FOR GLUCOSE
DECISION LEVELS FOR GLUCOSE
Performance standards for Glucose Medical Decision PS
1= 6.0 mg/dL @ 50 mg/dL
Hypoglycemia
PS
2= 10% = 12.6 mg/dL @ 126 mg/dL Impaired glucose control
PS
3= 30 mg/dL @ 300 mg/dL Poorly controlled diabetes
DECISION LEVELS FOR GLUCOSE
DECISION LEVELS FOR GLUCOSE
Sources of Allowable Errors Sources of Allowable Errors
1. Proficiency testing requirements for acceptable performance 2. Literature guidelines
a. based on physician surveys
• e.g.: Karon, Boyd & Klee, Glucose Meter Performance Criteria for Tight Glycemic Control Estimated by Simulation Modeling, Clin Chem, 2010; 56:
1091-97
b. based on intra-individual biological variation of analyte
• Ricos C et al., Scand J Clin Lab Invest,1999; 59: 491-500
• Fraser C, “Biological Variation: From Principles to Practice”, AACC, 2001
• Internet at http://www.westgard.com/biodatabase1.htm
3. Input from clinicians and/or professional judgment
Formulation of Criteria to Judge Analytic Errors
Formulation of Criteria to Judge Analytic Errors
General form:
compare observed analytic error to the specification for allowable analytic error Performance is acceptable when:
observed error < allowable error Performance is not acceptable when:
observed error > allowable error
Performance Characteristics: Precision
CLSI Guidelines for Precision
• EP15: a five-day procedure to verify that imprecision meets the claims of a measurement procedure
(EP15 is most frequently used by clinical laboratories for method evaluation.)
• EP05: a 20-day procedure to establish the imprecision for
a measurement procedure
Replication Experiment
1. Time period: within-run
within-day day-to-day
2. Number of samples: minimum of 20
3. Sample matrix: simulate patient sample 4. Analyte concentration: medical decision limit
5. Calculations: mean, standard deviation (SD),
coefficient of variation (CV)
Performance Characteristics: Accuracy
Accuracy [Trueness] (Measured as Bias)
• Bias: Estimate of a systematic measurement error; a quantitative measure of the average difference between results from a
measurement procedure and results from an accepted reference measurement procedure.
• When a reference measurement procedure is not available for an analyte, a best-available comparative method may be used to measure bias.
• Frequently, clinical laboratories perform a comparison of patient sample results between a new and an existing measurement procedure.
(“correlation studies”)
CLSI Guidelines for Trueness (Measured as Bias)
• EP15: a method comparison to verify that a new method conforms to a manufacturer’s claim for comparability to another procedure.
(minimum of 20 patient samples)
• EP09: a method comparison to establish a claim for method comparability.
(minimum of 40 patient samples)
Performance Characteristics: Accuracy
Comparison of Methods Experiment
CLSI EP9-A:
“User Comparison of Quantitative Clinical Laboratory Methods Using Patient Samples”
1. Choice of comparative method:
• critical for the conclusions which can be made 2. Number of test samples:
• minimum N = 40
• uniform distribution (EP9-A includes a table)
• a “bin-box” approach
Comparison of Methods Experiment
Bin-box approach:
N u m b er o f s a m p le s
5 10
Comparison of Methods Experiment
3. Replicates:
• required for EP9-A: desirable, but not always practical 4. Time period:
• minimum of 5 days 5. Data analysis:
• review daily
• Check for maximum allowable differences between methods
• EP9-A includes a test for outliers within and between methods
6. EP9-A has a section on establishing manufacturer’s claims
Three Approaches to Analyzing Comparison of Methods Data
1. correlation coefficient
2. t-test statistics
3. regression statistics
Sensitivity of Statistical Parameters to Errors
Parameter
Random Constant ProportionalLEAST SQUARES
SLOPE no no yes
Y-INTERCEPT no yes no
STD. ERROR yes no no
T-TEST
BIAS no yes yes
sd yes no yes
CORRELATION COEFFICIENT
r
yes no noEffect of range on the correlation coefficient
Range 0 to 300 70 to 110
Random Error 10 units 10 units
Corr. Coef., r 0.986 0.764
HH
Correlation coefficient, r
• Responds to random error.
• Value depends on the range of data.
• Does not estimate analytical bias or random error between methods.
• Merely presents the relationship of the range of the data to the scatter of the data between methods.
Therefore, the correlation coefficient should NOT be used to
judge acceptability of analytical methods in method comparison
studies.
Linear regression statistics…
Subject to certain limitations:
• Data must be linear
• Outliers must be carefully examined
• Range of data must be wide:
a. r > 0.99 (Waakers et al.)
b. r > 0.975 (CLSI EP9-A)
Recommendations for Method Comparison
Summary
• Present graph of data
• Present slope, y-intercept, and S
y/x• Present mean and standard deviation of “X” data
• Present correlation coefficient ONLY to show that least
squares regression is applicable; if not, use Deming or
Passing-Bablock regression statistics
Performance Characteristics:
Linearity
Linearity – Measuring Interval or Analytical Measurement Range (AMR)
• A linearity study is used to establish or verify the measuring interval for a measurement method.
• Measuring Interval: the interval between lower and upper numerical values for which a method can produce quantitative results suitable for the intended clinical use.
• The measuring interval is verified by demonstrating a linear relationship between the measured and expected concentration relationships.
CLSI Guideline for Linearity – Measuring Interval
EP06: procedures to verify or establish the linear measuring interval of a measurement procedure.
Performance Characteristics:
LoD/LoQ
Limit of Detection (LoD) & Limit of Quantitation (LoQ) (sometimes referred to as “Analytical Sensitivity”)
• LoD: the lowest amount of analyte (measurand) in a sample that can be detected with a stated probability.
• LoQ: the lowest amount of analyte (measurand) in a sample that can be quantified with acceptable precision and bias under stated experimental conditions.
• Usually, laboratories review and accept the manufacturer’s claims for LoD and LoQ.
But these characteristics can be tested by laboratories using:
CLSI Guideline for LoD and LoQ
EP17: procedures for verifying or establishing the LoD and the LoQ
Performance Characteristics:
Interference
• Interference: an artifactual increase or decrease in the apparent quantity of an analyte due to the presence of a substance that reacts
nonspecifically with the measuring system.
• Most manufacturers evaluate a large number of substances known or suspected to be potential interferents. They report this information in the Instructions For Use (IFU).
• It is not practical for most clinical laboratories to repeat such an investigation and inspection of the manufacturer’s information is frequently sufficient.
But these characteristics can be tested by laboratories using:
CLSI Guideline for Interference
EP7: procedures for testing constant error due to interference
1. See CLSI EP7-A2 2. What to test:
• Literature review
• Always test hemolysis, lipemia, bilirubin
• Tube additives
3. Concentrations to test:
• Interferent: highest compatible with life
• Analyte: at medical decision levels
4. Volume of interferent <10% of sample
5. Replicates: Based on “Effect / S
tm” (see EP7) 6. Validate technique with current method
Interference Experiment:
Factors
Interference Experiment: N=?
Number of Measurements / Replicates:
• at least several samples per interferent
• at least duplicates per sample
• EP7 lists a table of N as a function of bias/s
tm(E
A,I/S
tm ),with which one can determine how many replicates are necessary to reach 95% probability of observing a certain magnitude of error:
E
A,I/S
tmNo. Replicates E
A,I/S
tmNo. Replicates
0.8 41 1.5 12
1.0 26 1.6 10
1.1 22 1.8 8
1.2 18 2.0 7
1.3 16 2.5 6
1.4 14 3.0 3
Performance Characteristics:
Carryover
• Carryover: the discrete amount of reagent or analyte carried by the measuring system from one test into subsequent test(s), thereby erroneously affecting test results.
• Reagent carryover among different measurement procedures on multichannel automated analyzers is an evaluation that is usually conducted by measuring system manufacturers.
But this characteristic can be tested by laboratories using:
CLSI Guideline for Carryover
EP10: includes an assessment of sample carryover along with other parameters.
NOTE: EP10 is intended to determine if a device has unacceptable performance. It is recognized in the CAP Chemistry Checklist as an acceptable way to measure carryover.
Performance Characteristics:
Reference Intervals
Reference Interval: interpretive information for laboratory test results that is frequently provided as the central 95% interval of results for a group of well-defined reference individuals.
Laboratories can produce reference intervals in a variety of ways, including testing procedures found in…
CLSI Guideline for Reference Intervals or Decision Value
C28: procedures for establishing a reference interval or verifying the suitability of a manufacturer-proposed reference interval
“Transference” of established reference intervals to an individual laboratory or a new method may be accomplished in a variety of ways:
1. Subjective assessment by a responsible individual;
• the Medical Director (sometimes called “by divine judgment”)
2. Donor testing
a. Verify with ~ 20 donor samples
b. Validate/Estimate using ~ 60 donor samples c. Establish using ~ 120 donor samples
3. Calculation
• use regression statistics from a comparison of methods study to calculate reference limits for the new method (Y) that correspond to the reference interval limits of the former method (X).
Reference Interval Determination
Y = a + b × X
CLSI Makes Life Easier with StatisPro
In October 2010, CLSI released StatisPro software:
Direct, faithful implementation of CLSI Evaluation Protocol Guidelines
Study Advisor step-by-step help for each study
Four steps to complete a study:
Definition, Data Input, Analysis, and Signoff
StatisPro – Pick a Study Type
StatisPro – Study Design
Performance Claim to be Verified Study Goal
Identifying Information
Details of the Study
Description of Materials Used
StatisPro – Data Entry
Copying and Pasting from any spreadsheet application or Windows application with clipboard support is easy.
StatisPro – Analysis
1 - Inspect group: Evaluate the data visually using various plots and tables.
You can choose to show or hide excluded observations.
2 - Outliers group: Select an observation to exclude from the calculations.
3 - Study-specific group: Select commands that continue to evaluate the data and reach a study conclusion.
4 - Sign Off group: Add any comments, your name, and a signature line to the study report so it is ready for a handwritten signature when printed.
StatisPro – Study Advisor
StatisPro Demonstration
• Demonstrate EP15 (method comparison) and EP06 (linearity).
User Experience with StatisPro
• StatisPro is useful when introducing new methods into your laboratory.
• StatisPro is useful when performing six-month linearity or “calibration verification”
studies.
• By using StatisPro:
You are demonstrating compliance with regulatory and accreditation bodies.
You are ensuring that your laboratory delivers accurate results.