PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
COMMON METHODOLOGICAL ISSUES FOR CER IN BIG DATA
Sharon-Lise Normand
Harvard Medical School and Harvard School of Public Health sharon@hcp.med.harvard.edu
December 2013
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
OUTLINE
UNCERTAINTY AND SELECTIVE INFERENCE
1
Context
2
Methodological Approaches
3
Concluding Remarks
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
TRANSRADIAL VS TRANSFEMORAL PCI
CONTEXT
Radial artery access permits easier access and easier closure Large number of patients undergoing both procedures Not particularly well studied and of growing importance in the US
Marked heterogeneity in predisposition to bleeding Significant treatment selection (healthier patients undergo transradial procedures)
MASSACHUSETTS
1 2 3 4 5 6 7 8 9 10 11 12
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28
Quarter
Radial Artery Access & Complications (%)
10/2008 4/2009 10/2009 4/2010 10/2009 4/2011
Treatment = Radial Artery Access (vs Femoral) Outcome = Bleeding/Vascular Complication
130,000 PCIs in MA adults
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
TRANSRADIAL VS TRANSFEMORAL PCI
Does radial artery access cause fewer complications compared to femoral artery access?
If so, then shorter LOS and money is saved; patients ambulatory quicker
Large data registry containing detailed clinical information on patients undergoing PCI
More than 300 variables measured on each person Gets larger when considering treatment specific information (multiple lesions)
Introduces selective inference issues
Drawing inference on a selected subset of the
parameters, a subset that is selected because the
parameters within seem interesting after viewing the
data
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
SELECTIVE INFERENCE
An old issue becoming a big problem because:
Better data acquisition technologies More interconnectivity
Increasing focus on use of observational databases for comparing the effectiveness of treatments
More perspectives:
Payer: Coverage with Evidence Development
Patient: Services that are high value for some may be low value for others (e.g., STEMI versus NSTEMI patients) Health care provider: Adoption of value-enhancing technologies
Two issues:
Uncertainty - which is the correct model?
Bias - causal parameters
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
SELECTIVE INFERENCE
Many decisions required:
Select outcome(s) Defining treatments Identify confounders
Inclusion/exclusion criteria for subjects Causal framework
I will focus on confounders and causal framework
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
MOST COMMONLY EMPLOYED APPROACH
1: Methods that limit number of confounders based on perceived clinical relevance and estimate a single model
Identify confounders based on statistical testing and conduct inferences using the identified confounders More than one model may fit the data well
All Subjects Intervention Radial Femoral
No. of Procedures 5192 35022
Mean Age [SD] 63 [12] 65 [12]
Female 25.3 29.8
Race
White 89.6 89.4
Black 3.3 3.2
Hispanic 4.3 3.5
Asian 1.8 1.7
Native American 0.02 0.07
Other 1.0 2.2
Health Insurance
Government 46.0 50.3
Commercial 4.8 13.4
Other 49.2 36.3
Comorbidities
Diabetes 33.1 32.7
Prior CHF 9.4 12.7
Prior PCI 32.0 34.3
Prior myocardial
infarction (MI) 28.7 30.1 Prior bypass surgery 8.4 15.7
Hypertension 79.6 80.7
Peripheral vascular
disease 12.1 12.8
Smoker 24.8 23.1
Lung disease 13.7 14.4
All Subjects Intervention Radial Femoral
No. of Procedures 5192 35022
Cardiac Presentation
Multi-vessel Disease 10.3 10.9 Number of Vessels >
70% stenosis 1.49 1.58
Left main Disease 3.7 7.2
ST-elevated MI 38.9 42.6
Shock 0.44 1.8
Drugs Prior to Procedure
Heparin (unfractionated) 87.3 61.7 Heparin (low weight
molecular) 3.83 4.27
Thrombin 25.5 54.9
G2B3A inhibitors 26.7 26.8 Platelet Aggregate
inhibitors 85.8 86.6
Intra-Aortic Balloon Pump 0.10 0.55 In-Hospital Complication, % 0.69 2.73 Mean Difference, % (95% CI) -2.04 (-2.30, -1.80)
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
BIG DATA SETTING
Methods that limit number of confounders based on perceived clinical relevance and estimate a single model
Main problems:
Exact confounders required to satisfy no unmeasured confounding assumptions are rarely known
Subgroups exhibiting heterogeneous treatment effects are rarely known
Increasing uncertainty amid the availability of
high-dimensional covariate information
How to reduce the dimension of the problem?
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
DIMENSION REDUCTION TECHNIQUES
2a: Methods relying upon sparseness – only a small number of variables are required to parsimoniously represent the underlying data structure
Y
i= X
0iβ +
iwhere β is of low dimension
Main idea: assume many model parameters are 0 by imposing a penalty on including too many variables Tools: penalized least squares; least absolute shrinkage and selection (LASSO) methods; and sparse additive models
No special attention to causality
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
DIMENSION REDUCTION TECHNIQUES
2b: Methods relying upon denseness – shrink estimates to a common mean and permit a small number of variables to have distinct coefficients
Tool: Kernel Regularized least squares
2c: Methods relying upon both denseness and sparseness – shrink estimates to a common mean and to zero so that there are two penalty terms
Tool: Elastic Net
No special attention to causality
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
DIMENSION REDUCTION TECHNIQUES
3: Model averaging approaches
p(∆) =
M
X
m=1
p(∆ | M
k)p(M
k)
M
kindexes model and ∆ a parameter of interest
∆ = bleeding risk in radial artery access patients − bleeding risk in femoral artery access patients
M
1may be a polynomial regression model; M
2a logistic model with many interactions, etc
Difficult to define the space of models over which to average
No real link to causality in development
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
DIMENSION REDUCTION TECHNIQUES
Estimate the treatment assignment model (propensity score) &
the outcome model simultaneously then average More in line with causal thinking
Y = observed outcome; X observed confounders; T binary treatment (1 = new; 0 = standard)
Assume you have ”all the measured confounders”
logitP (T
i= 1) = γ
0+
p
X
j=1
α
Xjγ
jX
ijY
i= β
0αY+ β
TαYT
i+
p
X
j=1
α
Yjβ
jαYX
ij+
Yiα
Yjand α
Xj= ”inclusion” probabilities
Confounders: those with large values of both α
Yjand α
XjPCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
GENERAL IDEA
BUT Model Averaging
Little evidence of use in clinical and policy literature since its introduction in late 1990s
Major paradigm shift if adopted for causal inference However
Meta-analysis acknowledged as providing valid evidence of treatment effectiveness
Approach is transparent
A solution in presence of high dimensional data
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
OBSERVATIONS
1
Plenty of methodology being developed for BIG DATA Need a focus on causal rather than predictive inference
2
Causal inference for CER has constraints different from predictive inference
No unmeasured confounder assumption
Subjects have a chance of getting the treatment Treatment groups are balanced in terms of observables Constant or non-constant treatment effect
3
Non-parametric approach for outcome equation may be
more robust
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks
OBSERVATIONS
Compared to transfemoral artery access, transradial access causes:
1.58% (1.12, 2.05) absolute reduction in complications (regression adjusted using perceived clinical importance) 1.40% (0.90, 1.80) (propensity score matched)
2.56% (0.35, 4.75) (2SLS approach)
PCERC 2013 B&W Sharon-Lise
Normand
Outline Context
Approaches for Big Data Closing Remarks