• No results found

Predicting Lung Cancer Prior to Surgical Resection in Patients with Lung Nodules

N/A
N/A
Protected

Academic year: 2021

Share "Predicting Lung Cancer Prior to Surgical Resection in Patients with Lung Nodules"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

Background: Existing predictive models for lung cancer focus on improving screening or referral for biopsy in general medical popu-lations. A predictive model calibrated for use during preoperative evaluation of suspicious lung lesions is needed to reduce unneces-sary operations for a benign disease. A clinical prediction model (Thoracic Research Evaluation And Treatment [TREAT]) is pro-posed for this purpose.

Methods: We developed and internally validated a clinical predic-tion model for lung cancer in a prospective cohort evaluated at our institution. Best statistical practices were used to construct, evaluate, and validate the logistic regression model in the presence of missing covariate data using bootstrap and optimism corrected techniques. The TREAT model was externally validated in a retrospectively col-lected Veteran Affairs population. The discrimination and calibra-tion of the model was estimated and compared with the Mayo Clinic model in both the populations.

Results: The TREAT model was developed in 492 patients from Vanderbilt whose lung cancer prevalence was 72% and validated among 226 Veteran Affairs patients with a lung cancer prevalence of 93%. In the development cohort, the area under the receiver operat-ing curve (AUC) and Brier score were 0.87 (95% confidence interval [CI], 0.83–0.92) and 0.12, respectively compared with the AUC 0.89 (95% CI, 0.79–0.98) and Brier score 0.13 in the validation dataset. The TREAT model had significantly higher accuracy (p < 0.001) and better calibration than the Mayo Clinic model (AUC = 0.80; 95% CI, 75–85; Brier score = 0.17).

Conclusion: The validated TREAT model had better diagnostic accuracy than the Mayo Clinic model in preoperative assessment

of suspicious lung lesions in a population being evaluated for lung resection.

Key Words: Lung cancer, Diagnosis, Prediction models. (J Thorac Oncol. 2014;9: 1477–1484)

L

ung cancer is the leading cause of cancer-related mortal-ity in the United States,1,2 but early detection and

treat-ment prolongs life. The National Lung Screening Trial found a 20% reduction in lung cancer mortality in high-risk patients screened with low-dose computed tomography (CT). Implementation of a screening regimen for lung cancer among the estimated 7.4 million eligible Americans3 will greatly

increase the number of lung nodules requiring evaluation and diagnosis. In addition, 39% of the patients screened with low-dose CT had at least one positive scan requiring additional diagnostic evaluations.4 A diagnostic operation after nodule

discovery and radiographic surveillance resulted in a benign diagnosis in 24% of the surgical procedures. Other stud-ies describing resection for known or suspected lung cancer report benign disease rates as high as 40%.5–9

Existing lung cancer prediction models are designed to either determine which high-risk populations would most ben-efit from screening10–13; or estimate the likelihood of cancer,

once a lesion is discovered.14–16 Current guidelines from the

American College of Chest Physicians recommend that cli-nicians use a validated prediction model, such as the model developed in the Mayo Clinic, or their clinical expertise to estimate the probability of cancer in a suspicious lung lesion.17

The Mayo Clinic model contains six variables (age, smoking history, previous cancer, lesion size, spiculated edge, and loca-tion) and was designed to evaluate nodules in patients selected from a general population who had a lesion found on imaging. Our previous work demonstrated that the Mayo Clinic model has poor calibration in patients referred for surgical evalua-tion.18 Currently, no models exist to estimate the lesion’s

prob-ability of malignancy at the point of surgical evaluation. Patients evaluated by surgeons usually have a signifi-cant body of diagnostic information compiled from previous medical specialists such as multiple radiographic scans, biopsy results, and pulmonary function. Surgeons need an accu-rate and well calibaccu-rated predictive model to help diagnose a

DOI: 10.1097/JTO.0000000000000287

Copyright © 2014 by the International Association for the Study of Lung Cancer

ISSN: 1556-0864/14/0910-1477

Predicting Lung Cancer Prior to Surgical Resection 

in Patients with Lung Nodules

Stephen A. Deppen, PhD,*†† Jeffrey D. Blume, PhD,† Melinda C. Aldrich, PhD, MPH,*

Sarah A. Fletcher, BA,† Pierre P. Massion, MD,‡§ Ronald C. Walker, MD,

Heidi C. Chen, PhD,¶

Theodore Speroff, PhD,# Catherine A. Degesys, MD,** Rhonda Pinkerman, NP,†† Eric S. Lambright, MD,*

Jonathan C. Nesbitt, MD,* Joe B. Putnam, Jr., MD,* and Eric L. Grogan, MD, MPH*††

*Department of Surgery, Tennessee Valley Healthcare System, Veterans Affairs, Nashville, Tennessee; ††Department of Thoracic Surgery, §Department of Medicine, Division of Pulmonary and Critical Care Medicine, ¶Vanderbilt-Ingram Cancer Center, and **School of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee; †Department of Biostatistics, Vanderbilt University, Nashville, Tennessee; and ‡Department of Critical Care Medicine, ║Department of Radiology, and #Geriatric Research Education Clinical Center.

Disclosure: The authors declare no conflict of interest.

Address for correspondence: Eric L. Grogan, MD, MPH, Department of Thoracic Surgery, Vanderbilt University Medical Center, 609 Oxford House, 1313 21st Avenue South, Nashville, TN 37232. E-mail: eric.grogan@ vanderbilt.edu

(2)

suspected lung cancer without missing early stage disease; no models exist which integrate this additional information.19 We

developed and validated the Thoracic Research Evaluation And Treatment (TREAT) lung cancer prediction model and com-pared the performance of the TREAT model to the Mayo Clinic model in two populations being evaluated for lung resection.

METHODS Study Population

The TREAT model was developed in the Vanderbilt University Medical Center (VUMC) Lung Cancer Cohort and to examine the generalizability of the TREAT model, it was validated in the Tennessee Valley Veterans Affairs (VA) cohort. The Vanderbilt cohort was composed of patients iden-tified from two separate sources. Using VUMC’s Thoracic Surgery Quality Improvement database and clinic records, 606 patients were identified who received an evaluation of a lung nodule or mass by a thoracic surgeon for known or suspected non–small-cell lung cancer from January, 2005 to October, 2010 (Fig. 1). Demographic and clinical data for each procedure were abstracted using the Society of Thoracic Surgeons National Database for General Thoracic Surgery specifications and guidelines.20 Imaging data were abstracted

from radiologist reports or from original scans of the most recent preoperative CT scans for lesion growth, edge char-acteristics, and F18-fluorodeoxyglucose (FDG)-positron emission tomography (PET) avidity by experienced medical reviewers.5,18,21,22 Lesion edge characteristics defined by the

terms smooth, lobulated, lobular, lobed, irregular, ground glass opacity, ground glass nodule, spiky, or spiculated in the radiologists’ reports were designated by medical reviewers as either smooth, lobulated, ground glass opacity, spiculated, or indeterminate. The growth on serial radiographs occurring at least 60 days apart is defined as an increase in the mean

diameter of 2 mm for nodules initially less than 15 mm in size and an increase of at least 15% compared with a baseline scan for lesions more than 15 mm in size at baseline.23 For

cases with one preoperative radiograph or whose subsequent radiograph was fewer than 60 days and deemed too short a time span to record lesion growth, the case was designated as “insufficient data.” FDG-PET avidity was determined by either physician report or by maximum standard uptake value (SUV). Not avid was coded if the radiologist report used the terminology: not avid, not cancerous, low avidity, not likely cancerous or reported a SUV less than 2.5. Avidity was coded if the radiologist used the terminology: avid, likely cancerous, highly avid, cancerous or reported a SUV of 2.5 or greater. Any radiological reports of insufficient quality to determine diagnosis, shape characteristics, or FDG-PET avidity by chart review were reviewed for determination by a thoracic surgeon. If no designation could be made, then the original scans were reviewed by a thoracic radiologist blinded to clinical pretest data and pathological outcome. Diagnosis was confirmed by the pathologic examination after thoracotomy, thoracoscopy, mediastinoscopy, bronchoscopy with biopsy (N = 523) or by radiographic surveillance among patients not undergoing a procedural biopsy (N = 83). Preoperative symptoms were defined as any documented evidence in the medical record of the following: hemoptysis, shortness of breath, unplanned weight loss, fatigue, pain, or pneumonia. Preoperative pre-dicted forced expiratory volume in 1 second (FEV1) was a continuous variable based on the most recent pulmonary func-tion test prior to their thoracic operafunc-tion.

Individuals with multiple nodules or who had the evi-dence for benign diseases (e.g., benign calcification, infil-trates, bronchiolitis obliterans organizing pneumonia, or empyema) were excluded (N = 39). Also, individuals receiving an operation for a known malignancy after initial chemo-radi-ation therapy (N = 13), who had no preoperative radiographic

FigurE 1.  Consort diagram of VUMC and 

Tennessee Valley Healthcare System VA cohort.  VUMC, Vanderbilt University Medical Center cohort;  VA, Veteran Affairs; NSCLC, non–small-cell lung  cancer.

(3)

imaging documentation to determine evaluation rationale (N = 13), those with known metastatic disease (N = 5), indi-viduals without a definitive clinical diagnosis after surgery (N = 1) or reoperation (N = 1) were excluded. Nonsurgical patients with less than 2 years of radiographic surveillance or clinical follow-up from the date of their surgical evaluation (N = 38) were excluded as were nonsurgical patients with mul-tiple or calcified nodules (N = 4). The remaining 492 patients were used in our analysis. Vanderbilt University Institutional Review Board approved this study with a waiver of individual patient consent.

The Tennessee Valley Veterans Affairs Cohort was com-posed of 245 individuals receiving a thoracic operation for known or suspected lung cancer between January 1, 2005 and December 31, 2013. Individuals with no preoperative radio-graphic imaging documentation to determine evaluation ratio-nale (N = 10), those with known metastatic disease (N = 4), individuals without a definitive clinical diagnosis after sur-gery (N = 2), known benign disease (N = 1), mesothelioma (N = 1), or reoperation (N = 1) were excluded. The remain-ing 226 patients were used to validate the TREAT model. Tennessee Valley Healthcare System—Veteran’s Affairs Institutional Review Board approved this study.

Selection of Variables for the Model

The TREAT model was developed using a prespecified set of candidate variables derived from previously published and validated models for determining appropriate screening populations for low-dose CT scans,13 or for estimating the

likelihood a lung nodule is malignant after its discovery14–16

(Table 1). Lesion growth and FDG-PET scan avidity were chosen due to their inclusion in recent guidelines for diag-nosis and that a patient should be referred for surgical evalu-ation when they occur.17 Additional variables for the TREAT

model were chosen after consulting with thoracic surgeons as to those factors commonly encountered in their practice and that influence their risk estimates, given the radiographic dis-covery of a lung nodule or mass.

Statistical Analysis

Analyses were performed in R v3.0.1 and Stata v12 (College Station, TX). Variables were summarized and exam-ined as appropriate in numerical (e.g., mean, median, or pro-portion) and graphical fashion (scatter plots, boxplots, or histograms). Descriptive statistics and the extent of missing data are reported in Table 1. Analysis of demographic vari-ables and prespecified predictors of lung cancer according to lung cancer status were conducted utilizing only the observed data. Multiple imputation techniques and predictive mean matching were used for imputation of the missing values.24

Multiple imputation assumptions were examined following the methods of Potthoff et al.25

Logistic regressions models for predicting lung can-cer were fit to the data and evaluated for accuracy, calibra-tion, and overfitting based on the methodology of Harrell and Steyerberg.24,26 Nonlinear associations between

continu-ous variables and lung cancer were evaluated using restricted cubic splines of three and five knots, and linearity was tested

using the likelihood ratio test. The model’s ability to dis-criminate between cancer and benign disease was evaluated by the area under the receiver operating characteristic curve (AUC). Model calibration, in the sense of directly compar-ing a model’s predicted probabilities to observed probabili-ties, was assessed with Brier score of the model’s predictions. Bootstrap methods were used to (1) obtain standard errors for the model parameters and model predictions, and (2) assess the degree of optimism of the model’s accuracy to predict cancer. A stable, internally valid TREAT model was corrected for optimism, which occurs due to overfitting of the model, by using the bootstrap with replacement approach. AUC and Brier score performance used 500 bootstrap iterations with replacement for the TREAT model.27 AUC and Brier score for

the Mayo Clinic model were estimated using the published equation of variable coefficients from the original article.14

The Mayo Clinic model’s probability of a malignant pulmo-nary nodule equals ex/(1 + ex); where x = −6.8272 + (0.0391 ×

age) + (0.7917 × smoking history) + (1.3388 × previous non-thoracic cancer) + (0.1274 × lesion size) + (1.0407 × spicu-lated lesion edge) + (0.7838 × upper lobe location). The AUC and Brier for the Mayo Clinic model were compared with the AUC and Brier of the bootstrapped TREAT model in both the VUMC development and the VA validation datasets.

TAblE 1.  Variables Used in Published, Validated Clinical 

Lung Cancer Prediction Models

Model PLCOM2012 ClinicMayo VA ModelSPN TREAT

Variables

Age √ √ √ √ √

Sex √ √

Race √

Education √

Body mass index √ √

COPD √ √b

Family history of cancer √

Smoking (Y/N) √ √ √

Smoking—pack years √a

Years quit smoking √ √

Hemoptysis √ √c Previous cancer √ √ √ Lesion size √ √ √ √ Lesion growth √ √ Spiculation √ √ √ Lesion location √ √ √ FDG-PET avidity √ √

PLCOM2012 Model was developed to estimate risk of lung cancer in a healthy population for purposes of determining who should be screened. The Mayo Clinic, VA, SPN model, and TREAT models were developed to estimate lung cancer risk given radiographic detection of a SPN.

aPack years were separated out into smoking duration and smoking intensity. bCOPD modeled using predicted forced expiratory volume in 1 sec as a continuous variable.

cHemoptysis included in preoperative symptoms which also included any of:

shortness of breath, unplanned weight loss, fatigue, pain, or pneumonia.

VA, Veteran Affairs; SPN, solitary pulmonary nodule; TREAT, Thoracic Research Evaluation And Treatment; COPD, chronic obstructive pulmonary disease; FDG-PET, F18-fluorodeoxyglucose positron emission tomography.

(4)

rESulTS

Lung cancer prevalence in the VUMC cohort (N = 492) was 72% and 93% in the VA cohort (N = 226). Pathological diagnosis after resection occurred in 451 (92%) and by active surveillance in 41 (8%) individuals in the VUMC cohort and all diagnoses in the VA cohort were determined patho-logically. Complete covariate data were available for 264 and 136 individuals in the VUMC and VA cohorts, respectively. Those with a cancer diagnosis were more likely to have com-plete data (58%) than those with the benign disease (42%) in the development cohort. Missing data for FDG-PET and FEV1 were more common among those patients in the VUMC cohort diagnosed by active surveillance (85% and 61%, respectively). Missing data occurred most often with FDG-PET scan (22% in VUMC and 21% in VA), growth on serial CT scans (13% in VUMC and 1% in VA), predicted FEV1 (10% in VUMC and 0% in VA) and preoperative dis-ease symptoms (7% in VUMC and 16% in VA). The remain-ing variables of interest had less than 5% missremain-ing data in the VUMC development dataset. Assumptions regarding the missing data did not preclude the use of multiple imputation according to Potthoff’s methodology.25 The VA cohort

pre-dominantly consisted of men (97%), had a higher prevalence of preoperative symptoms (62%), were more likely to smoke (95%) and smoked more (50 pack-years), had slightly larger lesions on average (29 mm) and those lesions were more likely to be FDG-PET avid (95%) when compared with the VUMC cohort (Table 2).

In the TREAT model, lung cancer risk increased with age, preoperative lesion size, lesion growth, previous cancer, and FDG-PET avidity (Table 3). Smoking intensity measured by pack-years had a nonlinear relationship with lung cancer and thus was modeled as a restricted cubic spline (Table 3). The initial model development, prior to bootstrapping, for the TREAT model produced an AUC of 0.89 (95% confidence interval [CI], 0.86–0.92) and Brier score of 0.11. Internal validation with bootstrap adjustment resulted in an AUC of 0.87 (95% CI, 0.83–0.92) and Brier score of 0.12 (95% CI, 0.10–0.14). In the VA validation cohort, the AUC for the boot-strapped TREAT model was 0.89 (95% CI, 0.79–0.98) and Brier score was 0.08 (95% CI, 0.06–0.10).

The Mayo Clinic model on the VUMC cohort, using published coefficients to estimate lung cancer risk (Table 3), had an AUC of 0.80 (95% CI, 0.75–0.85) which was sig-nificantly less (p < 0.001) than the AUC observed for the TREAT model (Fig. 2). The Mayo Clinic model generally overestimated the risk and its Brier score was 0.17 (95% CI, 0.15–0.19), showing poorer calibration than the TREAT model (Fig. 3) on the VUMC cohort. The AUC for the Mayo Clinic model on the VA cohort was 0.73 (95% CI, 0.60, 0.85) and Brier score was 0.18 (95% CI, 0.15–0.21). AUC and Brier score in the validation dataset were improvements over that found in the development dataset. The Mayo Clinic model on an average predicted a slightly higher probability of lung cancer compared to the TREAT model in individu-als with no growth and a higher risk for lung cancer when compared with the TREAT model among individuals with non-avid FDG-PET scans.

DiSCuSSiON

Clinicians evaluating pulmonary nodules are faced with a basic question of equipoise. Unlike biopsy for breast, pros-tate, or colon cancer, the lung nodule is difficult to access, and lung biopsy has significant risks associated with the procedure. Reviews of outcomes after lung surgery have found 1 to 3% mortality rates within 30 days and rates as high as 7% at 90 days.4,28–30 If the patient under evaluation has marginal lung

function and other preoperative comorbidities, then the likeli-hood of a poor outcome increases. This procedural risk is juxta-posed against the danger of missing a curable lung cancer. One suggested solution is for the clinician to delay biopsy and treat-ment until a more definitive noninvasive diagnosis is possible. The effects of diagnosis and treatment delay from the time of lesion discovery to stage progression and metastasis is not well known.31 Thus, clinical practice typically focuses on timeliness

of care even for a small and localized cancer. We propose a new, validated clinical prediction model for lung cancer in a patient population with lung nodules being evaluated for a sur-gical lung biopsy. The TREAT lung cancer model provided a high and consistent predictive discrimination for lung cancer based on common clinical characteristics and performed better

TAblE 2.  Demographic and Radiological Data in VUMC  Development and VA Validation Datasets VUMC N = 492 N = 226VA Male (%) 192 (51) 220 (97) Caucasian (%) 453 (92) 196 (87) Missing 2 2 Mean age (SD) 63 (13) 65 (8) Smoking status—ever (%) 378 (77) 215 (95) Median pack—years among smokers (IQR) 42 (30, 60) 50 (44, 85)

Missing 8 3

Mean BMI, kg/m2 (SD) 27.7 (6) 26 (5)

Missing 2 2

Mean predicted FEV1 (SD) 77.7 (20) 72.0 (17)

Missing 50 0

Preoperative symptomsa (%) 118 (26) 117 (62)

Missing 33 37

Previous cancer (%) 181 (37) 57 (25)

Upper lobe location (%) 283 (59) 77 (34)

Missing 13 0

Mean lesion size, mm (SD) 28 (19) 29 (17)

Growth (%) 200 (47) 90 (40) Missing 65 2 Spiculation 214 (45) 141 (65) Missing 19 10 FDG-PET avidity (%) 328 (86) 170 (95) Missing 109 47

Patients with complete data (%) 264 (58) 136 (60)

aPreoperative symptoms included any of: hemoptysis, shortness of breath, unplanned weight loss, fatigue, pain, or pneumonia.

FEV1, forced expiratory volume in 1 sec; VUMC, Vanderbilt University Medical Center; VA, Veteran Affairs; IQR, interquartile range; BMI, body mass index; FDG-PET, F18-fluorodeoxyglucose positron emission tomography.

(5)

than the Mayo Clinic model. The TREAT model’s high level of discrimination may be valuable in providing clinical guidance in estimating individual likelihood of lung cancer.

We identified superior performance of the TREAT lung cancer model (AUC = 0.87; 95% CI, 0.83–0.92) compared with the Mayo Clinic model (AUC = 0.80; 95% CI, 0.75–0.85) and validated the TREAT model (AUC = 0.89) in a separate cohort with a higher prevalence of the disease and a higher acuity from a nearby Veterans Affairs institution. The Mayo Clinic model performed well in the VUMC population although the prevalence of disease in our cohort was higher (72%) than in the population in which the Mayo Clinic model was developed (23%) and validated (44%).14–16 As the prevalence of disease

increased to 95%, as was found in the VA validation cohort, the accuracy of the Mayo Clinic model to discriminate malig-nancy decreased (AUC = 0.73). The calibration of the Mayo Clinic model, as measured by the Brier score, also decreased as the prevalence of the disease increased in the two cohorts for surgical evaluation. The Mayo Clinic model appears to underestimate the risk for cancer in the lower quintiles of lung cancer risk in these populations which limits its use in clinical practice for patients being evaluated for surgery.18 Two factors

are clear. First, the population being evaluated by surgeons has a higher prevalence of lung cancer than other lung nodule– evaluated populations.9,22,32 Second, the guideline suggested

model for estimating the likelihood of malignancy is less accurate in terms of discrimination and calibration in these two surgical populations.

The addition of FDG-PET avidity and lesion growth as predictive variables contributed to the TREAT model’s supe-rior discrimination. Exploratory analyses of the model with Bland–Altman plots showed that PET avidity and growth

TAblE 3.  Estimated Model Coefficients for Mayo Clinic and TREAT Models

Mayo Coefficient TREAT Coefficienta TREAT Odds Ratio (95% CI) p Value

Constant −6.827 −4.715

Age (per year) 0.0391 0.0533 1.05 (1.03–1.08) <0.001

BMI — −0.0262 0.97 (0.93–1.02) 0.24

Sex—male — −0.0547 0.95 (0.55–1.64) 0.84

Pack-years — a b 0.02

Smoking history (yes/no) 0.792 — — —

Lesion size(per mm) 0.127 0.0577 1.06 (1.04–1.08) <0.001

Spiculated lesion edge 1.041 0.277 1.32 (0.73–2.40) 0.26

Lesion location–upper lobe 0.784 −0.015 1.02 (0.58–1.78) 0.99

Lesion growth

No lesion growth — Reference Reference —

Insufficient data — 0.259 1.29 (0.56–2.79) 0.59

Growth observed — 1.160 3.18 (1.58–6.39) 0.003

Previous cancer 1.339 0.639 1.89 (1.05–3.42) 0.03

Predicted FEV1 — −0.013 0.99 (0.97–1.00) 0.07

Any preoperative symptoms — −0.461 0.63 (0.33–1.20) 0.16

FDG-PET Avid — 1.834 6.26 (2.78–14.1) <0.001

aProbability{lung cancer = 1} = ex/(1 + ex); where x = −4.715 + 0.0533 × (Age) – 0.0262 × (BMI) − 0.0547 × (sex: male) + 0.02338 × (pack-years) − 0.000003 × (pack-years)

+

3 + 0.000006 × + 0.000003 × (pack-years 80− )+

3 + 0.0577 × (lesion size) + 0.277 × (speculated lesion) + 0.015 × (lesion location) + 0.259 × {lesion growth: insufficient data} + 1.160 × {lesion growth: growth observed} + 0.639 × (previous cancer) − 0.013 × (predicted FEV1) − 0.461 × (any symptoms) + 1.834 × (FDG-PED Avid) and {c} = 1 if subject is in category

c, 0 otherwise; b (pack-years x− )3+=(pack-years x if (pack-years x)− )3 − 3>0 0 ttherwise, o .

Pack-years is modeled as a restricted cubic spline and therefore odds ratios are not directly interpretable.

TREAT, Thoracic Research Evaluation And Treatment; CI, confidence interval; FEV1, forced expiratory volume in 1 sec; BMI, body mass index; FDG-PET, F18-fluorodeoxyglucose positron emission tomography.

FigurE 2.  Comparison of AUC for the Mayo Clinic model using  originally published estimates and TREAT model in the VUMC and  VA cohorts. CIs for AUC are reported for each estimate model.  TREAT model AUC is significantly higher compared with the  Mayo Clinic model in each cohort (p < 0.001). AUC, area under  the receiver operating curve; TREAT, Thoracic Research Evaluation  And Treatment; VA, Veteran Affairs; VUMC, Vanderbilt  University Medical Center cohort; CI, confidence interval.

(6)

accounted for the majority of differences in prediction between the two models in the VUMC cohort (Supplemental Digital Content 1, http://links.lww.com/JTO/A655). The differences in predicted cancer risk between the two models were consis-tent across bootstrap generated datasets (Supplemental Digital Content 2, http://links.lww.com/JTO/A655). The TREAT model takes advantage of this additional information avail-able to surgeons at the time of the evaluation. The addition of FDG-PET, lesion growth, predicted FEV1, and presenta-tion with any symptoms improved the discriminapresenta-tion between benign disease and lung cancer.

At this time our findings have some limitations. The cohort used for model development was a retrospective review of a single tertiary academic medical center’s database contain-ing prospectively collected information supplemented by med-ical chart review for specific variables. The external validation cohort had a high prevalence of disease and the observed improvement in discrimination is likely due, in part, to the high prevalence of cancer in this cohort. This cohort was primarily a Veteran population, so there were more men smokers with pre-operative symptoms. These differences did not result in a drop in the AUC or Brier score and may improve the generalizability of the model. Other surgical populations being evaluated for lung resection may have differing prevalence of the disease, referral patterns, radiologist expertise or other underlying fac-tors, like endemic granulomatous disease which are important considerations for clinical prediction models intended to apply to more than a single institution.22 The impact of these

pos-sible predictors of lung cancer is relevant to future work as the TREAT model is validated in other populations.

As in most clinical datasets used for association stud-ies or predictive models, missing data for the predictors of interest is a constraint in our study. In the development of the

TREAT model, statistical methods were used to impute ing variables and analyses determined the impact of this miss-ing data on the model. We chose to include 41 patients who did not have a surgical resection and underwent radiographic surveillance. Including these patients increased the missing data in the cohort, but more importantly, excluded a bias in the spectrum of risk encountered by clinicians and minimized the additional bias arising from patients not undergoing resec-tion. For example, predicted FEV1 is generally not performed unless resection is likely as clinical quality guidelines suggest performing pulmonary function tests prior to lung resection. This pattern of work-up bias between the individuals indicates that the data may not be missing at random. The multiple imputation algorithm used depends on the missing at random assumption given known covariate patterns, and our investi-gation of this assumption did not yield any indications that would be concerning. In a sensitivity analysis, TREAT model results with only complete data which was similar to that esti-mated using imputed data.

Although little difference was observed in the bootstrap adjusted AUC or Brier scores when compared with the initial estimates of AUC and Brier score, the model must be vali-dated in independent, external populations prior to use in the clinical setting. Other cohorts for external validation of the TREAT model should have differing prevalence of the disease and from other regions of the country to assess the generaliz-ability of the TREAT model in aiding surgeons evaluating a lung nodule.

The prediction model developed by Bach et al.,10 the

Liverpool model33 and that developed by Tammemägi et al.13

are intended to determine who would most benefit from lung cancer screening. The Mayo Clinic, the VA Lung Cancer and the solitary pulmonary nodule models for characterizing lung

FigurE 3.  Box plots comparing Brier score from  500 bootstrap samples of the data for each model  and from each cohort. Brier score for the TREAT  model exhibited better calibration of predicted  probability of lung cancer when compared with  the Mayo Clinic model in the VUMC cohort (TREAT  = 0.12 versus Mayo Clinic = 0.17) and VA cohort  (TREAT = 0.07 versus Mayo Clinic 0.18). Brier score  values of 0.25 are the same as a chance and lower  values of Brier score represent increased calibration  of the model. As Brier score decreases from 0.25 to  0, the predicted probability of cancer increasingly  equals the observed probability of cancer and the  calibration is improved. TREAT, Thoracic Research  Evaluation And Treatment; VUMC, Vanderbilt  University Medical Center cohort; VA, Veteran  Affairs.

(7)

nodules after their discovery for surgical biopsy were devel-oped and validated in populations with lower prevalences of lung cancer than observed here.15–17 A recently published

model to characterize the likelihood, a screening discovered lung nodule was cancerous had an extremely high AUC of 0.94, but it was developed and calibrated in screening popula-tions with a 5.5% prevalence of lung cancer.34 When

examin-ing the landscape of clinical prediction models for lung cancer, clinicians evaluating a patient immediately prior to surgery have no validated models for this high risk population.35 The

TREAT model addresses that need. Future work will validate the TREAT model in external datasets with varying preva-lence of malignancy to measure the changes in the negative predictive value of the model. Application of the TREAT model in the clinical setting requires prospective evaluation of the model to determine those cut points of the predicted risk that both minimizes the risk of missing a lung cancer with the harms of a futile thoracotomy.

CONCluSiON

In a population with a radiographically confirmed lung lesion being evaluated for possible resection, the TREAT model predicted the risk for lung cancer with high accuracy and an AUC of 0.87. The model was validated and showed little overfitting in its accuracy to discriminate between lung cancer and benign disease. The TREAT model incorporates the full spectrum of epidemiologic and radiographic evidence available to surgeons in the United States today, and it bet-ter predicts lung cancer with a higher AUC than existing published models. This model will be validated in additional external datasets and if valid, applied in a prospective study to reduce unnecessary pulmonary resections.

ACKNOWlEDgMENTS

Supported by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service Career Development Award 10– 024 (to Eric L. Grogan), and NIH/NCI K07CA172294 (to Melinda C. Aldrich ). This work was also supported by the Vanderbilt Institute for Clinical and Translational Research grant, UL1TR000011 from NCATS/NIH (REDCap database). Vanderbilt Physician Scientist Development Award (to Eric L. Grogan ), the lung SPORE CA90949 (to Pierre P. Massion), and the EDRN CA152662, a Merit award from the Department of Veterans Affairs (to Pierre P. Massion).

rEFErENCES

1. Siegel R, Ward E, Brawley O, Jemal A. Cancer statistics, 2011: the impact of eliminating socioeconomic and racial disparities on premature cancer deaths. CA Cancer J Clin 2011;61:212–236.

2. Jemal A, Siegel R, Xu J, et al. Cancer statistics, 2010. CA Cancer J Clin 2010;60:277–300.

3. CDC. Cigarette smoking among adults and trends in smoking cessa-tion—US, 2008. Health and Human Services Centers for Disease Control and Prevention. MMWR CDC Surveill Summ., 2009. Pp. 1227–1232. Available at http://www.cdc.gov/mmwr/PDF/wk/mm5844.pdf.

4. National Lung Screening Trial Research Team, Aberle DR, Adams AM, Berg CD, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 2011;365:395–409.

5. Deppen S, Putnam JB Jr, Andrade G, et al. Accuracy of FDG-PET to diagnose lung cancer in a region of endemic granulomatous disease. Ann Thorac Surg 2011;92:428–433.

6. Kozower BD, Meyers BF, Reed CE, et al. Does positron emission tomog-raphy prevent nontherapeutic pulmonary resections for clinical stage IA lung cancer? Ann Thorac Surg 2008;85:1166–1170.

7. Veronesi G, Bellomi M, Scanagatta P, et al. Difficulties encountered managing nodules detected during a computed tomography lung cancer screening program. J Thorac Cardiovasc Surg 2008;136:611–617. 8. Starnes SL, Reed MF, Meyer CA, et al. Can lung cancer screening by

computed tomography be effective in areas with endemic histoplasmosis? J Thorac Cardiovasc Surg 2011;141:688–693.

9. Kuo E, Bharat A, Bontumasi N, et al. Impact of video-assisted thoraco-scopic surgery on benign resections for solitary pulmonary nodules. Ann Thorac Surg 2012;93:266–273.

10. Bach PB, Kattan MW, Thornquist MD, et al. Variations in lung cancer risk among smokers. J Natl Cancer Inst 2003;95:470–478.

11. Spitz MR, Etzel CJ, Dong Q, et al. An expanded risk prediction model for lung cancer. Cancer Prev Res 2008;1:250–254.

12. Cassidy A, Duffy SW, Myles JP, et al. Lung cancer risk prediction: a tool for early detection. Int J Cancer 2007;120:1–6.

13. Tammemägi MC, Katki HA, Hocking WG, et al. Selection criteria for lung-cancer screening. N Engl J Med 2013;368:728–736.

14. Swensen SJ, Silverstein MD, Ilstrup DM, et al. The probability of malig-nancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med 1997;157:849–855.

15. Schultz EM, Sanders GD, Trotter PR, et al. Validation of two models to estimate the probability of malignancy in patients with solitary pulmo-nary nodules. Thorax 2008;63:335–341.

16. Gurney JW, Lyddon DM, McKay JA. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part II. Application. Radiology 1993;186:415–422.

17. Gould MK, Donington J, Lynch WR, et al. Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians

evidence-based clinical practice guidelines. Chest 2013;143(5 Suppl):e93S–e120S. 18. Isbell JM, Deppen S, Putnam JB Jr, et al. Existing general population

models inaccurately predict lung cancer risk in patients referred for surgi-cal evaluation. Ann Thorac Surg 2011;91:227–233.

19. Grogan EL, Jones DR. VATS lobectomy is better than open thoracot-omy: what is the evidence for short-term outcomes? Thorac Surg Clin 2008;18:249–258.

20. Society of Thoracic Surgeons. General Thoracic Surgery Database. Available at: http://www.sts.org/sts-national-database/database-manag-ers/general-thoracic-surgery-database. Accessed July 21, 2013. 21. Grogan EL, Weinstein JJ, Deppen SA, et al. Thoracic operations for

pul-monary nodules are frequently not futile in patients with benign disease. J Thorac Oncol 2011;6:1720–1725.

22. Grogan EL, Deppen SA, Ballman KV, et al. Accuracy of fluorodeoxyglucose-positron emission tomography within the clinical practice of the American College of Surgeons Oncology Group Z4031 trial to diagnose clinical stage I non-small cell lung cancer. Ann Thorac Surg 2014;97:1142–1148. 23. National Comprehensive Cancer Network. Lung Cancer Screening

v1.2014. NCCN Clinical Practice Guidelines in Oncology (NCCN Guidelines) [serial online]. Available from National Comprehensive Cancer Network. Accessed October 27, 2011.

24. Harrell FEJ. Regression Modeling Strategies. Charlottesville, VA: Springer, 2001.

25. Potthoff RF, Tudor GE, Pieper KS, et al. Can one assess whether missing data are missing at random in medical studies? Stat Methods Med Res 2006;15:213–234.

26. Steyerberg EW. Clinical Prediction Models. New York: Springer, 2009. 27. Steyerberg EW, Harrell FE, Borsboom GJJM, et al. Internal validation of

predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001;54:774–781.

28. von Meyenfeldt EM, Gooiker GA, van Gijn W, et al. The relationship between volume or surgeon specialty and outcome in the surgical treat-ment of lung cancer: a systematic review and meta-analysis. J Thorac Oncol 2012;7:1170–1178.

29. Cheung MC, Hamilton K, Sherman R, et al. Impact of teaching facility status and high-volume centers on outcomes for lung cancer resection: an examination of 13,469 surgical patients. Ann Surg Oncol 2009;16:3–13.

(8)

30. Powell HA, Tata LJ, Baldwin DR, et al. Early mortality after surgical resection for lung cancer: an analysis of the English National Lung cancer audit. Thorax 2013;68:826–834.

31. Quarterman RL, McMillan A, Ratcliffe MB, et al. Effect of preoperative delay on prognosis for patients with early stage non-small cell lung can-cer. J Thorac Cardiovasc Surg 2003;125:108–114.

32. Smith MA, Battafarano RJ, Meyers BF, et al. Prevalence of benign dis-ease in patients undergoing resection for suspected lung cancer. Ann Thorac Surg 2006;81:1824–1829.

33. Cassidy A, Myles JP, van Tongeren M, et al. The LLP risk model: an individual risk prediction model for lung cancer. Br J Cancer 2008;98:270–276.

34. McWilliams A, Tammemagi MC, Mayo JR, et al. Probability of can-cer in pulmonary nodules detected on first screening CT. N Engl J Med 2013;369:910–919.

35. Division of Cancer Control and Population Sciences. Lung Cancer Prediction Models. May 2, 2013. Available at: http://epi.grants.cancer. gov/cancer_risk_prediction/lung.html. Accessed May 6, 2014.

Figure

FigurE 1.  Consort diagram of VUMC and  Tennessee Valley Healthcare System VA cohort. 
TAblE 1.  Variables Used in Published, Validated Clinical  Lung Cancer Prediction Models
FigurE 2.  Comparison of AUC for the Mayo Clinic model using  originally published estimates and TREAT model in the VUMC and  VA cohorts. CIs for AUC are reported for each estimate model.  TREAT model AUC is significantly higher compared with the  Mayo Cli

References

Related documents

The Scottish Government ’ s Social Innovation Fund (SIF) is one manifestation of this social investment logic, requiring research institutions and Social Economy Organisations (SEOs)

The aim of the present study was to assess midwifery students’ knowledge on and attitude toward oral health promotion of pregnant women and its relationship with the students’

Furthermore, his position as the head of Indonesian Chamber of Commerce and Industry (KADIN) in the province of Banten strengthens Ratu Atut‟s family position in

According to this theory, decision makers evaluate separately the positive part and the negative part of each lottery (positive and negative with respect to the status quo point).

channels will be required to provide television access services in the following year. Ofcom will aim to notify relevant channels by 31 May. Broadcasters that do not

Driven primarily by the relative ease of descending thoracic aortic stent-grafting, it has been more liberally applied to patients with various descending thoracic aortic

In this section, we present the existing superpixel metrics, that were progressively introduced in the literature to evaluate the color homogeneity, the respect of image objects or

“Level 1&#34; is the specific minimum requirement for school library programs established in the Iowa Administrative Code 12.3(12). “Level 2” describes a district that is