CHAPTER VIII: STUDY FINDINGS AND LIMITATIONS
8.2 Limitations
Data Issues
Despite the merits of a large claims database in observational studies, there were some drawbacks to using the MarketScan Commercial Claims and Encounter (CCAE)
database as the primary data source in this dissertation. The limitations of this real-world data can potentially threaten the internal and external validity of our study results. Therefore, caution needs to be practiced when interpreting our findings.
In this dissertation, we selected patients diagnosed with Crohn's disease who filed claims between 2005 and 2009. These patients were between 18 and 64 years old with commercial insurance coverage, and comprised the majority of enrollees in the MarketScan CCAE database. Older patients aged 65 and above were mostly retirees, and eligible for Medicare. A large portion of young patients under age 18 were insured by public insurance programs (e.g., Medicaid and SCHIP). Therefore, our findings may not be generalizable to patients aged 65 and above and under 18 years, even though CD can affect both.[16]
The MarketScan database provides rich information about patients' utilization records from pharmacy counters or other medical facilities whenever a claim is filed to an insurer. When serving as source data for an observational study, the MarketScan database lacks personal information about each patient's demographic and socioeconomic status as well as
182
detailed medical and clinical data. When developing the conceptual framework for this study, we recognized that education level, annual income, and race were important factors that could potentially impact a patient’s choice of treatment strategy. Omitting these variables in the analytical models could have introduced selection bias into comparisons of healthcare utilization and costs between the two patient cohorts.
We also realized that more accurate information about patients' initial diagnosis of CD, general health condition, and progression of disease severity was needed to ensure the credibility of our study findings. Unfortunately, the MarketScan database does not contain detailed clinical and medical histories, so other measurements were developed to
approximate those variables. For example, we denoted the CD diagnosis date as the date of the first claim for a procedure with a CD diagnosis or a prescription for a CD drug. We assumed that patients were newly diagnosed with CD if no CD-related claims were filed in the six months preceding the diagnosis. We used the Charlson Comorbidity Scores and the number of prescriptions filled in six months prior to diagnosis to approximate the general health condition at diagnosis. Further, we used a claims-based algorithm developed by Malone et al. to define disease severity. This method was considered as a better proxy for disease severity classification compared to other algorithms because it reflects the current treatment practice in the US. However, this claims-based approach has not been verified in an actual patient population. It is unknown how accurate the definition of disease severity is when compared with standard way in clinical practice. There is likely a potential
misclassification, particularly for patients with milder disease symptom and patients who under-utilized healthcare services. [68]
183 Modeling Issues
A decision tree is a simple and effective form of a decision model. In this dissertation, the budget impact analysis for Aim 2 and cost analysis for Aim 3 were based on the same decision tree model. Since we took a real-world data approach to predict the incremental costs of the new treatment strategy (early adoption), the time frame for the decision tree model was set to three years, which was based on availability of claims data for CD patients in the MarketScan database. In order to obtain consistent estimates of healthcare costs (payoff value) and probabilities at each chance node, we required at least 20 patients in the same path for each year of either treatment scenario. From our study cohorts (3,082 early biological users and 2,986 late biological users), we could not identify an adequate number of patients to empirically estimate model parameters for the fourth year of disease. For example, among 26 early biological users with mild to moderate disease in the third year of CD, there were 6 patients had mild disease and only 1 patient had severe disease in the fourth year. Sample sizes reduction over time due to attrition prevented us from extending the time frame of the decision model. Based on the three-year decision tree model, the budget impact
analysis for Aim 2 and cost analysis for Aim 3 provided positive information about the value of biological therapies for CD patients, and the tendency toward financial benefits under the new treatment strategy. However, a time frame of 5-10 years would be more desirable to demonstrate the long-term value of the new treatment strategy since CD is chronic. Another major methodological issue should be mentioned. The budget impact and cost analyses were not the best economic evaluation methods to effectively demonstrate the value of a new treatment or technology since both methods focus on costs rather than effectiveness (namely, health-related quality of life). In the decision tree model, disease
184
severity was designated as an outcome of medical treatment. Change in disease severity (either improving or worsening) may be correlated to patients' quality of life. However, disease severity, often in discreet categories, cannot quantitatively represent quality of life on a continuous numeric scale. To account for both costs and effectiveness, cost-effectiveness analysis (CEA) is a more appropriate method for economic evaluation. However, lack of effectiveness data was the obvious barrier to using CEA. The MarketScan database does not contain information about health related quality of life. In the literature, quality of life data were only reported in randomized clinical trials (RCTs) according to treatment arms. Without patient- level information, the published RCT data cannot be incorporated into the decision tree model where patients were stratified by disease severity. Furthermore, RCT data were obtained from studies with relatively small sample sizes, so the results may not be
generalizable to a large patient population.