Emergency Physicians Maintain Performance on the American Board of Emergency Medicine Continuous Certification (ConCert) Examination

(1)

Emergency Physicians Maintain Performance

on the American Board of Emergency

Medicine Continuous Certi

ﬁ

cation (ConCert)

Examination

Catherine A. Marco, MD, Francis L. Counselman, MD, Robert C. Korte, PhD, Chad M. Russ, MS, Cameron T. Whitley, MA, and Earl J. Reisdorff, MD

Abstract

Objectives: The American Board of Emergency Medicine (ABEM) Maintenance of Certification (MOC) program is a four-step process that includes the Continuous Certification (ConCert) examination. The ConCert examination is a validated, summative examination that assesses medical knowledge and clinical reasoning. ABEM began administering the ConCert examination in 1989. The ConCert examination must be passed at least every 10 years to maintain certi_fication. This study was undertaken to determine longitudinal physician performance on the ConCert examination.

Methods: In this longitudinal review, ConCert examination performance was compared among residency-trained emergency physicians (EPs) over multiple examination cycles. Longitudinal analysis was performed using a growth curve model for unbalanced data to determine the growth trajectories of EP performance over time to see if medical knowledge changed. Using initial certification qualifying examination scores, the longitudinal analysis corrected for intrinsic variances in physician ability. Results: There were 15,085 first-time testing episodes from 1989 to 2012 involving three examination cycles. The mean adjusted examination scores for all physicians taking the ConCert examination for a first cycle was 85.9 (95% confidence interval [CI]=85.8 to 85.9), the second cycle mean score was 86.2 (95% CI=86.0 to 86.3), and the third cycle was 85.4 (95% CI=85.0 to 85.8). Using thefirst examination cycle as a reference score, the growth curve model analysis resulted in a coefficient of +0.3 for the second cycle (p < 0.001) and–0.5 for the third cycle (p= 0.02). Initial qualifying (written) examination scores were significant predictors for ConCert examination scores.

Conclusions: Over time, EP performance on the ConCert examination was maintained. These results suggest that EPs maintain medical knowledge over the course of their careers as measured by a validated, summative medical knowledge assessment.

S

ince the establishment of emergency medicine (EM) as the 23rd medical specialty by the Ameri-can Board of Medical Specialties (ABMS) in

1979, the American Board of Emergency Medicine (ABEM) has offered a time-limited certiﬁcate. ABEM-certiﬁed physicians must successfully pass a cognitive

From the Department of Emergency Medicine, University of Toledo College of Medicine (CAM), Toledo, OH; the Department of Emergency Medicine, Eastern Virginia Medical School and Emergency Physicians of Tidewater (FLC), Norfolk, VA; the American Board of Emergency Medicine (RCK, CMR, EJR), East Lansing, MI; and the Department of Sociology, Michigan State University (CTW), East Lansing, MI.

Received September 19, 2013; revision received November 11, 2013; accepted November 24, 2013. The authors have no relevant_ﬁnancial information to disclose.

Drs. Marco and Counselman serve on the ABEM Board of Directors. Dr. Korte, Mr. Russ, and Dr. Reisdorff are ABEM employees. Mr. Whitley is a consultant for ABEM.

Supervising Editor: John Burton, MD.

Address for correspondence and reprints: Catherine Marco, MD; e-mail: [email protected].

532 PII ISSN 1069-6563583 doi: 10.1111/acem.12378

(2)

expertise examination at least every 10 years to main-tain certiﬁcation, and this is now true for every ABMS medical specialty.

The ABEM Maintenance of Certification (MOC) pro-gram is a four-step process that includes a cognitive expertise examination, which is a component (Part III) of the MOC program. Originally termed the“recertifi ca-tion examinaca-tion,” beginning in 2004 with the adoption of the ABEM MOC program, the cognitive expertise examination became known as the Continuous Certifi ca-tion (ConCert) examinaca-tion. Prior to entering the ABEM MOC program, the physician must be initially certified. This occurs through the successful completion of accredited EM residency training, passing the written multiple-choice item qualifying examination, and pass-ing an oral (certifypass-ing) examination.

The ConCert examination is a high-stakes examina-tion that measures the competency of medical knowl-edge using single best answer multiple-choice items. Physicians may elect to take the examination earlier than 10 years from the previous examination. Originally a paper-and-pencil test, the ConCert examination became a computer-delivered test in 2004 at secure test-ing centers throughout the United States (Pearson VUE, Bloomington, MN). Scoring the examination is based on the percentage of correct answers with adjustment of the score using an equating process. Equating is a sta-tistical method that modifies the scores reported to can-didates to reflect the scores that would have been achieved on a reference examination. The purpose of this is to assure that year-to-year variation in the diffi -culty of an examination does not result in an advantage or disadvantage to an individual physician. ABEM began equating the ConCert examination in 2004, using the 2004 examination as the reference examination. Prior to that time, scoring was not equated.

Previous reports have demonstrated that physician skills, including medical knowledge, tend to decline over time.1–5This described tendency for proﬁciency to wane is evidence of the need for an effective continuous pro-fessional development process to maintain medical knowledge. In addition, the reported decline in skill and knowledge highlights the need for clinically oriented, valid measures of a physician’s knowledge over the course of one’s clinical career.

ABEM seeks to adopt physician assessment pro-cesses that are valid, reliable, fair, and psychometri-cally sound. In regard to a medical knowledge examination, fairness and validity are determined, in part, by the degree to which an examination is clini-cally relevant. Analyzing the trajectories of examination performance relative to years of practice is relevant to determining the validity of the ConCert examination. This study was undertaken to determine the longitudi-nal trends in the medical knowledge of EPs as mea-sured by performance on the ConCert examination over years of practice.

METHODS Study Design

This was a retrospective, longitudinal review of ConCert examination scores. This study was reviewed and

deter-mined to be exempt research by the University of Toledo Institutional Review Board.

Study Setting and Population

Twenty-four years of ConCert examination scores from 1989 to 2012 were compared among residency-trained EPs. The ConCert examination typically contains about 205 single best-answer multiple-choice items. Each item includes a question and multiple foils. Field test items (unidentiﬁed) are also included on the examination. Field test items are not used for determining the ﬁnal performance score.

All ABEM-certified, EM residency–trained physicians who took a ConCert examination for thefirst time in a given certification cycle were included in the analysis. Physicians who had lapsed board certification and were taking the examination to regain certification were also included, although this number was low. Physicians who were taking the examination for a repeat attempt because they had failed an earlier attempt for this recer-tification cycle were excluded, although their initial examination performance was included in the analysis. Physicians who became initially certified through the practice pathway (“grandfathering”) were excluded from analysis due to the group heterogeneity, varied residency training, varied clinical backgrounds, and the current ineligibility to become ABEM-certified. All results were deidentified for the study investigators; data were reported in aggregate only.

Study Protocol

Scores from physicians taking the examination for any or all of three recertification cycles were compared. A certification cycle is approximately 10 years in length. For example, a second-cycle ConCert examinee would have been initially certified about 20 years earlier. How-ever, exceptions to this schedule may occur. For exam-ple, a physician can take the ConCert examination earlier than 10 years, or if trying to regain lapsed

certi-ﬁcation, could have taken it beyond a 10-year span. This study reviewed physician performance on 23 ConCert examinations from 1989 to 2011. Analysis included only those physicians taking a ConCert examination for the

first time during a recertification cycle. The key mea-surement was the examination score for first-time test takers on thefirst, second, and third ConCert examina-tion cycles. All data were retrieved from the secure ABEM database.

Primary outcome measures included examination scores of physicians taking the ConCert examination during successive certiﬁcation periods. Of note, these scores were adjusted based on the physicians’ qualify-ing (written) examination results. In this way, the longi-tudinal analysis could correct for intrinsic variances in physician ability and thus yield more accurate perfor-mance comparisons of medical knowledge over time.

Data Analysis

By using growth curve modeling, a longitudinal analysis approach applied to physician initial ConCert examina-tion scores over multiple examinaexamina-tion cycles, we are able to determine the overall trend in the maintenance of medical knowledge. This model analyzes individual

(3)

longitudinal data in aggregate to determine if there is a statistically signiﬁcant trend. By identifying this trend, we can make individual-level predictions about the maintenance of medical knowledge (Data Supplement S1, available as supporting information in the online version of this paper).

RESULTS

There were 15,085 first-time testing episodes from 1989 to 2012. Of these, 11,334 were in cycle 1, 3,403 were in cycle 2, and 348 were in cycle 3 (Table 1). The mean ages for candidates were progressively and expectedly older as the number of recertification cycles increased (Table 1). Because the mean age statistics are based on the entire available sample, confidence intervals (CIs) were not calculated because the true mean age was ascertained. Pearson’s chi-square predictably indicates a significant relationship between age and ConCert examination cycle (p < 0.001). Sex information was not required on the application for early ConCert examina-tions. Sex information prior to 2000 is too sparse for meaningful analysis and interpretation.

The unadjusted ConCert examination scores are the ConCert examination scores for the various cycles prior to controlling for qualifying examination scores (Table 2). After controlling for qualifying examination scores, the mean ConCert examination scores for the

ﬁrst cycle were 86.4 (95% CI = 85.8 to 85.9; Figure 1). This served as the reference score against which cycle 2

and cycle 3 performances were compared. On average, the cycle 2 ConCert examination scores were slightly higher (0.29) than the reference scores (p < 0.01; Table 3). The cycle 3 ConCert examination scores were slightly lower (–0.50) than the reference scores (p=0.02). In both cases these differences amount to less than 1 point and may not be important differences. Based on an a priori significance of a<0.01, the cycle 2 scores are significantly higher, and the cycle 3 scores are not significantly different (lower), than the reference score.

Qualifying examination scores are signiﬁcant predic-tors of ConCert examination scores. Every point increase in an individual’s qualifying examination pass-ing score above a score of 84 leads to a 0.70 point increase in future ConCert examination scores.

DISCUSSION

The ConCert examination is the single summative evalu-ation activity within the ABEM MOC program. For that experience to be fair and valid, the medical knowledge that is assessed must be relevant to the clinical practice of EM. The medical knowledge within the core content of EM is rapidly changing. It is essential for the EP to have a substantial amount of medical knowledge avail-able for immediate recall. The current environment of

Table 1

Age Comparison of ConCert Examination Candidates

ConCert 1 ConCert 2 ConCert 3

Pearson’s chi-square* Number of candidates 11,334 3,403 348 Age (yr),† mean (_n) 43.4 (11,323) 52.6 (3,401) 60.3 (348) p<0.001

ConCert=Continuous Certification.

*The significance of difference in value across cycles is reported with Pearson’s chi-square calculation.

†Age is missing for 11 candidates in cycle 1 and two

candi-dates in cycle 2, accounting for an overall reduction in sam-ple size. Samsam-ple sizes are presented in parentheses. Because the mean age statistics are based on the entire available sample, CIs were not calculated because the true mean age was ascertained.

Table 2

ConCert Examination Scores Across Three Examination Cycles

Score Types Observations Mean Score SD

Qualifying examination 15,085 86.6 5.5

ConCert cycle 1 11,334 86.4 5.5

ConCert cycle 2 3,403 87.1 5.7

ConCert cycle 3 348 87.0 5.0

Table 3

ConCert Examination Scores Controlled for Qualifying Examina-tion Scores Using Growth Curve Modeling

Score Parameters Coefficient 95% CIs Significance

ConCert cycle 1 performance (reference) 85.86 85.78 to 85.94 <0.01 ConCert cycle 2 performance 0.29 0.16 to 0.43 <0.01 ConCert cycle 3 performance – 0.50 –0.90 to–0.10 0.02 Qualifying examination score 0.70 0.68 to 0.72 <0.01

Figure 1. Continuous certification (ConCert) examination scores corrected for the qualifying examination (QE) scores.

(4)

EM practice prohibits information retrieval for each individual patient encounter. Cassel and Holmboe6 assert that medical knowledge is an essential element of the clinical reasoning process and buttresses diagnostic acumen. Diagnostic acumen, the ability to translate undifferentiated symptoms into diagnoses, is of para-mount importance to the EP. The assessment of cogni-tive expertise has been critically reviewed elsewhere by Brennan et al.7 Further, competence in medical knowl-edge is associated with competence in other areas, such as patient safety.7

The clinical relevance of the ConCert examination is further emphasized by using psychometrically validated items that measure diagnostic reasoning.8 All ABEM multiple-choice items on the examination are developed by a panel of extensively trained item writers. ABEM item writers must be clinically practicing EPs who are EM residency trained and have at least 5 years of clini-cal experience after initial certiﬁcation. Item writers receive intensive training in item writing, psychometric principles, and statistics. Following authorship by an appointed item writer, all items are edited by ABEM examination editors, formatted by ABEM staff and edi-tors, ﬁeld tested to ensure fairness and validity, and

finally included as scored items on the examination. When an item appears as a field test or scored item on the ConCert examination, candidates have the opportu-nity to comment on every item. Every comment is reviewed by both ABEM staff and the examination editors. An item can be scored only after a rigorous process of field testing, review of examinee comments, editorial review, and psychometric analysis. Once a pool of fair, reliable, and valid items is constructed, an exam-ination can be assembled using a “blueprint” that includes weighted frequencies of items from the content areas of the Model of the Clinical Practice of Emergency Medicine, a distribution of condition acuity, physician tasks, and psychometric performance (e.g., difficulty, discrimination). The ConCert examination is composed of both Type I items that test the recall of facts and Type II items, clinical case presentations that require diagnos-tic reasoning. In general, about two-thirds of ConCert examination items are Type II items; one-third are Type I. For the 2012 ConCert examination, 75.5% of ConCert examination candidates commented that all or most items were relevant to the practice of EM, 20.2% felt that only some or few items were relevant, and 4.3% did not answer (unpublished data).

The ConCert examination is a criterion-referenced examination; it does not use a performance curve. As such, any candidate who achieves a predetermined equated score of 75 or higher passes the examination. There is no curve nor any quota for passing or failing scores. The passing score was determined by ABEM directors using a modiﬁed Angoff method, which is a standard psychometric determination.

The ConCert examination has been remarkably con-sistent and stable over time. As part of the equating of ConCert administration forms, ABEM calibrated the ConCert item bank using item response theory. This put all items on the same scale and ensured that the exami-nation scores were equivalent. Subsequent analyses indicated that the computer and paper-and-pencil forms

of the examinations evidenced similar difﬁculty, stan-dard deviations (SDs), and reliability.

An examination blueprint is the set of design charac-teristics for an examination. For the ConCert, the blue print includes the number of questions from the various domains within the EM Model, the levels of condition acuity, item difﬁculty, and the number of questions on pediatric and geriatric conditions. The content of the examination in terms of the examination blueprint remained the same over the duration of the study. There has not been a formal criterion study correlating the ConCert examination with other examinations, aside from the qualifying examination as discussed. In 2011, however, ABEM conducted a practice analysis, the results of which validated the content of the ConCert examination as being representative of EM practice.

Reliability is commonly reported as coefﬁcient alpha (a), a measure of internal consistency. This alpha mea-sures the homogeneity of the test items as samples of the content domain. It serves as an estimate of what the correlation would be between scores on equivalent forms of the test. The reliability of the examination has remained in the range from 0.84 to 0.89.

Each form of the ConCert examination is built to a standard difficulty and ABEM has been successful in closely meeting that standard across forms. Equating then rectifies any remaining difference in difficulty across forms. This results in little variation in mean scores across time. The Angoff standard-setting exer-cise was last conducted in 1996. ABEM will have a new standard-setting activity when the new examination for-mats are implemented in 2015.

Physicians who take the ConCert for the third cycle have been practicing for about 30 years since initial cer-tification. The attrition of EPs obviously increases with age and duration in practice. The smaller group of phy-sicians that took the ConCert for a third cycle is poten-tially influenced by retirement, death, physicians who opt to not maintain certification, and other factors. ABEM is unable to make a determination of the possi-ble performance that a physician who opted to not maintain certification would have demonstrated. None-theless, when comparing EPs to other specialties that have reported MOC examination performance, EPs have maintained medical knowledge and diagnostic rea-soning across a 30-year span, whereas physicians in other specialties appear to have declined.2,3,5

Successful performance on a recertification cognitive expertise examination in other specialties is associated with numerous factors, including continuing medical education participation, initial certification score, U.S. medical school graduation, group versus solo practice, younger age at the time of initial certification, and male sex.9–11 Studies have demonstrated an association between performance on MOC examinations and clini-cal performance. One study of 3,602 internists demon-strated that performance on the American Board of Internal Medicine cognitive expertise examination was associated with improved clinical practice.12

Although previous studies have demonstrated a decline in physician skills over time, EPs maintain their medical knowledge throughout their careers. Our study showed that even into the third ConCert examination

(5)

cycle, about 30 years after the physicians were initially certified, medical knowledge was maintained, as mea-sured by performance on the ConCert examination. There are several potential explanations for thisfinding. Since the beginning of EM as a specialty, a cognitive expertise examination has been a professional expecta-tion to maintain ABEM certification. EPs might adjust their continuous professional development activities to address this requirement. Unlike many other specialists, EPs continue to practice general EM throughout their careers, thus maintaining constant exposure to general content in EM. Subspecialization in EM is somewhat limited. Among the 30,211 ABEM diplomates, only 972 subspecialty certificates have been issued (3.2%; unpub-lished data). Another reason for this sustained perfor-mance might be the clinical focus and relevance of the ConCert examination, written and edited by clinically active physicians. The longitudinal trends observed in this study would be highly unlikely if the ConCert exam-ination contained esoteric information that one might only have recently acquired through residency training. Throughout a career in the general practice of EM, the EP is immersed in an environment of continuous knowl-edge acquisition. EPs often work with other EPs, con-sultants, and staff; attend CME events; and participate in educational lectures, conferences, and seminars for colleagues, prehospital personnel, and other emergency team providers. These interactions potentially promote the continued exchange of medical knowledge. This affirms the likelihood that continued learning occurs by practicing EPs over a 30-year career. It also suggests that the ConCert examination is clinically relevant and valid.

LIMITATIONS

Taking the ConCert examination is a voluntary activity and there may be a self-selection bias for physicians taking the ConCert examination. Any conjecture as to the potential performance of physicians who did not opt to take the ConCert examination is speculative. Had there been a signiﬁcant decay in ConCert examination performance by physicians who took the examination over multiple cycles, one could reasonably assert either that medical knowledge had not been maintained or that the ConCert examination did not validly assess medical knowledge.

A further limitation is that ABEM did not begin to equate the ConCert examination scores prior to 2004. Therefore, direct comparisons between examinations before 2004 to examinations from 2004 forward might be imperfect. Nonetheless, the size of this study, the long-term adherence to sound psychometric principles for examination development, and general trends in perfor-mance over time offer some assurance that using non-equated examinations for part of the study period would not have greatly affected results. Likewise, although only EM residency–trained physician scores were used, when the 7,931 practice pathway (grandfathered) physician scores were added for an internal review, the perfor-mance trends were essentially identical.

This analysis did not include candidates who had pre-viously failed the examination. If all candidates had been

included in the samples, the results may have been dif-ferent. However, the inclusion of repeated performances by individual physicians (especially those who failed repeatedly) would introduce a distracting heterogeneity that would make the interpretation of results difﬁcult.

Other limitations include small baseline differences and trend differences between the various groups. For example, geographic distribution and qualifying exami-nation scores showed variation within certain groups. Sex differences between groups could not be deter-mined. Finally, although these ConCert examination performance patterns have been internally noted in the past by ABEM, they may not allow generalization to future ConCert examinations or to other medical specialties.

CONCLUSIONS

Emergency physicians maintain their performance on the ConCert exam over time. The decay in medical knowledge seen in other medical specialties does not seem to be apparent among emergency physicians, as measured by a validated, secure, standardized cognitive expertise examination.

The authors thank Jenny L. Ritter and Robert G. Purosky for their assistance with data management in the project.

References

1. Choudhry NK, Fletcher RH, Soumerai SB. System-atic review: the relationship between clinical experi-ence and quality health care. Ann Intern Med 2005;142:260–73.

2. Buyske J. For the protection of the public and the good of the specialty: maintenance of certiﬁcation. Arch Surg 2009;144:101–3.

3. Rhodes RS, Biester TW. Certiﬁcation and mainte-nance of certiﬁcation in surgery. Surg Clin North Am 2007;87:825–36.

4. McKenna M. Aging gracefully? Patient safety advo-cates call for ongoing skills assessment for older physicians [abstract]. Ann Emerg Med 2011;58:15A– 17A.

5. Xierali IM, Rinaldo JC, Green LA, et al. Family phy-sician participation in maintenance of certiﬁcation. Ann Fam Med 2011;9:203–10.

6. Cassel C, Holmboe ES. Professional standards in the USA: an overview and new developments. Clin Med 2006;6:363–7.

7. Brennan TA, Horwitz RI, Duffy FD, Cassel CK, Goode LD, Lipner RS. The role of physician spe-cialty board certiﬁcation status in the quality move-ment. JAMA 2004;292:1038–43.

8. Holmboe ES, Lipner RS, Greiner A. Assessing qual-ity of care: knowledge matters. JAMA 2008;299:338– 40.

9. Rhodes RS, Biester TW, Ritchie WP, Malagoni MA. Continuing medical education activity and the American Board of Surgery examination perfor-mance. J Am Coll Surg 2003;196:604–10.

10. Curry L. Use of CME programs: solo versus group practitioners. J Med Educ 1982;57:870–1.

(6)

11. Lipner R, Song H, Biester T, Rhodes R. Factors that inﬂuence general internists’ and surgeons’ perfor-mance on maintenance of certiﬁcation exams. Acad Med 2011;86:53–8.

12. Holmboe ES, Wang Y, Meehan TP, et al. Associa-tion between maintenance of certiﬁcation examina-tion scores and quality of care for Medicare beneﬁciaries. Arch Intern Med 2008;168:1396–403.

Supporting Information

The following supporting information is available in the online version of this paper:

Data Supplement S1. Statistical methods: growth curve modeling.