Delphi Studies Lili Luo and Barbara M Wildemuth
THINGS TO AVOID IN CONDUCTING A DELPHI STUDY
At first glance, the Delphi method does not seem to be a very complicated research tool. However, it is not simple, and a lack of careful consideration of the problems involved in implementing such a study could lead to failure in achieving the research objective. Linstone and Turoff (1975) summarized a few things of which researchers need to be aware and which they need to avoid when conducting a Delphi study:
rImposing your own preconceptions of a problem on the respondent group by overspecifying the
Delphi questionnaires and not allowing for the contribution of other perspectives related to the problem
88 APPLICATIONS OF SOCIAL RESEARCH METHODS rInadequately summarizing and presenting the group response
rNot ensuring common interpretations of the evaluation scales used in the questionnaires rIgnoring, rather than exploring, disagreements so that discouraged dissenters drop out and an
artificial consensus is generated
rUnderestimating the demanding nature of a multiround Delphi study and not properly compen-
sating the participants for their time and effort
rIgnoring misunderstandings that may “arise from differences in language and logic if participants
come from diverse cultural backgrounds” (Linstone & Turoff, 1975, p. 7)
Although the Delphi method can be tailored to specific research designs in different studies, the preceding problems are the most common pitfalls of which you should be aware when planning a Delphi study.
EXAMPLES
Two Delphi studies will be examined here. The first (Neuman, 1995), focused on problems and issues associated with using electronic resources in high school libraries, was conducted in four rounds with a panel of 25 library media specialists. The sec- ond (Ludwig & Starr, 2005), focused on the future role of health sciences libraries, was conducted in three rounds with a panel of 30 librarians, building designers, and other stakeholders. Each study exemplifies a slightly different approach to the Delphi technique.
Example 1: High School Students’ Use of Databases
Neuman (1995) conducted a nationwide Delphi study to identify high school stu- dents’ most difficult problems in using electronic databases, potential solutions to these problems, and the most important policy issues related to the use of electronic resources in schools. The rationale for employing the Delphi method was that schools’ use of electronic information resources was just beginning, and “insights of early adopters” (p. 285) would provide crucial guidance for further development and application of these tools.
The first step was to identify a panel representative of early adopters of electronic information resources in high schools. Seven national experts in the use of databases with students suggested several lists of potential participants for this study. Neuman (1995) did not describe how these experts were identified or what criteria they were given for nominating potential panelists; however, this general approach seems appropriate. All the high school library media specialists (LMSs) on these lists were invited to participate through telephone calls. Eventually, 25 LMSs from 22 schools in 13 states formed the panel, including seven winners of Dialog’s Excellence in Online Education Award. The school was the unit of analysis for this study, so the schools’ demographic characteristics, such as size, location, ethnicity, and years of experience using online and CD-ROM systems, were described in this article, addressing Sackman’s (1975) concern that the demographic characteristics of Delphi panels are rarely provided. Although the purposive sampling of Delphi studies restricts the generalizability of the results, Neuman used these demographic data to make the case that the panel in her study was representative of the population of LMSs who use electronic information resources with high school students nationwide.
Delphi Studies 89
An important question in the design of a Delphi study is the source of the items on the first-round questionnaire. For this study, Neuman (1995) developed 226 statements in nine categories by combining the findings of two foundational case studies about students,’ teachers,’ and LMSs’ behaviors and perceptions regarding database use, and Malcolm Fleming’s (as cited in Neuman, 1995) five categories of effective instructional presentations. The marriage of empirical data and a well-established conceptual frame- work resulted in a comprehensive list of items covering the issues related to use and development of electronic information resources in schools. To further validate the initial questionnaire, Neuman asked several colleagues and LMSs to review it for the clarity of its direction and content.
The Delphi study was conducted in four rounds over 18 months. In the first round, the panelists were asked to rate 226 statements under 18 categories and subcategories on a 4-point Likert-type scale ranging from 0 (not important) to 4 (critically important) and to select those that were the most important in each category or subcategory. The second round was a replication of the first round, intended for the panel “to proceed toward consensus” (Neuman, 1995, p. 287). For each rated item, the range within one standard deviation above and below the means was reported, and each panelist needed to decide whether to move his or her rating within this range. A rationale was to be provided for any rating left outside the range of consensus. A similar procedure was followed to reach consensus about the “most important” statements. Each panelist’s first-round responses were highlighted so that he or she could easily compare his or her own choices with the consensus of the group’s responses. This was a thoughtful action, allowing the panelists to keep track of their own responses and to make any revisions in light of the group’s responses. Since the primary goal of the Delphi method is to reach a consensus of opinions, defining the consensus range becomes an inherent part of any Delphi study. Neuman explicitly stated her definition of the consensus range, allowing her readers to understand how much consensus was achieved.
The design of the third and fourth rounds was considerably different from the first two rounds. In the third round, the identified “most important” statements were grouped into five categories, and under each category these statements were ordered according to the frequency with which they had garnered votes in the second round. Panelists were asked to rank the statements within each category. In the fourth round, the statements were reordered based on the results of the third round, and panelists were asked to provide the final rankings for these statements.
As with most Delphi studies, the sample size dwindled with the progression through the four rounds of questionnaires. The original panel consisted of 28 panelists from 26 schools in 16 states, but according to Neuman (1995), “panelists from two schools and one state dropped out in the first round, and panelists from two more schools in two states dropped out in the second; one panelist joined a colleague in completing the instruments for the study” (p. 285). Because the questionnaires used in Delphi studies are often fairly complex and/or long, it is rare that a study can be completed with all the panelists that begin the first round. Because the data from each round are used primarily to inform the design of the next round, input from all the panelists in each round is included in the analysis (i.e., data are not removed from previous rounds when a panelist drops out in a later round).
Because the third and fourth rounds took a different approach than the first two rounds, two sets of results are presented. The results from the first two rounds are the mean ratings for the problems and solutions listed in the questionnaire. From these mean
90 APPLICATIONS OF SOCIAL RESEARCH METHODS
ratings, a rank ordering of importance can be established indirectly. The results from the last two rounds were based on the panelists’ votes for which statements were most important. Thus they are presented as a list of statements, ordered by importance. These two approaches to data collection and analysis provide slightly differing findings—two perspectives on the same issues. The differences illustrate the effects of your methods of data collection and analysis.
For the most part, Neuman’s (1995) study was rigorously designed and could be considered a good model for readers who are interested in using the Delphi method in identifying and prioritizing research issues. She used an unusual but reasonable method for selecting a panel of experts. The initial questionnaire had a firm foundation in the past research literature. The four rounds of the study led to significant consensus among the panelists. In particular, her clear explanation of her study procedures can help guide others in designing a Delphi study.
Example 2: The Future of Health Sciences Libraries
Ludwig and Starr (2005) conducted a Delphi study to investigate experts’ opinions about the future of health sciences libraries, with particular emphasis on the “library as place.” On the basis of experts’ statements about change in libraries, a three-round Delphi study was conducted with a panel of 30 experts.
The study began by assembling a small team of experts: 14 people (librarians, archi- tects, and space planners) who had recently been involved in designing health sciences libraries. In response to open-ended questions, this group generated 80 pages of com- mentary, from which Ludwig and Starr (2005) extracted 200 “change statements.” From this set of statements, they removed any “that the experts generally agreed on, that de- scribed fairly common-place activities, or that were vague and/or difficult to interpret” (p. 317), leaving the 78 statements that formed the first-round questionnaire. As noted previously, one of the challenges of a Delphi study is to develop the items to be eval- uated/ranked during the study. These are often generated during the first round, by the panel itself, in response to open-ended questions; a variation of this approach was used in this study. This open-ended expression of opinion was incorporated as a preliminary step in the iterative cycle of the Delphi study.
The panel recruited for the Delphi study included the 14 people who had participated in generating the first-round questionnaire. This group was augmented by recruiting additional members through postings to several e-mail lists “for key opinion leaders in health care, librarianship, information technologies, and building and design” (Ludwig & Starr, 2005, p. 317). The process used in this study for ensuring that the panel was composed of experts is clearly not foolproof. While the final panel does represent a diverse set of stakeholders in library planning (librarians, library building designers, information technology managers, library building consultants, and health care admin- istrators), their expertise in these areas was not vetted prior to including them on the panel. The panel consisted of 30 participants; no mention is made of any attrition in this sample through the three rounds.
The first round of the Delphi study asked the panel to rate each of the 78 change statements on five different scales: “‘Likelihood of Change,’ ‘Desirability of Change,’ ‘Certainty of Answer,’ ‘Date Change Will Occur,’ and ‘Impact on Design’” (Ludwig & Starr, 2005, p. 317). The panel was also asked to explain the reasons for their responses. The second-round questionnaire was somewhat shorter; statements on which consensus
Delphi Studies 91
Table 10.1. Excerpt from Ludwig and Starr’s study, showing the ways in which results were
aggregated and presented Consensus
ranking
Agree (%)
Disagree
(%) Statement Desirability and impact
4 92 8 By 2015, many academic
health sciences libraries will be integrated into multifunctional buildings. Somewhat to highly desirable; substantial design impact 5 8 92 By 2015, few institutions
will have a health sciences library due to the easy, yet secured, ability of desktop access to information.
Undesirable; substantially different design
13 82 14 By 2007, the majority of
faculty will regard health sciences librarians as partners in curriculum development teams.
Some design changes if classrooms and labs are part of the equation
Source: From Ludwig and Starr (2005, Table 1). Reprinted with permission of the authors.
was reached in the first round were eliminated, as were statements “for which there was substantial confusion” (p. 317). The panelists were provided with the mean ratings on each scale for each of the remaining items. The open-ended comments from the first-round questionnaire were also summarized on the second-round questionnaire, as
why or why not commentary on the relevant change statement. In the second round,
then, panelists were asked to rate each statement and provide their reasons, just as they had done in the first round. The same procedures were followed in the second round for analyzing the questionnaire responses. For the third round, the questionnaire was again shortened by eliminating those statements on which consensus had been reached or for which Ludwig and Starr concluded that there would be no consensus. The third round was then conducted.
Ludwig and Starr (2005) do not explicitly state a particular consensus level that needed to be reached for each statement to be included in the results. However, from the results included in the paper, we can conclude that they required that at least 65 percent of the panelists to concur about a particular statement. It appears that these results are based primarily on the ratings on the ‘Likelihood of Change’ scale. Responses to the ‘Desirabil- ity of Change’ and the ‘Impact on Design’ scales are incorporated into the summary of results as commentary on the statement. Results from the ‘Date Change Will Occur’ ratings were added to the original statements in the report of results. To illustrate the way in which these results were aggregated and presented, an excerpt from table 1 of the paper (presenting the results for all 78 statements) is shown here as Table 10.1. Given the quantity and qualitative nature of much of the data, this is an excellent overview of the findings from this study.
This study illustrates the way in which the Delphi method can be used to predict the future for libraries. While the future is always uncertain, we still need to plan for it. Using the consensus of a trusted panel of experts, as in this case, can provide a basis for such planning. Here Ludwig and Starr (2005) were particularly concerned with the physical
92 APPLICATIONS OF SOCIAL RESEARCH METHODS
aspects of library space and how that might be affected by other developments in library services. They recruited an appropriate panel of stakeholders in library buildings—those groups that are most likely to have thought about the questions of interest. If such a group reaches consensus about a trend, it is reasonable to conclude that you should take account of that trend in your own planning.
CONCLUSION
As can be seen from the two example studies discussed here, the Delphi technique is most appropriate in situations where the truth of the matter cannot be known through direct observation. Instead, we want to leverage the expertise of people who have thought about a particular issue or problem, seeking to find the consensus of their views on the matter. Through the careful selection of panelists and the iterative surveying of their opinions (with feedback), a Delphi study can provide results that will be useful in planning for the future.
WORKS CITED
Dalkey, N., & Helmer, O. (1963). An experimental application of the Delphi method to the use of experts. Management Science, 9(3), 458–467.
Fischer, R. G. (1978). The Delphi method: A description, review, and criticism. Journal of
Academic Librarianship, 4(2), 64–70.
Gordon, T. J., & Helmer, O. (1964). Report on a Long-Range Forecasting Study. Retrieved January 6, 2009, from https://www.rand.org/pubs/papers/2005/P2982.pdf.
Helmer, O., & Rescher, N. (1959). On the epistemology of the inexact sciences. Management
Science, 6, 25–52.
Linstone, H. A., & Turoff, M. (1975). The Delphi Method: Techniques and Applications. Reading, MA: Addison-Wesley.
Ludwig, L., & Starr, S. (2005). Library as place: Results of a Delphi study. Journal of the Medical
Library Association, 93(3), 315–326.
Neuman, D. (1995). High school students’ use of databases: Results of a national Delphi study.
Journal of the American Society for Information Science, 46(4), 284–298.
Okoli, C., & Pawlowski, S. D. (2004). The Delphi method as a research tool: An example, design considerations and applications. Information and Management, 42, 15–29.
Passig, D. (1997). Imen-Delphi: A Delphi variant procedure for emergence. Human Organization,
56, 53–63.
Sackman, H. (1975). Delphi Critique: Expert Opinion, Forecasting, and Group Process. Lexing- ton, MA: Lexington Books.