• No results found

Evaluation Criteria

In document Wired for Health and Well-Being (Page 68-70)

A number of organizations and individuals have published, and, in some cases, implemented criteria for evaluating the appropriateness or quality of health-related and other Web sites (Jadad and Gagliardi, 1998; Pealer and Dorman, 1997). Some of these criteria are the basis for tools used to produce a summary rating or grade to help potential users assess the site. There are liter- ally dozens of criteria proposed in the literature (Kim et al., 1999), many of which are closely related.

In selecting and prioritizing criteria to use in evaluating IHC applications, developers and other evaluators often will consider many factors, including the objectives of application and the preferences and values of the evaluator and potential users.6After identifying relevant criteria, the relative weights assigned to each criterion may vary depending on the application. For example, for an application that provides information about clinical trials to the general public, accuracy and appropriateness of content may receive relatively heavy weighting. In contrast, evaluators of an application that focuses on enhancing peer support for a chronic health condition among a disabled population may choose to emphasize the usability of the program.

For general purposes, key criteria for evaluation that can be applied to most programs include (Henderson et al., 1999):

1. Accuracy of content. This includes a number of components, including currency and validity. Sometimes new and seemingly accurate information may not be validated under the scrutiny of time and broader experience. Ensuring the accuracy of the content is not always clear-

cut because there is a close relationship between accuracy and other attributes of information. For example, it is possible that information is accurate but still misleading in certain contexts. Wide variations in use of medical interventions have been linked to varying interpretations of the same evidence. Accuracy can be in the eye of the beholder. In addition, in some applications, the boundary between actual content and advertising may be blurred and identification of the source of the content may be difficult.

2. Appropriateness of content. This includes applicability and intelligibility to the user. Many applications are intended for use by only certain groups of people or only in specific situations. Developers need to be explicit in identifying appropriate audiences to ensure that the content is both applicable to such users and that the likely users can understand it.

3. Usability. This measures how easily a user can get the program to perform as intended. This is where quality interface design is critical. A flashy interface may be appealing to the senses, but actually make an application harder or more intimidating to use. Usability of any computer program, including IHC applications, is a combination of the following user-oriented characteristics (Schneiderman, 1997): 1) ease of learning, 2) high speed of user task performance, 3) low user error rate, 4) subjective user satisfaction, and 5) user retention over time. Hix and Hartson (1993) and Nielsen (1993) provide expert guidance to evaluating user interface issues, a process known as usability testing. The three major usability classifications are efficiency, user satisfaction, and effectiveness. Characteristics such as cost savings or minimizing training time fall under the classification of efficiency and are strong concerns for any organization making the investment in interactive learning. Ease-of-use, perceived benefit versus time invested, intuitiveness, and visual appeal are generally classified as user satisfaction. Immediate retention, retention over time, and transfer to actual job performance are categorized as effectiveness. Unfortunately, effectiveness is the least likely classification to be measured even though it is the primary intent of education and training in most contexts. Another component of usability is acceptability. Developers must be careful that the interface, or its elements, do not intimidate or antagonize users.

4. Maintainability. This is important because application content and design and likely users may shift over time, thus, requiring modifications to the program. A plan for who will implement changes, how the changes will be accomplished, and the resources required is necessary before implementation of the application.

5. Bias. There are many sources of bias, including origin of funding and personal biases due to background and training of developers and evaluators. It is imperative that developers and evaluators incorporate strategies to prevent or minimize bias. Sources of bias should be disclosed explicitly but they cannot be eliminated because the perception thereof is dependent on the individual user. Nevertheless, it is important to be sensitive to, and aware of, both potential and actual biases. For example, if a program incorporates an assumption that alternative medicine is good (or is bad), it can be both limiting (e.g., to whom it has sales appeal) and dangerous (e.g., in terms of liability to both developer and provider). Although conflicts of interest do not necessarily lead to bias, it is often nearly as important to avoid the appearance of bias as it is to avoid it in reality. Thus, it is incumbent upon developers and evaluators to avoid any potential conflicts of interest. When this is not possible, it is essential to use the most objective and bias-resistant criteria.

6. Efficacy and effectiveness. These are measures of the extent to which a program actually has its intended impact. For example, for applications that promote behavior change, does the program actually help people adopt the new behavior? For decision support applications, does the program provide adequate, reliable information that enables the user to make an informed decision? Does it result in decisions demonstrably more consistent with the patients’ stated preferences? Technically, efficacy refers to a program’s impact under controlled (experimental) environments, and effectiveness is the program’s impact under real- world conditions. It is possible for a program to be efficacious in controlled trials but not be very effective when implemented under field conditions.

In document Wired for Health and Well-Being (Page 68-70)