Limitations and Future Work - Discussions, Conclusions, Contributions, and Future Work

Chapter 6 Discussions, Conclusions, Contributions, and Future Work

6.3 Limitations and Future Work

There are some limitations in this study. First, automated quality rating relies on pre-existing evidence-based quality criteria, and the approach cannot be applied when such criteria are not available. As the prevalence of evidence-based medical practice increases, evidence-based clinical guidelines are becoming widespread and are being developed for a wider range of

important properties: correctness, comprehensiveness, bias-free and currency. In this study, the quality score generated by the semantics-based rating approach directly reflects the first two properties since 1) the presence of an identified rating criterion (i.e. evidence-based health practice guideline) is an indicator of information correctness of web documents; 2) the quality score is determined by how many different rating criteria were covered by the web page content, i.e. the comprehensiveness of the text content. However, it needs to be

acknowledged that the current design of the semantics-based quality rating approach has not yet dealt with offsetting quality score with information which goes directly against the rating criteria. Thus, this type of anti-criteria sentence, if contained in a web page, does not

contribute negatively to web page quality score. In addition, the semantics-based approach proposed in this study does not directly examine content currency and content bias. These two criteria could be partially addressed by the underlying clinical guidelines, in that these evidence-based clinical guidelines are likely to be bias-free and up to date. If a rating criterion has been derived from the latest evidence-based practice, the identification of that criterion on a web page attests to the currency of that information on the web page. At the same time, the information on the web page could be mixed with other health care

information that is outdated, and this outdated information will not be recognized by the system as such due to the lack of relevant criteria, and thus will have no effect on the quality score. Similarly, this limitation also exists for the examination of content bias. It should be noted though that such issues could also influence human rating if that rating is performed using the same rating criteria. Third, it has been acknowledged in Chapter 3 that the approach in this study is not designed to rate web pages containing complex non-text formats. For example, the proposed approach cannot assess the quality of information contained in

images, and if a web page contains tables, the content inside the table may not be properly processed. In addition, multimedia web sources, such as audios and videos, cannot be evaluated by this approach unless the information content is converted into text format by speech recognition or other technologies. Fourth, the semantics-based approach proposed in this study is focused on the “factual” aspect of content analysis, while the analysis of attitude and affect in the text is not covered. However, in recent years an increasing amount of web content, particularly in the social media zone such as Facebook and Twitter, is rich in subjective opinions (e.g. Kelly, 2009), rather than facts. For example, a patient may tweet his/her personal feeling about a health condition experience, a health care professional may share stories about a treatment trial and his/her subjective comments through micro-blogging. Quality assessment based on such “opinion-based” information will have to rely more on an analysis of sentiment in the content, including attitude expressions, writer’s certainty, and writing stylistic features such as forms of reference, tenses, types of evidential language, etc. In the last few years, investigation on such topics has gradually emerged as a new research subject, i.e., sentiment analysis, and has achieved some success (e.g. Shanahan et al., 2006; Rubin & Vashchilko, 2012). Exploration of using state-of-the-art sentiment analysis

technology for quality assessment of web health care information will be a necessary complement to the semantics-based approach proposed in this study.

This study is to our knowledge the first attempt to use a semantics-based approach (i.e. through the comparison of text content with rating criteria) to automatically rate health care

scores strongly correlated to human rating results. Theoretically speaking, the techniques and tools employed in this study, including transformation from text to semantic tags,

classification methods for identifying sentences in concordance with rating criteria, and the UMLS resource, can be applied to process text in other biomedical sub-domains, as there is no domain-specific design in this approach. This study was conducted based on a solid foundation of previous research (CEMBH, 1998; Griffiths & Christensen, 2002), in which previous researchers have made advances in efforts to summarize and refine the evidence- based depression treatment guidelines into a set of one-sentence statements so that each guideline statement could be easily converted into rating criteria in the current study. In addition, it has been demonstrated previously (Griffiths & Christensen 2002; Griffiths & Christensen, 2005; Griffiths et al., 2005) that the human rating quality scores generated based on these specific evidence guidelines are highly correlated with subjective rating performed by health care professionals; moreover, the generation of quality scores based on these

guidelines has been demonstrated to have reasonably high inter-rater reliability. These factors were important to the successful quality rating achieved in this study on depression treatment web pages. Future studies can be conducted to verify whether this approach can succeed in rating information quality of other health conditions.

In document Semantics-based Automated Quality Assessment of Depression Treatment Web Documents (Page 129-132)