2.3 Natural Language Processing in Radiology
2.3.2 Machine Learning Approaches
In this section, we will talk about use of different machine learning algorithms in the domain of radiology. The section starts with a brief overview of machine learning and its applicability in radiology. This is followed by introduction of supervised ma- chine learning where the algorithms are trained on labeled data. Supervised learning subsection has three different paragraphs and each paragraph talks about a separate algorithm. It starts with decision tree, followed by Maximum Entropy Model and fi- nally in the last paragraph, Support vector machine (SVM) and Conditional Random Field (CRF) are discussed. The next subsection talks about unsupervised learning, where the algorithms are trained on unlabeled data. The section ends with discus- sion on research work on Deep Learning.
Overview
Machine learning approach overcomes most of the limitations shown by rule based approach. Built on statistical models, machine learning gives a system the ability to learn from complex raw data and predict pattern in unknown data. In radiology, machine learning has lots of applications like early diagnosis of disease, medical im- age analysis, image reconstruction, language processing in reports etc.
Wang and Summer in their survey in 2012, discussed the use of machine learning
in radiology and looked into the previously mentioned applications [38]. The au-
thor summarized a few advantages of using machine learning in radiology-one of them being labour saving. Due to increasing number of reports and images over the years, the workload of radiologists is increasing and becoming too much for ra- diologists to handle. Machine learning systems can be trained to identify complex patterns and help radiologists with the labour intensive work, so that radiologists can focus more on the high-level work. Another advantage is it was observed that many machine learning system’s performance was comparable to humans and some of them were performing as good as the expert radiologists. Machine learning can
2.3. Natural Language Processing in Radiology 25 be used to gain new insights into the data for example which disease gains promi- nence over a certain period of the year under what conditions. It is hard for humans to look into huge amount of data and answer these types of questions.
Supervised Learning
One of the very early works of using machine learning approach in the domain of
radiology was performed in 1993 by Zigmond [44] who developed a software called
RadTRAC to monitor follow-up of the patients from the free-text chest X-ray reports. He used dictionary based approach to identify findings related to malignancy from
the reports and used machine learning approach, calleddecision tree(CART), to cate-
gorize the reports into two categories-medical follow-up required versus no medical follow-up required. The RadTrac system achieved a sensitivity of 90% and a speci- ficity of 82% when tested on 470 radiology reports.
A more recent work done in 2013 was about extracting clinically important rec- ommendation from radiology reports so that clinicians & other concerned persons do not miss upon any important recommendations/advices suggested for the pa-
tients by the radiologists [40]. The authors developed a recommendation extraction
pipeline consisting of section segmentation, sentence segmentation and recommen- dation extraction. They used UMLS in feature extraction stage to match the free-text
from the reports to concepts in UMLS and Maximum entropy model, a supervised
machine learning algorithm, for feature processing. The model was tested on 800 ra- diology reports achieving an f-score of 0.758. This work is a continuation of another
work in 2011 [41] by the same authors where they address the same aim but using
rule based method to identify section boundaries whereas in this work, they used Maximum entropy algorithm to identify section boundaries. The motivation of the authors behind using machine learning for section identification was generalizing the section identification rules, which were only specific to reporting style of their institution in their 2011 work. Though the work of 2013 improved automation, the former work using rule based achieved a better f-score (87%) than the latter work.
For named entity recognition,Conditional random field (CRF)[18] is usually used with
some variation by many researchers. Li and his colleagues [20] did a comparative
study betweenSVMand CRF for disease named entity recognition and concluded
that CRFs (f-score:0.86) outperformed SVMs (f-score:0.64). Torii [37] investigated
the performance of CRF taggers for extracting clinical concepts and also tested the portability of this kind of tagger on different kinds of dataset. Along with CRF, the authors also used dictionary look up from UMLS for matching concepts. A mas-
ter’s thesis work was conducted by Joost Timmerman [36] on structuring of free-text
radiology reports. He applied LC-CRF (Linear Chain CRF) for named entity recog- nition and achieved an f-score of 89.3%. The next work will talk about cascaded multi-stage systems, where CRF is used in multiple levels for multi-level named
entity recognition. Esuli and his colleagues [7] developed a cascaded 2 stages LC-
CRF, one stage CRF for identifying entities at clause level and another one at word level. They also compared it with another approach-a confidence weighted ensem- ble method that combines two types of classifier (standard token level LC-CRF and the cascaded 2 stage classifier mentioned in the last line) and sums up their result with equal weight. Their system was tested on 500 mammography reports and the
26 Chapter 2. Related Work
former cascaded system performed slightly better (f-score:0.873) than the latter (f- score:0.858). Both of their systems outperformed their baseline model of standard one level LC-CRF (f-score:0.846).
Unsupervised Learning
One of the disadvantages of machine learning is requirement of labeled training cor- pus for supervised machine learning. In unsupervised machine learning, no manual annotation is required and the machine infers the hidden structure in data on its own. In 2013, Zhang and Elhadad did a research on biomedical named entity recog- nition using unsupervised approach which does not require annotated data, rules or
heuristics [43]. Their system performs entity detection using a noun phrase chunker
followed by a filter based on inverse document frequency and entity classification is done using distributional semantics. They tested their system on i2b2 and GENIA corpora and found that their system outperformed a dictionary matching approach.
Deep Learning
Recently, a lot of research has been going in the field of deep learning. Researchers were applying deep learning to image analysis in the beginning which has now been extended to text. For the first time, bidirectional LSTM CRF(Bi-LSTM-CRF) was ap-
plied on text data for sequence tagging by Huang and his colleagues [14]. The bidi-
rectional LSTM component helps in looking into the past and future features and CRF looks into the sentence level tags. Their system achieved a f-score of 84.26% on named entity recognition task tested on CoNLL2003 dataset. Another very recent work was done in 2017 by a group from Stanford university, who used deep learn- ing convolution neural network (CNN) for classifying free-text radiology reports
[4]. They applied their proposed method to extract pulmonary embolism findings
from thoracic computed tomography (CT) reports and compared it with a tradi-
tional NLP model, peFinder [3]. They observed that the CNN model (f-score:0.938)
outperformed the peFinder model (f-score:0.867).
2.4
Summary
As conclusion of the related work section, the following important things should be noted:
1. Radiology reports need to be concise, clear and understandable for proper communication of knowledge to the outside world and for proper diagnosis of patients.
2. Structured reporting style is preferred over free-text style by many radiolo- gists. But structured reporting should not be impose on the radiologists. Struc- tured reporting should be such that it does not lower the accuracy of the re- ports.
3. Two natural language processing approaches used for radiology reports anal- ysis are rule based and machine learning. We use machine learning approach for our purpose because in this approach, the algorithms are trained automati- cally to recognize patterns unlike rule based system, where the rules are hand- crafted by experts. We did not want to overburden the radiologists with the task of rule creation.
2.4. Summary 27 4. As seen from the literature, Conditional Random Field is the best performing algorithm for sequence labeling and therefore, we use this algorithm in our project.
5. In one very recent work, deep learning performed very well on radiology re- ports, but deep learning models require a lot of data to get trained. Because of availability of limited labeled data, we will not be able to use deep learning for our task.
29
Chapter 3
Theoretical Background
In this section, we give an overview of machine learning models used in this project. We also give an overview of the radiology reporting standard for breast cancer.
3.1
Machine Learning Overview
Machine learning is a technique used to make the systems learn from the data, us- ing statistical techniques without explicitly creating rules. Through various machine learning algorithms, these trained systems are used to make predictions on the data. There are two machine learning approaches – supervised and unsupervised. The main difference between these two approaches is that in supervised learning, sys- tems are trained from labeled data whereas in unsupervised, no labeled data are provided. These approaches are explained in details in the next sub-sections.