Determining whether RBCs are infected or healthy requires at least a three-way classification of the pixels in a thin-film microscope slide image – whether they correspond to the background (plasma or possibly an artefact or WBC); to a healthy part of a RBC; or to a stained parasite.
The only simplifications are that, for determination of the degree of infection or parasitemia, only parasites within RBCs are of interest (i.e. before the cell is disrupted, section 1.2) and that there is almost never more than one parasite within an infected cell. Classification of pixels as belonging to a healthy part of a RBC or to a parasite infection may thus be made as a binary decision on each of the pixels which have already been segmented from the background in a hierarchical architecture (section 4.2.6) as an alternative to attempting the multi-class discrimi-nation directly which is not likely to be a viable approach.
Typically, although some 40 − 50% of the pixels in an image may belong to RBCs only a few % of the RBCs may be infected and, unless almost mature, a parasite may only occupy a small fraction of the area of a RBC. The number of pixels in an image belonging to stained parasite infections may thus be very small compared to the 1.3M pixels in an image – say 1/1000th given the remarks above which would make the infected pixels a very small minority class indeed. Detecting such pixels and subsequently determining whether they lie inside a RBC and that the cell is infected and reliably counting the small number of such cells are thus all quite difficult tasks. Only preliminary work has been carried out and satisfactory solutions that would, for example, enable an accurate estimate of the parasitemia to be made have yet to be found. Nevertheless, in the remainder of this chapter we describe this preliminary work as it may provide a useful guide to further research.
6.2.1 A structural approach?
In addition to their propensity to favour balanced classes unsupervised, multi-class algorithms are very general. They therefore make little or no use of any regularities or structure in the problem. In image processing and computer vision, the aim is often to utilise spatial structure but we have seen that, in particular for the parasite infections, it can be quite difficult to encap-sulate such structure. However, when the feature space is one-dimensional and the input is a
histogram, some structure of the histogram may be characterised and utilised in a quite general manner. For example, inspection of the intensity histograms indicates that pixels corresponding to stained parasite infections are deep in the lower tail of the histogram.
Consider image number 1 shown in figure 5.1 (a). The Otsu algorithm applied to the inten-sity histogram of this image shown in figure 6.11 (a) may be used to segment pixels belonging to RBCs (b) with a threshold T = 155. Background pixels above this threshold may be re-moved and the peak in the remainder of the histogram easily found automatically – in this case at IP = 134. If we then reflect the threshold T about IP and retain only the tail of the histogram below IP − (T − IP) = 113 we are left (figure 6.11 (c)) with a cluster which appears (see fig-ure 6.12 (a)) to include pixels that may belong to parasite infections but also some others. A further application of the Otsu algorithm to this histogram produces a threshold of T = 73 and leaves only pixels very deep in the tail of the histogram that seem, in addition to some artefacts in the background plasma, to be predominantly stained parasite pixels as shown in figure 6.12 (b) and (c).
This example shows that there are approximately 905000 pixels in the background plasma and ∼ 433800 pixels in the healthy parts of the RBCs but only ∼ 7200 pixels corresponding to putative stained parasite pixels. Some of these appear to be artefacts but the numbers3 nev-ertheless confirm and emphasize the smallness of the parasite pixel class and the difficulty of segmenting it.
It would also seem from the above that such an approach might work more generally since to determine the parasitemia it is only necessary to detect malaria infections within RBCs and thence to decide whether a cell is infected or not. A crude segmentation of parasite pixels within a RBC might thus suffice, rather than the detailed segmentation that would be required for determining the size, shape and other characteristics of a parasite in order to classify the type of infection and stage of its life cycle [175, 176]. Furthermore, since the Giemsa stain of the parasite is dark blue, one could also envisage processing individual colour channels in a similar manner and possibly combining information from two or more colour channels. However, the approach is rather ad hoc – there is no reason to expect the histogram to have the symmetry proposed – and thus potentially fickle and will only be used for comparison with other methods to be discussed below based on recursive application of Otsu algorithms.
3The number of stained pixels detected is ∼ 0.5% of the image pixels and ∼ 1.7% of putative RBC pixels.
(a)
(b)
(c)
Figure 6.11: Intensity histogram (a) of image number 1 shown in figure 5.1 (a) for which the intensity-based RBC pixel segmentation was shown in figure 6.3. The peak (b) of the lower part of the histogram at I = 134 and (c) the tail at values of I < 113 selected from structural considerations as described in the text.
(a)
(b)
(c)
Figure 6.12: Results of the ‘structural segmentation’: (a) pixels segmented from the tail of the histogram selected in figure 6.11(c); (b) pixels remaining with intensity below I = 73 after a further application of the Otsu algorithm; (c) as in (b) but superimposed on the original image shown in 5.1.