SOME of the most common blinding conditions are caused

(1)

Image Processing Algorithms for Retinal Montage Synthesis, Mapping, and Real-Time

Location Determination

Douglas E. Becker, Ali Can, James N. Turner, Howard L. Tanenbaum, and Badrinath Roysam,* Member, IEEE

Abstract— Although laser retinal surgery is the best available treatment for choridal neovascularization, the current proce- dure has a low success rate (50%). Challenges, such as motion- compensated beam steering, ensuring complete coverage and minimizing incidental photodamage, can be overcome with im- proved instrumentation. This paper presents core image pro- cessing algorithms for 1) rapid identification of branching and crossover points of the retinal vasculature; 2) automatic mon- taging of video retinal angiograms; 3) real-time location deter- mination and tracking using a combination of feature-tagged point-matching and dynamic-pixel templates. These algorithms tradeoff conflicting needs for accuracy, robustness to image variations (due to movements and the difficulty of providing steady illumination) and noise, and operational speed in the context of available hardware. The algorithm for locating vascu- lature landmarks performed robustly at a speed of 16–30 video image frames/s depending upon the field on a Silicon Graphics workstation. The montaging algorithm performed at a speed of 1.6–4 s for merging 5–12 frames. The tracking algorithm was validated by manually locating six landmark points on an image sequence with 180 frames, demonstrating a mean-squared error of 1.35 pixels. It successfully detected and rejected instances when the image dimmed, faded, lost contrast, or lost focus.

Index Terms— Montage synthesis, real-time image processing, retinal images, tracking.

I. INTRODUCTION

S

OME of the most common blinding conditions are caused by choroidal neovascularization (CNV). The relevant conditions include age-related macular degeneration [1], histoplas- mic choroiditis, idiopathic CNV, post-traumatic CNV, post- inflammatory CNV, degenerative myopia, angioid streaks, post-laser treatment, and any condition that causes a rupture of Bruchs’ membrane. At present, the only proven modality of effective treatment is the application of laser energy to the

Manuscript received August 27, 1996; revised May 14, 1997. This work was supported by National Science Foundation under Grant MIP-9 412 500.

Asterisk indicates corresponding author.

D. E. Becker was with the Electrical, Computer, and Systems Engineering Department, Rensselaer Polytechnic Institute, Troy, NY 12180-3590 USA.

He is now with Siemens Medical Systems, Hoffman Estates, IL 60195-5203 USA.

A. Can is with the Electrical, Computer, and Systems Engineering Depart- ment, Rensselaer Polytechnic Institute, Troy, NY 12180-3590 USA.

J. N. Turner is with the Wadsworth Center for Laboratories and Research, New York State Department of Health, Albany, NY 12201-0509 USA.

H. L. Tanenbaum is with The Center for Sight, Albany, NY 12204 USA.

*B. Roysam is with the Electrical, Computer, and Systems Engineering Department, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180- 3590 USA (e-mail: [email protected]).

Publisher Item Identifier S 0018-9294(98)00250-X.

CNV to cauterize the vessels [2]–[4]. The key to effective and lasting treatment is the identification of the full extent of the CNV, complete cauterization of the CNV by accurately aiming an appropriate amount of optical energy while ensuring that healthy tissue is not cauterized. Despite the superiority of laser treatment over other available methods, serious problems remain. The current rate of success of this procedure is less than 50% for eradication of the CNV following one treatment session with a recurrence and/or persistence rate of about 50% [5]–[7]. The latter condition requires re-treatment. Each re-treatment, in turn, has a 50% failure rate. The visual recov- ery declines with each successive treatment. Indeed, several studies indicate that incomplete treatment was associated with poorer prognosis than no treatment [8]–[10].

A new computer-controlled instrument is being developed in an attempt to minimize the above-mentioned failure rate [12].

Among other functions, this instrument is intended to perform montaging, mapping, real-time tracking of the retina, and spatial dosimetry of the applied laser energy. This investigation builds upon considerable prior work in this area [13]–[24]. The core automated image processing algorithms are the subject of this paper. The two principal algorithms described here are as follows.

1) Automated synthesis of wide-area retinal montage and map: This algorithm is used to combine a number of fundus camera images of a patient’s retina into a montage with a consistent coordinate system. In comparison with existing methods [11], [19]–[21], [25], the proposed algorithm is automated, and is optimized for speed. As such, it does not incorporate visual refinements such as image warping. Its design is such as to assist with the second algorithm noted below, although it may have applications on its own. Fig. 3 shows an example of a retinal montage generated using this method.

2) Location determination and tracking: This algorithm is designed to be used in a computer-assisted laser delivery system to determine the location of a live retinal fundus video image relative to the wide-area retinal montage map, in real time. In other words, this algorithm is used to track the patient’s retina relative to the retinal map, and provide control signals to a computer-controlled laser delivery system. This algorithm builds upon the work of Barrett et al. [16] and Markow et al. [17].

The above-mentioned algorithms are in turn constructed from the following three component algorithms.

0018–9294/98$10.00  1998 IEEE

(2)

1) Algorithm for rapid detection and characterization of vasculature landmarks: To identify a particular area of a patient’s retina, landmarks that usually correspond to branching and crossover points in the vasculature are detected in each retinal image. The algorithm presented here differs from the work of Goldbaum [22] in that it is designed to operate very rapidly and with repetitive consistency, at the expense of absolute accuracy.

2) Fast algorithm for matching sets of vascular landmark points: These algorithms quickly compute the origi- nal spatial transformation linking two sets of retinal landmarks (extracted from two retinal images). This algorithm is used not only to construct the retinal image montage, but also to assist with real-time image tracking.

3) Algorithm for validation and improvement of transfor- mations: This algorithm determines whether to accept or reject a transformation produced by the previous algorithm. This is needed since the transformation produced by the landmark matching algorithm may not be accurate (due to poor image quality). In addition, it is used to refine an acceptable transformation.

II. METHODS

A. Image Acquisition

The experimental data for evaluating the image processing algorithms was acquired using a Hitachi KPM-1 low-light monochrome charge-coupled device (CCD) video camera at- tached to the eyepiece of the TOPCON TRC-501A fundus camera using red-free illumination. A healthy subject was used with eyes dilated. The video camera was interfaced directly to the video input on a Silicon Graphics Indy Workstation with a 150-MHz R4400 CPU, 96 Mb of memory and 1 Gb of system disk. This system captured uncompressed video images to main memory at a resolution of 640 480 pixels at a rate of 30 frames/s. This system was used to capture both individual frames as well as real-time video sequences lasting 6 s each.

The 6-s sampling duration was dictated by the available main memory on the Indy computer, and the desire to avoid any form of lossy image compression that may degrade image quality. The subject was asked to move his eye in order to deliberately make it difficult to track. As an artifact of the collection procedure, the eyepiece crosshairs are visible in all the frames. The programs that processed these images were designed to ignore them, knowing their fixed locations relative to the image. Each frame in the video sequence represents a partial view of the retina, and the zoom factor (magnification, scale) can vary from frame to frame, due either to selection of a different magnification setting on the fundus camera or movement of the camera nearer or farther from the patient’s eye (which may be necessary for focusing).

Several aspects of retinal images in general, and live video retinal images, in particular, make automated processing difficult. First, the images are highly variable. The naturally high variability of fundus images between patients is widely acknowledged. The variability of live images is especially high, due to unavoidable movements, and the difficulty of providing steady illumination. For instance, the need arises

to process image frames that are dim, out of focus, motion blurred, or corrupted with optical effects such as glare or nonuniform illumination. Examples of such low-quality frames that must be processed in a consistent manner are available from the World Wide Web (http://www.rpi.edu/˜roysab). It is important that such frames do not result in an erroneous action by the instrument. During standard retinal still photography, skilled technicians quickly re-focus the camera for optimum illumination for each picture. Images are only taken when the illumination is optimal. Any images with insufficient illumination or excessive glare are simply discarded. The image processing subsystem of the proposed automated retinal surgical system does not have this luxury. It must be able to work with suboptimal illumination, and be able to detect when image quality is too poor for processing, and reject these frames. Image processing in the face of high variability ordi- narily requires the application of adaptive image processing algorithms that require a large number of operations/pixel.

However, particularly for the real-time tracking, the time needed for such elaborate processing is simply not available.

The algorithms described below were designed in the context of such conflicting needs. They are designed to be simple and quick and yet capable of automatically skipping over invalid image frames.

The extent of scene motion between retinal image frames can be extremely high due to the speed at which the eye moves [26]. The saccades involve sudden jumps up to 15 degrees. These movements occur at speeds ranging from 90 to 180 /s. The mean peak acceleration can range from 15 000 –22 000 /s/s. Indeed, the interframe motion at the imaging speed of 30 frames/s is high enough to preclude useful pixel-level interframe correlations, so only feature- based image processing approaches were considered.

B. Algorithms for Rapid Detection and Characterization of Vasculature Landmarks

A natural and widely-used choice of features are the retinal vasculature landmark points. These points must have the following properties to be useful. They: 1) must be at fixed locations on the patient’s retina; 2) must be present in sufficient numbers in all areas of the retina for effective location determination; 3) must be detectable in different images of the same area of the retina even when the images differ in magnification, focus, and lighting; 4) must be quickly detectable. Points identifying bifurcations and crossing points of the retinal vasculature generally meet these requirements (with exceptions such as those arising in retinal detachment).

One possible method of detecting blood-vessel branching and crossover points, for example the method of Goldbaum et al. [22], [24], [27], is based on identifying and determining the locations of the blood vessels by boundary detection or segmentation, thinning the vessels to a single pixel width, and determining the points where blood vessels meet. This method is not appropriate for this application due to the fact that the computation time is too high for real-time operation.

The method presented here is much more direct in the sense that it bypasses the steps related to segmentation, thinning and

(3)

Fig. 1. Illustrating the image processing steps involved in the rapid detection of vasculature landmarks. (a) One red-free video image frame. (b) Result of minimum filtering showing slight widening of the vasculature. (c) Result of Sobel edge detection. (d) Result of thinning the Sobel edges. (e) Edge direction dispersion measure shown inverted (the dark regions indicate high-dispersion regions). (f) Result of thresholding the edge direction dispersion image. The dots indicate detected vasculature landmarks.

skeleton analysis. This method proceeds in two steps (please refer to Fig. 1 for an illustration). First, the boundaries of the retinal vasculature are detected using a standard Sobel edge detection algorithm [28], after the image has been smoothed and the vasculature thickened using a minimum filter [29]. The Sobel operator also computes edge directions (perpendicular to the image intensity gradient) in the image. The detected edge directions are then normalized so that opposite sides of a blood vessel have edges pointing in the same direction. Finally, the edges detected by the Sobel detector are thinned using a single-pass algorithm described by Anarim et al. [30]. The second step is to identify points in the image around which the edge direction varies significantly. The justification behind this approach is that for most of the images, the edge direction does not vary greatly over small areas of the image because the edges correspond to the boundaries of blood vessels with a small curvature. Therefore, where the edge directions do vary significantly, it is likely that this is where a blood vessel is splitting off into two different directions, or crossing another vessel. This motivates the following approach. A 9 9 window is considered centered at each edge pixel in the

image. If the number of edge pixels in the window exceeds a threshold, an edge direction dispersion measure (described below) is computed over the window. This edge direction dispersion measure is computed as follows:

(1) where represents the set of all edge pixels in the window, is the number of edge pixels in the window (the number of elements in , is the direction vector of each edge pixel, and is an operator computing the Euclidean magnitude.

Watson [31] has provided a detailed explanation as to why this is a reasonable dispersion measure for directional data.

Pixels with a locally maximum dispersion value, provided that the dispersion exceeds a threshold, are taken to be possible landmark points. It has been shown that the expected value of this dispersion measure is greater for vessel branches and crossovers than for points along a straight vessel [12]. This can be understood by considering a perfectly straight blood- vessel segment [see Fig. 1(d) and (e)]. For this segment, all

(4)

(a)

(b)

(c)

(d)

Fig. 2. Showing examples of edge direction histograms. (a) Shows the local 222 22 pixel region around a landmark point (adjusted for contrast) and the corresponding edge direction histogram (after smoothing). (b)–(d) Show the same information for landmarks from a different retinal image of the same eye. (b) Is the point identified as having the most similar edge direction histogram based on (4). (c) Was also identified as having a similar histogram. (d) Shows an example of a landmark with a significantly different edge direction histogram.

the edges point in the same direction, so the dispersion in a window containing the vessel is zero. If the vessel was slightly curved, the dispersion would be nonzero, but still small. Now, at a branch point or a crossover, the edge directions vary greatly, so the dispersion is large. While the above approach to landmark detection is much faster than traditional methods, it is not nearly as accurate in the sense that the detected points do not always coincide with branching and crossover points. However, for this work, this is not important. It is more important that the landmarks be consistent and robustly reproducible across image frames. The above procedure has this property since it only depends upon local changes in intensity greater than a threshold (i.e., edges) which are largely robust to illumination changes, and relative orientations of the edges in this local neighborhood, which do not change unless the retina is detached.

This window size (9 9) for the dispersion calculation was determined empirically based on a tradeoff between

performance and computation. Clearly, since only relative values of the dispersion measure between straight sections and branch points are of interest, rather than actual values, the above strategy remains fairly robust to choice of window sizes. Large window sizes provide a better sampling of the local regions, and work better when large blood vessels are involved, especially at high scale values. On the other hand, an excessively large window size will encounter the confusing problem of multiple landmarks within a single window. In addition, larger window sizes entail higher computational cost.

In any case, the robust point-matching algorithm described in Section III-C is able to survive a small number of miss- ing and/or incorrectly located landmark points. The chosen window size was empirically determined as the smallest size that yielded satisfactory detection performance for the video- resolution images, over the scale changes of interest [12].

To enable rapid image matching, a local edge direction histogram (Fig. 2) is computed at each landmark. Since each

(5)

vasculature crossing or branch point is unique, edge direction histograms represent a unique “signature” associated with each landmark point that can be used to distinguish different landmark points in a way that is reasonably independent of scale differences (because the pattern of edge directions around a landmark does not change with scale) and slight translation differences (because positional information is ignored in the calculation of the edge direction histogram). Fig. 2 shows examples of edge direction histograms.

C. Fast Algorithm for Matching Sets of Vascular Landmark Points

This Section describes an efficient procedure for matching pairs of vasculature landmark point sets that are computed from sets of retinal images. Mathematically, the core computation of interest is the matching of feature-tagged point sets with unknown correspondences, in the presence of a small number of noncorresponding points, to produce an optimal transformation between the two point sets. The nature of retinal video images restricts the possible transformations to include two-dimensional (2-D) translations and scale changes This model holds for the area of the retina that is of most interest–around the fovea, where the retina is well approximated by a plane at standard video resolutions.

The small errors resulting from this modeling assumption do not affect the subsequent point-matching steps as discussed later in this Section. The rotational movements are known to be very small (rarely approaching and never exceeding 5 ). As long as there is no detachment, the retina is known to move

“rigidly;” that is, all the points on the retina move together and maintain the same relative location with respect to one another.

On the other hand, the apparent scale of the images cannot be neglected. It can change not only when the magnification is adjusted, but also when the patient moves closer or further from the camera.

The matching algorithm operates by corresponding pairs of points. First, a number of potential transformations are calculated by hypothesizing correspondences between points in one set and points in the other. Second, the transformations are evaluated by computing a “score” for each transformation that measures how well the transformation corresponds to the image data. Every “plausible” correspondence between a pair of points in the first set and a pair of points in the second set defines a transformation. A “plausible” correspondence is a match that induces a transformation with no significant rotation and has a scale within an acceptable range (a typical extreme range is 70–145%). Given a plausible correspondence between a pair of points in the first set and a pair of points in the second set the scale factor can be computed by the following equation:

(2) The translation can then be computed by

(3) If there are points in the first set and points in the second set, then there are on the order of pair-to-

pair matches. Although relatively few of these matches are plausible, there are still a large number of transformations to evaluate. One way to avoid evaluating this large a number of transformations is to attempt to quickly reject bad correspon- dences. The method adopted in this system is to compare the local edge direction histograms of each landmark. Only those with sufficiently similar (in the sense defined as follows) edge direction histograms are considered for correspondence.

Comparisons between edge direction histograms are made as follows. First, in order that binning artifacts (i.e., artifacts due to the assigning of a direction to a discrete histogram bin) do not prevent good matches, a Gaussian smoothing is applied to each histogram. The similarity between two histograms is then calculated by a sum-of-squared-difference measure according to the following equation:

(4)

where and are the two (smoothed) histograms, and is the number of bins in the histograms. Pairs of landmarks that produce a small measure (near zero) are considered more similar than landmarks giving a larger measure. Empirical investigation has revealed that it is generally sufficient to use only the five most-similar (in the above sense) landmarks for correspondence. Limiting the number of these evaluations dramatically decreases the computation time, from to By restricting potential correspondences to the five most-similar (as determined by histogram comparison), it is possible (though unlikely) that we may miss out on a correct correspondence. In other words, all five potential correspondences may be wrong. But that does not cause a serious problem, since when we generate transformations from these (incorrect) correspondences, they will evaluate poorly in a subsequent evaluation step (described below). As long as we have some correct correspondences from which we can generate a correct transformation, it will be selected in the latter step. It must be further noted that the five pairs of points are used to generate several (not one) transformations, each of which is evaluated.

Further, since many of the computed transformations will be similar or, ideally, the same, it would be wasteful to evaluate every transformation. This duplication is avoided by maintaining a list of transformation “clusters” following the work of Stockman and Esteva [32], [33]. Each transformation cluster represents a set of plausible matches that induce similar transformations. The algorithm for generating clusters operates as follows. For each plausible match, the transformation vector is computed. The vector is then compared to each cluster. If the distance (in transformation space) between any cluster center and is less than a certain threshold value the transformation is added to the cluster, the count of transformation points in the cluster is incremented by one, and the cluster centroid is re-computed. Otherwise, a new cluster is defined consisting of one transformation at By assigning transformations to clusters, the computationally expensive evaluation of the transformation need not to be performed for each match. Because a correct transformation

(6)

should map a large number of landmark points, and hence pairs of points, correctly, it is intuitively reasonable that the optimal transformation cluster will have a large number of transformation points associated with it. Therefore, all the clusters do not need to be evaluated. It is only necessary for a certain number of the ones with the most transformation points to be evaluated.

To evaluate a transformation (for determining the optimal transformation), the landmark points of the first set are transformed and compared to the landmark points in the second set. Define a mapping function , as follows:

(5) where is a landmark point in the first set, is the spatial transformation, and is the second set of landmark points.

This function computes the closest landmark in the second set to the transformed point Next, define the following set of points:

(6) where is the first set of the landmark points and is the maximum acceptable distance between a transformed point in the first set and the nearest point in the second set. The set contains all points in that, when transformed by , are less than distance from a point in the second set. Then we can define an evaluation function as follows:

(7)

This evaluation function returns high values for points in the first set that are mapped close to points in the second set.

Because the function does not require exact matches between points, but degrades gracefully, allowing small differences in the locations of points, the system will allow some error in the landmark point identification and will also allow slight distortions of the type incurred by using 2-D rather than three-dimensional (3-D) perspective transformations. Practical operation of this algorithm requires this type of robustness.

One major time-consuming operation in the evaluation of transformations can be that determination of the closest point in one set to a given transformed point in the other set (within a maximum acceptable distance). A naive way of doing this would be to compare the transformed point to every point in the other set to find the minimum distance between them. A method to dramatically cut down the amount of computation is to use a “hash table.” For this, define a grid over the location space. The space between the grid lines is defined as or twice the maximum acceptable distance between a transformed point in the one set and the closest point in the other set. The grid divides the entire space into “boxes,” so that every point in the location space has a box that covers that point. The hash table consists of a 2-D set of such boxes, each of which contains a list of points that are contained within that box.

Now, instead of examining every point in the set to determine the closest point to a transformed point, it is only necessary to examine the points in the lists associated with the four

neighboring boxes to the transformed point. For a typical red- free retinal video frame, a computation reduction factor of at least ten was obtained experimentally.

The above improvements to the point-matching algorithm result in overall speed improvement of approximately 180–200 times. The resulting computation times, of the order of 1 s or less on a Silicon Graphics computer, are acceptable for montage synthesis. However, they are still not sufficient for real-time location determination and tracking.

D. Algorithms for Validation and Improvement of Transformations

Two issues of concern with the transformations provided by the point-matching algorithms are that: 1) The transformation may not be close to correct at all (i.e., the point matching has failed, or 2) the transformation, while close to the correct transformation, may be slightly inaccurate. In the first case, it is necessary to detect this failure, while in the second case it is desirable to correct the inaccuracy by a refinement operation.

To determine the success or failure of the above matching algorithm, a sequential similarity detector (SSD) was used [34], [35]. Given a transformed image that is to be compared with an image , the SSD algorithm computes the measure

(8)

where is a window over which the measure is to be computed, and are the average intensities of image and image , respectively, in this window, and and are the estimated standard deviations within the window. The lower the value of , the better the match. Subtracting the mean and normalizing by the standard deviation of the windows means that even images with significantly different illumination can be matched successfully, an important consideration for this application.

This measure works best when the window contains a region of interest, such as vasculature branch points [36]. Therefore, for this algorithm, the value is computed for a small window around each of several landmark points. The lowest value returned by the similarity measure for all the locations tested is considered to be the correct match of the landmark from the retinal map to a point in the new image. This process is repeated for a number of landmark points. Experiments indicate that five is a sufficient number of points. If the measure exceeds a certain (experimentally determined) threshold for a given window, then the match is determined to be a failure. Because effects such as glare can cause a failed match, a single failed match does not automatically cause the transformation to be rejected. The entire transformation is rejected if less than four of the five points match successfully.

If the transformation is not rejected, the results of the SSD matches can be used to refine the transformation. Each of the successful matches maps a point in the retinal map to the new image. In general, no one transformation will allow all the points to be mapped exactly. However, the optimal transformation, in the least-squares sense, can be computed.

Suppose that a set of points in image 2 (the new

(7)

image) map to a set of points in image 1 (the retinal map). The least-squares estimate of the optimal scale parameter in the new transformation is given by

(9) where denotes the average of the coordinates, denotes the average of multiplying corresponding coordinates from the two images (i.e. the average of and represents the average of the square of the coordinates of image 2. The corresponding and translation values are given by

(10) The net result of applying these procedures is an improved and validated transformation. The detailed derivation of the above equations are provided in the Appendix.

E. Algorithm for Wide-Area Montage and Map Synthesis The wide-area map is first initialized to the first acceptable image frame. Subsequent image frames serve to either reinforce or augment the wide-area map. Specifically, the algorithm computes landmark points for each image frame, and uses the matching algorithm to compute a transformation to the current wide-area map. If the transformation is sufficiently reliable, as defined in the previous Section, then the extent of the overlap between the stored wide-area map and the new image frame is determined. The portion of the new frame that is not represented in the current wide-area map is now inserted into the wide area map. Also, the portion of the wide- area map that overlaps with the new image frame is updated to reflect the new level of confidence in the landmark points.

Specifically, associated with every landmark point in the wide- area map is a count of the number of times it coincided with a landmark in a new frame (coincidence counts), and a count of the number of times a new image reliably overlapped with the relevant spatial region (observation counts). A “confidence value” is computed by dividing the number of coincidence counts by the number of observation counts. Points with higher confidence values are considered more reliable. A threshold can be set for rejecting landmarks with insufficient confidence values. As more overlapping images are added to the map, it is expected that a set of very reliable landmark points will be obtained. Once the wide-area map has been computed, it is straightforward to transform and merge the actual gray level image frames to construct a montage corresponding to the wide-area map.

F. Real-Time Algorithm for Location Determination and Tracking

During laser retinal surgery, it is important that the location of each incoming video frame be determined relative to the wide-area retinal map in real time (in 33 ms/frame or less).

Unfortunately, the point-matching algorithm presented above, while fast, could not be computed in the above time frame.

Barrett et al. [16] and Markow et al. [17] have described a novel method for tracking the positions of retinal images in

real time for small movements and with a fixed magnification.

Their method is based on defining a small set (typically 5–10) of local correlation templates that are sensitive to the position of blood vessels in either the horizontal or vertical direction.

Usually, these templates are defined on locally horizontal or vertical segments of a few prominent vessels in the retinal images. Each template consists of four pixels: two adjacent pixels, denoted and , straddling one boundary of a retinal blood vessel and two pixels, denoted and , straddling the other boundary. If and are the intensities of the pixels outside the hypothesized blood vessel and and are the intensities of the pixels inside the vessel, then the intensity difference across the vessel boundaries are , and , respectively. The response of the template is defined as the sum of these differences divided by the average intensity of these four points, as given by the following equation:

(11) For each incoming video frame, a 28 28 pixel region centered about the location of the template in the last frame is defined. The template is hypothesized to be at every position in this 28 28 region, and the response computed. The location with the maximum response is considered to be the location of the template in the new frame. In practice, several horizontal and vertical templates are defined to cover blood vessels at various positions on a retinal map. They are maintained at fixed locations relative to one another as possible positions for the combined template are hypothesized. At each hypothesized position for the combined template, the responses of the one- dimensional (1-D) templates are summed to form a total response for the 2-D template. The response of the 2-D template is maximum when all 1-D templates are aligned over blood vessels. Because these calculations can be performed well within the time between frames (33 ms), the retina can be tracked in real-time.

To define the templates for a particular retinal map, each possible template location within a region is examined and the location with the highest response is selected. This process of defining templates can take longer than the time between frames. However, this does not prevent the algorithm from tracking in real time. Barrett et al. [16] describe obtaining a set of templates once before tracking begins.

While very fast, the template-based method has some dis- advantages. There are a number of conditions that can cause this method to fail, and there is no way of detecting or correcting these failures within the template approach. It is desirable to design a method for location determination that combines the speed advantages of the template-based method with the advantages of a point-matching-based method, such as flexibility, robustness, and verifiability. The main difficulty with this is that the advantages from the point-matching method come at a cost of significant computation time. It is not obvious how the advantages of point matching can be utilized while still maintaining real-time operation.

The key to solving this dilemma comes from noting that the large, sudden saccadic or fixation movements happen much less frequently when compared to the small and constant

(8)

Fig. 3. A montage of nine red-free retinal images from the Live Video data set. These images have been transformed onto a single coordinate system, aligned, and combined into a single montage. This montage does not have the image warping refinements as in the work of Mahurkar et al. [19]. Its purpose is to provide a coordinate reference frame, as defined by the landmarks for location determination (see Fig. 5 for an example).

microsaccades. The other insight needed for the design of this algorithm is that real-time location determination performance is only necessary while the surgical laser is active. In other words, if sudden large movements of the eye, eye blinks, glare, or scale changes can be detected within the time between image frames, then the laser can automatically be disabled (shuttered or deflected to a beam dump), allowing a slower point-matching-based algorithm to obtain an accurate, verifiable fix on the current retinal position, allowing the tracking to be re-initiated.

What becomes necessary, therefore, is a way to quickly detect whether the template-based method has failed. A two- part method has been found to be useful for this purpose. First, the template response is compared to a minimum “confidence threshold.” If the response of the template is below this threshold, this is an indication that the algorithm may be giving an incorrect estimate of the retinal location. Second, if the template-based method indicates a sudden movement between frames larger than normally expected, this may be an indication of an incorrect location, or at least that a verifiable estimate of the retinal location should be obtained. In either case, the laser is disabled while the point-matching algorithm obtains a new location estimate.

Based on the considerations listed above, the combined location determination algorithm operates as follows. It is assumed that after diagnosis and analysis, the physician has

obtained a wide-area retinal map and has outlined an area on the map that is designated for treatment. During laser application, the instrument obtains a video image frame of the patient’s retina. The point-matching algorithm matches the frame with the retinal map. The transformation that results from this matching operation provides an estimate of the spatial location of the current image frame relative to the wide- area map. Then the algorithm automatically obtains a set of 1-D templates for the template-based tracking method. The template-based method tracks the retina based on the position fix determined by the point-matching algorithm. If the template response is above the confidence threshold, and the change in the retinal location between frames is within a preset limit, then the laser is enabled. As long as the confidence tests are passed, the algorithm proceeds by grabbing new retinal images at video rates and using the template method to obtain new estimates of the retinal location. Otherwise, the laser is disabled and a new determination of the retinal location is obtained by matching the current image frame with the retinal map. A visible tracking (marker) laser spot is detected in each incoming image frame, and the location of this spot can be determined on the retinal map. The tracking laser spot marks the precise location where the surgical laser will hit when fired.

As a result, the computer can determine the area of the retina on the retinal map that will be cauterized when the surgical laser is fired. The resulting algorithms provide a unified method for tracking the retina and controlling the surgical laser. From a physician’s standpoint, the system remains simple; whenever tracking is lost, the system automatically shuts off the laser, overriding the physician’s request to turn on the laser. A flowchart summarizing the above procedure is shown in Fig. 4.

III. EXPERIMENTAL RESULTS

A. Detection of Retinal Vasculature Landmarks

Fig. 1 illustrates the intermediate steps in the detection of retinal vasculature landmarks. Fig. 1(a) shows a red-free video image of a healthy retina. Fig. 1(b) shows the result of minimum filtering. This operation smoothes out impulsive image noise, and has the effect of widening the vasculature.

The latter effect is needed to prevent merging of edges during the thinning operation that follows. Fig. 1(c) shows the effect of Sobel edge detection. An important practical consequence of this step is that subsequent computations are hastened by restricting them to the detected edge regions.

Fig. 1(d) illustrates the effect of single-pass thinning of the edge detection output. This operation greatly reduces the amount of data that must be processed in the subsequent step, typically, a factor of eight. Fig. 1(e) shows the effect of computing the direction variance at each thinned-edge pixel. In this image, the darker pixels are close to one, whereas the lighter pixels are close to zero. Thresholding this image, followed by local maximum suppression yields a set of landmark points that are displayed in Fig. 1(f) as white dots superimposed on the original image in Fig. 1(a).

(9)

Fig. 4. Flowchart outlining the algorithm for real-time retinal location determination and laser control. Bold arrows represent actions that must be performed in real time.

B. Construction of Wide-Area Retinal Map and Montage A number of wide-area montages consisting of several retinal images were constructed. The result of montaging nine red-free retinal images is shown in Fig. 3. It took 7.2 s on a 150-MHz Silicon Graphics Indy computer to compute the transformations for the images in this montage. The algorithms for generating the wide-area map and montage were able successfully to handle differences in position and scale, as well as differences in illumination and general image quality. The wide-area maps form the basis of retinal location determination and tracking.

C. Location Determination and Tracking

During laser surgery, each incoming image frame must be located with respect to the wide-area retinal map in order to determine the current location of the laser. Fig. 5 shows an example of an image that was located on the wide-area retinal map. The white circular outline on Fig. 5(b) represents the location of the outline of the frame shown in Fig. 5(a), as automatically identified by the algorithm.

The image in Fig. 6 is the first image of a 6-s (180 frame) live red-free video sequence. In capturing this sequence,

the subject was asked to move his eye in order to create a deliberately difficult sequence to track. In actual retinal surgery, attempts would be made to minimize eye movements.

The location determination and tracking algorithms were used to track this entire sequence off-line. Each image in the sequence was processed as if it were captured live from the proposed instrument during surgery. When re-establishing a position fix, the sequence was paused; the next image processed was the next image in the sequence. Some sample results of this tracking are shown in Fig. 6. In this figure, an arbitrary point was selected on the first image of the sequence.

The tracking algorithm was used to record the position of this point on every other image of the sequence based on the calculated location of that frame on the retinal map. A white crosshair is shown in Fig. 6 at this calculated point on six frames from this sequence. The algorithm required 19 position fixes in processing this 180-frame 6-s sequence, including the initial fix. Ten frames were determined by the algorithm to be unusable for location determination based on the criteria described earlier. In the proposed instrument, the operating laser would be disabled while obtaining a position fix or after detecting an unusable frame.

(10)

Fig. 5. An example of location determination. The upper image is the first frame of a red-free video sequence from the Live Video data set. (b) Is a montage constructed from a sequence of 24 successive image frames. The white outline in (b) represents the outline of (a) as automatically determined by the location determination algorithm. The superimposed white dots indicate the detected landmark points.

The accuracy of the above algorithms was verified manually.

Each of the 180 images in the Live Video data set was analyzed manually. For each image, a specific landmark point was located manually using a computer mouse in each image of the sequence. Then, the tracking algorithm was used to predict the location of this selected landmark point for each frame.

The spatial distance between the algorithm-predicted point and the manually identified point was used as an error measure.

Fig. 7 shows the average error for all manually identified points. The average error over all points was computed to be 1.9 pixels.

By way of comparison, cross validation was performed on the manually selected points to give a measure of consistency.

This cross validation was performed by calculating the least- squares transformation as in (9)–(10) for each frame based on the manually selected points. The average error between each point as selected manually from the position predicted from the least-squares transformation was computed. Fig. 8 shows this cross-validation measure. Note the very large error of over 12 pixels at frame 158. This is a frame with considerable motion blur. There was no way of obtaining reliable landmarks for this image. The large error for this frame in the manually selected points is not reflected in the manual versus automatic comparison because the automatic location determination method identified this frame as being unsuitable for processing. Fig. 8(b) shows the same data without frame

Fig. 6. Six representative red-free image frames from a 6-s video sequence demonstrating the results of automatic tracking. An arbitrary point was selected on the first image of the sequence. The tracking algorithm was used to record the position of this point on every other image of the sequence as a white crosshair.

Fig. 7. Graph showing the average discrepancy between manual and automatic location determination computed over all manually mapped points. The dotted vertical lines indicate times when the template method failed, requiring the point-matching-based algorithm to be invoked to derive a position fix.

158. The average error (inconsistency) was computed to be 1.35 pixels.

A higher accuracy may be achieved at a correspondingly higher computational cost. For example, increasing the window size of the SSD computations for transformation validation and improvement can give more accurate results. Fig. 9

(11)

(a)

(b)

Fig. 8. (a) Cross validation data showing the internal consistency of the manually selected points. The average error over all points was computed based on a least-squares computation of the transformation for each frame. A large error exists for frame 158 due to extreme motion blur in that frame. (b) The same data, excluding frame 158 plotted on an expanded scale.

shows the effect of increasing the window size on average error. Error bars for each point show the standard deviation of the error. A horizontal line indicates the inconsistency of the human data as discussed in the previous paragraph. Other tradeoffs of speed for accuracy can be made, including increasing the window size during landmark detection, increasing the range over which the SSD is performed, and upgrading the transformation models.

IV. DISCUSSION AND CONCLUSIONS

The algorithms presented here are being used to construct a computer-assisted instrument for laser retinal surgery. Re- cently, Welch [15]–[17] has sketched out a concept for a future computer-controlled retinal surgery system that includes retinal tracking and automatic laser beam movement, similar to our work. Also known is the largely unpublished and proprietary work of Dr. S. Charles (Charles Retina Institute, Memphis, TN). It is expected that our instrument will reduce the failure rate of laser retinal surgery by enabling accurate and quantitatively monitored delivery of the laser energy to

Fig. 9. Graph showing decreasing error with increasing amounts of computation. The window size of the SSD calculations for transformation validation and improvement is graphed against the average error. Error bars show the standard deviation for each window size. A horizontal line indicates the inconsistency of the manually selected points as computed by cross-validation.

The average computation times corresponding to the vertical bars were (777, 851, 875, 1626, and 2322 ms, respectively).

the region of interest with the minimum-possible incidental retinal damage or untreated area. For example, it will be possible to shutoff the operating laser when the laser is aimed at an unplanned region, or when the optical dosage to a particular region exceeds a threshold. The problem of incidental damage is especially important when operating near critical regions such as the fovea. The progress of the laser treatment will be displayed on a heads-up display as it occurs.

For instance, currently, the physician is required to alternately view and memorize a portion of a monochrome image on which the treatment region is mapped, then visually identify the corresponding region on the patient’s retina (in color) through a slit-lamp microscope using vasculature landmarks, and manually direct the laser using a foot pedal to switch the laser on and off. This task can only be performed approximately, leading the high treatment failure rate. Ideally, the physician should be able to perform a comprehensive imaging of the retina, view it on a large computer screen, use a pointing device to outline the areas requiring treatment, specify the optical dosages to each such area, make annotations, and perform a simulated treatment before proceeding with the full treatment. The retinal map that is generated above, as well as the marked treatment areas should be displayed to the physician in real time to help guide the treatment. Ideally too, previously acquired aligned retinal maps should be accessible during the post-operative phase to monitor the progression of a treatment. In this context, note that the same algorithms can also be used to perform post-laser follow-up measurement since the mapping data can be easily stored on the computer and retrieved later. A considerable factor causing misdirection of the laser is related to eye movements (normal saccadic scanning, those resulting from distractions, as well as the avoidance movements that are reflex responses to irritating light or the discomfort induced by the laser, involuntary movement to fixate the fovea on the laser spot, etc.). The closer

(12)

the laser aiming beam is brought to the fovea, the harder it is for the patient to resist the desire to direct the fovea to the beam. In macular degeneration the patient’s vision is often so poor that this is not a problem, but the physician must be prepared for it. The physician attempts to override this drive by giving a point-source target to the other fovea. This works fairly well if the uninvolved eye has better central vision than the involved eye, which is often not the case.

The algorithms presented here demonstrated the ability to track image frames or to determine that the laser must be disabled within the time required for real-time operation. All timings shown here are on a 150-MHz Silicon Graphics Indy computer. An average of 3.7 ms was required to track or make the determination required to disable the laser (due to a new position fix being required) using the template method.

This is nearly an order of magnitude faster than required for real-time operation. With a window size of 16 pixels for the SSD computations, it took an average of 875 ms to obtain a position fix, 51 ms of which was required to define a new set of templates, 400 ms was required for detecting landmarks, 353 ms was required for validation, and 105 ms was needed for the point matching. The time for point matching is remarkably low considering the combinatorial nature of the problem [37], [38], indicating the success of the computation reduction techniques adopted here. Collectively, these techniques brought down this timing from about an hour down to 150 ms. These times are to be interpreted as being representative, and they depend upon the image, and on parameter selection (see Fig. 9). It is clear from these performance numbers that the bulk of the time needed to obtain a position fix is used for pixel-intensive tasks.

This suggests the use of a parallel pixel-processor, such as the Texas Instruments TMS320C80 chip for accelerating these operations. This device consists of four long instruction word signal processors and a standard RISC CPU with floating-point arithmetic, all integrated on a single die.

A particular capability of the algorithms relates to the handling of the high variability in the image data, in particular bad frames, in a consistent manner, without the use of computationally expensive adaptive image-analysis algorithms. By combining the template method with the fast point-matching- based method, a consistent real-time control of the laser delivery system becomes possible. Of particular importance is the ability to quickly (in 3.7 ms after frame capture) shut off the operating laser in the event of a tracking loss, or when a poor-quality image is acquired, and the ability to automatically re-initiate the tracking when acceptable image frames become available. This greatly improves upon the work of Barrett et al. [16] and Markow et al. [17]. The problem of locating retinal landmarks is another example that required a novel solution. The direct approach to this problem [22]

involves adaptive image segmentation and matched filtering to detect the retinal vasculature, followed by skeletonization and branch point detection. Such an approach, again, would be computationally infeasible for the proposed instrument. The approach described in this paper is “less accurate” in the specific sense that the detected landmarks may not correspond to vasculature branching and crossover points, yet they are sufficient for the specific purpose of location determination and

montaging, while being orders of magnitude faster to compute.

An interesting issue that arises is the method to ensure that an applied laser does indeed impact a point indicated by the algorithms relative to the montage. In this context, it must be noted that the operating laser has an additional collinear visible-light low-power beam known as the aiming beam. The imaging system observes the spot created by this beam (the aiming spot) through the same optical system, with the same distortions. This fact eliminates the need for a reverse mapping mechanism from the retinal map to the 3-D point on the retina.

The accuracy of the location determination algorithm was demonstrated to be within 1.9 pixels of the manually acquired location data. It was also shown that the manual data, despite extreme care and effort, had an inherent error of 1.35 pixels on average. In this context, it is useful to note that the need for accuracy has less to do with the absolute achieveable precision implied by the optics, and much to do with the need to address the principal causes of failure in laser retinal surgery—failure to cover the treatment area, and incidental damage resulting from applying the laser to nontreatment areas. An important feature of the proposed algorithms is the fact that they are amenable to further improvements given increased computing capability. For instance, the accuracy of the tracking can be improved by increasing the size of the window used for the SSD, and increasing the number of landmark points used for transformation improvement. This should become possible with inevitable improvements in microprocessor technology.

We are currently investigating full 3-D modeling of the retinal surface in order to improve the accuracy of the montage, and the location determination implied by it.

APPENDIX

DERIVATION OF (9) AND (10)

Given a set of points in image 2 (the new image) that map to set of points in image 1 (the retinal map), we can define a squared difference error value as follows:

(A.1) The minimum of this error measure can be computed by taking partial derivatives and setting the result equal to zero, giving the following results:

(A.2)

(A.3) and

(A.4) By solving these equations, the parameters of the best transformation in the least-squares sense can be computed in closed form as (9) and (10).

(13)

ACKNOWLEDGMENT

The authors would like to thank the staff at the Center for Sight, Albany, NY, especially photographers G. Howe and M.

Fish, for assisting with image acquisition. They would also like to thank T. Turner and H. Yau for assisting with manual validation of the tracking algorithm.

REFERENCES

[1] R. Murphy, “Age-related macular degeneration,” Ophthalmol., vol. 93, pp. 969–971, 1986.

[2] Macular Degeneration Study Group. “Recurrent choroidal neovascularization after argon laser photocoagulation for neovascular maculopathy,”

Arch. Ophthalmol., vol. 104, pp. 503–512, 1986.

[3] S. L. Trokel, “Lasers in Ophthalmology,” Optics, Photonics News, Oct.

1992, pp. 11–13.

[4] M. W. Balles, C. A. Puliafito, D. J. D’ Amico, J. J. Jacobson, and R. Birngruber, “Semiconductor diode laser photocoagulation in retinal vascular disease,” Ophthalmol., vol. 97, no. 11, pp. 1553–1561, Nov.

1990.

[5] N. M. Bressler, S. B. Bressler, and E. S. Gragoudas, “Clinical charac- teristics of choroidal neovascular membranes,” Arch. Ophthalmol., vol.

105, pp. 209–213, 1987.

[6] P. N. Monahan, K. A. Gitter, J. D. Eichler, G. Cohen, and K. Schomaker,

“Use of digitized fluorescein angiogram system to evaluate laser treat- ment for subretinal neovascularization: Technique,” Retina—J. Retinal, Vitreous Diseases, vol. 13, no. 3, pp. 187–195, 1993.

[7] P. N. Monahan, K. A. Gitter, J. D. eichler, and G. Cohen, “Evaluation of persistence of subretinal neovascular membranes using digitized angiographic analysis,” Retina—J. Retinal, Vitreous Diseases, vol. 13, no. 3, pp. 196–201, 1993.

[8] S. Fine, “Observations following laser treatment for choroidal neo- vascularization,” Archives of Ophthalmol., vol. 106, pp. 1524–1525, 1988.

[9] G. Soubrane, G. Coscas, C. Francais, and F. Koenig, “Occult subretinal new vessels in age-related macular degeneration,” Ophthalmol., vol. 97, pp. 649–657, 1990.

[10] Macular Photocoagulation Study Group. “Persistent and recurrent neovascularization after Krypton laser photocoagulation for neovascular lesions of ocular histoplasmosis,” Arch. Ophthalmol., vol. 107, pp.

344–352, 1989.

[11] Q. Zheng and R. Chellappa, “A computational vision approach to image registration,” IEEE Trans. Image Processing, vol. 2, no. 3, July 1993.

[12] D. E. Becker “Algorithms for automatic retinal mapping and real-time location determination for an improved retinal laser surgery system,”

Ph.D. dissertation, Rensselaer Polytechnic Inst., Troy, New York 12180, Aug. 1995

[13] R. W. Flower and B. F. Hochheimer, “A clinical technique and apparatus for simultaneous angiography of the separate retinal and choroidal circulation,” Investigat. Ophthalmol., vol. 12, no. 4, pp. 248–261, Apr.

1973.

[14] T. M. Clark, W. R. Freeman, and M. H. Goldbaum, “Digital overlay of fluorescein angiograms and fundus images for treatment of subretinal neovascularization,” Retina—J. Retinal, Vitreous Diseases, vol. 2, no.

12, pp. 118–126, 1992.

[15] A. J. Welch, “University of Texas lab studies tissue optics, ablation, automation,” Biomed. Optics: Newslett. Biomed. Optics Soc., vol. 2, no.

2, May 1993.

[16] S. F. Barrett, M. R. Jerath, H. G. Rylander, and A. J. Welch, “Digital tracking and control of retinal images,” Opt. Eng., vol. 33, no. 1, pp.

150–159, Jan. 1994.

[17] M. S. Markow, H. G. Rylander, and A. J. Welch, “Real-time algorithm for retinal tracking,” IEEE Trans. Biomed. Eng., vol. 40, no. 12, pp.

1269–1281, Dec. 1993.

[18] M. J. Borodkin and J. T. Thompson, “Retinal cartography: An analysis of two-dimensional and three-dimensional mapping of the retina,”

Retina—J. Retinal, Vitreous Diseases, vol. 12, no. 3, pp. 273–280, 1992.

[19] A. A. Mahurkar, B. L. Trus, M. A. Vivino, E. M. Kuehl, M. B.

Datiles, and M. I. Kaiser-Kupfer, “Retinal fundus photo montages: A new computer based method,” Investigat. Ophthalmol., Visual Sci., vol.

36, no. 4, Mar. 1995.

[20] P. Dani and S. Chaudhuri, “Automated assembling of images—Image montage preparation,” Pattern Recogn., vol. 28, no. 1, pp. 431–445, Mar. 1995.

[21] D. Milgram, “Adaptive techniques for photomosaicking,” IEEE Trans.

Comput., vol. C-26, no. 11, pp. 1175–1180, Nov. 1977.

[22] M. Goldbaum, N. Katz, S. Chaudhuri, M. Nelson, and P. Kube, “Digital image processing for ocular fundus images,” Ophthalmol. Clin. N.

Amer., vol. 3, no. 3, pp. 447–466, Sept. 1990.

[23] R. Jagoe, J. Arnold, C. Blauth, P. L. C. Smith, P. M. Taylor, and R. Wootton, “Measurement of capillary dropout in retinal angiograms by computerized image analysis,” Pattern Recogn. Lett., vol. 13, pp.

143–151, Feb. 1992.

[24] S. Chaudhuri, S. Chtterjee, N. Katz, M. Nelson, and M. Goldbaum,

“Detection of blood vessels in retinal images using two-dimensional matched filters,” IEEE Trans. Med. Imag., vol. 8, no. 3, pp. 263–269, Sept. 1989.

[25] A. V. Cideciyan, “Registration of ocular fundus images,” IEEE Eng.

Med., Biol., Mag., vol. 14, no. 1, pp. 52–58, Jan./Feb. 1995.

[26] K. E. Rayner, Ed., Eye Movements and Visual Cognition: Scene Per- ception and Reading, Springer Series in Neuropsychology. New York:

Springer-Verlag, 1992.

[27] M. H. Goldbaum, V. Kouznetsova, B. L. Cot´e, W. E. Hart, and M.

Nelson, “Automated registration of digital ocular fundus images for comparison of lesions,” in SPIE: Ophthalmic Technologies III, 1993, vol. 1877, pp. 94–99.

[28] L. S. Davis, “A survey of edge detection techniques,” Comput. Graphics, Image Processing, vol. 4, pp. 248–270, 1975.

[29] R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, vol. 1.

Reading, MA: Addison-Wesley, 1992.

[30] E. Anarim, H. Aydinoglu, and I. C. Goknar, “Decision based edge detector,” Signal Processing, vol. 35, pp. 149–156, Jan. 1994.

[31] G. S. Watson, Statistics on Spheres. New York: Wiley, 1983.

[32] G. Stockman, S. Kopstein, and S. Benett, “Matching images to models for registration and object detection via clustering,” IEEE Trans. Pattern Anal. Machine Intell., vol. 3, no. 3, pp. 229–241, 1982.

[33] G. Stockman and J. C. Esteva, “3-D object pose from clustering with multiple views,” Pattern Recogn. Lett., vol. 3, pp. 279–286, 1985.

[34] L. G. Brown, “A survey of image registration techniques,” ACM Computing Surveys, vol. 24, no. 4, pp. 325–376, Dec. 1992.

[35] D. I. Barnea and H. F. Silverman, “A class of algorithms for fast digital image registration,” IEEE Trans. Comput., vol. C-21, no. 2, pp. 179–186, Feb. 1972.

[36] E. Peli, R. A. Augliere, and G. T. Timberlake, “Feature-based registra- tion of retinal images,” IEEE Trans. Med. Imag., vol. 6, no. 3, Sept.

1987.

[37] B. Ravichandran and A. C. Sanderson, “Model-based matching using a hybrid genetic algorithm,” in Proc. 1994 IEEE Int. Conf. Robotics and Automat., pp. 2064–2069, 1994.

[38] S. Umeyama, “Least-squares estimation of transformation parameters between two point patterns,” IEEE Trans. Pattern Anal. Machine Intell., vol. 13, no. 4, pp. 376–380, Apr. 1991.

Douglas E. Becker was born on December 4, 1970.

He received the B.S. degree in 1990, the M.S.

degree in 1991, and the Ph.D. degree in 1995, all in computer and systems engineering from Rensselaer Polytechnic Institute, Troy, NY.

He is currently a Principal Software Engineer at Siemens Medical Systems, Nuclear Medicine Group, Hoffman Estates, IL. His current research interests are in medical image analysis, image registration, and high-speed computer architectures.

Ali Can received the B.S. degree in electrical en- gineering from University of Gaziantep, Turkey, in 1993, and the M.S. degree in computer and systems engineering from Rennselaer Polytechnic Institute (RPI), Troy, NY, in 1997. Currently, he is a Ph.D.

degree student at RPI.

His research interests include biomedical image processing and real-time applications, motion and structure estimation (2-D and 3-D) from image sequences.

Mr. Can is a member of the Microscopy Society of America.

(14)

James N. Turner received the B.S. degree in engineering science in 1968 and the Ph.D. degree in biophysics in 1973 from the State University of New York at Buffalo.

He did National Institutes of Health (NIH) and National Science Foundation (NSF) postdoctoral fellowships at the Roswell Park Memorial Institute, Buffalo. Currently, he is Director of the Three- Dimensional Light Microscopy Facility at the Wadsworth Center of the New York State De- partment of Health, Albany. He is also Professor of Biomedical Engineering at Rensselaer Polytechnic Institute and Biomedical Sciences in the School of Public Health of the University at Albany. His interests focus on applications of light imaging methods and quantitative image analysis in biology and medicine, with special emphasis on the nervous system.

Dr. Turner is on the Editorial Boards of Microscopy and Microanalysis and Microscopy Research Techniques, and he has chaired numerous symposia in the area of 3-D microscopy both light and electron at meetings of the Microscopy Society of America. He is a member of the Microscopy Society of America, International Society for Analytical Cytology, AAAS, and the Society for Neuroscience. He frequently serves on NIH advisory panels.

Howard L. Tanenbaum received the B.Sc. degree and the M.D., C.M. from McGill University, Mon- treal, P.Q., Canada in 1961.

He is a Fellow of the Royal College of Physi- cians and Surgeons of Canada. He has taught ophthalmology at various levels at the University of Colorado, Boulder, (1962–1963), Montreal Gen- eral Hospital, Montreal, P.Q., Canada (1968–1969), Jewish General Hospital, Montreal, P.Q., Canada (1968–1984), McGill University (1968–1984), and Albany Medical College, Albany, NY (1984–1987).

He is currently Director of The Center for Sight in Albany, NY. His research interests are in proliferative vitreoretinal diseases, diabetic retinopathy, neovascularization, and a variety of laser-related issues.

Dr. Tanenbaum is a member of the Association for Research in Vision and Ophthalmology (ARVO), Canadian Medical Association, The Retina Society, American Academy of Ophthalmology, New York Academy of Science, Quebec Retina Club, Macula Society, Northeast Eye, Ear, and Throat Society of New York, New York State Medical Society, New York State Ophthalmological Society, and The American Medical Association. He is on the Editorial Committee of the National Eye Trauma Registry, and is Contributing Editor to the journal Ophthalmic Practice.

Badrinath Roysam (M’89) received the B.Tech.

degree in electronics engineering from the Indian Institute of Technology, Madras, India, in 1984 and the M.S. and D.Sc. degrees in electrical engineering from Washington University, St. Louis, MO, in 1987 and 1989, respectively.

He has been at Rensselaer Polytechnic Institute, Troy, NY, since 1989. He is currently an Associate Professor in the Electrical, Computer, and Systems Engineering department. He co-founded AutoQuant Imaging Systems Inc., Troy, NY. He has also con- sulted for various major and small corporations on imaging systems and image processing, and has assisted venture capital companies with detailed analysis of startup companies. His current research interests are in the areas of biomedical image analysis, optical instrumentation, high-speed and real-time computing architectures, parallel algorithms, and other compelling medical applications.

Dr. Roysam is a member of the Microscopy Society of America (MSA) and the Association for Research in Vision and Ophthalmology (ARVO).