Doctoral Dissertations (All Dissertations, All Years)
Electronic Theses and Dissertations
2014-04-28
On Simultaneous Localization and Mapping inside
the Human Body (Body-SLAM)
Guanqun Bao
Worcester Polytechnic Institute
Follow this and additional works at:
https://digitalcommons.wpi.edu/etd-dissertations
This dissertation is brought to you for free and open access byDigital WPI. It has been accepted for inclusion in Doctoral Dissertations (All Dissertations, All Years) by an authorized administrator of Digital WPI. For more information, please [email protected].
Repository Citation
Bao, G. (2014).On Simultaneous Localization and Mapping inside the Human Body (Body-SLAM). Retrieved from
inside the Human Body (Body-SLAM)
by
Guanqun Bao
A Dissertation
Submitted to the Faculty
of the
WORCESTER POLYTECHNIC INSTITUTE
In partial fulfillment of the requirements for the
Degree of Doctor of Philosophy
in
Electrical and Computer Engineering
by
April 2014
APPROVED:
Professor Kaveh Pahlavan, Major Thesis Advisor Professor Yehia Massoud, Head of Department
Professor Lifeng Lai, ECE Dept., WPI Professor Emmanuel Agu, CS Dept., WPI
Abstract
Doctor of PhilosophyBody-SLAM: Simultaneous Localization and Mapping inside Human Body by Guanqun Bao
Wireless capsule endoscopy (WCE) offers a patient-friendly, non-invasive and painless investigation of the entire small intestine, where other conventional wired endoscopic instruments can barely reach. As a critical component of the capsule endoscopic exam-ination, physicians need to know the precise position of the endoscopic capsule in order to identify the position of intestinal disease after it is detected by the video source. To define the position of the endoscopic capsule, we need to have a map of inside the human body. However, since the shape of the small intestine is extremely complex and the RF signal propagates differently in the non-homogeneous body tissues, accurate mapping and localization inside small intestine is very challenging. In this dissertation, we present an in-body simultaneous localization and mapping technique (Body-SLAM) to enhance the positioning accuracy of the WCE inside the small intestine and reconstruct the tra-jectory the capsule has traveled. In this way, the positions of the intestinal diseases can be accurately located on the map of inside human body, therefore, facilitates the follow-ing up therapeutic operations. The proposed approach takes advantage of data fusion from two sources that come with the WCE: image sequences captured by the WCE’s embedded camera and the RF signal emitted by the capsule. This approach estimates the speed and orientation of the endoscopic capsule by analyzing displacements of fea-ture points between consecutive images. Then, it integrates this motion information with the RF measurements by employing a Kalman filter to smooth the localization results and generate the route that the WCE has traveled. The performance of the proposed motion tracking algorithm is validated using empirical data from the patients and this motion model is later imported into a virtual testbed to test the performance of the alternative Body-SLAM algorithms. Experimental results show that the proposed Body-SLAM technique is able to provide accurate tracking of the WCE with average error of less than 2.3cm.
First and foremost, I would like to express my deepest appreciation to my advisor Professor Kaveh Pahlavan, not only for his guidance and support in academics and research, but also for enlightening me by sharing his insights and life experience. I cannot remember how many times professor Pahlavan has provided creative discussions and advices when I was stuck with my research. He inspired me with his short didactic stories which contain the true philosophy of life. My work and accomplishments were only possible because of his help and encouragement.
I also would like to thank the members of my PhD committee, Prof. Allen H. Levesque for always being kind and supportive and bringing me into this prestigious research lab, Professor Massoud for sharing his insights when I was hesitating between life choices, Professor Agu for his excellent lectures on Computer Graphic, which lays down the fundamentals of my thesis, Professor Lai for his valuable suggestions and comments towards my thesis and Professor Kamran Sarafian for being my external committee member off campus. I also would like to extend my thanks to Dr. David Cave from Umass Medical Center for introducing us to the field of localization of wireless capsule endoscopy and educating us with the knowledge of his expertise.
I also would like to show my applications to the former and current members of the CWINS lab: Dr. Yunxing Ye, Dr. Jie He, Ruijun Fu, Shen Li, Jin Chen, Xin Zheng, Zhuoran Liu, Yishuang Geng, Bader Alkandari, Fardad Askarzadeh, for their direct or indirect help in preparing this thesis. Special thanks to Liang Mi for working with me on the emulation testbed and being so hardworking and productive.
Finally, and most importantly, I would like to thank my wife Zhijiao Wang for her longstanding support, encouragement and unwavering love during the past six years. I thank my parents, Yuling Zhao and Lichun Bao, for their faith in me. Without their support I would’t have a chance to achieve today’s accomplishments.
Abstract ii
Acknowledgements iii
Contents iv
List of Figures vii
List of Tables x
Abbreviations xi
Symbols xiii
1 Introduction 1
1.1 Evolution of Wireless Capsule Endoscopy (WCE) . . . 2
1.2 Motivation . . . 3
1.3 Contributions . . . 5
1.4 Outline of the Dissertation. . . 7
2 Challenges in WCE Localization 9 2.1 Overview of Wireless Capsule Endoscopy (WCE) . . . 10
2.2 Literature Review . . . 14
2.3 RF Localization Techniques . . . 16
2.3.1 RSS based techniques . . . 17
2.3.2 ToA based techniques . . . 19
2.3.3 Localization Algorithms . . . 20
2.3.3.1 least square algorithm . . . 21
2.3.3.2 maximum likelihood algorithm . . . 22
2.3.4 Cramer-Rao Lower Bound (CRLB) . . . 24
2.4 Challenges of Localization inside Small Intestine . . . 27
3 Design of Algorithms for Body-SLAM 31 3.1 Formulation of Body-SLAM . . . 32
3.2 Motion Tracking using Endoscopic Images . . . 34
3.2.1 Analyzing the Content of Endoscopic Images . . . 34
3.2.1.1 Image segmentation using SRM . . . 34
3.2.1.2 Image classification using SVM classifier. . . 37
3.2.2 Feature Points Matching. . . 39
3.2.3 Image Unrolling . . . 41
3.2.4 Speed Estimation. . . 43
3.2.5 Direction of Moving Estimation . . . 45
3.3 Data Fusion of Visual and RF Information. . . 48
3.3.1 Kalman Filter. . . 49
3.3.2 Relative Position Predictions using Images . . . 52
3.3.3 Absolute Position Measurements by RF Localization . . . 54
4 Performance Evaluation of Body-SLAM 58
4.1 Empirical Results of Motion Tracking . . . 58
4.1.1 Speed Estimation using PillCam COLON 2 . . . 60
4.1.2 Statistical Speed Modeling . . . 61
4.2 Design of Testbed for Performance Evaluation . . . 68
4.2.1 Visual Component . . . 70
4.2.1.1 Physical testbed . . . 70
4.2.1.2 Virtual testbed . . . 72
4.2.2 RF Component . . . 79
4.2.2.1 RF propagation emulation using FDTD . . . 79
4.2.2.2 RSS vs ToA . . . 81
4.3 Performance Evaluation for Body-SLAM . . . 82
5 Conclusion and Future Direction 92 5.1 Conclusion . . . 92
5.2 Future Direction . . . 93
A Appendix: Full Publication List 94 A.1 Related to the Thesis. . . 94
A.2 Not Related to the Thesis . . . 96
B Appendix Tutorial 98
2.1 The architecture of WCE . . . 12
2.2 Wireless Capsule Endoscopy. . . 12
2.3 A typical RF localization system . . . 17
2.4 A typical 3D pattern of body mounted sensors used as reference points of the performance evaluation scenario for localization of the WCE . . . 25
3.1 Overall flow chart of Body-SLAM . . . 33
3.2 Two basic categories of endoscopic images . . . 35
3.3 Two sequences of segmented image with Q = 16 Top 2 rows: FT, Bottom 2 rows: FL . . . 36
3.4 Illustration of feature mapping using Kernel function . . . 39
3.5 A WCE moving inside the small intestine . . . 40
3.6 Feature matching between two consecutive images using A-SIFT . . . 41
3.7 Image acquisition system of WCE. . . 41
3.8 The process of ”unrolling” the cylindrical image. . . 43
3.9 Speed estimation . . . 45
3.10 Direction of moving of the capsule . . . 45
3.11 Direction of moving estimation . . . 46
3.12 A typical RF localization system . . . 55
3.13 A complete flowchart of data fusion of images and RF measurements using
a Kalman filter . . . 57
4.1 Some typical landmarks for the WCE . . . 59
4.2 PillCam COLON 2 with double cameras . . . 61
4.3 Speed estimation results from PillCam COLON 2 double cameras . . . 62
4.4 Statistics of speed estimation using PillCam COLON 2 double cameras . 63 4.5 PDF of the speed estimation from different individuals . . . 64
4.6 CDF of the speed estimation from different individuals . . . 64
4.7 Speed estimation results of a sequence of real endoscopic images . . . 66
4.8 A typical speed pattern of moving fast . . . 67
4.9 A typical speed pattern of moving slow . . . 67
4.10 Design of emulation testbed for quantitative performance evaluation . . . 69
4.11 A physical visual model for the small intestine (a) wired endoscopic cam-era (b) appearance of the physical model (c) pictures taken from inside the physical model . . . 71
4.12 Mapping the physical testbed into virtual 3D space . . . 74
4.13 3D testbed . . . 75
4.14 Emulated endoscopic images from virtual visual testbed . . . 75
4.15 3D path generation from a 3D GI tract model . . . 77
4.16 Emulation testbed set up . . . 78
4.17 RF propagation setup using SEMCAD X . . . 80
4.18 RSS versus distance (left) and Time-of-Arrival (TOA) versus distance (right) inside human body . . . 81
4.20 Result of the motion tracking compared with ground truth . . . 86
4.21 Mean square error (MSE) in the motion racking process . . . 87
4.22 Localization results of different algorithms and performance evaluation . . 89
4.23 Error distributions of different algorithms . . . 90
4.24 Performance evaluation by CDF plot of different algorithms . . . 91
B.1 Generating the path inside human body . . . 100
2.1 FDA-approved wireless capsule systems and specifications . . . 13
2.2 Parameters for the statistical implant to body surface pathloss model . . 18
4.1 Motion tracking performance for each step . . . 86
ASIFT Affine Scale InvariantFeatureTransform BAN Body Area Network
CDF Cumulative DistributionFunction CT Computer Tomography
CRLB Cramer Rao LowerBound
FDA Food andDrug and Administration FCC FederalCommunicationsCommission FP FeaturePoints
GI GastroIntestinal
GPS GlobalPositioningSystem HOG Histogram of OrientedGradients IPS Indoor Positioning System ISM IndustrialScientific andMedical IMU Inertial MeasurementUnit LBP LocalBinaryPattern LS Least Square
KF KalmanFilter
MLE MaximumLlikelihoodEstimation
MSE MeanSquareError
MRI MagneticResonance andImaging
NIST NationalInstitute of Science andTechnology PDF Probability Density Function
RSS ReceivedSignalStrength SNR Signal toNoise Ratio
SLAM Simultaneous LocalizationAndMapping SRM Statistical RegionMerging
SIFT Scale InvariantFeatureTransform ToA Time of Arrival
In n×n identity matrix
(.)T matrix transpose
(.)H n×n matrix conjugate transpose
|S|i ithcolumn of matrixS
|S|i,j i, jthcolumn of matrixS
|S|i,j signum function
sign(.) i, jthcolumn of matrixS
diag(A) vector resulting from extraction of diagonal elements ofA
tr(.) matrix trace
▽(f) gradient with respect tof
E(.) expectation
δ(.) discrete Kronecker delta function
Bel−(.) prior belief
Bel+(.) posterior belief
Introduction
Many of the profound innovations in science and engineering start with metaphors
pre-sented in the science fictions. The wireless information networking industry was
mo-tivated by the Captain Kirk’s communicator in the 1960s science fiction series “Star
Trek”. The idea was formed in the early 1980s; the Federal Communications
Commis-sion (FCC) released the Industrial, Scientific and Medical (ISM) bands; the IEEE 802.11
standardization committee created the WLAN standard in 1997 [1]. After almost half a
century, modern smart phones are what the evolution of the “Star Trek” communicator
fantasy brought to us. Recently, another 1960s science fiction, the “Fantastic Voyage”,
in which a space craft with its crew were shrunken to become a micro-device capable
of traveling inside human body to remove a brain clot, has stimulated a new wave of
innovative science and engineering for the Body Area Network (BAN) [2–5]. That space
craft lost its navigation capabilities and went through an unguided dramatic traveling
experience within the human body before it exits through tears from the eye of the
human subject. Today, wireless endoscopic capsules are traveling inside the digestive
system in the same way as the space craft in the fantastic voyage traveled and one can
envision emergence of a number of other similar applications for micro-robots inside the
human body.
1.1
Evolution of Wireless Capsule Endoscopy (WCE)
Endoscopy [6] is a medical procedure used to examine the interior wall of the digestive
system. According to a study conducted in 2002 [7], approximately 19 million people in
the United States were estimated to be affected by disorders of the small intestine. This
statistic indicates that effective advancements in endoscopy technology are extremely
worthy of investigation. When using the conventional endoscopic instrument, a long
flexible tube with a miniature camera needs to be inserted through the mouth or the
anus in order to get into the gastrointestinal (GI) tract. Owing to its rigidity and large
size, it causes much discomfort to whoever undergoes this procedure. This generally
limits the willingness of patients to have their GI tract examined regularly. Furthermore,
the lack of capability to reach the entire small intestine is also a significant shortcoming
of the current wired endoscope.
Wireless Capsule Endoscopy (WCE) [8–11], a significant step in the efforts of developing
a more effective endoscopy technique, was invented to overcome the above limitations.
The first WCE prototype for the small intestine was approved by the Food and Drug
Administration (FDA) in 2001. Over subsequent years, this technology has been evolving
into one of the most popular non-invasive imaging tools of the intestinal disease diagnosis.
WCE is a pill-shaped device which consists of a short focal length CMOS camera, light
source, battery and radio transmitter [12,13]. After the endoscopic capsule is swallowed
by a patient , this miniature device propelled by peristalsis of GI tract begins to work
000 images) while moving along the GI tract. At the same time, images are sent out
wirelessly in Ultra High Frequency (UHF) at 432 MHz to a small portable recorder
attached to the waist [14]. The images are subsequently downloaded from the portable
recorder to a workstation for analysis off line. The whole examine process takes about
8 h, during this period, the patient do not need to be confined to a hospital or clinic
environment during the examination and is free to continue their daily routine. Up
to now, WCE has been used to detect the following diseases [15–17] small intestinal
blooding, Crohn disease, ulcer, tumors, vascular lesions and colon cancers.
1.2
Motivation
Although WCE provides a non-invasive wireless imaging technology for observing the
entire GI tract, one significant drawback of this technology is that it cannot localize
itself during its several hours journey. Therefore, when an abnormality is detected by
the video source, the physicians have limited idea where the abnormality is located which
prevents the following up therapeutic operations being executed immediately. Therefore,
having a precise localization system for the endoscopic capsule would greatly enhance
the benefits of WCE.
However, localization of the WCE inside the human body is not trivial. There are some
fundamental technical challenges which make accurate localization inside human body
a difficult task.
• First, we don’t have a clear map of inside human body. A map of the digestive
system plays a very important role in refining the localization results [18–20] since
computed tomography (CT) and magnetic resonance imaging (MRI) imaging tools
are not able to provide enough resolution to extract the path of the small intestine.
• Second, conventional single source localization techniques, for example RF
lo-calization techniques, cannot provide satisfactory lolo-calization results due to the
non-homogeneity and severe attenuation of body tissues [21]. We need to design
more complicated hybrid localization algorithms that integrate all possible data
sources to enhance the localization accuracy. To do this, we need researchers with
multidisciplinary background including wireless localization, robotics and image
processing etc.
• Third, validation of existing localization algorithms are challenging [22]. After
the capsule is swallowed by the patient, we have limited control of the endoscopic
capsule. Exploratory clinical procedures such as planar X-ray imaging and
Ultra-sound cannot be easily used for verifying the position and motion status of the
capsule due to their high cost and potential risk to the patient’s health.
• Last but most importantly, operating experiments inside human body is extremely
difficult. As we mentioned previously, there are practical challenges to verify the
performance of any localization algorithm. Moreover, human subjects are different
from one and another, we need a uniform platform to do comparative performance
evaluation for different algorithms.
These challenges make deign of an accurate localization system for the WCE inside
human body a unsolvable engineering problem for more than 13 years. And this became
the motivation of my research : To design a localization system that is able to precisely
localize the endoscopic capsule as it travels along the digestive system and meanwhile
1.3
Contributions
To meet the challenges introduced above, in this dissertation, we present an in-body
simultaneous localization and mapping technique (Body-SLAM) to enhance the
po-sitioning accuracy of WCE inside the small intestine and meanwhile reconstruct the
trajectory the capsule has traveled. The contributions of this multi-disciplinary and
inter-disciplinary dissertation are:
• Design and performance evaluation of a Body-SLAM algorithm to accurate
local-ize the position of WCE and reconstruct the 3D map the capsule has traveled.
The proposed Body-SLAM technique estimates the speed and orientation of the
endoscopic capsule by analyzing displacements of feature points between
consecu-tive images and this motion information is integrated with the RF measurements
by employing a Kalman filter to smooth the localization results and the generated
3D map.
• To achieve this objective, we modeled the motion of the endoscopic capsule using
empirical data obtained from a actual patients. This motion model is further
imported into a emulation testbed for performance evaluation.
• We designed a tested for performance evaluation of hybrid localization algorithms
that benefits from content of the endoscopic images as well as the features of the
RF signal emitted from the video capsule. We used this testbed to demonstrate
the effectiveness of hybrid localization algorithms for Body-SLAM inside small
intestine.
1. K. Pahlavan, G. Bao, Y. Ye, S. Makarov, U. Khan ... K. Sayrafian, “Rf
local-ization for wireless video capsule endoscopy”. International Journal of Wireless
Information Networks, Vol.19 (4), pp.326-340, 2012.
2. G. Bao, Y. Ye, U. Khan, X. Zheng and K. Pahlavan, “Modeling of the Movement
of the Endoscopy Capsule inside GI Tract based on the Captured Endoscopic
Images”, The 2012 International Conference on Modeling, Simulation and
Visual-ization Methods (MSV), Las Vegas, USA, July, 2012.
3. G. Bao and K. Phalavan, “Motion Estimation of the Endoscopy Capsule using
Region-based Kernel SVM Classifier”, 2013 IEEE International Conference on
Electro/Information Technology (EIT), Rapid City, SD, May 9-11, 2013.
4. G. Bao, L. Mi and K. Phalavan, “Emulation on Motion Tracking of Endoscopic
Capsule inside Small Intestine”, 2013 World Congress in Computer Science,Computer
Engineering, and Applied Computing (WORLDCOMP’13), Las Vegas, USA, 2013.
5. G. Bao, L. Mi and K. Phalavan, “A Video Aided RF Localization Technique
for the Wireless Capsule Endoscope (WCE) inside Small Intestine”, 8th
Interna-tional Conference on Body Area Networks, Boston, Massachusetts, United States,
September 30 - October 2, 2013.
6. L. Mi, G. Bao and K. Phalavan, “Design and Validation of a Virtual Environment
for Experimentation inside the Small Intestine”, 8th International Conference on
Body Area Networks, Boston, Massachusetts, United States, September 30 -
Oc-tober 2, 2013.
7. R. Fu, G. Bao and K. Pahlavan, “Activity Classification with Empirical RF
Prop-agation Modeling”, 8th International Conference on Body Area Networks, Boston,
8. L. Mi, G. Bao and K. Pahlavan, “Geometric Estimation of Intestinal Contraction
for Motion Tracking of Video Capsule Endoscope”, SPIE Medical Imaging:
Image-Guided Procedures, Robotic Interventions, and Modeling, San Diego, California,
February 15-20, 2014.
9. G. Bao, L. Mi, Y. Geng and K. Pahlavan, “A Computer Vision based Speed
Estimation Technique for Localizing the Wireless Capsule Endoscope inside Small
Intestine,” submitted to Signal Processing Letters, IEEE, April, 2014.
10. G. Bao, L. Mi, Y. Geng, M. Zhou and K. Pahlavan, “A Video-based Speed
Esti-mation Technique for Localizing the Wireless Capsule Endoscope inside
Gastroin-testinal Tract, ” submitted to IEEE Engineering in Medicine and Biology Society
(EMBC 14), March, 2014.
11. M. Zhou, G. Bao and K. Pahlavan, “Mutual Information based Motion Tracking
Technique for the WCE inside Large Intestine”, submitted to IEEE Engineering
in Medicine and Biology Society (EMBC 14), March, 2014.
12. G. Bao, L. Mi and K. Pahlavan, “Hybrid Localization of Micro-robotic
Endo-scopic Capsule inside Small Intestine by Data Fusion of Vision and RF Sensors”,
submitted to Sensor Journal, IEEE, March, 2014.
A full publication list can be found in AppendixA.
1.4
Outline of the Dissertation
This dissertation focuses on the hybrid localization which we called “Body-SLAM” for
the wireless capsule endoscopy and testbed design for comparative performance
is organized as follows: in chapter 2, we give a overview of the existing localization
technologies of WCE and addressed the technical challenges in this field. In Chapter 3,
we present a hybrid localization technique, which we called “Body-SLAM”, that uses
endoscopic images for motion tracking and combines the motion information with the
RF signal radiated from the capsule to enhance the localization accuracy, and
mean-while reconstruct the trajectory the capsule has traveled. In chapter 4, performance
evaluation of the proposed localization algorithm are given by using empirical data and
design of emulation testbed. Finally, we conclude the dissertation in chapter 5 and give
Challenges in WCE Localization
While physicians can receive clear images of the interior of the entire digestive system
using WCE, they have little idea of the exact location of the capsule when an
abnor-mality is found by the video source. To localize intestinal abnormalities, physicians
have to administrate successive radiological, endoscopic or surgical operations, which
are invasive and potentially harmful to patient’s health. If we could develop a wireless
localization system to localize these devices, not only can physicians diagnose the
medi-cal diseases, but they can also learn where the diseases are located. However, designing
such a localization system is a very challenging task. In this chapter, we review the
existing localization techniques, especially the RF localization techniques, discuss their
limitations and address the challenges in designing localization system for inside human
body.
2.1
Overview of Wireless Capsule Endoscopy (WCE)
Wireless Capsule Endoscopy (WCE) is a pill-shaped device which consists of a short focal
length CMOS camera, light source, battery and radio transmitter [12,13] as shown in
Figure2.1. After the endoscopic capsule is swallowed by a patient, this miniature device
begins to work and record images at least 2 frames per second while moving along the
GI tract. At the same time, images are sent out wirelessly to a data recorder attached
to the patient’s waist. The whole process takes about 8 h, then all the image data
are downloaded into a work station and physicians could inspect the whole video and
diagnosis diseases in the GI tract. Being such an innovative technique without cable
connection, WCE offers a patient-friendly, non-invasive and painless investigation of
the entire GI tract, especially the small intestine, where other conventional endoscopic
instrument can barely reach. Up to now, WCE has been used to detect the following
diseases: small intestinal blooding, Crohn disease, ulcer, tumors, vascular lesions and
colon cancers [15–17].
A typical capsule endoscopy system consists of 3 components shown in Figure 2.2
[12,13]:
1. A wireless capsule endoscope
All capsule endoscopes have similar components: a disposable plastic capsule, a complementary metal oxide semiconductor or high-resolution charge-coupled de-vice image capture system, a compact lens, white-light emitting diode illumination sources, and an internal battery source.
2. A sensing system with sensing pads or a sensing belt to attach to the patient, a
The mode of data transmission is either via ultra-high frequency band radio teleme-try (PillCam, EndoCapsule) or human body communications (MiroCam). The lat-ter technology uses the capsule itself to generate an electrical field that uses human tissue as the conductor for data transmission. Currently PillCam SB2 and Miro-Cam are available with extended battery life, which may be beneficial in patients with delayed small-bowel transit.
3. A personal computer workstation with proprietary software for image review and
interpretation.
Major visualization systems are RAPID Reader from Given Imaging, WS-1 En-doCapsule from Olympus America and MiroView from IntroMedic.
1 2 3 4 5 6 7 8 Optical dome Lens holder Lens Illuminating LEDs CMOS imager Battery RF transmission model Antenna
Figure 2.1: The architecture of WCE
(a) (b) (c)
te r 2. C h a lle n ge s in W C E L oc a liz a tio n 13
Table 2.1: FDA-approved wireless capsule systems and specifications
WCE company Size, mm Weight View angle Frame rate Battery life Resolution
EndoCapsule
Olympus America 11×26 3.5 g 145o 2 /sec 8 hours 512×512
PillCam SB2
Given Imaging 11×26 2.8 g 156o 2 /sec 8 hours 256×256
PillCam SB3
Given Imaging 11×26 2.8 g 156o 2-6/sec 12 hours 320×320
PillCam SB2EX
Given Imaging 11×26 3.3 g 156o 2 /sec 12 hours 256×256
MiroCam
2.2
Literature Review
WCE provides a noninvasive way to inspect the entire small intestine. As a critical
component of capsule endoscopic examination, physicians need to know the precise
po-sition of the endoscopic capsule in order to identify the popo-sition of intestinal disease
after it is found by the video source [23–25]. The follow up therapeutic operations and
effect of drug administration are heavily dependent on the accuracy of capsule’s
posi-tion informaposi-tion [26]. Therefore, having a precise and reliable localizaposi-tion system plays
an important role in enhancing the benefits of WCE. During the past few years, many
attempts have been made to develop accurate and reliable localization systems for the
WCE. A good review of existing localization techniques is given in [27]. These
technolo-gies can be divided into those using magnetic field [28–32] or inertial systems [33], using
image processing techniques [34] and techniques using RF signals [4,35–37].
In magnetic sensing based techniques, a magnet is inserted into the WCE and the WCE
is located by measuring the magnetic field [38,39]. This technique increases the weight
and size of the WCE and the magnetic field of the WCE used for localization will be
interfered by the external magnetic fields used for other applications such as the Magnetic
Resonance Imaging (MRI) systems. One can also insert radiation opaque material into
the WCE and trace the location of the WCE using X-ray or Computed Tomography
(CT) scan. Continuous imaging using X-ray or CT scan is very expensive and it bears
the health risks for the patient [27,40].
In [33], Ciuti and his colleagues magnetic inertial sensing based localization system.
They inserted a three-axis accelerometer LIS331DL into the capsule. the This inertial
sensing not only provides the approximate location and orientation of the capsule in
magnetic link between the external permanent magnet and the capsule. However, it
would be difficult to make a compact capsular mechanism to be swallowable with the
integration of such a inertial sensing subsystem and four cylindrical magnets. Also, this
localization technique only offers rough spatial information (an average error of 3 cm)
without data in a vertical direction.
Besides the magnetic field based and inertial sensing based techniques, using computer
vision based technique for localization the WCE is being investigated [34,41–44].
Be-cause the capsule endoscope changes position and direction very slow, some identical
areas exist in the successive two endoscopic images, so we can find the correspondent
point pairs in these two images. Using the image correspondences, we can determine the
motion (rotation and translation) parameters of capsule endoscope with an appropriate
algorithm. This approach can be a complementary method for improving the magnetic
localization and orientation method.
Using the RF signal used for image transmissions for the WCE to also locate the capsule
offers a natural and low cost solution that does not add to the capsule extra complexity
and payload [4,45–49]. Therefore, it has been chosen for use with the smartpill capsule
in USA and the M2A capsule in Israel. RF signal has been widely used for locating
an object in both outdoor and indoor environments with the accuracy achieved up to
hundreds of millimeters [19,50]. Nevertheless, applying radio frequency in the task of
tracking an object when it moves inside a special environment, such as the GI tract,
is a challenge. This is because high-frequency signals suffer significant attenuation at
different levels when they pass through different living tissues, whereas low-frequency
signals due to their long wavelengths are not able to deliver the desired precision of
sev-eral millimeters. The most commonly used RF techniques are Received Signal Strength
of using RF signal for localization and address their limitations and challenges when
applied inside human body.
2.3
RF Localization Techniques
The wireless localization industry was initiated by Global Positioning System (GPS)
for outdoor navigation in early 1970s and later evolved into Indoor Positioning System
(IPS) in 1990s [51–57]. Soon after, with the release of Body Area Network (BAN)
IEEE 802.15.6 standard and arising of implantable micro-robots, the future trend of
this localization technique is moving inside the human body [4,58–60]. The first major
application for this localization technology is the wireless capsule endoscopy [7,29,61–
63].
A commonly used RF localization infrastructure is to attach many calibrated external
RF sensors to the anterior abdominal wall of the human body to detect the RF signal
emitted by the wireless capsule as shown in Figure 2.3. By interpreting the
charac-ter of the received signal (RSS or ToA) into distance between the capsule and body
mounted sensor array, position of the capsule can be estimated by pattern matching
algorithms such as least square algorithm and maximum likelihood algorithm [28,36].
However, RF localization of micro-robots inside humans is not trivial. Compared to
outdoor and indoor environments, the inside of the human body is a complex
environ-ment making engineering design and visualization a formidable task [64]. The inside of
the human body is an extremely complex medium for RF propagation because it is a
non-homogeneous liquid-like environment with irregularly shaped boundaries and severe
path-loss. Things become more complex when it comes inside human body since the
used as references for localization are also in motion. More importantly, reliable designs
need testing the hardware implementation, but we cannot easily test devices inside the
human bodies. Therefore, existing RF localization systems sometimes end up providing
discontinuous and scattered estimations with large errors.
Figure 2.3: A typical RF localization system
2.3.1 RSS based techniques
The name of “wireless” capsule endoscope indicates its capability to transmit the images
by RF signal. The transmitter embedded inside the capsule sends endoscopic images,
which are captured during its travel along the GI tract, to several receivers placed
uniformly on the exterior of the patient abdomen as shown in Figure 3.12. Taking
advantage of this integrated function, people can measure the strength of the received
RF signals at each sensor and use each sensor as a reference node to localize the capsule
(mobile node). The tracking algorithm is based on the observation that the closer the
Table 2.2: Parameters for the statistical implant to body surface pathloss model
Implant to body surface LP(d0) dB α σdB
Deep tissue 47.14 4.26 7.85 Near surface 49.81 4.22 6.81
the RSS reading and the distance from the transmitter to the receiver can be expressed
by a pathloss model as given below [45,46,65,66]:
RSS(d) =Pt−P L(d0)−10αlog10
d d0
+S(d > d0) (2.1)
where d is the distance between transmitter and receiver, P T is the transmit power,
P L(d0) is the path loss for a reference distance d0 (i.e. 50 mm), α is the path loss
gradient which is determined by the propagation environment. For example, in free
space, α equals to 2. Since the human body tissue strongly absorbs RF signal, a much
higher value for the path loss gradient is expected for inside human body. Sis a Gaussian
random variable caused by shadow fading. From Eq. 2.1, the distances between the
capsule and each of the sensors can be roughly estimated by the RSS readings. Then,
the capsule’s location can be calculated using trilateration method.
A propagation attenuation model plays a vital role in the RSS technique. In order to
reduce the positioning error, it is necessary to develop an appropriate implant to body
surface path loss model. The parameters of one of the most cited signal attenuation
model developed by National Institute of Standards and Technology (NIST) at MICS
band are summarized in Table 2.2.
The empirical model mentioned previously is not accurate enough for the complex
envi-ronment of the GI tract. The model was developed by National Institute of Standards
Instead of using a signal propagation model, another RSS based localization scheme is
called “finger printing” technique [47]. The way of finger printing technique works is to
create a lookup table for position estimation first. Offline measurement survey needs to
be done in advance, in which at each position of the capsule, both the corresponding
signal strength measured by each of the sensors and its position data were recorded into
the table. During the experiment, online data were compared with the data stored in the
lookup table to find the closest match and thus to select the most appropriate position.
However, since we don’t have a map of inside the body to do the survey and people are
different in term of body shape, this method doesn’t have too much practical value.
2.3.2 ToA based techniques
For RF based localization, a widely known benefit of ToA based techniques is their high
accuracy compared to RSS based techniques [60,67]. The ToA based technique relies on
measurements of travel time of signals between the known reference nodes and unknown
mobile node. Ranging distance is calculated by multiplying the propagation velocity of
RF signal and the measured ToA value [60,68].
di =c×τi (2.2)
However, since the human body is formed of tissues with different characteristics of
conductivity and relative permittivity, the RF signal propagates with various speed
through different organs [69]. These variations in the speed are the dominant source
of error for the ToA-based RF localization inside the human body. Also, in near-field
application, time-based methods are difficult because radio waves travel with a very high
required in order to obtain the position resolution of 0.3 m. Another geometric location
method, time difference of arrival (TDoA), does not have these disadvantages. All it
needs is a transmission that has a recognizable unambiguous starting point. The data
used in the location calculations is the time difference in the reception of that starting
point at the several reference nodes, and not the actual time of flight of the signal from
the target to the fixed sensors. But in order to have sufficient data to find the mobile’s
coordinates, TDoA requires one more reference node than ToA.
2.3.3 Localization Algorithms
In every ranging based localization, the position of the mobile node is determined as
the intersection of the spheres [70], of which centers are the coordinates of the reference
nodes and radius are the ranging distancemi between the reference nodes
xi yi zi
T
and the target node
x y z
T
, where
m2i = (x−xi)2+ (y−yi)2+ (z−zi)2 (2.3)
Since inside the human body is an non-homogeneous environment, there is difference
between the true distance and the ranging distance using ToA. Therefore, the spheres do
not always intersect at one single point. The goal of the localization algorithm is to find
out the best estimation of the target’s actual position based on the noisy measurements.
Two most commonly used optimal estimation algorithms are least square algorithm and
maximum likelihood algorithm. In the following subsections, we explained how these
2.3.3.1 least square algorithm
In least square (LS) algorithm [71,72], at least three reference nodes are needed to solve
the least square problem. Substituting
x′=x−x1 y′ =y−y1 z′=z−z1 (2.4) and
x′i =xi−x1 (i= 2,3) (2.5)
into Eq. 2.3and subtracting the first one (i= 1) successively from it for i= 2,3 results
in an equation set in the matrix form as
x2−x1 y2−y1 z2−z1 x3−x1 y3−y1 z3−z1 ... xn−x1 yn−y1 zn−z1 x y z = 1 2 m21−m22+k2−k1 m21−m23+k3−k1 ... m21−m2n+kn−k1 (2.6) where ki =x2i +yi2+z2i (2.7) it can be denoted as 2At=b (2.8)
where t= x y z T (2.9) A= x2−x1 y2−y1 z2−z1 x3−x1 y3−y1 z3−z1 ... xn−x1 yn−y1 zn−z1 (2.10) b= m2 1−m22+k2−k1 m21−m23+k3−k1 ... m2 1−m2n+kn−k1 (2.11)
The solution can be obtained by using the least square method [72,73]:
t= 1
2(A
T
A)−1ATb (2.12)
2.3.3.2 maximum likelihood algorithm
This section talks about how to using maximum likelihood (ML) algorithm [74,75] to
do the localization. Assume the RSS measurements intensity for each sensor is
Ri=γi K X k=1 Ck |ρk−ri|α +ωi (2.13)
where Ri is the t-th sample: γi is gain factor, Ck is intensity of the k-th contaminant
source, ρk is the position of the k-th source, ri is the position of the mobile node, ωi is
the background noise.
Eq. 2.14can be also expressed as
Ri=γi
C
m2i +ωi (2.14)
wheremi is shown in Eq.2.3, which is the Euclidean distance between the mobile node
and sensor nodes.
Setting ξi = (ωi−µi)/σi ∼ N(0,1), (sigmaRi−µi)i ∼N( γi σi C m2 i
,1), we can define the following
matrix notation: Z = (R1−µ1) σ1 , (R2−µ2) σ2 ... (RN−µN) σN T (2.15) G=diag γ1 σ1, γ2 σ2... γN σN (2.16) D = 1 m2 1 ,m12 2 ...m12 N T (2.17) ξ= ξ1, ξ2...ξN T (2.18)
We use MLE method to estimate the location. The joint probability density function
f(Z|θ) = (2π)N/2exp
−12(Z−GDC)T(Z−GDC)
(2.19)
its log likelihood function is:
L(θ)∼ −1 2 N X i=1 Zi−γi C m2 i =−12 N X i=1 Riσ−iµi −γi C m2 i (2.20)
where the θ is the estimated mobile position. Thus, we can get the maximum likely
mobile position by minimizing this function [76].
2.3.4 Cramer-Rao Lower Bound (CRLB)
Cramer-Rao lower bound (CRLB), named in honor of Harald Cramer [77] and
Calyam-pudi Radhakrishna Rao [78] who were among the first to derive it, expresses a lower
bound on the variance of estimators of a deterministic parameter. In the localization
literature [79,80], CRLB defines the lower bound on the precision of a localization that
one algorithm can reach. To calculate the CRLB for localization inside human body, we
define a performance evaluation scenario and models for the behavior of the localization
metrics mentioned above, the RSS and ToA, for RF signaling in between the GI tract
and the body-mounted sensors used for localization. In this section, we introduce a
gen-eral scenario for comparative performance evaluation of RSS and ToA based localization
for capsule endoscopy application. The scenario is designed to reflect the performance
in different organs, the path of movement of the WCE inside the small intestine, and
the number and pattern of installation of body mounted sensors on the torso. Since the
caused by the refraction at the boundary of organs and tissues inside the human body,
models for behavior of the RSS and ToA are fairly complicated.
Figure 2.4: A typical 3D pattern of body mounted sensors used as reference points
of the performance evaluation scenario for localization of the WCE
Consider the WCE whose location is being indexed as 1 and m body mounted receiver
sensors denoted with indexes 2...m+ 1 as shown in Figure 2.4. Each receiver
sen-sor i is capable of measuring the ToA τi or RSS ri from the WCE. The observation
vector is X = |τ2...τm+1| for the ToA case or X = |r2...rm+1| for the RSS. Assume the localization coordinate of the WCE is θ1 = [x1, y1, z1], then our objective here is
to estimate the location of the WCE ˆθ1. The τi observation are modeled as normal
random variables fτi|θ1,θi N(di,1|v, σ¯ 2
T), where di,1 is the distance between the WCE and receiver sensor i. ¯v is the average propagation speed of the RF signal inside the
human GI tract, and σT is the parameter describing the ToA ranging error caused
by human tissue non-homogeneity. The ri measurements are log-normally distributed
fridB|θ1,θi N(Pr(dB), σ 2
at the reference distance from the WCE. α is the pathloss gradient and σ2sh is the
variance of the log normal shadowing.
The CRLB of ˆθ1 iscov(ˆθ1)>I(θ1)−1 is the Fisher information matrix (FIM)
Iθ1 =−E▽θ1(▽θ1lnι(X|θ1, θ)) = Ixx Ixy Ixz Ixy Iyy Iyz Ixz Iyz Izz (2.21)
whereι(X|θ1, θ) is the logarithm of the joint conditional probability density function:
ι(X|θ1, θ) = mX+1 i=2 logfτ1|θ1,θi (f or T oA) (2.22) ι(X|θ1, θ) = mX+1 i=2 logfr1|θ1,θi (f or RSS) (2.23) and Ixx=− i=2 X m+1 E[∂ 2logf τi|θ1,θi ∂2x2 1 ] (f or T oA) (2.24) Ixx =− i=2 X m+1 E[∂ 2logf ri|θ1,θi ∂2x2 1 ] (f or RSS) (2.25)
Similar expressions can be extend toIyy, Izz, Ixy, IxzandIyz. The CRLB on the variance
σ21 =tr{covθ(ˆx1,yˆ1,zˆ1)} =min tr(cov(ˆθ1)) =tr(I(θ1)−1)
= (−Ixx(Iyy+Izz)+IxyIxy+IxzIxz.../(−IxxIyyIzz+IxxIyzIyz...+IxzIyyIzz)) (2.26)
2.4
Challenges of Localization inside Small Intestine
There are a number of fundamental multi-disciplinary scientific and technological
chal-lenges facing the RF localization of the WCEs inside the human body. To design an
accurate localization system for inside human body we need to consider the following
[4]:
• Modeling of the Movements of WCE inside the GI Tract
The first challenge for meaningful analysis of RF localization inside the human
body is to use clinical databases and clinical procedures performed by GI
special-ist, to model the movements of the endoscopy capsule inside the GI tract [81].
Previously acquired and stored databases of patients with approximately 55,000
images per patient could be examined for detection of landmarks or fixed points
such as the pylorus and the ileocecal valve [29, 82]. Using the location of these
landmarks, the number of images that observes the landmark, and the fact that
the images are taken at a rate of two frames /sec (recently released WCE can take
up to six frames / sec), we should design a model for the movements of the capsule
in the GI tract to be mapped into the hardware and visualization platform. In the
future, inertial sensing units that are small enough to be embedded in a pill size
the endoscopic capsule. This information could be used to enhance the movement
model provided by examining the images reported by the capsule. The improved
model for the movements of the WCE using inertial sensors would enhance the
RF localization result. The feedback controlled inertial sensors have been already
used to monitor the robotic end luminal system using magnetic field to efficiently
perform diagnostic and surgical medical procedures [33].
• In every localization technique, map always plays a very important role in terms
of refining the localization results [18,19]. Existing literature [20] reported that a
clear street map is able to reduce the GPS localization error from tens of meters
to several meters in the urban area. In case of the localization inside human body,
“map” is even more important since everything goes through the GI tract follows
the same route. Knowing a clear pattern of the intestinal tract will greatly enhance
the localization accuracy. Therefore, tracing the path of intestinal tract is essential
to the accurate capsule localization.
• Modeling of the Wideband RF Propagation from Inside the Human Body
The second challenge is to model the wideband characterization of the RF
propaga-tion channel between an endoscopy capsule and body-mounted sensors [83,83,84].
We could use measurements inside phantoms and on the human subject’s body
sur-face to calibrate existing software simulation tools for direct solution of Maxwell’s
equations inside the human body. We then could use the software to determine
the waveforms observed by a body-mounted sensor used as a reference point for
localization or another endoscopy capsule inside the tract that could be used for
cooperative localization purposes. Finally, it should be possible to design models
localization techniques) as capsules travel along the GI tract, to be used by the
CPS for performance evaluation and visualization .
• Design of Complicated Algorithms for Localization inside the GI Tract
The third challenge would be the design and comparative performance evaluation
of alternative localization algorithms and discovery of methods for visualization
of the results. For this part one needs to consider the use of channel models for
spatial and temporal variation of the signal, the model for the track of physical
movement of the capsule inside digestive system, and landmarks detected from
video frames of the endoscopy capsule camera [41,63,85, 86]. In addition to the
RF localization features, we may expect that these algorithms could exploit the
knowledge of pattern of movements and the visual data observed by the camera
inside the tract. The Cramer-Rao lower bound (CRLB) for the performance of
basic RSS and ToA based localization algorithms for capsule endoscopy are already
available in the literature [87]. We can use these bounds as a guideline for the
expected performance of the designed algorithms [88].
• Security and Reliability Issues
One last challenge in RF localization for WCEs would be to examine and where
possible quantify the security, reliability, and privacy of implantable WCEs in
human bodies. Here, there is an impending need to understand and analyze radio
propagation of signals from WCEs outside the human body at larger distances
where they may (a) cause interference (accidental or malicious) to the localization
of WCEs and or devices inside a human body (b) recovered by more powerful
devices towards identifying existence of such WCEs in specific patients. The former
the latter impacts the privacy of patients and the medical procedures that may be
conducted on the patients.
Design of Algorithms for
Body-SLAM
Since the RF signal suffers from the noisy characteristics of wireless channel and
multi-path distortions, it is natural to resort to other techniques to improve the overall
per-formance of the localization system. One way to enhance the perper-formance of RF
lo-calization is to combine the motion information of the capsule by employing a data
fusion algorithm such as Kalman filter [89] or particle filter [43,90,91]. In our previous
work [53], we have used both filters to integrate the RSS-based Wi-Fi localization and
the movement models from inertial sensors including accelerometers and gyroscopes for
cooperative robotic localization in indoor areas. The results were promising since this
method shown the potential to enhance the localization accuracy by combining data
from various sensor sources. However, as we mentioned previously, inertial sensors that
meet the accuracy requirement for the WCE applications are too large to be embedded
inside a video capsule and even if they can be embedded as the assembly technology
improves, the cost of the capsule will be increased dramatically. In the localization
lit-erature, there has been a trend to extract motion parameters from consecutive image
sequence to improve the accuracy of RF localization, which is known as visual based
Simultaneous Localization and Mapping (V-SLAM) algorithm [92–94]. In the WCE
ap-plication, since the endoscopic capsule continuously takes pictures with very short time
interval (up to 6 frames / sec), it is possible to extract the motion information of the
capsule by processing the video stream captured by the embedded vision sensor. This
motion information can be used as an alternative of inertial sensors to smooth the RF
localization results and meanwhile to reconstruct the trajectory that the capsule has
traveled in the same manner of V-SLAM for the indoor geo-location.
3.1
Formulation of Body-SLAM
For every location aware application, higher positioning accuracy can be achieved by
em-ploying hybrid techniques which take advantage of data fusion of different sensors [89].
Since the only two data sources come with the endoscopic capsule are video stream
cap-tured by the embedded vision sensor and wireless signal received by the body mounted
RF sensors, an intuitive idea to enhance the localization accuracy of the capsule is
through combination of the two [95]. As we mentioned before, the endoscopic capsule
continuously takes pictures at short time interval (2 - 6 frames / sec) as it travels. Thus,
it is possible to obtain information such as how quick the capsule moves and the
direc-tion of moving to track the posidirec-tion of the capsule. In this chapter, we present a novel
motion tracking algorithm for the endoscopic capsule by analyzing the displacements of
unique portion of the scene, which referred as feature points (FPs), between
points matching, image unrolling and quantitative calculation of motion parameters.
Detailed procedures of each step are explained in the upcoming subsections.
RF propagation simulation using Semcad X Algorithm design for RF localization Modeling the Speed of WCE using Endoscopic
Images
Algorithm design of Body-SLAM & performance evaluation
Map generation and visual tesbed design
3.2
Motion Tracking using Endoscopic Images
The movement of the endoscopic capsule is highly unpredictable. It may move fast,
slow, rotate and with any combination of the movements stated above. This complicated
movement of the wireless capsule creates great errors to the localization accuracy since
the Received Signal Strength (RSS) various a lots due to fast fading and sudden change
of antenna gain caused by flipping and rotating. Thus, knowing how the capsule moves
will help us to better understand the radio propagation channel inside human body and
therefore enhance the accuracy of the existing localization methods.
3.2.1 Analyzing the Content of Endoscopic Images
3.2.1.1 Image segmentation using SRM
To model the pattern of movements of the endoscopic capsule, we need to categorize the
endoscopic images first to get a conceptual idea how the capsule moves [85]. Based on
our observation, the received endoscopic images can be briefly categorized into two basic
categories: “facing the tunnel” (FT) and “facing the lumen” (FL). Two sets of typical
FT and FL images are shown in Figure 3.2. FT is the case when the focal axis of the
camera is parallel to the center of the intestinal tube. The major feature of this set of
images is always there would be a black hole (we call it “tunnel” here) somewhere in
the picture representing the vanishing line of the intestinal tube. Through a sequence
of consecutive FT images, we can clearly see the capsule either moves propelled by the
intestinal motility. On the contrast, FL is the case where the capsule tends to stop or
moves not as fast as in the FT. The reason why we do such classification is we are trying
to develop a geometric model for the FT images to quantitatively calculate the speed of
Fig.1 two sets of images (top: FT, bottom: FL)
ܴ ܴ
ᇱቊ ܯ݁ݎ݃݁
݂݅ ܴത െ ܴ
തതത ඥݏ
ᇱ ଶܴ ܳ
ݏ
ଶܴ
ᇱܳ
ܰݐ ݉݁ݎ݃݁
ݐ݄݁ݎݓ݅ݏ݁
ܴത
ܴ ݏ ܴ ܳ
ට ܳ ܴ
࣬
ோߜ
Figure 3.2: Two basic categories of endoscopic images
To distinguish the FT images from the FL images seems to be a fairly easy task for the
human eye, however, it has been proved to be extremely difficult for the machines.
Ma-jor sources of difficulties include highly complicated shape of the scene, various lightning
conditions and uncontrolled noise include liquid and bubbles inside the GI tract [85].
Given a set of labeled images, finding what is in common among each set and what is
difference between different set can provide inductive clues for classifier design. Some
normally used feature descriptors such like Histogram of Oriented Gradients (HOG) and
Local Binary Pattern (LBP) [96] doesn’t work well for our application since no
distin-guish difference can be found between the two image sets. Thus, in terms of image
representation, our approach is a region-based method. We used a Statistical Region
Merging (SRM) techniques described in [97] to segment the original image into several
sub-regions with each region represent an object. The basic idea of this technique is to
grow the major regions iteratively by combining smaller regions or pixels with
homoge-neous properties. A typical example of segmented FT image is shown in Figure3.3with
segmented regions shown in their representative colors. From Figure3.3 we can clearly
see that after the segmentation, the image preserved the tunnel shape (the darkest
Chapter 3. Design of Algorithms for Body-SLAM 36
will effectively reduce the variance in the feature space.
ࣘ
ࣘ ࢞
ࣘ
(
࢞
)
࢞
ࢀ࢞
ܭ൫࢞
࢞
൯
ܭ൫࢞
࢞
൯
ࣘ ࢞
ࢀࣘ൫࢞
൯
this will generate a support vector machine in an infinite
dimensional while do so in roughly the same amount of time it
If we measure the
margin by the kernel function and perform the optimization, a
Note that the boundary
࢝
ࢀࣘ ࢞
ܾ
(15)
) into the above equation with replacing
࢞
ࣘ ࢞
ߙ
ݕ
ࣘ ࢞
ࢀࣘ ࢞
ܾ ߙ
ݕ
ܭ ࢞
࢞
ܾ
0
(16)
ܭ ࢞ ࢞
ᇱ݁ݔ ቆെ
ԡ࢞ െ ࢞
ᇱԡ
ଶߪ
ଶቇ
(17)
ࣘ ࢞
Fig.7 Two sequences of segmented image with Q = 16 Top row: FT,
Figure 3.3: Two sequences of segmented image with Q = 16 Top 2 rows: FT, Bottom
2 rows: FL
The region merging rule is following:
P(R, R′) = M erge if|R¯−R¯′|6ps2(R, Q) +s2(R′, Q) N ot merge Otherwise (3.1)
where ¯R is the average value of a certain color channel inside region R, s(R, Q) is a
threshold function whose value is controlled by Q. Detailed expression of s(.) can be
found in [97]. A good threshold is to find balance between preserving the major
com-ponents of the scene and the risk of over merging. The choice Qcontrol the coarseness
of the segmentation: a large Qwill keep more detailed regions while a smallQtends to
merge the small regions. From the experimental point of view, we set the value ofQto
After merging, pixels inside each isolated region share a common color expectation while
the expectations between adjacent regions are different for at least one color channel.
Then, we can extract features out of the segmented images. To reach a good classification
performance, the feature selection must obey the following rule: choose the feature that
is more likely to appear in one set other than in the other set. This can be measured
by calculating the co-occurrence of similar instances from different sets with the same
label. Features that are more distinguishable may increase the precision of classification.
Nine features are selected to classify the images. They are the size of the darkest region
of the segmented image, length of the darkest region, length of the darkest region, RGB
value for the darkest region and RGB value for the remaining regions.
3.2.1.2 Image classification using SVM classifier
After feature extraction, the segmented images are classified using a Kernel Support
Vector Machine (K-SVM). The training data set are labeled in the following format
xi, yi
, where xi is a n×1 feature vector, each element of the feature vector is
composed by the feature we extracted from the previous section, since we extracted 9
features to represent an image, heren= 9, and yi∈
+1,−1
is the label of the image.
If the endoscopic image is FT, yi = +1, otherwise, yi = −1. Suppose we have some
hyperplanes which separates the positive from the negative examples, the pointsxthat
lie on the hyperplane must satisfy
wTx+b= 0 (3.2)
where w is a weight vector with the same dimension of x and b is a bias term, which
called “geometric margin”, can be expressed as follows:
wTi +b
kwk (3.3)
Since the hyperplane expressed by Eq. 3.2 are identical after w and b are scaled by a
common constant, we can add a normalized restriction to this expression:
min|wTi +b|= 1 (3.4)
Then, the optimal solution is the boundary that maximize the minimum distance which
expressed by Eq.3.3. By restriction of Eq. 3.4, this can be reduced to maximization of
1
kwk.
The above equations are only applicable for the linear separable case. However, for our
application, since the content of endoscopic images from different sets sometimes share
similar features, the two set of images are not always linear separable. In another word,
a hyperplane that can perfectly classify the two image sets does not exist. Thus, we
need an approach that able to achieve nonlinear boundaries. Kernel mapping [98] is a
technique which is used to solve nonlinear separation data set. The basic concept of the
Kernel method is to map the vector xi to a higher dimensional space (possibly infinite
dimensional) and do the SVM in this higher dimensional space. Figure 3.4shows a one
dimensional example of kernel mapping. The transformed space should satisfy that the
distance is defined in the transformed space and the distance has a relationship to the
Fig. 5 Simple example of feature mapping using Kernel function
࢞
א
ሬሬሬሬሬԦ
ࡾ
ሱۛۛۛۛۛሮ
ࣘ
(
࢞
)
א
ࡾ
ሬሬሬሬሬԦ
ো(14)
ࣘ
ࣘ ࢞
ࣘ ࢞
࢞
ࢀ࢞
ܭ൫࢞
࢞
൯
ܭ൫࢞
࢞
൯
ࣘ ࢞
ࢀࣘ൫࢞
൯
࢝
ࢀࣘ ࢞
ܾ
࢞
ࣘ ࢞
ߙ
ݕ
ࣘ ࢞
ࢀࣘ ࢞
ܾ ߙ
ݕ
ܭ ࢞
࢞
ܾ
ܭ ࢞ ࢞
ᇱ݁ݔ ቆെ
ԡ࢞ െ ࢞
ᇱԡ
ଶߪ
ଶቇ
ࣘ ࢞
Figure 3.4: Illustration of feature mapping using Kernel function
x∈R~n mapping→ φ(x)∈R~m (3.5)
whereφis the mapping function. Since φ(x)is very high dimensional, it would not be
very easy to work withφ(x)explicitly. If we measure the margin by the kernel function
and perform the optimization, a nonlinear boundary can be obtained.
3.2.2 Feature Points Matching
For the FT images, the translation of the endoscopic capsule inside the small intestine
can be modeled as a tiny camera passing through a elastic cylindrical tube as shown
in Fig. 3.5. Since the WCE continuously takes pictures at a rate up to 6 frames/sec,
common portions of the scene may present between consecutive images [34]. These
portions of the images are called “feature points” (FP). The pattern and magnitudes of
the displacements of these feature points can be used as a hint to reveal the speed of
the endoscopic capsule.
To make an accurate estimation of the capsule’s speed, it is very important that the
FPs extracted from the reference (first) frame can be accurately located in the following
The Affine Scale-invariant Feature Transform (ASIFT) [99–101] defined by the affine
camera model in Eq. 3.7, is a perfect matching tool for the WCE images due to its
immune property to viewpoint changes, blur, noise and spatial deformations.
A=HλR1(Ψ)TtR2(Φ) =λ cosΨ −sinΨ sinΨ cosΨ t 0 t 1 cosΦ −sinΦ sinΦ cosΦ (3.6)
whereRrepresents rotation and T represents tilt. Ψ is rotation angle of camera around
optical axis. Φ is longitude angle between optical axis and a fixed vertical plane. λ is
zoom parameter. Detailed procedure of FPs matching using ASIFT can be found in
[100]. An example of feature points matching is given in Fig.3.6(a), in which blue “o”
represents the coordinates of detected FPs in the reference frame, red “o” represent the
coordinates of matched FPs on the second frame. If we connect the corresponded FP
pairs on the same frame (as shown in Fig. 3.6 (b)), a bunch of motion vectors will be
generated representing the displacements of FPs between frames.
Intestinal wall
Wireless Capsule Endoscope (WCE)
LED light Lens
Field of view
(a) Corresponding feature points between two consecutive frames
(b) Formation of motion vectors
Figure 3.6: Feature matching between two consecutive images using A-SIFT
3.2.3 Image Unrolling
To standardize the displacement of each FP pair and facilitate the quantitive calculations
of motion parameters that are useful for localization, we need to perform an inverse
cylindrical projection [102] (also referred as “image unrolling” in [63, 103]) to project
the original cylindrical image onto an flatten view coordinate system, which we called
“unrolled” image domain. As shown in Fig.3.7, given a pointP at distancedaway from
the camera, the angler depth ofP is defined as:
ܴ ܥ ܲ ߠ d ݂ ܲ ߟ ݎ
Cylindrical image domain
(ݔ,ݕ) ܲ
߶
Figure 3.7: Image acquisition system of WCE.
θ=tan−1 R d (3.7)
whereR represents the radius of the intestinal tube. It can be seen from Eq.3.7that a
smaller angler depth indicates a larger distance away from the camera. To facilitate the
derivation of angler depth, we map the coordinate (x, y) of any point on the cylindrical
image plane to the unrolled image plane (x′, y′) by:
x′ = Lφ 2π y
′=r (3.8)
where φ is the angle between point P and the horizontal axis in the cylindrical image
plane (shown in Fig.3.8(a)).
φ=tan−1 y−y0 x−x0 (3.9)
r is the radius of the circular ring associated with point P that can be calculated by:
r=p(x−x0)2+ (y−y0)2. (3.10)
L and H are length and height of the unrolled image plane respectively. Fig.3.8
illus-trates the procedure of image unrolling.
In this unrolled image plane, x′ axis represents the radian angleφ whose value ranges
from 0 (whenx′ = 0) to 2π (whenx′ =L). y′axis represents angular depth which reflect the distance away from the camera. y′ = 0 represents a 0 angular depth and y′ = H
gives the maximum field of view η of the camera. As can be seen in Fig. 3.8(a), after
the mapping, the circular rings in the cylindrical image plane are stacked up vertically
in the unrolled image plane. Under this new coordinate system, the angular depth of
θ∼= y′ H η (3.11)
The angular depth obtained from Eq.3.11would facilitate the