Original citation:
Ng, Chee Un and Martin, Graham R. (2001) Content-description interfaces for medical
imaging. University of Warwick. Department of Computer Science. (Department of
Computer Science Research Report). CS-RR-384
Permanent WRAP url:
http://wrap.warwick.ac.uk/61192
Copyright and reuse:
The Warwick Research Archive Portal (WRAP) makes this work by researchers of the
University of Warwick available open access under the following conditions. Copyright ©
and all moral rights to the version of the paper presented here belong to the individual
author(s) and/or other copyright owners. To the extent reasonable and practicable the
material made available in WRAP has been checked for eligibility before being made
available.
Copies of full items can be used for personal research or study, educational, or
not-for-profit purposes without prior permission or charge. Provided that the authors, title and
full bibliographic details are credited, a hyperlink and/or URL is given for the original
metadata page and the content is not changed in any way.
A note on versions:
The version presented in WRAP is the published version or, version of record, and may
be cited as it appears here.For more information, please contact the WRAP Team at:
Imaging
Chee Un Ng and Graham R. Martin
Department of Computer Siene,
University of Warwik
Coventry CV4 7AL
August 20, 2001
Abstrat
Thistehnialreportpresentsanintrodutiontoontent-based in-formationretrieval(CBIR) inthe domainof medialimaging. CBIR is a very atively researhed area in reent years, however, utilising
itinthe healthareommunityis stillrelativelynewand unexplored. This report provides a survey of urrent CBIR researh, with spe-ialemphasisonmedialimaging. Researhhasalso beendoneinthe
1 Introdution 1
2 Content Desription 5
2.1 MultimediaContent DesriptionInterfae . . . 7
2.1.1 Desriptor . . . 8
2.1.2 Desription Sheme . . . 8
3 Comparing Medial Images 9 3.1 Curvature Sale Spae Desription forShape Representation . 9 3.2 Mathing of Sale Spae Images . . . 13
3.3 Creation of Contour Images . . . 15
3.4 Examples . . . 16
3.4.1 Creation of a losed ontour . . . 16
3.4.2 Creation of a CSS Desriptor . . . 17
3.4.3 Mathing Result . . . 17
1 Overview of the proposed system . . . 4
2 Contour apture steps using the windowimage . . . 18
3 Desriptor reationsteps using the brain image . . . 19
List of Tables 1 Desriptor BinaryRepresentation Syntax . . . 11
2 Images of the shes Part1 . . . 21
3 Images of the shes Part2 . . . 22
4 Comparison tothe original images. . . 23
The reent information explosion in multimedia ontent, partiular visual
information, has lead to massive demand for multimedia data storage. The
samesituationhashappenedinthemedialimagingeldtoo,reatinganeed
for eÆient visual information management. Probably millions of medial
images are aptured and reated daily, and to nd a partiular image with
some degree of similarityproves to be very diÆult.
To address the above issues, ontent-based retrieval has been proposed.
It is animportant alternativeand omplementtotraditionalkeyword-based
searhing for multimedia data and an greatly enhane the auray of the
information being returned. Currently, most searh engines are totally or
mostlybasedonkeyword searh,andontent-basedretrievalisativelybeing
investigated in various image proessingand multimedialaboratories.
Akey developmentinontent-basedsystems isa newinternational
stan-dardisation work item alled \Multimedia Content Desription Interfae",
referred toas MPEG-7 [1℄. MPEG-7 speies a standard set of `desriptors'
to desribe various features within multimedia information. In addition, a
Desription Denition Language (DDL) is being developed to speify
De-sriptionShemes(DS),hierarhial setsof desriptionsdening multimedia
objets. In our projet, we aim to adopt MPEG-7 standard methodology
to our ontent desription interfae system, asit would greatly inrease the
hanes of universal information sharingand exhange.
There are some preliminary suesses with the use of ontent-based
in-formation retrieval(CBIR) inthe multimediaindustry, partiularlyinelds
of broadastingand entertainment. In medial imaging,pure ontent-based
retrievalwithoutsensiblehumaninterationorfeedbak willprobablyreturn
thousands of (or no) results, and hene, further elaboration by the user is
enouraged.
This leads to another atively researhed subjet, whih is to inlude a
human as an integral part of the feedbak loop. This theory is opposed to
the fully automati theory of omputer vision pattern reognition.
How-ever, ahuman shouldonly takepart inthe proess whenit isneessary, and
minimising the interation of the human is highly desired. This ts with
the theory that a human is always an indispensable part of an image
re-trieval system. In fat, this researh trend has already been reeted in a
number of ontent-based image retrieval systems. For example, a team of
MIT researhers moved from the \automated" Photobook to \interative"
proposed a \Relevane Feedbak" arhiteture [4℄ for image retrieval,where
human and omputerouldinterat witheahothertoimprove theretrieval
performane. In [4℄, Relevane Feedbak \is the proess of automatially
adjustinganexistingqueryusinginformationfed-bakby theuserabout the
relevaneofpreviouslyretrieved douments."ExperimentsinMARSshowed
that retrievalperformane an beimprovedonsiderably by usingRelevane
Feedbak.
Itwould benetboth the imageretrievalresearhand themedial
indus-try if a web based system was available. This approah has been taken by
WebMIRS [5℄ , whih isa web-based medialinformationretrieval system.
Below is a brief desription of several seleted ontent-based image
re-trievalsystems.
Query By Image Content (QBIC) [6℄ is the earliest ommerial
ontent-based system. QBIC supports queries based on example images, sketh by
user, drawing, olour et.
Another system \Virage" is developed by Virage In. [7℄. It is slightly
more powerfulthan QBIC as itsupports ombinationqueries. Forexample,
users an request queries based on example to have half of the weighting
ratio, while skething, and olour determination eah have a quarter of the
weighting ratio.
MARS(MultimediaAnalysisandRetrievalSystem)[8℄isanother
ontent-based image retrieval system . The features of MARS are the integration of
DatabaseManagementSystem(DBMS)andInformationRetrieval(IR),and
the integration of omputer(automati) and human (manual) feedbak.
Now,wewillidentifythemainaimstowhihtheresearheortisfoused.
1. To enable liniians to searh for medial image data eortlessly and
eÆiently by developing an intelligent medial image searh and
re-trieval system. To ahieve this, aneÆient image feature analysis
en-gine should be developed, and a hierarhial ontent desription and
indexing tehnique should be investigated. Suh a tehnique will
en-able the retrieval ofimages whih shareertainharateristis. Lastly,
an intelligentquery interfae whih inorporates user feedbak should
be developed.
2. Demonstrate and evaluate the proposed ontent desription interfae
usingavarietyofmedialimages,inanetworked environment.
ment ofMPEG-7 and utilisationof the standardisedformat should be
a neessity. The possibility of heterogeneous ompatibility should be
exploited too, whih may inlude developing a web based system to
enourage informationexhange.
Figure1showsthe overalloneptoftheproposedontent-basedmedial
image informationsystem. The systemis partitionedinto two setions. The
rst setion is onerned with the storage of medial informationas wellas
medialimages. Theotherisabout theeetivesearhandretrievalof
infor-mation. Thereisalsoaentralrepository whihstores allrelevantdata that
is requiredfor eÆient storage,searh and retrieval. It ontains information
suh asthe DesriptionDenitionLanguage, the Desriptor Shemes'
stru-ture, the Desriptors' syntax, as well asthe stored medialinformation and
images themselves.
A graphial user interfae will be the medium by whih the liniian
interatswiththesystem. Suhasystemwilleasetheuser'sunderstandingof
thesystembyreturningappropriateinstrutions(feedbak)togaininreased
relevane and spei results. Suh results an again trigger another round
of feedbak from the user, if the user suspets that loser results may be
obtained. This proess isslightlysimilartothe"training" ofapereptron in
a neural network system. Hene, to train the system to be more intelligent
by automatially returning more favourable results an be investigated in
future.
Duringthe storage of medial images and information, the liniian will
interat with the system to speify what kind of information should be
reorded andextratedfromtheimage. The imagewillthenundergofeature
extration and image pre-proessing. The extrated features will beused to
generate ertaindesriptionsof the imageontent, whihan beused by the
searh and retrieval mehanism. The desription willthen be enoded into
binary formfor storage.
When a liniian requires ertain images whih ontain ertain features
orartifats,he/shean requestthe systemtosearhthe databaseforimages
whih t his/her desriptions. The searhed results will then be displayed.
Theliniiananrenethesearh,orgiveomment(feedbak)tothesystem,
whihausesthesystemtoperformfurthersearhesasappropriate. Thiswill
berepeated untilthe liniianissatisedwith theresult, oruntilthe system
is unable torespond further.
Filtering
Indexing
Searching
Search Engine
Result Image
Decoder
Non-MPEG-7
MPEG-7
Decoder
Sketch/ Example etc
Query By Text/
User Feedback After
Given Results
Descriptor Generation Interface
Medical Image
Clinician Text Input
For Storage
Description/ Thumbnail
Possible Result
Non-MPEG-7
Encoder
Non-MPEG-7
Description
MPEG-7
Encoder
Description
MPEG-7
Features Extraction
& Image Processing
Generation
Generation
MPEG-7 Description
Non-MPEG-7 Description
MPEG-7 Coded
Description
Original Image
Coded Description
Non-MPEG-7
Description Definition
Language (DDL)
Description Scheme
(DS)
Descriptor (D)
Central Database
New Feature Extraction
& Image Procesing
New D/ DS
Content Description
New Medical
to the entralrepository, whih inturn may neessitate,new feature
extra-tion methodsfor eetive desription generation.
Thisreportdetailsworkonshape-basedimageontentdesriptionand
re-trieval. WehaveusedthemethodologyadoptedbyMPEG-7usinga
Contour-Based Shape desriptor based on the Curvature Sale Spae representation
[9℄ of the ontour. Shape-based image desription is hosen beause shape
representationisimportantinsomemedialimages. This willat asa
start-ing point todevelop an eetive and intelligentmedial imagingsearh and
retrievalsystem.
Also, as it is reommended by the MPEG-7 ommittee, this will enable
data to be read by systems whih are MPEG-7 ompatible, and easily
on-verted to otherstandards.
The organisation of this report is as follows. A brief introdution of
urrent researh in medial imaging searh and retrieval system has been
disussed in this setion. The next setion presents a literature review of
various ontent desription methods, with further elaboration on MPEG-7.
Setion3 shows theresearh workthat has been doneon shape-basedimage
retrieval. In the last setionof this report, onlusions tothe work that has
been done are drawn and future researh is addressed.
2 Content Desription
Traditionally,textual basedmethodshave been used forthe desription and
searhing ofpredominantlyalphanumeriinformation. However, itisknown
that for multimedia information, text is not enough to desribe the rih
ontentofthe data. Hene,ontent-basedimageretrievalsystems havebeen
proposed[10℄[11℄[12℄. Content-basedimageretrievaldiers fromtraditional
text-based image retrieval as information is indexed by visual ontent. For
example, an image is indexed by olour, texture et. Also, ontent-based
image retrievalshould oer an intelligentway of invoking the right features
(e.g. olour)assoiatedwiththe imagestoassistretrieval. Thedevelopment
of MPEG-7 aims to standardise the ontent desription approah, whih
in-ludesgroupinganddeningsets ofstandard features(known asdesriptors)
whihanbeusedtodesribeawidevarietyofmultimediaontent,inluding
images.
Coded ontent desription an be onsidered as a form of Metadata, a
of Metadata standards in use, eah dened for spei purposes. The more
popularstandards inlude the following[13℄:
The Dublin Core Metadata Initiative (http://purl.org/d/) has dened
a metadata element set to failitate the disovery of eletroni resoures.
Dublin Core's metadata, or desriptor, is used for fast informationretrieval
and searh operations. It also linksto the Resoure Desription Framework
(RDF) (http://www.w3.org/RDF).This standard supports anumberof
de-sription ommunities, and is espeially suessful in digital libraries. This
has made ontent desription (i.e. metadata) being widely aepted in the
library ommunity.
AnotherstandardisdevelopedbytheTVAnytimeForum
(http://www.tv-anytime.org/). The standard isspeiallydeveloped toenable audio-visual
anditsrelatedservies,whiharebasedonmass-market,high-volumedigital
storage.
The Soiety of Motion Piture and Television Engineers (SMPTE) has
also developed a standard known as The Dynami Metadata Ditionary
-Unique MaterialIdentiers (UMIDs) [14℄.
A standard developed speially for use in the medial ommunity is
known as Digital Imaging and Communiation in Mediine (DICOM). It
denes the protools and mehanismsto manageand transfer medialdata,
primarily in the ontext of radiology. It merges the patient information
together with the medial image data into a format known as the DICOM
image.
The DICOM standard is very spei to the healthare ommunity and
retainsanioni(simplepiture)datarepresentation. Anexampleofawider
interhangestandard isthe PAPYRUSformatwhihisusedwiththe OSRIS
displayandmanipulationplatformattheUniversityHospitalofGeneva. The
system madeamovetoopen systemsby usingwidely availableomputer
in-dustry standard, suh asSQL-based distributeddatabases and TCP-IP
net-working. Another similar approah is being used in the Web-based Medial
Information Retrieval System (WebMIRS) [8℄.
It is shown that the urrent researh diretion is towards a more open
medial imaging arhiteture. So naturally, the urrent eort of MPEG to
establishanewontentdesriptionstandardknownformallyas\Multimedia
Content Desription Interfae"(MPEG-7) is in alignment with the researh
objetive.
sual objets and artifats, perhaps even over heterogeneous systems, whih
supportthe MPEG-7 standard.
2.1 Multimedia Content Desription Interfae
MPEG-7 [15℄ is anISO/IEC standard developed by the Moving Piture
Ex-pertsGroup(ISO/IECJTC1/SC29WG11). IntheOverviewoftheMPEG-7
Standard,itstatesthatMPEG-7\... aimstoreateastandardfordesribing
the multimediaontentdata that willsupport somedegree ofinterpretation
of the information's meaning, whih an be aessed by or passed onto a
devie ora omputer ode.". Also,MPEG-7 is not aimed atany one
appli-ation, but aims to support asbroad a rangeof appliations as possible.
MPEG-7needstoprovideaexibleandextensibleframeworkfor
desrib-ing audio-visual data. Therefore, it denes a set of methods and tools for
the dierent steps of multimedia desription. Standardisation will apply to
the followomponents:
1. Desriptors
2. Desription Shemes (DS)
3. Desription Denition Language (DDL)
4. Methods toenode desriptions
MPEG-7 systems will inludetoolsthat are needed toprepare MPEG-7
DesriptionsforeÆienttransport and storage,and toallowsynhronisation
between ontentand desriptions. However, atthis stage,suhtoolsare still
beingdeveloped, andtheapproahwehavetakenistodevelopourowntools.
Desription Denition Language (DLL) is dened as \ a language that
allows the reationofnew DesriptionShemesand possibly, Desriptors. It
alsoallows the extensionand modiationof existingDesription Shemes."
The DDL will be based on the XML Shema Language, but will not be
limitedtoitasDDLisrequiredtoaterforahugerangeof appliationsand
urrent standards.
Fora moreomprehensivedenition andintrodutiontoMPEG-7
The denition of a Desriptor is a presentation of a Feature. \A Desriptor
denes the syntax and the semantis of the Feature representation. A F
ea-ture is a distintive harateristi of the data whih signies something to
somebody.", aording tothe MPEGommittee.
A Desriptor will allow an evaluation of the orresponding feature via
the desriptor value. A Desriptor may ontain more than one value, and
all or part of them an be used to evaluate a orresponding feature. Also,
a single feature an have several Desriptors to desribe it, for dierent
re-quirements. An example is for texture, where Luminane Edge Histogram
and Homogeneous Texture Desriptorsan beused.
For medial imaging, a few Desriptors have been hosen to be studied
for their suitability. Contour-Based Shape Desriptor has been studied and
foundsuitable. MoreinformationabouttheContour-BasedShapeDesriptor
willbeexplainedin3.1. Texturebrowsing,edgehistogramandregionloator
Desriptors willbe investigated in the near future.
2.1.2 Desription Sheme
A Desription Sheme is dened as: \A Desription Sheme (DS) speies
the struture and semantis of the relationships between its omponents,
whihmay be both Desriptors and DesriptionShemes."
Simply speaking, a Desription Sheme is used to group individual
De-sriptors, or even other DS, to form a systemati semanti tree-struture
informationabout apiee of information,suh asan image.
The distintion between a Desription Sheme and a Desriptor is that
a Desriptor an only ontain basi data types, or Desriptor Values, and it
does not refer toanother Desriptoror DS.
Anexample ofa DesriptionShemeis theStillRegionDS, whih ispart
of thestandard DSinMPEG-7 [16℄. This isaSegmentDS,derived fromthe
generiSegmentDS,whihdesribesspei typesof audio-visualsegments.
Other segment DSs are the Video Segment DS, Mosai DS, Moving Region
DS, Video TextDS, Audio SegmentDS and Audio VisualSegmentDS.
TheStillRegionDS thenhas variousDesriptorstodesribeastillregion
segment of animage. Some Desriptors are EdgeHistogram, T
existingDSinMPEG-7standardisnotsuÆientformedialimageretrieval.
AnewDSshouldbeformulatedforthepurposeofmedialimages,asmedial
images are semantially muh more distint then generi images. This will
beinvestigated after allrelevant Desriptorshave been explored.
3 Comparing Medial Images
To ompare medialimages, we have deided to implementashape
desrip-tion tehnique. Shapedesriptionisanimportantissueinobjetreognition
as itis use tomeasure geometri attributes of anobjet, whih is an
impor-tant feature for various medialimages.
An overview of various shape desription tehniques is provided in [17℄.
Shape desription tehniques an be lassied as Boundary Based Methods
orRegionBasedMethods. Theyare thenfurtherategorisedintoTransform
Domainand SpatialDomain,while SpatialDomainis againategorisedinto
Partial(Olusion)andComplete(NoOlusion). Themostsuessfulshape
desriptors are Fourier Desriptors [18℄ and Moment Invariants [19℄.
How-ever, in [17℄, the authors did not mention Curvature Sale Spae, hene,
we suggest that Curvature Sale Spae is onsidered as a Boundary Based
Methodinthe Transform Domain. This isdue tothe boundarybeing
trans-formed into a Sale Spae representation, althoughit uses the urvature as
the basi measurement.
3.1 Curvature Sale Spae Desription for Shape
Rep-resentation
A very fast and reliablemethodfor shape similarity retrieval inlarge image
databases isby using theCurvatureSale Spaetehnique. This isalsovery
robust with respet to noise,sale and orientation hanges of the objet.
The Curvature Sale Spae (CSS) representation of a ontour isthe
re-ommended tehnique by MPEG-7 for ontour shape similarity mathing.
Filtering based on shape ontour also targets query-by-example [20℄. The
representation of ontour shape is very ompat, less than 14 bytes in size
on average.
To reate a CSS desription for a ontour shape following the MPEG-7
ontour and we have: =(x 1 ;y 1 );(x 2 ;y 2 );::(x n ;y n ) (3.1)
is the set of individual points representing the x and y oordinates
respetively. The points need to be resampled as equidistant points. After
that, thexand yoordinates must beparameterised bythe urvear-length
parameteru, whereuisnormalisedtotakevaluesfrom0to1. Wethen have
to onstrut the funtion X(u)and Y(u)from the x and y oordinates.
The next step is to selet a goodvalue of N, the number of equidistant
points the funtions X(u) and Y(u) will be resampled to. From [20℄, it is
understood that the value of N = 256 is usually suÆient for typial
im-age/videoappliations. GivenN,wean ndthe valuesof thetwofuntions
by using linear interpolation. We name the resampled funtions x(j) and
y(j) where j is from 0 to 255 (if N =256). x(j) and y(j) then repeatedly
undergo a low-pass lter operation whih performs a onvolution with the
normative(0.25,0.5,0.25)kernel.
The urvature of the lteredurve is
K(j;k)= X
u (j;k)Y
uu
(j;k) X uu (j;k)Y u (j;k) (X u (j;k) 2 +Y u (j;k) 2 ) 3=2 (3.2) whereX u
(j;k)=X(j;k) X(j 1;k),X uu
(j;k)=X u
(j;k) X u
(j 1;k),
Y u
(j;k)=Y(j;k) Y(j 1;k) and Y uu
(j;k)=Y u
(j;k) Y u
(j 1;k).
We nd the minima and maxima of the urvature by nding the zero
rossingsof K(j;k). The ndingofzero rossingsshouldnot belimited only
to nding all pairs of onseutive indies (j;j+1), for whih K(j;k)K(j+
1;k) < 0 as indiated in [20℄.Otherwise, those zero rossings whih span a
few points, suh as those with intermediate zero(s) (e.g.K i
= 0:1;K i+1
=
0;K i+2
= 0:1) will be missed. Indies j of eah zero rossing and the
orresponding number of passes of the lter k are reorded. The CSS image
of a ontour is the binary image where the x-axis represents j and the
y-axis represents k. All the \ative"pointsorrespond tozero rossingsof the
urvature.
TheMPEG-7speidesriptorbinaryrepresentation syntaxofContour
shape using Curvature Sale Spae islisted in table 1.
NumberOfPeaksrefers tothe totalnumberof peaksinthe CSSimage. If
ContourShapef Number of bits
numberOfPeaks 6
GlobalCurvatureVetor 2*6
if(numberOfPeaks !=0)f
-PrototypeCurvatureVetor 2*6
g -HighestPeakY 7 for(k=1;k<numberOfPeaks;k++)f -PeakX[k℄ 6 PeakY[k℄ 3 g -g
-Thereare two values needed forthe GlobalCurvatureVetor, namely
E-entriity and Cirularity. Cirularity is dened as:
irularity =
perimeter 2
area
(3.3)
This is uniformlyquantised to 6 bits in the range of 12to 110. If the value
is larger than 110, it is lipped to 110. Cirularity is the rst value of the
GlobalCurvatureVetor.
The seondvalue ofthe GlobalCurvatureVetor isEentriity, obtained
as follows: i 02 = N X k=1 (y k y ) 2 (3.4) i 11 = N X k=1 (x k x )(y k y ) (3.5) i 20 = N X k=1 (x k x ) 2 (3.6) where(x ;y
) is the enter of mass of the shapeand N is the number of
pointsinside the shape.
Eentriity is thusdened as:
range from 1to 10,and lipped to 10if neessary.
The next set of values reorded in the binary representation is the
Pro-totypeCurvatureVetor. These are atually the values of Eentriity and
Cirularity of the prototype ontour. The prototype ontour is the ontour
whihis totallyonvexand there are nomorezero rossingsinK(j;k). Itis
the nal ontour after multiple passes of the normative lter.
HighestPeakY isthe parameterof the lter orresponding tothe highest
peakin the CSS image. It an be alulated as:
HighestPeakY =3:8(
yss[0℄
Nsamples 2
) 0:6
(3.8)
where yss[0℄ is the number of passes of the normative kernel lter
orre-sponding to the highest peak, and Nsamples is the number of equidistant
pointsonthe ontour initiallyused asthe input to the proess.
As above,it isuniformlyquantised to7 bits inthe rangefrom 0 to1.7.
Finally,the lasttwosets ofvaluesinthebinaryrepresentationsyntax
de-notetheparametersoftheremaining(upto63)prominentpeaks. Prominent
peaksarethosepeakswheretheirheightisgreaterthanHighestPeakY0:05
after transformation. The peaks are represented in dereasing order in
re-spet totheir peak heightvalue.
First,wealulatexpeak[k℄. Wenormalisethedistanealongtheontour
by thelengthofthe ontour,usingthe pointwherethe highestpeakisfound
as the starting point P[0℄. The value of xpeak[k℄ will be the normalised
distane fromP[0℄ tothe k-th peak on the ontour ina lokwise diretion.
Finally the PeakX[k℄ is obtained by quantising xpeak[k℄ to 6 bits by the
normalised distane inthe range from 0to 1.
The transformed height of the kth peak is known as ypeak[k℄, and is
alulated as:
ypeak[k℄=3:8(
yss[k℄
Nsamples 2
) 0:6
(3.9)
Afteruniformlyquantisingypeak[k℄intherangefrom0toypeak[k 1℄to
3bits,weobtainPeakY[k℄. ThisimpliesthatthevalueofPeakY[k℄depends
Allshapeboundaryontoursinthedatabaseanbewellrepresentedbytheir
uniqueCSSimagepresentation. Morespeially,they anbedened bythe
maximaofurvaturezerorossingontoursoftheCSSimage[9℄. Thereason
forusing themaximaofthe urvaturezerorossingisthatthey arethe most
signiant points of the zero-rossing ontours. It also onveys information
on the loation and the sale of the orresponding ontour [21℄. Hene, the
mathing is ahieved by nding two CSS images whih share similar sets of
maxima.
Also, the CSS image is reated by deteting the hange of urvature, so
that it is invariant under rotation, sale and orientation. Hene, mathing
using the CSS image will ensure that suh physial varianes of the same
image willnot betreated asdierenes.
Therepresentation isalsorobustwithrespet tonoise. Modiedversions
of the CSS image tehnique, known as the Renormalised Curvature Sale
Spae Image and the Resampled Curvature Sale Spae Image, are more
robust toertain kindsofnoise, suh asnon-uniformnoise orsevere uniform
noise [22℄. By using the above mentioned CSS image tehniques, one ould
inrease the auray of the mathing proess.
Another advantage of using this tehnique is that a math an be found
quikly. This is due to the relativelyfew features, i.e. maxima of urvature
zero rossing, required to be mathed. This is espeially true at the high
sales of the CSS imagewhere the maxima are sparse.
The initial task for mathing is to horizontally shift one of the two sets
of maxima by some amount. The best hoie to determine the amount of
shifting required is to shift one of the CSS images so that its major
maxi-mum overlaps with the major maximum of the other CSS image. Ifthe two
CSS imagesare similar, suh shiftingwould enable the similarityof the CSS
images tobeeasilydisovered, either by alulation orhuman visualisation.
After shifting, the similarity value an be alulated. A threshold has
to be set as to what value of dierene an be allowed to be onsidered a
mathed maxima. After determining whih maxima are onsider mathed,
thealulationofsimilarityvalueisthesummationofthe Eulideandistane
between the mathed pairs, plus the summation of the vertial oordinates
of the unmathed pairs.
Inorder tominimisethe omparisontime forlarge databases, the aspet
In the MPEG-7 ontour shape Desriptor, information required for an
eÆient mathing operation,suh aseentriityand irularity,isinluded.
Also,theDesriptorisalreadysortedaordingtoheightofthepeaks, whih
onrm that the major maximum, i.e. Highest Peak, is atthe starting
posi-tion.
The mathing algorithmspeied in [23℄ is asfollow:
1. Compare the global parametersof both CSSimages. Ifthey dier
sig-niantlythennomoreomparisonisrequired. Thefollowingequations
have tobe fullled for further mathingto be performed:
j q [0℄ r [0℄j MAX( q [0℄; r [0℄)
Th e (3.10)
j q [1℄ r [1℄j MAX( q [1℄; r [1℄)
Th (3.11)
where q
[0℄ and r
[0℄ are eentriity values for the query and model
images respetively, and q
[1℄ and r
[1℄ are the irularity values for
the queryand modelimages. Th e and Th are the thresholds where
Th e=0:6 and Th =1:0.
2. ForCSSimageswhihfulltheaboveonditions,thesimilaritymeasure
M is omputed asbelow:
M =0:4 j q [0℄ r [0℄j MAX( q [0℄; r [0℄) +0:3 j q [1℄ r [1℄j MAX( q [1℄; r [1℄)
+Mss (3.12)
where
Mss=Sm+Su (3.13)
and Smis thesummationoverallmathedpeakswhileSuisthe
sum-mation of allunmathed query and modelpeaks. Hene, we have
Sm = X
((xpeak[i℄ xpeak[j℄) 2
+(ypeak[i℄ ypeak[j℄) 2
) (3.14)
where i and j are indies of the query and model peaks that math,
and Su= X (ypeak[i℄) 2 (3.15)
ThesimilaritymeasureM willbezeroiftwomathesareexatlythesame,
and will produe agreater value if they have a greaterlevelof dierene.
Other similar algorithmsto math CSS imagesan be found in[24℄ and
Beforethe reationofaCSSimage,oneisrequiredtoobtainalosed
bound-ary ontour forthe objet ofinterest. Suhaontourisnot readily available
and there might be multiple hoies for suh ontours in a single image.
Therefore, amethodis required toextrat the orretlosed boundary
on-tourfromtheoriginalimage. Oftenontoursareisobtainedfromalight-box
setup orothersimplesingleontour images. However, asmedialimagesare
muh more omplex, areliable ontour extrationis required.
A method based on image intensity thresholding and user feedbak is
urrently being investigated with some initialsuess. Suh a method relies
more on the user's visual skill to pereive the objet ontour that he/she
requires. This ould ensure that the right ontour is always aptured. The
drawbak ofsuh amethodistherequirementofsigniantuser interation.
The possibility of reduing the dependeny onthe user will be exploited in
the near future.
The algorithmfor obtainingthe orretontour is:
1. First, display the original image. An input from the user is required
to threshold the intensity of the image, so that any value below the
threshold is blak(0), and the remainder as white(1). This willreate
a binary image.
The seleted threshold should enable the ontour (outline) of the
re-quired objet in the original image to be visible to the user after the
original image has been onverted to a binary image. Else, the user
should reseletthe value again untilthe aboveondition is met.
2. After abinaryimage isprodued,anedge detetor,suhas theCanny
edge detetor, willbeimposed onthe binary image. The resultwillbe
an image onsisting only of lines. Again, hek whether the required
objet's boundary ontour is visible enough. If not, perform the last
step again.
3. The morphologial operations of erosion and dilation are then
per-formed on the lines. Suh an operation an regenerate the lines, but
only one pixel wide [25℄. Other morphologial operations may also
be operated onthe binary image to inrease the haneof obtaininga
qualitylosedontour. Suhoperationsaninludebridgingpreviously
objet. Theuserisrequiredtoinputthestartingpointtoletthesystem
know whih objet is of interest. During border traing, we also seek
how many diretions a point an link to. If there is more than one
link, i.e., thereare branhes, user inputwillbesoughttolarify whih
branh is the orret one. This is preferred over letting the algorithm
\jumping"intothe rst link(branh) itdisovers, whih mightnot be
the right one.
However, if alosed ontour is not aptured, the above algorithmwill
fail. A modied border traing method is used so that it an apture
the whole ontour, even if it has an open end. When an open end is
disovered, the user is againrequired to selet another ontour, whih
we assume here, ispart of the requirednal ontour.
After the user is satised with the ontours aptured, the system will
then link allontours together, hene reatinga losed ontours.
To suessfully obtain a ontour, instead of using the intensity
thresh-olding method, various segmentation tehniques ould be used too. A
mor-phologial operation based approah has been used for image segmentation
for some time [27℄. However, suh an approah is not sophistiated enough
to segment ompleximages. Another tehnique, proposed in [28℄ is tomake
use oftheindividualstrengthsofwatershedanalysisandrelaxationlabelling.
Some other popular segmentation tehniques are based on Edge Flow [29℄,
Delaunaytriangulation[30℄ andfuzzy entropy [31℄. Allthe abovementioned
segmentation algorithms are automati and do not depend on user input.
Hene, to obtain a ontour of a partiular objet, a further step will be
required to speify the objet.
There are also various other objet segmentation algorithms, whih
re-quire some human interation. A omputer-assisted boundary extration
method has been suggested in [32℄. Other approahes inlude an algorithm
whihisbasedonlusteringandgroupinginspatial-olour-texturespae[33℄,
and an ative ontours based algorithm[34℄.
3.4 Examples
3.4.1 Creation of a losed ontour
A window image will be used as an example. Figure 2(a) is the original
user interation is required. The rst time, the algorithmwillgrab the rst
part of the ontour that the user speies, as shown in gure 2(d). Then,
the algorithmwillgrab the next partof theontour, asshown ingure2(e).
Finally,when the user issatised, he/shewillrequest the algorithmtolose
the ontour, and the omplete losed ontour isaptured as ingure 2(f).
An originally losed ontour will only require the user to interat with
the system one, that is,to speify whih ontour is of interested.
3.4.2 Creation of a CSS Desriptor
Here, anexample onanimage ofthe human elbowis presented. Figure3(a)
istheoriginalimage. Weare interestedinapturingapartoftheimagenear
the entre, slightly to the top, just above the joint (see gure 3(e)). After
onverting the image to lines by the method mentioned above, we get the
imageas ingure3(). The user isrequiredtolikonthe objet ofinterest
in the image, whih willgrab a losed ontour as ingure 3(d). After that,
the maxima of the zerorossings anbefound by using the CSS algorithm.
A binary CSS imageof the ontour is showin gure 3(f)
The values obtained from the algorithmare onverted into a Desriptor,
and the storedinformation isas below:
NumberOfPeaks : 11 (unsigned 6bit)
GlobalCurvatureVetor : 64 ,5 (unsigned 6 bit)
PrototypeCurvatureVetor : 19, 4(unsigned 6 bit)
HighestPeakY : 35(unsigned 7 bit)
PeakX : 18,44,53, 47,26,3, 59,14,58, 22(unsigned 6 bit)
PeakY : 5,4, 2,3, 7,6, 4,5, 6,6 (unsigned 3 bit)
We an then ompare this CSS desriptor with others to see how muh
they dier.
3.4.3 Mathing Result
Amathingexamplewillmakeuseoftheshontourwhihisprovidedfrom
[35℄. All the CSS Desriptors for the sh ontours have been generated in
advane before the omparison. Then their Desriptors are ompared and
() After edge detetion (d) Capture of the rst ontour
(e) Capture of the seond ontour (f) Complete the ontour apture
elbow_sagittal_58.jpg
(a)Originalimage (b) Intensity Threshold = 0.53
() After edge detetion (d) Capture of the losed ontour
0
50
100
150
200
250
300
0
200
400
600
800
1000
1200
1400
1600
1800
2000
(e) Closed ontour onthe originalimage (f)CSS image of ontour
Table 2 and 3 show the outline of the sh images whih are used for
omparison. The mirror imagesof the shes are showon the left hand side.
Table 4 shows the omparison results with eah sh, where NC means
there isnoomparisonasthey are toodierenttowarrant omparison. This
is beause they are lteredby the eentriity and irularity measurement.
Table 5 shows the omparison results with the mirror imagesof the sh.
In table4, it an beobserved that the similaritymeasure of image no. 1
ompared to image no. 2 is slightly dierent when no. 2is ompared to no.
1. Thereasonforthisisthattheomparisonriteria(query)isdierent. For
images no. 9 and no. 11, they have numerous "NC", as their features are
quite dierent from the rest. During the multimple omparisons of images
to no. 1,itan be found that imageno. 5 isthe most similar,while no. 9is
the most distint.
In table 5, it an be observed that when the image is ompared to its
own mirror, the similarity measure might not be too small. This means
the mirror of the image is not very similar to its original image. However,
the more symetrial they are, the loser are their mirror images to original
images.
4 Conlusions and Future Work
Content-based image retrieval (CBIR) ould potentially play an important
role in modern medialimaging systems. It has been extensively studied in
the eld of multimedia and generi image proessing. However, muh more
work has to be done for the medial imaging eld. The proposed researh
hasbeenonernedwiththedevelopmentofaontent-basedmedialimaging
interfae, whihmakesuse ofCBIR and auser feedbak interfae,but based
on the evolving ontent desription standard,MPEG-7.
This report desribesthe work that has been done sofar, whih inludes
the implementation of the CSS algorithm, onversion from image objet to
CSS Desriptor,and the CSS mathing proess. This ompletes a study for
a single desriptor. Some bakground reading on CBIR is also presented in
the rst part of this report.
By using the maxima of urvature zero rossing ontours of the CSS
No. Image Mirror Image
1
2
3
4
5
6
7
No. Image Mirror Image
9
10
11
- 1 2 3 4 5 6 7 8 9 10 11 12 1 0 0.4419 0.3536 0.3263 0.1967 0.3631 0.1699 0.3451 0.5512 0.3495 0.4662 0.3076
2 0.4488 0 0.2846 0.4455 0.3820 0.3334 0.4135 0.5173 NC 0.3824 NC 0.4872
3 0.3451 0.2754 0 0.5271 0.2717 0.3365 0.4795 0.4588 NC 0.3119 NC 0.4290
4 0.3388 0.4566 0.5285 0 0.4396 0.4156 0.2369 0.3740 0.6040 0.4594 0.5424 0.3689
5 0.2068 0.3868 0.2869 0.4313 0 0.2589 0.4264 0.4480 NC 0.2710 NC 0.4245
6 0.3245 0.3223 0.3230 0.4265 0.1738 0 0.4348 0.4485 NC 0.2198 NC 0.4207
7 0.2644 0.4205 0.4607 0.2341 0.2722 0.4130 0 0.3635 0.4463 0.4131 0.5397 0.3027 8 0.3584 0.5027 0.4470 0.4093 0.4293 0.5156 0.4067 0 0.4501 0.4904 0.2398 0.3034
9 0.5372 NC NC 0.6382 NC NC 0.4393 0.4521 0 NC 0.1884 0.5390
10 0.3422 0.3752 0.3543 0.3520 0.2257 0.2348 0.4448 0.4741 NC 0 NC 0.3888
- 1 2 3 4 5 6 7 8 9 10 11 12 1M 0.2655 0.4335 0.4213 0.3215 0.2905 0.3244 0.3217 0.2914 0.5705 0.3994 0.4680 0.3076 2M 0.4305 0.0354 0.2832 0.4393 0.3769 0.3162 0.3966 0.5595 NC 0.3746 NC 0.5206 3M 0.3986 0.2771 0.1176 0.5257 0.1770 0.3069 0.5010 0.4365 NC 0.3554 NC 0.4322 4M 0.3410 0.4485 0.5431 0.0690 0.4575 0.3638 0.2574 0.3816 0.4998 0.4374 0.5423 0.3659 5M 0.3736 0.3604 0.2960 0.4660 0.2013 0.2028 0.4667 0.4100 NC 0.2745 NC 0.3867 6M 0.3610 0.3179 0.3313 0.3570 0.1049 0.1892 0.3624 0.4151 NC 0.0940 NC 0.4385 7M 0.3446 0.3892 0.4882 0.2235 0.4124 0.3477 0.1960 0.3404 0.6052 0.4579 0.5332 0.2790 8M 0.3429 0.5568 0.4375 0.4012 0.4083 0.4105 0.3798 0.1514 0.3734 0.5151 0.3129 0.3087
9M 0.6052 NC NC 0.6267 NC NC 0.6276 0.3781 0.2698 NC 0.1673 0.4829
10M 0.4041 0.3430 0.3526 0.4453 0.2374 0.1450 0.4605 0.4187 NC 0.2139 NC 0.3348
11M 0.4885 NC NC 0.5416 NC NC 0.5247 0.3128 0.1863 NC 0.0500 0.4236
CSS image of the objet,one an deide how similar two objet's boundary
ontours are. The CSS image is reated from the urvature of the ontour,
whihis then enoded into asale spae binary image.
The information from the sale spae binary image is then oded as an
MPEG-7 desriptor. It ould then be retrieved by any system whih is
MPEG-7 ompatible. Mathing of CSS images beomes the mathing of
twoCSSdesriptors. Byusingeentriity and irularity,therelatively
dif-ferentimagesanbelteredout. TheremainingCSSdesriptorswillthenbe
omparedand asimilaritymeasure Manbeobtained. Thevaluewilldepit
how losely the two desriptors math, whih implies how losely the CSS
images math,and hene how losely the originalobjets' ontours math.
Futureresearh plans inludes:
1. Investigation of other relevant Desriptors to build up a desription
sheme (DS) whih is suitable for use with one or two sub-lasses of
medial images. Identify the shortomings of the urrent desription
sheme and propose new desriptors or other methods suh as the
in-tegration of related metadata totakle the shortomings.
2. Develop ahierarhialontentdesription and indexingmethodwhih
will permit the retrieval of images with ertain harateristis. Also,
the storage of the ontent desription will be onsidered. Should the
desription be in an individual visual frame, or just part of the
de-sription in an individual image, and the remainder be in a entral
database? Suh a problem must besolved toenable eetive retrieval
of information.
3. Apowerfuland user-friendlyintelligentqueryinterfaewillberequired
for the retrieval of omplex medial information. Synergy of human
feedbak and omputer automati extration shouldbeexplored, asit
is reognised that human feedbak will be an indispensable part for
anintelligentontent-desription interfae for medialpurposes. Also,
fusionof textual and visual lues forontent-based retrievalshould be
[1℄ Jose M. Martinez, Introdution to MPEG-7 (version 1.0), ISO/IEC
JTC1/SC29/WG11-N3545, Beijing,July 2000.
[2℄ R. W. Piard and T. P. Minka, \Vision texture for annotation," Teh.
Rep. 302, MIT Media Laboratory,1994.
[3℄ T. P. Minkaand R. W.Piard, \Interative learningusing a`soiety of
models'," in Pro. IEEE CVPR, 1996, pp. 447{452.
[4℄ Yong Rui, Thomas S. Huang, and Sharad Mehrotra, \Content-based
image retrievalwith relevane feedbak in MARS," in Pro. IEEE Int.
Conf. on Image Pro., Santa Barbara, California, USA, Otober 1997,
pp. 815{818.
[5℄ L.RodneyLong,StanleyR.Pillemer,RevaC.Lawrene, Gin-HuaGoh,
Leif Neve, andGeorgeR.Thoma, \Webmirs: Web-based medial
infor-mationretrievalsystem," inPro. SPIEStorageandRetrievalforImage
and Video Databases VI,San Jose, California, USA, January1998, vol.
3312, pp. 392{403.
[6℄ M. Flikner, Harpreet Sawhney, Wayne Niblak, Jonathan Ashley,
Q. Huang, Byron Dom, Monika Gorkani, Jim Hane, Denis Lee,
Dragutin Petkovi, David Steele, and Peter Yanker, \Query by
im-age and videoontent: The QBICsystem," inIEEE Computer, vol.28,
pp. 23{32.1995.
[7℄ Jerey R. Bah, Charles Fuller, Amarnath Gupta, Arun Hampapur,
Bradley Horowitz, Rih Humphrey, Ramesh Jain, and Chiao Fe Shu,
\The virage image searh engine: An open framework for image
man-agement," in Pro. SPIE Storage and Retrieval for Image and Video
Databases, San Jose, California,USA, February 1996, pp. 76{87.
[8℄ ThomasS.Huang,SharadMehrotra,andKannanRamhandran,
\Mul-timediaanalysisandretrievalsystem (MARS)projet," Teh.Rep.
TR-DB-96-06, Information and Computer Siene, University of California
at Irvine,1996.
[9℄ Farzin Mokhtarian, Sadegh Abbasi, and Josef Kittler, \EÆient and
robust retrieval by shape ontent through urvature sale spae," in
International Workshop on Image DataBases and MultiMedia Searh,
Communiations of ACM, vol.40, pp. 30{32.Deember1997.
[11℄ VenkatN. Gudivada and Jijay V.Raghavan, \Speial issue on
ontent-based image retrieval systems," in IEEE Computer Magazine, vol. 28,
pp. 18{22.September 1995.
[12℄ A.D.Narasimhalu, \Speialsetiononontent-basedretrieval," in
Com-muniations of ACM, vol.3. February 1995.
[13℄ FrankNak, \Allontentounts: Thefutureindigitalmediaomputing
is meta," IEEE Multimedia, vol. 7,no. 3,pp. 10{13, 2000.
[14℄ Soiety ofMotionPitureand TelevisionEngineers,SMPTE andWhite
Plains and N.Y., Television-UniqueMaterial Identier (UMID), 2000.
[15℄ MPEG-7 Overview (version 3.0),ISO/IEC JTC1/SC29/WG11-N3445,
Geneva,May/June 2000.
[16℄ Text of ISO/IEC 15938-5/CD Information Tehnology - Multimedia
ContentDesriptionInterfae-Part5MultimediaDesriptionShemes,
ISO/IEC JTC1/SC29/WG11-N3705, La Baule,Otober2000.
[17℄ Maytham Safar, Cyrus Shahabi, and Xiaoming Sun, \Image retrieval
byshape: Aomparativestudy," Teh.Rep., Integrated MediaSystems
Center and Department of Computer Siene, University of Southern
California, November 1999.
[18℄ C.T.Zahn and R.Z. Roskies, \Fourier desriptors for plane losed
urves," IEEE Transations on Computers, vol. 21,no. 3, pp. 269{281,
Marh 1972.
[19℄ M.K.Hu, \Visual pattern reognition by moment invariants," IEEE
Transations on Information Theory, vol. 8, no. 2,pp. 179{187,F
ebru-ary 1962.
[20℄ Leszek Cieplinski, Munhurl Kim, Jens-Rainer Ohm, Mark Pikering,
and Akio Yamada, CD 15938-3 MPEG-7 Multimedia Content
Desrip-tion Interfae - Part3 Visual,ISO/IECJTC1/S29/WG11-W3703, La
Baule, Otober 2000.
[21℄ Farzin Mokhtarian, \Silhouette-based isolated objet reognition
through urvaturesalespae," IEEE Transations onPatternAnalysis
urvature-based shape representation for planar urves," IEEE
Trans-ations onPatternAnalysisand MahineIntelligene,vol.14,no.8,pp.
789{805, August 1992.
[23℄ MPEG-7 Visual part of XM Version 8, ISO/IEC
JTC1/SC29/WG11-N3673, La Baule,Otober 2000.
[24℄ FarzinMokhtarian and Alan Makworth, \Sale-based desription and
reognitionof planarurvesand two-dimensionalshapes," IEEE
Trans-ations on PatternAnalysis and Mahine Intelligene,vol.8,no. 1, pp.
34{43, January 1986.
[25℄ Rafael C. Gonzalez and Rihard E. Woods, Digital Image Proessing,
hapter8, pp. 518{560, Addison Wesley, 1992.
[26℄ MilanSonka,ValavHlava,and Roger Boyle, Image Proessing,
Anal-ysis, and Mahine Vision, hapter 5, p. 142, PWS Publishing, seond
edition, 1999.
[27℄ M. Lybanon, S. Lea, and S. Himes, \Segmentation of diverse image
types using opening and losing," in Pro. IEEE Int. Conf. on Pattern
Reognition, 1994, vol. I, pp. 347{351.
[28℄ Mihael Hansen and William Higgins, \Watershed driven relaxation
labeling for image segmentation," in Pro. IEEE Int. Conf. on Image
Pro., 1994, vol.III, pp. 460{464.
[29℄ W. Y. Maand B. S.Manjunath, \Edgeow: aframeworkof boundary
detetion and image segmentation," in Pro. IEEE Conf. on Computer
Vision and Pattern Reognition, 1997, pp. 744{749.
[30℄ TGevers andV.K.Kajovski, \Image segmentationby diretedregion
subdivision," in Pro. IEEE Int. Conf. on Pattern Reognition, 1994,
pp. 342{346.
[31℄ X.Q.Li,Z.W.Zhao,H.D.Cheng, C. M.Huang, andR.W.Harris, \A
fuzzy logi approahto imagesegmentation," inPro. IEEE Int. Conf.
on Pattern Reognition, 1994, pp. A337{341.
[32℄ Ramin Samadani and Ceilia Han, \Computer-assisted extration of
boundariesfromimages,"inPro. SPIEStorageandRetrievalforImage
and Video Databases, San Jose, California, USA, 1993, vol. 1908, pp.
segmentation using attration-based grouping in spatial-olor-texture
spae," in Pro. IEEE Int. Conf. on Image Pro., September 1996,
vol. 1,pp. 53{56.
[34℄ Dirk Daneels, D. Campenhout, Wayne Niblak, Will Equitz, Ron
Bar-ber, Erwin Bellon, and Freddy Fierens, \Interative outlining: An
im-proved approah using ative ontours," in Pro. SPIE Storage and
Retrieval for Image and Video Databases, San Jose, California, USA,
1993, vol.1908, pp. 226{233.
[35℄ F. Mokhtarian,S.Abbasi, and J.Kitter, \EÆient and robustretrieval
by shape ontent through urvature sale spae," in Image DataBases
and Multi-Media Searh, pp. 51{58. World Sienti Publishing,