Content description interfaces for medical imaging

(1)

Original citation:

Ng, Chee Un and Martin, Graham R. (2001) Content-description interfaces for medical

imaging. University of Warwick. Department of Computer Science. (Department of

Computer Science Research Report). CS-RR-384

Permanent WRAP url:

http://wrap.warwick.ac.uk/61192

Copyright and reuse:

The Warwick Research Archive Portal (WRAP) makes this work by researchers of the

University of Warwick available open access under the following conditions. Copyright ©

and all moral rights to the version of the paper presented here belong to the individual

author(s) and/or other copyright owners. To the extent reasonable and practicable the

material made available in WRAP has been checked for eligibility before being made

available.

Copies of full items can be used for personal research or study, educational, or

not-for-profit purposes without prior permission or charge. Provided that the authors, title and

full bibliographic details are credited, a hyperlink and/or URL is given for the original

metadata page and the content is not changed in any way.

A note on versions:

The version presented in WRAP is the published version or, version of record, and may

be cited as it appears here.For more information, please contact the WRAP Team at:

(2)

Imaging

Chee Un Ng and Graham R. Martin

Department of Computer Siene,

University of Warwik

Coventry CV4 7AL

August 20, 2001

Abstrat

Thistehnialreportpresentsanintrodutiontoontent-based in-formationretrieval(CBIR) inthe domainof medialimaging. CBIR is a very atively researhed area in reent years, however, utilising

itinthe healthareommunityis stillrelativelynewand unexplored. This report provides a survey of urrent CBIR researh, with spe-ialemphasisonmedialimaging. Researhhasalso beendoneinthe

(3)

1 Introdution 1

2 Content Desription 5

2.1 MultimediaContent DesriptionInterfae . . . 7

2.1.1 Desriptor . . . 8

2.1.2 Desription Sheme . . . 8

3 Comparing Medial Images 9 3.1 Curvature Sale Spae Desription forShape Representation . 9 3.2 Mathing of Sale Spae Images . . . 13

3.3 Creation of Contour Images . . . 15

3.4 Examples . . . 16

3.4.1 Creation of a losed ontour . . . 16

3.4.2 Creation of a CSS Desriptor . . . 17

3.4.3 Mathing Result . . . 17

(4)

1 Overview of the proposed system . . . 4

2 Contour apture steps using the windowimage . . . 18

3 Desriptor reationsteps using the brain image . . . 19

List of Tables 1 Desriptor BinaryRepresentation Syntax . . . 11

2 Images of the shes Part1 . . . 21

3 Images of the shes Part2 . . . 22

4 Comparison tothe original images. . . 23

(5)

The reent information explosion in multimedia ontent, partiular visual

information, has lead to massive demand for multimedia data storage. The

samesituationhashappenedinthemedialimagingeldtoo,reatinganeed

for eÆient visual information management. Probably millions of medial

images are aptured and reated daily, and to nd a partiular image with

some degree of similarityproves to be very diÆult.

To address the above issues, ontent-based retrieval has been proposed.

It is animportant alternativeand omplementtotraditionalkeyword-based

searhing for multimedia data and an greatly enhane the auray of the

information being returned. Currently, most searh engines are totally or

mostlybasedonkeyword searh,andontent-basedretrievalisativelybeing

investigated in various image proessingand multimedialaboratories.

Akey developmentinontent-basedsystems isa newinternational

stan-dardisation work item alled \Multimedia Content Desription Interfae",

referred toas MPEG-7 [1℄. MPEG-7 speies a standard set of `desriptors'

to desribe various features within multimedia information. In addition, a

Desription Denition Language (DDL) is being developed to speify

De-sriptionShemes(DS),hierarhial setsof desriptionsdening multimedia

objets. In our projet, we aim to adopt MPEG-7 standard methodology

to our ontent desription interfae system, asit would greatly inrease the

hanes of universal information sharingand exhange.

There are some preliminary suesses with the use of ontent-based

in-formation retrieval(CBIR) inthe multimediaindustry, partiularlyinelds

of broadastingand entertainment. In medial imaging,pure ontent-based

retrievalwithoutsensiblehumaninterationorfeedbak willprobablyreturn

thousands of (or no) results, and hene, further elaboration by the user is

enouraged.

This leads to another atively researhed subjet, whih is to inlude a

human as an integral part of the feedbak loop. This theory is opposed to

the fully automati theory of omputer vision pattern reognition.

How-ever, ahuman shouldonly takepart inthe proess whenit isneessary, and

minimising the interation of the human is highly desired. This ts with

the theory that a human is always an indispensable part of an image

re-trieval system. In fat, this researh trend has already been reeted in a

number of ontent-based image retrieval systems. For example, a team of

MIT researhers moved from the \automated" Photobook to \interative"

(6)

proposed a \Relevane Feedbak" arhiteture [4℄ for image retrieval,where

human and omputerouldinterat witheahothertoimprove theretrieval

performane. In [4℄, Relevane Feedbak \is the proess of automatially

adjustinganexistingqueryusinginformationfed-bakby theuserabout the

relevaneofpreviouslyretrieved douments."ExperimentsinMARSshowed

that retrievalperformane an beimprovedonsiderably by usingRelevane

Feedbak.

Itwould benetboth the imageretrievalresearhand themedial

indus-try if a web based system was available. This approah has been taken by

WebMIRS [5℄ , whih isa web-based medialinformationretrieval system.

Below is a brief desription of several seleted ontent-based image

re-trievalsystems.

Query By Image Content (QBIC) [6℄ is the earliest ommerial

ontent-based system. QBIC supports queries based on example images, sketh by

user, drawing, olour et.

Another system \Virage" is developed by Virage In. [7℄. It is slightly

more powerfulthan QBIC as itsupports ombinationqueries. Forexample,

users an request queries based on example to have half of the weighting

ratio, while skething, and olour determination eah have a quarter of the

weighting ratio.

MARS(MultimediaAnalysisandRetrievalSystem)[8℄isanother

ontent-based image retrieval system . The features of MARS are the integration of

DatabaseManagementSystem(DBMS)andInformationRetrieval(IR),and

the integration of omputer(automati) and human (manual) feedbak.

Now,wewillidentifythemainaimstowhihtheresearheortisfoused.

1. To enable liniians to searh for medial image data eortlessly and

eÆiently by developing an intelligent medial image searh and

re-trieval system. To ahieve this, aneÆient image feature analysis

en-gine should be developed, and a hierarhial ontent desription and

indexing tehnique should be investigated. Suh a tehnique will

en-able the retrieval ofimages whih shareertainharateristis. Lastly,

an intelligentquery interfae whih inorporates user feedbak should

be developed.

2. Demonstrate and evaluate the proposed ontent desription interfae

usingavarietyofmedialimages,inanetworked environment.

(7)

ment ofMPEG-7 and utilisationof the standardisedformat should be

a neessity. The possibility of heterogeneous ompatibility should be

exploited too, whih may inlude developing a web based system to

enourage informationexhange.

Figure1showsthe overalloneptoftheproposedontent-basedmedial

image informationsystem. The systemis partitionedinto two setions. The

rst setion is onerned with the storage of medial informationas wellas

medialimages. Theotherisabout theeetivesearhandretrievalof

infor-mation. Thereisalsoaentralrepository whihstores allrelevantdata that

is requiredfor eÆient storage,searh and retrieval. It ontains information

suh asthe DesriptionDenitionLanguage, the Desriptor Shemes'

stru-ture, the Desriptors' syntax, as well asthe stored medialinformation and

images themselves.

A graphial user interfae will be the medium by whih the liniian

interatswiththesystem. Suhasystemwilleasetheuser'sunderstandingof

thesystembyreturningappropriateinstrutions(feedbak)togaininreased

relevane and spei results. Suh results an again trigger another round

of feedbak from the user, if the user suspets that loser results may be

obtained. This proess isslightlysimilartothe"training" ofapereptron in

a neural network system. Hene, to train the system to be more intelligent

by automatially returning more favourable results an be investigated in

future.

Duringthe storage of medial images and information, the liniian will

interat with the system to speify what kind of information should be

reorded andextratedfromtheimage. The imagewillthenundergofeature

extration and image pre-proessing. The extrated features will beused to

generate ertaindesriptionsof the imageontent, whihan beused by the

searh and retrieval mehanism. The desription willthen be enoded into

binary formfor storage.

When a liniian requires ertain images whih ontain ertain features

orartifats,he/shean requestthe systemtosearhthe databaseforimages

whih t his/her desriptions. The searhed results will then be displayed.

Theliniiananrenethesearh,orgiveomment(feedbak)tothesystem,

whihausesthesystemtoperformfurthersearhesasappropriate. Thiswill

berepeated untilthe liniianissatisedwith theresult, oruntilthe system

is unable torespond further.

(8)

Filtering

Indexing

Searching

Search Engine

Result Image

Decoder

Non-MPEG-7

MPEG-7

Decoder

Sketch/ Example etc

Query By Text/

User Feedback After

Given Results

Descriptor Generation Interface

Medical Image

Clinician Text Input

For Storage

Description/ Thumbnail

Possible Result

Non-MPEG-7

Encoder

Non-MPEG-7

Description

MPEG-7

Encoder

Description

MPEG-7

Features Extraction

& Image Processing

Generation

MPEG-7 Description

Non-MPEG-7 Description

MPEG-7 Coded

Description

Original Image

Coded Description

Non-MPEG-7

Description Definition

Language (DDL)

Description Scheme

(DS)

Descriptor (D)

Central Database

New Feature Extraction

& Image Procesing

New D/ DS

Content Description

New Medical

(9)

to the entralrepository, whih inturn may neessitate,new feature

extra-tion methodsfor eetive desription generation.

Thisreportdetailsworkonshape-basedimageontentdesriptionand

re-trieval. WehaveusedthemethodologyadoptedbyMPEG-7usinga

Contour-Based Shape desriptor based on the Curvature Sale Spae representation

[9℄ of the ontour. Shape-based image desription is hosen beause shape

representationisimportantinsomemedialimages. This willat asa

start-ing point todevelop an eetive and intelligentmedial imagingsearh and

retrievalsystem.

Also, as it is reommended by the MPEG-7 ommittee, this will enable

data to be read by systems whih are MPEG-7 ompatible, and easily

on-verted to otherstandards.

The organisation of this report is as follows. A brief introdution of

urrent researh in medial imaging searh and retrieval system has been

disussed in this setion. The next setion presents a literature review of

various ontent desription methods, with further elaboration on MPEG-7.

Setion3 shows theresearh workthat has been doneon shape-basedimage

retrieval. In the last setionof this report, onlusions tothe work that has

been done are drawn and future researh is addressed.

2 Content Desription

Traditionally,textual basedmethodshave been used forthe desription and

searhing ofpredominantlyalphanumeriinformation. However, itisknown

that for multimedia information, text is not enough to desribe the rih

ontentofthe data. Hene,ontent-basedimageretrievalsystems havebeen

proposed[10℄[11℄[12℄. Content-basedimageretrievaldiers fromtraditional

text-based image retrieval as information is indexed by visual ontent. For

example, an image is indexed by olour, texture et. Also, ontent-based

image retrievalshould oer an intelligentway of invoking the right features

(e.g. olour)assoiatedwiththe imagestoassistretrieval. Thedevelopment

of MPEG-7 aims to standardise the ontent desription approah, whih

in-ludesgroupinganddeningsets ofstandard features(known asdesriptors)

whihanbeusedtodesribeawidevarietyofmultimediaontent,inluding

images.

Coded ontent desription an be onsidered as a form of Metadata, a

(10)

of Metadata standards in use, eah dened for spei purposes. The more

popularstandards inlude the following[13℄:

The Dublin Core Metadata Initiative (http://purl.org/d/) has dened

a metadata element set to failitate the disovery of eletroni resoures.

Dublin Core's metadata, or desriptor, is used for fast informationretrieval

and searh operations. It also linksto the Resoure Desription Framework

(RDF) (http://www.w3.org/RDF).This standard supports anumberof

de-sription ommunities, and is espeially suessful in digital libraries. This

has made ontent desription (i.e. metadata) being widely aepted in the

library ommunity.

AnotherstandardisdevelopedbytheTVAnytimeForum

(http://www.tv-anytime.org/). The standard isspeiallydeveloped toenable audio-visual

anditsrelatedservies,whiharebasedonmass-market,high-volumedigital

storage.

The Soiety of Motion Piture and Television Engineers (SMPTE) has

also developed a standard known as The Dynami Metadata Ditionary

-Unique MaterialIdentiers (UMIDs) [14℄.

A standard developed speially for use in the medial ommunity is

known as Digital Imaging and Communiation in Mediine (DICOM). It

denes the protools and mehanismsto manageand transfer medialdata,

primarily in the ontext of radiology. It merges the patient information

together with the medial image data into a format known as the DICOM

image.

The DICOM standard is very spei to the healthare ommunity and

retainsanioni(simplepiture)datarepresentation. Anexampleofawider

interhangestandard isthe PAPYRUSformatwhihisusedwiththe OSRIS

displayandmanipulationplatformattheUniversityHospitalofGeneva. The

system madeamovetoopen systemsby usingwidely availableomputer

in-dustry standard, suh asSQL-based distributeddatabases and TCP-IP

net-working. Another similar approah is being used in the Web-based Medial

Information Retrieval System (WebMIRS) [8℄.

It is shown that the urrent researh diretion is towards a more open

medial imaging arhiteture. So naturally, the urrent eort of MPEG to

establishanewontentdesriptionstandardknownformallyas\Multimedia

Content Desription Interfae"(MPEG-7) is in alignment with the researh

objetive.

(11)

sual objets and artifats, perhaps even over heterogeneous systems, whih

supportthe MPEG-7 standard.

2.1 Multimedia Content Desription Interfae

MPEG-7 [15℄ is anISO/IEC standard developed by the Moving Piture

Ex-pertsGroup(ISO/IECJTC1/SC29WG11). IntheOverviewoftheMPEG-7

Standard,itstatesthatMPEG-7\... aimstoreateastandardfordesribing

the multimediaontentdata that willsupport somedegree ofinterpretation

of the information's meaning, whih an be aessed by or passed onto a

devie ora omputer ode.". Also,MPEG-7 is not aimed atany one

appli-ation, but aims to support asbroad a rangeof appliations as possible.

MPEG-7needstoprovideaexibleandextensibleframeworkfor

desrib-ing audio-visual data. Therefore, it denes a set of methods and tools for

the dierent steps of multimedia desription. Standardisation will apply to

the followomponents:

1. Desriptors

2. Desription Shemes (DS)

3. Desription Denition Language (DDL)

4. Methods toenode desriptions

MPEG-7 systems will inludetoolsthat are needed toprepare MPEG-7

DesriptionsforeÆienttransport and storage,and toallowsynhronisation

between ontentand desriptions. However, atthis stage,suhtoolsare still

beingdeveloped, andtheapproahwehavetakenistodevelopourowntools.

Desription Denition Language (DLL) is dened as \ a language that

allows the reationofnew DesriptionShemesand possibly, Desriptors. It

alsoallows the extensionand modiationof existingDesription Shemes."

The DDL will be based on the XML Shema Language, but will not be

limitedtoitasDDLisrequiredtoaterforahugerangeof appliationsand

urrent standards.

Fora moreomprehensivedenition andintrodutiontoMPEG-7

(12)

The denition of a Desriptor is a presentation of a Feature. \A Desriptor

denes the syntax and the semantis of the Feature representation. A F

ea-ture is a distintive harateristi of the data whih signies something to

somebody.", aording tothe MPEGommittee.

A Desriptor will allow an evaluation of the orresponding feature via

the desriptor value. A Desriptor may ontain more than one value, and

all or part of them an be used to evaluate a orresponding feature. Also,

a single feature an have several Desriptors to desribe it, for dierent

re-quirements. An example is for texture, where Luminane Edge Histogram

and Homogeneous Texture Desriptorsan beused.

For medial imaging, a few Desriptors have been hosen to be studied

for their suitability. Contour-Based Shape Desriptor has been studied and

foundsuitable. MoreinformationabouttheContour-BasedShapeDesriptor

willbeexplainedin3.1. Texturebrowsing,edgehistogramandregionloator

Desriptors willbe investigated in the near future.

2.1.2 Desription Sheme

A Desription Sheme is dened as: \A Desription Sheme (DS) speies

the struture and semantis of the relationships between its omponents,

whihmay be both Desriptors and DesriptionShemes."

Simply speaking, a Desription Sheme is used to group individual

De-sriptors, or even other DS, to form a systemati semanti tree-struture

informationabout apiee of information,suh asan image.

The distintion between a Desription Sheme and a Desriptor is that

a Desriptor an only ontain basi data types, or Desriptor Values, and it

does not refer toanother Desriptoror DS.

Anexample ofa DesriptionShemeis theStillRegionDS, whih ispart

of thestandard DSinMPEG-7 [16℄. This isaSegmentDS,derived fromthe

generiSegmentDS,whihdesribesspei typesof audio-visualsegments.

Other segment DSs are the Video Segment DS, Mosai DS, Moving Region

DS, Video TextDS, Audio SegmentDS and Audio VisualSegmentDS.

TheStillRegionDS thenhas variousDesriptorstodesribeastillregion

segment of animage. Some Desriptors are EdgeHistogram, T

(13)

existingDSinMPEG-7standardisnotsuÆientformedialimageretrieval.

AnewDSshouldbeformulatedforthepurposeofmedialimages,asmedial

images are semantially muh more distint then generi images. This will

beinvestigated after allrelevant Desriptorshave been explored.

3 Comparing Medial Images

To ompare medialimages, we have deided to implementashape

desrip-tion tehnique. Shapedesriptionisanimportantissueinobjetreognition

as itis use tomeasure geometri attributes of anobjet, whih is an

impor-tant feature for various medialimages.

An overview of various shape desription tehniques is provided in [17℄.

Shape desription tehniques an be lassied as Boundary Based Methods

orRegionBasedMethods. Theyare thenfurtherategorisedintoTransform

Domainand SpatialDomain,while SpatialDomainis againategorisedinto

Partial(Olusion)andComplete(NoOlusion). Themostsuessfulshape

desriptors are Fourier Desriptors [18℄ and Moment Invariants [19℄.

How-ever, in [17℄, the authors did not mention Curvature Sale Spae, hene,

we suggest that Curvature Sale Spae is onsidered as a Boundary Based

Methodinthe Transform Domain. This isdue tothe boundarybeing

trans-formed into a Sale Spae representation, althoughit uses the urvature as

the basi measurement.

3.1 Curvature Sale Spae Desription for Shape

Rep-resentation

A very fast and reliablemethodfor shape similarity retrieval inlarge image

databases isby using theCurvatureSale Spaetehnique. This isalsovery

robust with respet to noise,sale and orientation hanges of the objet.

The Curvature Sale Spae (CSS) representation of a ontour isthe

re-ommended tehnique by MPEG-7 for ontour shape similarity mathing.

Filtering based on shape ontour also targets query-by-example [20℄. The

representation of ontour shape is very ompat, less than 14 bytes in size

on average.

To reate a CSS desription for a ontour shape following the MPEG-7

(14)

ontour and we have: =(x 1 ;y 1 );(x 2 ;y 2 );::(x n ;y n ) (3.1)

is the set of individual points representing the x and y oordinates

respetively. The points need to be resampled as equidistant points. After

that, thexand yoordinates must beparameterised bythe urvear-length

parameteru, whereuisnormalisedtotakevaluesfrom0to1. Wethen have

to onstrut the funtion X(u)and Y(u)from the x and y oordinates.

The next step is to selet a goodvalue of N, the number of equidistant

points the funtions X(u) and Y(u) will be resampled to. From [20℄, it is

understood that the value of N = 256 is usually suÆient for typial

im-age/videoappliations. GivenN,wean ndthe valuesof thetwofuntions

by using linear interpolation. We name the resampled funtions x(j) and

y(j) where j is from 0 to 255 (if N =256). x(j) and y(j) then repeatedly

undergo a low-pass lter operation whih performs a onvolution with the

normative(0.25,0.5,0.25)kernel.

The urvature of the lteredurve is

K(j;k)= X

u (j;k)Y

uu

(j;k) X uu (j;k)Y u (j;k) (X u (j;k) 2 +Y u (j;k) 2 ) 3=2 (3.2) whereX u

(j;k)=X(j;k) X(j 1;k),X uu

(j;k)=X u

(j;k) X u

(j 1;k),

Y u

(j;k)=Y(j;k) Y(j 1;k) and Y uu

(j;k)=Y u

(j;k) Y u

(j 1;k).

We nd the minima and maxima of the urvature by nding the zero

rossingsof K(j;k). The ndingofzero rossingsshouldnot belimited only

to nding all pairs of onseutive indies (j;j+1), for whih K(j;k)K(j+

1;k) < 0 as indiated in [20℄.Otherwise, those zero rossings whih span a

few points, suh as those with intermediate zero(s) (e.g.K i

= 0:1;K i+1

=

0;K i+2

= 0:1) will be missed. Indies j of eah zero rossing and the

orresponding number of passes of the lter k are reorded. The CSS image

of a ontour is the binary image where the x-axis represents j and the

y-axis represents k. All the \ative"pointsorrespond tozero rossingsof the

urvature.

TheMPEG-7speidesriptorbinaryrepresentation syntaxofContour

shape using Curvature Sale Spae islisted in table 1.

NumberOfPeaksrefers tothe totalnumberof peaksinthe CSSimage. If

(15)

ContourShapef Number of bits

numberOfPeaks 6

GlobalCurvatureVetor 2*6

if(numberOfPeaks !=0)f

-PrototypeCurvatureVetor 2*6

g -HighestPeakY 7 for(k=1;k<numberOfPeaks;k++)f -PeakX[k℄ 6 PeakY[k℄ 3 g -g

-Thereare two values needed forthe GlobalCurvatureVetor, namely

E-entriity and Cirularity. Cirularity is dened as:

irularity =

perimeter 2

area

(3.3)

This is uniformlyquantised to 6 bits in the range of 12to 110. If the value

is larger than 110, it is lipped to 110. Cirularity is the rst value of the

GlobalCurvatureVetor.

The seondvalue ofthe GlobalCurvatureVetor isEentriity, obtained

as follows: i 02 = N X k=1 (y k y ) 2 (3.4) i 11 = N X k=1 (x k x )(y k y ) (3.5) i 20 = N X k=1 (x k x ) 2 (3.6) where(x ;y

) is the enter of mass of the shapeand N is the number of

pointsinside the shape.

Eentriity is thusdened as:

(16)

range from 1to 10,and lipped to 10if neessary.

The next set of values reorded in the binary representation is the

Pro-totypeCurvatureVetor. These are atually the values of Eentriity and

Cirularity of the prototype ontour. The prototype ontour is the ontour

whihis totallyonvexand there are nomorezero rossingsinK(j;k). Itis

the nal ontour after multiple passes of the normative lter.

HighestPeakY isthe parameterof the lter orresponding tothe highest

peakin the CSS image. It an be alulated as:

HighestPeakY =3:8(

yss[0℄

Nsamples 2

) 0:6

(3.8)

where yss[0℄ is the number of passes of the normative kernel lter

orre-sponding to the highest peak, and Nsamples is the number of equidistant

pointsonthe ontour initiallyused asthe input to the proess.

As above,it isuniformlyquantised to7 bits inthe rangefrom 0 to1.7.

Finally,the lasttwosets ofvaluesinthebinaryrepresentationsyntax

de-notetheparametersoftheremaining(upto63)prominentpeaks. Prominent

peaksarethosepeakswheretheirheightisgreaterthanHighestPeakY0:05

after transformation. The peaks are represented in dereasing order in

re-spet totheir peak heightvalue.

First,wealulatexpeak[k℄. Wenormalisethedistanealongtheontour

by thelengthofthe ontour,usingthe pointwherethe highestpeakisfound

as the starting point P[0℄. The value of xpeak[k℄ will be the normalised

distane fromP[0℄ tothe k-th peak on the ontour ina lokwise diretion.

Finally the PeakX[k℄ is obtained by quantising xpeak[k℄ to 6 bits by the

normalised distane inthe range from 0to 1.

The transformed height of the kth peak is known as ypeak[k℄, and is

alulated as:

ypeak[k℄=3:8(

yss[k℄

Nsamples 2

) 0:6

(3.9)

Afteruniformlyquantisingypeak[k℄intherangefrom0toypeak[k 1℄to

3bits,weobtainPeakY[k℄. ThisimpliesthatthevalueofPeakY[k℄depends

(17)

Allshapeboundaryontoursinthedatabaseanbewellrepresentedbytheir

uniqueCSSimagepresentation. Morespeially,they anbedened bythe

maximaofurvaturezerorossingontoursoftheCSSimage[9℄. Thereason

forusing themaximaofthe urvaturezerorossingisthatthey arethe most

signiant points of the zero-rossing ontours. It also onveys information

on the loation and the sale of the orresponding ontour [21℄. Hene, the

mathing is ahieved by nding two CSS images whih share similar sets of

maxima.

Also, the CSS image is reated by deteting the hange of urvature, so

that it is invariant under rotation, sale and orientation. Hene, mathing

using the CSS image will ensure that suh physial varianes of the same

image willnot betreated asdierenes.

Therepresentation isalsorobustwithrespet tonoise. Modiedversions

of the CSS image tehnique, known as the Renormalised Curvature Sale

Spae Image and the Resampled Curvature Sale Spae Image, are more

robust toertain kindsofnoise, suh asnon-uniformnoise orsevere uniform

noise [22℄. By using the above mentioned CSS image tehniques, one ould

inrease the auray of the mathing proess.

Another advantage of using this tehnique is that a math an be found

quikly. This is due to the relativelyfew features, i.e. maxima of urvature

zero rossing, required to be mathed. This is espeially true at the high

sales of the CSS imagewhere the maxima are sparse.

The initial task for mathing is to horizontally shift one of the two sets

of maxima by some amount. The best hoie to determine the amount of

shifting required is to shift one of the CSS images so that its major

maxi-mum overlaps with the major maximum of the other CSS image. Ifthe two

CSS imagesare similar, suh shiftingwould enable the similarityof the CSS

images tobeeasilydisovered, either by alulation orhuman visualisation.

After shifting, the similarity value an be alulated. A threshold has

to be set as to what value of dierene an be allowed to be onsidered a

mathed maxima. After determining whih maxima are onsider mathed,

thealulationofsimilarityvalueisthesummationofthe Eulideandistane

between the mathed pairs, plus the summation of the vertial oordinates

of the unmathed pairs.

Inorder tominimisethe omparisontime forlarge databases, the aspet

(18)

In the MPEG-7 ontour shape Desriptor, information required for an

eÆient mathing operation,suh aseentriityand irularity,isinluded.

Also,theDesriptorisalreadysortedaordingtoheightofthepeaks, whih

onrm that the major maximum, i.e. Highest Peak, is atthe starting

posi-tion.

The mathing algorithmspeied in [23℄ is asfollow:

1. Compare the global parametersof both CSSimages. Ifthey dier

sig-niantlythennomoreomparisonisrequired. Thefollowingequations

have tobe fullled for further mathingto be performed:

j q [0℄ r [0℄j MAX( q [0℄; r [0℄)

Th e (3.10)

j q [1℄ r [1℄j MAX( q [1℄; r [1℄)

Th (3.11)

where q

[0℄ and r

[0℄ are eentriity values for the query and model

images respetively, and q

[1℄ and r

[1℄ are the irularity values for

the queryand modelimages. Th e and Th are the thresholds where

Th e=0:6 and Th =1:0.

2. ForCSSimageswhihfulltheaboveonditions,thesimilaritymeasure

M is omputed asbelow:

M =0:4 j q [0℄ r [0℄j MAX( q [0℄; r [0℄) +0:3 j q [1℄ r [1℄j MAX( q [1℄; r [1℄)

+Mss (3.12)

where

Mss=Sm+Su (3.13)

and Smis thesummationoverallmathedpeakswhileSuisthe

sum-mation of allunmathed query and modelpeaks. Hene, we have

Sm = X

((xpeak[i℄ xpeak[j℄) 2

+(ypeak[i℄ ypeak[j℄) 2

) (3.14)

where i and j are indies of the query and model peaks that math,

and Su= X (ypeak[i℄) 2 (3.15)

ThesimilaritymeasureM willbezeroiftwomathesareexatlythesame,

and will produe agreater value if they have a greaterlevelof dierene.

Other similar algorithmsto math CSS imagesan be found in[24℄ and

(19)

Beforethe reationofaCSSimage,oneisrequiredtoobtainalosed

bound-ary ontour forthe objet ofinterest. Suhaontourisnot readily available

and there might be multiple hoies for suh ontours in a single image.

Therefore, amethodis required toextrat the orretlosed boundary

on-tourfromtheoriginalimage. Oftenontoursareisobtainedfromalight-box

setup orothersimplesingleontour images. However, asmedialimagesare

muh more omplex, areliable ontour extrationis required.

A method based on image intensity thresholding and user feedbak is

urrently being investigated with some initialsuess. Suh a method relies

more on the user's visual skill to pereive the objet ontour that he/she

requires. This ould ensure that the right ontour is always aptured. The

drawbak ofsuh amethodistherequirementofsigniantuser interation.

The possibility of reduing the dependeny onthe user will be exploited in

the near future.

The algorithmfor obtainingthe orretontour is:

1. First, display the original image. An input from the user is required

to threshold the intensity of the image, so that any value below the

threshold is blak(0), and the remainder as white(1). This willreate

a binary image.

The seleted threshold should enable the ontour (outline) of the

re-quired objet in the original image to be visible to the user after the

original image has been onverted to a binary image. Else, the user

should reseletthe value again untilthe aboveondition is met.

2. After abinaryimage isprodued,anedge detetor,suhas theCanny

edge detetor, willbeimposed onthe binary image. The resultwillbe

an image onsisting only of lines. Again, hek whether the required

objet's boundary ontour is visible enough. If not, perform the last

step again.

3. The morphologial operations of erosion and dilation are then

per-formed on the lines. Suh an operation an regenerate the lines, but

only one pixel wide [25℄. Other morphologial operations may also

be operated onthe binary image to inrease the haneof obtaininga

qualitylosedontour. Suhoperationsaninludebridgingpreviously

(20)

objet. Theuserisrequiredtoinputthestartingpointtoletthesystem

know whih objet is of interest. During border traing, we also seek

how many diretions a point an link to. If there is more than one

link, i.e., thereare branhes, user inputwillbesoughttolarify whih

branh is the orret one. This is preferred over letting the algorithm

\jumping"intothe rst link(branh) itdisovers, whih mightnot be

the right one.

However, if alosed ontour is not aptured, the above algorithmwill

fail. A modied border traing method is used so that it an apture

the whole ontour, even if it has an open end. When an open end is

disovered, the user is againrequired to selet another ontour, whih

we assume here, ispart of the requirednal ontour.

After the user is satised with the ontours aptured, the system will

then link allontours together, hene reatinga losed ontours.

To suessfully obtain a ontour, instead of using the intensity

thresh-olding method, various segmentation tehniques ould be used too. A

mor-phologial operation based approah has been used for image segmentation

for some time [27℄. However, suh an approah is not sophistiated enough

to segment ompleximages. Another tehnique, proposed in [28℄ is tomake

use oftheindividualstrengthsofwatershedanalysisandrelaxationlabelling.

Some other popular segmentation tehniques are based on Edge Flow [29℄,

Delaunaytriangulation[30℄ andfuzzy entropy [31℄. Allthe abovementioned

segmentation algorithms are automati and do not depend on user input.

Hene, to obtain a ontour of a partiular objet, a further step will be

required to speify the objet.

There are also various other objet segmentation algorithms, whih

re-quire some human interation. A omputer-assisted boundary extration

method has been suggested in [32℄. Other approahes inlude an algorithm

whihisbasedonlusteringandgroupinginspatial-olour-texturespae[33℄,

and an ative ontours based algorithm[34℄.

3.4 Examples

3.4.1 Creation of a losed ontour

A window image will be used as an example. Figure 2(a) is the original

(21)

user interation is required. The rst time, the algorithmwillgrab the rst

part of the ontour that the user speies, as shown in gure 2(d). Then,

the algorithmwillgrab the next partof theontour, asshown ingure2(e).

Finally,when the user issatised, he/shewillrequest the algorithmtolose

the ontour, and the omplete losed ontour isaptured as ingure 2(f).

An originally losed ontour will only require the user to interat with

the system one, that is,to speify whih ontour is of interested.

3.4.2 Creation of a CSS Desriptor

Here, anexample onanimage ofthe human elbowis presented. Figure3(a)

istheoriginalimage. Weare interestedinapturingapartoftheimagenear

the entre, slightly to the top, just above the joint (see gure 3(e)). After

onverting the image to lines by the method mentioned above, we get the

imageas ingure3(). The user isrequiredtolikonthe objet ofinterest

in the image, whih willgrab a losed ontour as ingure 3(d). After that,

the maxima of the zerorossings anbefound by using the CSS algorithm.

A binary CSS imageof the ontour is showin gure 3(f)

The values obtained from the algorithmare onverted into a Desriptor,

and the storedinformation isas below:

NumberOfPeaks : 11 (unsigned 6bit)

GlobalCurvatureVetor : 64 ,5 (unsigned 6 bit)

PrototypeCurvatureVetor : 19, 4(unsigned 6 bit)

HighestPeakY : 35(unsigned 7 bit)

PeakX : 18,44,53, 47,26,3, 59,14,58, 22(unsigned 6 bit)

PeakY : 5,4, 2,3, 7,6, 4,5, 6,6 (unsigned 3 bit)

We an then ompare this CSS desriptor with others to see how muh

they dier.

3.4.3 Mathing Result

Amathingexamplewillmakeuseoftheshontourwhihisprovidedfrom

[35℄. All the CSS Desriptors for the sh ontours have been generated in

advane before the omparison. Then their Desriptors are ompared and

(22)

() After edge detetion (d) Capture of the rst ontour

(e) Capture of the seond ontour (f) Complete the ontour apture

(23)

elbow_sagittal_58.jpg

(a)Originalimage (b) Intensity Threshold = 0.53

() After edge detetion (d) Capture of the losed ontour

0

50

100

150

200

250

300

0

200

400

600

800 1000

1200

1400

1600

1800

2000

(e) Closed ontour onthe originalimage (f)CSS image of ontour

(24)

Table 2 and 3 show the outline of the sh images whih are used for

omparison. The mirror imagesof the shes are showon the left hand side.

Table 4 shows the omparison results with eah sh, where NC means

there isnoomparisonasthey are toodierenttowarrant omparison. This

is beause they are lteredby the eentriity and irularity measurement.

Table 5 shows the omparison results with the mirror imagesof the sh.

In table4, it an beobserved that the similaritymeasure of image no. 1

ompared to image no. 2 is slightly dierent when no. 2is ompared to no.

1. Thereasonforthisisthattheomparisonriteria(query)isdierent. For

images no. 9 and no. 11, they have numerous "NC", as their features are

quite dierent from the rest. During the multimple omparisons of images

to no. 1,itan be found that imageno. 5 isthe most similar,while no. 9is

the most distint.

In table 5, it an be observed that when the image is ompared to its

own mirror, the similarity measure might not be too small. This means

the mirror of the image is not very similar to its original image. However,

the more symetrial they are, the loser are their mirror images to original

images.

4 Conlusions and Future Work

Content-based image retrieval (CBIR) ould potentially play an important

role in modern medialimaging systems. It has been extensively studied in

the eld of multimedia and generi image proessing. However, muh more

work has to be done for the medial imaging eld. The proposed researh

hasbeenonernedwiththedevelopmentofaontent-basedmedialimaging

interfae, whihmakesuse ofCBIR and auser feedbak interfae,but based

on the evolving ontent desription standard,MPEG-7.

This report desribesthe work that has been done sofar, whih inludes

the implementation of the CSS algorithm, onversion from image objet to

CSS Desriptor,and the CSS mathing proess. This ompletes a study for

a single desriptor. Some bakground reading on CBIR is also presented in

the rst part of this report.

By using the maxima of urvature zero rossing ontours of the CSS

(25)

No. Image Mirror Image

1

2

3

4

5

6

7

(26)

No. Image Mirror Image

9

10

11

(27)

- 1 2 3 4 5 6 7 8 9 10 11 12 1 0 0.4419 0.3536 0.3263 0.1967 0.3631 0.1699 0.3451 0.5512 0.3495 0.4662 0.3076

2 0.4488 0 0.2846 0.4455 0.3820 0.3334 0.4135 0.5173 NC 0.3824 NC 0.4872

3 0.3451 0.2754 0 0.5271 0.2717 0.3365 0.4795 0.4588 NC 0.3119 NC 0.4290

4 0.3388 0.4566 0.5285 0 0.4396 0.4156 0.2369 0.3740 0.6040 0.4594 0.5424 0.3689

5 0.2068 0.3868 0.2869 0.4313 0 0.2589 0.4264 0.4480 NC 0.2710 NC 0.4245

6 0.3245 0.3223 0.3230 0.4265 0.1738 0 0.4348 0.4485 NC 0.2198 NC 0.4207

7 0.2644 0.4205 0.4607 0.2341 0.2722 0.4130 0 0.3635 0.4463 0.4131 0.5397 0.3027 8 0.3584 0.5027 0.4470 0.4093 0.4293 0.5156 0.4067 0 0.4501 0.4904 0.2398 0.3034

9 0.5372 NC NC 0.6382 NC NC 0.4393 0.4521 0 NC 0.1884 0.5390

10 0.3422 0.3752 0.3543 0.3520 0.2257 0.2348 0.4448 0.4741 NC 0 NC 0.3888

(28)

- 1 2 3 4 5 6 7 8 9 10 11 12 1M 0.2655 0.4335 0.4213 0.3215 0.2905 0.3244 0.3217 0.2914 0.5705 0.3994 0.4680 0.3076 2M 0.4305 0.0354 0.2832 0.4393 0.3769 0.3162 0.3966 0.5595 NC 0.3746 NC 0.5206 3M 0.3986 0.2771 0.1176 0.5257 0.1770 0.3069 0.5010 0.4365 NC 0.3554 NC 0.4322 4M 0.3410 0.4485 0.5431 0.0690 0.4575 0.3638 0.2574 0.3816 0.4998 0.4374 0.5423 0.3659 5M 0.3736 0.3604 0.2960 0.4660 0.2013 0.2028 0.4667 0.4100 NC 0.2745 NC 0.3867 6M 0.3610 0.3179 0.3313 0.3570 0.1049 0.1892 0.3624 0.4151 NC 0.0940 NC 0.4385 7M 0.3446 0.3892 0.4882 0.2235 0.4124 0.3477 0.1960 0.3404 0.6052 0.4579 0.5332 0.2790 8M 0.3429 0.5568 0.4375 0.4012 0.4083 0.4105 0.3798 0.1514 0.3734 0.5151 0.3129 0.3087

9M 0.6052 NC NC 0.6267 NC NC 0.6276 0.3781 0.2698 NC 0.1673 0.4829

10M 0.4041 0.3430 0.3526 0.4453 0.2374 0.1450 0.4605 0.4187 NC 0.2139 NC 0.3348

11M 0.4885 NC NC 0.5416 NC NC 0.5247 0.3128 0.1863 NC 0.0500 0.4236

(29)

CSS image of the objet,one an deide how similar two objet's boundary

ontours are. The CSS image is reated from the urvature of the ontour,

whihis then enoded into asale spae binary image.

The information from the sale spae binary image is then oded as an

MPEG-7 desriptor. It ould then be retrieved by any system whih is

MPEG-7 ompatible. Mathing of CSS images beomes the mathing of

twoCSSdesriptors. Byusingeentriity and irularity,therelatively

dif-ferentimagesanbelteredout. TheremainingCSSdesriptorswillthenbe

omparedand asimilaritymeasure Manbeobtained. Thevaluewilldepit

how losely the two desriptors math, whih implies how losely the CSS

images math,and hene how losely the originalobjets' ontours math.

Futureresearh plans inludes:

1. Investigation of other relevant Desriptors to build up a desription

sheme (DS) whih is suitable for use with one or two sub-lasses of

medial images. Identify the shortomings of the urrent desription

sheme and propose new desriptors or other methods suh as the

in-tegration of related metadata totakle the shortomings.

2. Develop ahierarhialontentdesription and indexingmethodwhih

will permit the retrieval of images with ertain harateristis. Also,

the storage of the ontent desription will be onsidered. Should the

desription be in an individual visual frame, or just part of the

de-sription in an individual image, and the remainder be in a entral

database? Suh a problem must besolved toenable eetive retrieval

of information.

3. Apowerfuland user-friendlyintelligentqueryinterfaewillberequired

for the retrieval of omplex medial information. Synergy of human

feedbak and omputer automati extration shouldbeexplored, asit

is reognised that human feedbak will be an indispensable part for

anintelligentontent-desription interfae for medialpurposes. Also,

fusionof textual and visual lues forontent-based retrievalshould be

(30)

[1℄ Jose M. Martinez, Introdution to MPEG-7 (version 1.0), ISO/IEC

JTC1/SC29/WG11-N3545, Beijing,July 2000.

[2℄ R. W. Piard and T. P. Minka, \Vision texture for annotation," Teh.

Rep. 302, MIT Media Laboratory,1994.

[3℄ T. P. Minkaand R. W.Piard, \Interative learningusing a`soiety of

models'," in Pro. IEEE CVPR, 1996, pp. 447{452.

[4℄ Yong Rui, Thomas S. Huang, and Sharad Mehrotra, \Content-based

image retrievalwith relevane feedbak in MARS," in Pro. IEEE Int.

Conf. on Image Pro., Santa Barbara, California, USA, Otober 1997,

pp. 815{818.

[5℄ L.RodneyLong,StanleyR.Pillemer,RevaC.Lawrene, Gin-HuaGoh,

Leif Neve, andGeorgeR.Thoma, \Webmirs: Web-based medial

infor-mationretrievalsystem," inPro. SPIEStorageandRetrievalforImage

and Video Databases VI,San Jose, California, USA, January1998, vol.

3312, pp. 392{403.

[6℄ M. Flikner, Harpreet Sawhney, Wayne Niblak, Jonathan Ashley,

Q. Huang, Byron Dom, Monika Gorkani, Jim Hane, Denis Lee,

Dragutin Petkovi, David Steele, and Peter Yanker, \Query by

im-age and videoontent: The QBICsystem," inIEEE Computer, vol.28,

pp. 23{32.1995.

[7℄ Jerey R. Bah, Charles Fuller, Amarnath Gupta, Arun Hampapur,

Bradley Horowitz, Rih Humphrey, Ramesh Jain, and Chiao Fe Shu,

\The virage image searh engine: An open framework for image

man-agement," in Pro. SPIE Storage and Retrieval for Image and Video

Databases, San Jose, California,USA, February 1996, pp. 76{87.

[8℄ ThomasS.Huang,SharadMehrotra,andKannanRamhandran,

\Mul-timediaanalysisandretrievalsystem (MARS)projet," Teh.Rep.

TR-DB-96-06, Information and Computer Siene, University of California

at Irvine,1996.

[9℄ Farzin Mokhtarian, Sadegh Abbasi, and Josef Kittler, \EÆient and

robust retrieval by shape ontent through urvature sale spae," in

International Workshop on Image DataBases and MultiMedia Searh,

(31)

Communiations of ACM, vol.40, pp. 30{32.Deember1997.

[11℄ VenkatN. Gudivada and Jijay V.Raghavan, \Speial issue on

ontent-based image retrieval systems," in IEEE Computer Magazine, vol. 28,

pp. 18{22.September 1995.

[12℄ A.D.Narasimhalu, \Speialsetiononontent-basedretrieval," in

Com-muniations of ACM, vol.3. February 1995.

[13℄ FrankNak, \Allontentounts: Thefutureindigitalmediaomputing

is meta," IEEE Multimedia, vol. 7,no. 3,pp. 10{13, 2000.

[14℄ Soiety ofMotionPitureand TelevisionEngineers,SMPTE andWhite

Plains and N.Y., Television-UniqueMaterial Identier (UMID), 2000.

[15℄ MPEG-7 Overview (version 3.0),ISO/IEC JTC1/SC29/WG11-N3445,

Geneva,May/June 2000.

[16℄ Text of ISO/IEC 15938-5/CD Information Tehnology - Multimedia

ContentDesriptionInterfae-Part5MultimediaDesriptionShemes,

ISO/IEC JTC1/SC29/WG11-N3705, La Baule,Otober2000.

[17℄ Maytham Safar, Cyrus Shahabi, and Xiaoming Sun, \Image retrieval

byshape: Aomparativestudy," Teh.Rep., Integrated MediaSystems

Center and Department of Computer Siene, University of Southern

California, November 1999.

[18℄ C.T.Zahn and R.Z. Roskies, \Fourier desriptors for plane losed

urves," IEEE Transations on Computers, vol. 21,no. 3, pp. 269{281,

Marh 1972.

[19℄ M.K.Hu, \Visual pattern reognition by moment invariants," IEEE

Transations on Information Theory, vol. 8, no. 2,pp. 179{187,F

ebru-ary 1962.

[20℄ Leszek Cieplinski, Munhurl Kim, Jens-Rainer Ohm, Mark Pikering,

and Akio Yamada, CD 15938-3 MPEG-7 Multimedia Content

Desrip-tion Interfae - Part3 Visual,ISO/IECJTC1/S29/WG11-W3703, La

Baule, Otober 2000.

[21℄ Farzin Mokhtarian, \Silhouette-based isolated objet reognition

through urvaturesalespae," IEEE Transations onPatternAnalysis

(32)

urvature-based shape representation for planar urves," IEEE

Trans-ations onPatternAnalysisand MahineIntelligene,vol.14,no.8,pp.

789{805, August 1992.

[23℄ MPEG-7 Visual part of XM Version 8, ISO/IEC

JTC1/SC29/WG11-N3673, La Baule,Otober 2000.

[24℄ FarzinMokhtarian and Alan Makworth, \Sale-based desription and

reognitionof planarurvesand two-dimensionalshapes," IEEE

Trans-ations on PatternAnalysis and Mahine Intelligene,vol.8,no. 1, pp.

34{43, January 1986.

[25℄ Rafael C. Gonzalez and Rihard E. Woods, Digital Image Proessing,

hapter8, pp. 518{560, Addison Wesley, 1992.

[26℄ MilanSonka,ValavHlava,and Roger Boyle, Image Proessing,

Anal-ysis, and Mahine Vision, hapter 5, p. 142, PWS Publishing, seond

edition, 1999.

[27℄ M. Lybanon, S. Lea, and S. Himes, \Segmentation of diverse image

types using opening and losing," in Pro. IEEE Int. Conf. on Pattern

Reognition, 1994, vol. I, pp. 347{351.

[28℄ Mihael Hansen and William Higgins, \Watershed driven relaxation

labeling for image segmentation," in Pro. IEEE Int. Conf. on Image

Pro., 1994, vol.III, pp. 460{464.

[29℄ W. Y. Maand B. S.Manjunath, \Edgeow: aframeworkof boundary

detetion and image segmentation," in Pro. IEEE Conf. on Computer

Vision and Pattern Reognition, 1997, pp. 744{749.

[30℄ TGevers andV.K.Kajovski, \Image segmentationby diretedregion

subdivision," in Pro. IEEE Int. Conf. on Pattern Reognition, 1994,

pp. 342{346.

[31℄ X.Q.Li,Z.W.Zhao,H.D.Cheng, C. M.Huang, andR.W.Harris, \A

fuzzy logi approahto imagesegmentation," inPro. IEEE Int. Conf.

on Pattern Reognition, 1994, pp. A337{341.

[32℄ Ramin Samadani and Ceilia Han, \Computer-assisted extration of

boundariesfromimages,"inPro. SPIEStorageandRetrievalforImage

and Video Databases, San Jose, California, USA, 1993, vol. 1908, pp.

(33)

segmentation using attration-based grouping in spatial-olor-texture

spae," in Pro. IEEE Int. Conf. on Image Pro., September 1996,

vol. 1,pp. 53{56.

[34℄ Dirk Daneels, D. Campenhout, Wayne Niblak, Will Equitz, Ron

Bar-ber, Erwin Bellon, and Freddy Fierens, \Interative outlining: An

im-proved approah using ative ontours," in Pro. SPIE Storage and

Retrieval for Image and Video Databases, San Jose, California, USA,

1993, vol.1908, pp. 226{233.

[35℄ F. Mokhtarian,S.Abbasi, and J.Kitter, \EÆient and robustretrieval

by shape ontent through urvature sale spae," in Image DataBases

and Multi-Media Searh, pp. 51{58. World Sienti Publishing,