Rochester Institute of Technology
RIT Scholar Works
Theses
Thesis/Dissertation Collections
1993
Retrieval from an image knowledge base
James Janicki
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact
Recommended Citation
Rochester Institute
of
Technology
Computer Science Department
Retrieval from
an
Image Knowledge base
by
James H. Janicki
16
Hadley
Court
Approved by:
A thesis, submitted to
The Faculty of the Computer Science Department
in partial fulfillment of the requirements for the degree of
Master of Science in Computer Science.
February 23, 1993
Reproduction of this thesis
is strictly prohibited
without the prior written approval of the author.
Professor Walter
1.
Wolf
Professor Peter G. Anderson
Title of Thesis:
Retrieval from an Image Knowledge base
I,
James
H.
Janicki, hereby deny permission to the Wallace Memorial Library of
Rochester Institute of Technology (RIT), to reproduce my thesis in whole or in part. Any
reproduction will not be for commercial use or profit.
TABLE OF CONTENTS
ABSTRACT
1
Acknowledgements
2
Classification Codes
2
Introduction
andBackground
2
Problem
Statement
4
Previous
Work
6
Image Databases
,6
Case-Based
Reasoning
9
Theoretical
andConceptual
Development
10
Frames
11
Architecture
12
Ordering
13
Search
14
System
Description
15
Functional Specification
15
Limitations
andRestrictions
18
User Inputs
18
Outputs
19
System Files
19
System Specification
20
System Organizational
Chart
21
Data
Structures
22
Equipment Configuration
22
Implementation
Tools
22
Conclusions
23
Problems
Encountered
andSolved
23
Discrepancies
andShortcomings
ofthe
System
24
Future Development
24
Bibliography
25
Books
andArticles
Read for Thesis
26
Glossary
27
Appendix A
-User's Manual
30
1.
ABSTRACT
With
advancesin
computertechnology, images
andimage databases
arebecoming increasingly
important.
Retrievals
ofimages
in
currentimage database
systemshave
been designed
using
keyword
searches.These carefully designed
andhandcrafted
systemsarevery
efficient giventhe
applicationdomain
they
arebuilt for.
Unfortunately,
they
arenot adaptableto
otherdomains,
not expandablefor
other uses ofthe
existing information
andarenotvery
forgiving
to their
users.The
appearance offull-text
search providesfor
amore general search giventextual
documents.
However,
pictorialimages
containavast amount ofinformation
that
is difficult
to
cataloguein
ageneral way.Further
this
classificationneedsto
be dynamic
providing
for
flexible searching
capability.The
searching
shouldallowfor
morethan
a preprogrammed set of searchparameters,
as exactsearchesmakethe
image database
quite uselessfor
a searchthat
wasnotdesigned into
the
originaldatabase. Further
the
incorporation
ofknowledge along
withthe
images
is
difficult.
Development
ofanimage knowledge base along
withcontent-basedretrievaltechniques
is
the
focus
ofthis thesis.
Using
an artificialintelligence
technique
called case-basedreasoning,
images
canbe
retrieved withadegree
offlexibility.
Each
image
wouldbe
classifiedby
userentered attributes aboutthe
image
calleddescriptors.
These
descriptors
wouldalsohave
a"degree-of-importance"parameter.This
parameter wouldindicate
the
relativeimportance
orcertainty
ofthat
descriptor. These descriptors
are collectedasthe
"case"for
the
image
and storedin
"frames"Each
image
canvary
asto
theamountof attributeinformation
they
contain.Retrieval
of animage from
the
knowledge base begins
withthe
entry
of newdescriptors for
the
desired
image.
Along
withthe
descriptors
arethe
degree-of-importance
parameter.The
degree-of-importance
wouldindicate
the
requirementfor
the
desired image
to
matchthat
descriptor.
Again,
a variablenumber ofdescriptors
canbe
entered.After
all criteria areentered,
the
system will searchfor
casesthat
have any
level
of matching.The
system will usethe
degree-of-importance both in
the
knowledge base
aboutthe
candidateimage(s)
andthe
degree-of-importance
onthe
searchcriteriato
orderthe
images. The ordering
process will use weighted summationsto
present arelatively
smalllist
ofcandidateimages.
picturesof
any
subject matter.Retrieval from
the
knowledge base
provides relative matchesto the
givensearch criteria.
Although
this
applicationis relatively
simple,
the
basis
ofthe
systemcanbe easily
extended
to
include
amore sophisticatedknowledge base
andreasoning
processas,
for
example,
wouldbe
used
for
a medicaldiagnostic
applicationin
the
field
ofdermatology.
1.1.
Acknowledgements
I
wouldlike
to
acknowledgethe
support receivedfrom
the
RIT
thesis
committee,
in
particularthe
energy
andenthusiasmreceived
from
Walter Wolf. I
wouldalsolike
to thank
my
family
for
being
understanding
and supportiveofthis
effort.1.2.
Classification Codes
Using
the
computing
classificationcodesby
the
ACM,
the
primary
classification codeis
H.3.3
-Information Search & Retrieval.
Secondary
codesare:1.2.4.
1.4.0.
andH.3.1.
2.
Introduction
and
Background
The
economicclimatein
today's
worldforces businesses
to
rely
on computersfor
their
competitiveadvantage.
With the
everincreasing
capabilities and continueddecreasing
cost oftoday's
PC, they
arecommonplace
in
the
business
community.PCs
containhigh-resolution
displays,
large
storagecapacities andfast
processors.They
providesophisticatedapplicationsthat
non-programmers requireto
runtheir
business. Current
PCs
allowfor
displaying
of graphicsinformation in
additionto text.
Graphics
andimages
play
anincreasing
rolein
the
communication ofbusiness
relatedinformation. The
expression"a
pictureis
worthathousand
words"indicates
how
images
represent a vastamountofinformation
that
is
extremely
difficult
to
describe
in
textual
form.
The
trend
in
computertechnology
is continually focused
on graphicsandimages
by
uses such as* graphical user
interfaces
(i.e., X-Windows,
Microsoft
Windows);
*
the
incorporation
of graphicsinto
standarddocuments
viasophisticated wordprocessors and
desktop
publishers;
*
draw,
paint, image
andpixel manipulationsoftware;
* presentationandbusiness
graphicssoftware;
and*
the
increasing
popularity
of multimediaandfull
motion video.PCs
display
graphicalinformation
andimages
onhigh-resolution
color monitors.A
minimum of512
X
480
pixelresolutionis
requiredto
closely
examineastillimage1Current
rasterdevices
(VGA,
SuperVGA, XGA)
exceedthis
minimumandhave
resolutionsthat
approachthe
1
million pixel range(1280
X
1024
pixels).Each
colorpixelcanbe
representedby
8, 9,
16
or24
bits;
the
most common
being
proceed
to
aslarge
as21
inches.
This
equipmentprovidesfor
acompletely
newinterface
anddifferent
applications
to
its
users.Further
advancesin
mass storagetechnology
have developed disk drives
withhundreds
ofMBytes
of storage(approaching
GByte
range) in
arelatively
smallpackage at avery
affordable price.This
increase
in
storagehas
madenon-textualforms
ofinformation
available morereadily
to
users(i.e.,
1 MByte
ofinformation
canprovideasingleVGA-resolution image
orapproximately
6
seconds of audio).A
common usefor
opticalstoragehas
been
the
recording
ofaudioinformation
onCD-ROM. This
opticalformat
providesfor 30 to 60
minutesofdigitized
musicthat
is
very
portable andhighly
reliable.In
aPC,
aCD-ROM
can containacombination oftext,
binary,
graphical,
image,
and audioinformation.
Use
of compressiontechniques
canfurther
extendthe
usefulstorage capacity.Use
of mechanicaljukeboxes
providefor
anarray
ofCD-ROM's
withautomatic access.Thus,
optical storagetechnology
providesthe
convenient mechanism and vast storage requiredto
create animage knowledge
base.
An
emerging
technology/product
using CD-ROM for DCI
is
PhotoCD^This
technology
providesfor
the
accuraterecording
ofover onehundred
35mm
photographic pictures on a standardCD-ROM (typical
capacity
of650 MBytes).
The CD-ROM
containsseveral resolutionsfor
eachimage starting from
alow
resolution of64
x96
pixelsfor icons
to
its
highest
resolution of2048
x3072
pixels.Images
are compressedto further
increase
theamount ofinformation
stored onthe
CD-ROM.
The
PhotoCD
is
readableby
any CD-ROM XA
drive.
It
allowsany
userto
create a customCD-ROM
withtheir unique picturesby
using
a35mm
camera.Thus,
PhotoCD
can provide alarge
areaofimaging
applicationto the
PC desktop.
However,
givenits
recent marketintroduction,
the
currentsoftwarefor
the
productis limited
to
a simplePhotoCD
Access
Developer
Toolkit. This
toolkit
providesfile
systemtype
accessto
images
onthe
PhotoCD.
The
softwarecapability
to
"index"the
images
is
notcurrently
available.For
consumerpurposes,
using
a"contact
sheet"(
containing
picons andimage
numbersthat
is
printedandincluded
in
the
jewel
case whenthe
images
arerecordedto the
CD-ROM
)
allowsfor
retrieval ofthe
desired image.
The
usersimply
providesthe
desired image
numberprinted onthe
contactsheetfor
ahigher
resolutionimage
to
be displayed.
Given
currentprocessing
speedsandmonitorcapability
onthe
PC,
displaying
severalpiconsis feasible.
With
these
picons,
the
user can"browse"
to
determine
whichimages
areofinterest
and selector markthose
desiring
a moredetailed
study.Those
markedimages
(hopefully
a smaller set)
canbe displayed in
ahigher
resolutiononeimage
atatime.
The
display
time
for
these
high
resolutionimages
willbe
dependent
onthe
hardware
and softwareinvolved
andthe
resolutiondesired
onthe
display
device (which
could
be different
than the
stored resolution).After
further
examination, the
user candetermine
whichimage
mostclosely
matchestheir
needs.Given
alarge
setofimages,
this
process canbe very
time
andaccurate meansof
electronically
indexing
ofthese
images
is
required.The
most sophisticatedform
ofretrieval would
be from
animage knowledge base.
2.1.
Problem Statement
Although
browsing
through
images
using
aPC
andaPhotoCD
canbe
fun
andinteresting
for
aconsumer,
browsing
is
notatechnique
that
is
usefulin the
business community
given alarge
database
ofimages.
Further,
piconscannotprovideenoughinformation
for
a reasonable selection process sincethey
areimage
dependent
(i.e.,
large
panoramicview versus aclose-up
of a specificobject, bright
outdoorlighting
versusnight
time).
Further,
there
is
no supportinformation
availableto
aidethe
selectionprocess.Having
someform
ofindex information
wouldallowfor
the
images
to
be
organized.This
organizationwould providefor
aquick searchthrough the
image database
to
provide a promptresponseof a single or small set ofimages
given somespecific setof criteria.This
prompt retrievalis
criticalin
abusiness
application.This
is
evidencedin
the
RightPages Image-Based Electronic
Library-^where
displaying
offront
covers of recentjournals is inadequate
to
determine
the
journal
articlesofinterest.
The
system usesthe table
ofcontents
from
eachjournal along
witha user's profileto
highlight
journal
articles ofinterest
to the
user.Thus,
it is
essentialthat
a set ofindex information
orknowledge
needsto
be incorporated
with eachimage
in
the
database.
To incorporate
information in
the
image
database,
adesigner
woulddetermine
theappropriatesetof attributesto
be
recorded.This information
wouldbe
applicationdependent. It
wouldbe
organizedby
thetype
of retrievalsexpecting
to
be
preformed.Traditional
methods would suggestthat this textual
information
wouldbe
storedin
a relationaldatabase
withan applicationdependent
set ofcolumnsortuples
representing
the
attributes.The
end-userofthe
system wouldgothrough
adata entry
processfor
each
image
in
the
database. The
data entry
activity
would require a set offorms
orscreensthat
would allowthe userto
enterthe specifictype
ofinformation
desired
giventhe application.The
forms
would also perform somelevel
ofvalidationonthe entereddata
(i.e.,
weredigits
enteredfor
anumericfield,
was a validdate
enteredfor
adate
field)
andpotentially
aconversionbefore
storing
the
information in
thedatabase
(i.e.,
converting
anASCII
string
representing
the
entereddate
suchas"November
11,
1992"into
an
integer
value).The
net effectis
thatwehave
two
databases in
the
system,
onefor
theimages
andonefor
the textual
information
that
describes
the
images.
Difficulties
in
building
the textual
database include
the
designer's ability
to
incorporate
orevenappreciate
allof
the
attributesnecessary
for
the
applicationdomain
ofinterest. It
wouldinclude
the
programmer's
ability in
developing
sufficient audit andvalidationroutinesto
insure data is in
the
properformat. The
data
entry
person would requiretraining
onhow
the
system wasbuilt
and operates.They
would also needfield. Potential
areasof concerninclude:
Can
the
designer
anticipate allthe
necessary
fields
requiredin
the textual
database
to
satisfy
alldesired
retrievals?What is
theprogrammer'sability
to
eliminate alldata
entry
errorsthrough
comprehensive validation?What is
the
data entry
person'sability
to
properly
key
in
the
neededinformation
withoutmistake?How
to
determine how
closely
a picture's attributes matchthe
desired
selectioncriteria?Even
after wehave built
thistextual
database
and attemptedto
minimizepotential
problems,
stillother questions are raisedsuchas:Can
thesystembe
extendedto
accommodate newretrievals?Can
the
retrievals providethe
images desired
given partialinformation
for
retrieval?Can
theappropriateimages be
retrievedgiven somerelatively
small amount ofdata
entry
error?What
is
theprobability
ofhaving
thedesired image
notbe
retrievedgiventhe
search parameters specified?To
minimizeall ofthe
concerns raisedin
the
traditionalapproach,
a number of system requirements aredesired.
First,
the
incorporation
ofknowledge into
theimage database
shouldbe
as generalaspossible.It
should
only be
asdomain
orapplication specific asnecessary
to
providefor
efficient retrieval.It
should minimize theneedfor
arigid
structure and shouldmaximizethe
user'sability
to
specify
thatknowledge
(avoid
limiting
the
attributes ofthe
image
to
a small predefined set).Further,
the
knowledge
shouldbe
understandable
to
all usersand notjust
the
designer,
the
data entry
person orthe
domain
expert.The
structure ofthe
knowledge
is
important
in
being
ableto
providefor
an efficient retrieval.The
knowledge
shouldbe
easily
extendedto
include
additional attributes of eachimage.
The
systemshouldperform relative searches ratherthan
exact matches.The
system should act as a whole wherethe
images
andknowledge
areclosely integrated.
The
goal ofthis
thesis
wasto
develop
animage knowledge base
thatwouldgrow asthe
numberofimages
increased.
Each
image
wouldhave
a setofdescriptors
describing
its
contentand/or useasit is
added.These
descriptors
wouldbe relatively
simplesentencesmaking it easy
to
addimages. These descriptors
provide
for
factual information
aboutthe
contents ofthe
image
aswellas subjectiveinformation. The
descriptors
would contain relationships.In
additionto the
descriptors,
the
usercanspecify
adegree-of-importance
witheachonedifferentiating
majorfrom
minorfocus.
The
retrievalof animage from
the
knowledge
base
wouldrequireinput
ofa setofdescriptors
withtheir
corresponding degree-of-importance in
this
search.The
set ofdescriptors
wouldbe
compared againstthe
knowledge base
to determine the
relative matches.The
setofimages matching any
ofthe
searchdescriptors
wouldbe
further
analyzedby
the
number ofdescriptors
that match,
the
degree-of-importance
associated with each search
description
andthe
degree-of-importance
with eachimage in
thematching
set.The
retrieval would provide an orderedlist
ofimages
based
upon eachimage's ability
to
matchthe
specifiedsearch criteria.The
results ofthe
search wouldbe
presentedin
a mannerthat
would allowfor
reasoning
processthat
determined
the
order.It
couldbe
possibleto
extendthe
resultsto
include
somesmall amountofaudio
information.
Inherent
in any
large knowledge base is
the requirementfor
an extensive amount ofdata
entry.In
knowledge bases developed
by
anexpert,
it is
usually
the
expertordeveloper
that
performsthe
data entry
activity
ratherthan the
end-user.Since
the
Smart Photo Album is
so general anapplication,
it
wouldbe
impossible
to
find
anexpertthat
wouldsatisfy
allconsumers.Also
the
picturesincorporated into
theimage knowledge base
areuserdependent. Thus
theend-user ofthe
Smart
Photo Album
willbe both
the
data
entry
personandthe
retrievalpersonoftheknowledge base.
2.2.
Previous
Work
2.2.1.
Image Databases
Previous
workin
image databases has focused
onboth
graphical andimage information. Graphical
information
is
typically
representedasvectorinformation
(i.e.,
CAD
drawings). Images
are representedin
rasterformat
whereeach pixelis
representedby
avalue;
all pixelsthat
makeup
theimage
arefully
represented.A
number of approaches willbe
presentedthat
indicate
themannerin
whichimage
databases
arecurrently
being
developed. Some
ofthem
deal
with vectordata
whichis
muchfarther
removed
from
the
currentscope.However,
abrief look
attheir
architecture willbe
examinedfor
any
parallels.
In
graphicaldatabases,
retrievalis
typically
based
onimage
content.Given
applicationdomain
knowledge,
objectswithinthe
image
arerecognizedusing
different
objectinterpretations,
degree
ofrecognition,
andtheir
positionin
the
image
space.Images
canbe
composed of complexobjects'*These
complex objectsare
decomposed
into
simpler graphicalshapesthat
canbe
moreeasily
assimilated.Classification
is
limited
to
a welldefined
set ofbasic
objects.The
maindisadvantage
ofthese type
ofgraphically
orienteddatabases
is
the
needto translate
therasterinformation
ofthe
image into
vectorinformation. If
weareunableto translate the
rasterinformation into
arecognizable object(defined
by
a set ofvectors), then
theclassificationprocessfails.
Another
retrieval approachfor
graphicaldatabases
has focused
onshapesimilarity
retrieval5.An
objectis
representedin
terms
ofits
local
structuralfeatures.
Each
structuralfeature is
represented as a pointin
amultidimensionalspace.
Indexing
is
performedusing
amultidimensionalpoint access method.The
technique
canhandle
occludedandtouching
objects.Other
researchin image
databases
deals
with computer visionand machineintelligence
wherethe
focus is
navigate
through
aroom witha numberofobstacleswithinthe
room.Another
common applicationis
the
analysis of aerial
photographs
desiring
to
identify
specificobjects suchasroads,
houses,
etc. SIGMA6is
such asystem.
The
architecture consistsoffour
mainmodules:the
"Geometric
Reasoning
Expert"usesscene
domain knowledge for
spatialreasoning among
objects,
the
"Model Selection
Expert"whichreasons
about appearancesofobjects
in the
image based
onknowledge
aboutthe mapping, the
"Low-Level Vision
Expert"
whichperforms
image
segmentationusing image domain
knowledge
andthe
"Question
andAnswer
Module"The
first
three
modules are analysis modulesto
constructthe
description
ofthe
scene.The last
module realizesthe
interactive information
retrievalfacility
to
examinethe content ofthe
constructed
description.
Unfortunately,
thedecomposition
andrecognitionprocessfor
any
ofthese
systemsis
limited
to
specificobjectswithin
its domain.
These
objectscannotbe
generalized.Given
the
Smart
Photo
Album,
it
wouldbe impossible
to
createsuchadomain
knowledge.
Further,
the
query
process entailsthe
userdescribing
the
shapeofthe object and/orlocation information
withintheimage.
This
process wouldonly
makesensein
our applicationif
the
userhad
rememberedthe
approximatecontentofthe
desired
image.
Also,
it
would
be
very
difficult
to
provide auser-friendly query language
(it
would requirethe
userto
draw
the
shape
desired
in
orderto
begin
searching).At best
this
approach might provide afiner level
ofsearchafter agrosssearch was performed with some other mechanism.
Another
approachthat
has been primarily
usedin
the
art-history field is
Iconclass
'Each
image is
givenan alphanumericcode
that
representsits
positioninto
alarge,
well-defined,
hierarchical
catalogue.This
code represents
the
subject of animage
and/orthe
elementsin
the
image
(depending
ondepth). As
anexample,
animage
of a mountainlandscape
wouldhave
the
following
classification(only
showing
a1.
Supernatural,
God,
Religion
2. Nature
>21 The four
elements22 Natural
phenomena23
Time
24 The
heavens
25
Earth
>A.
Maps,
Atlases
H.
Landscapes
>25H1.
(Landscapes)
2SH11. Mountains
3. Human Beings
4.
Society, Civilization,
Culture
5.
Abstract Ideas
andConcepts
6.
History
7.
Bible
8.
Non-Classical
Myths,
Tales
andLegends
9.
Classical
Mythology
andReligion
Although
this
schemeis
quite efficientin
theart-history
domain,
it
requires all usersto
be
intimately
familiar
withthe
large
set of classesto
perform a successful retrieval.Further,
providing
a singleclassificationcan
be
limiting
and error prone.This
wouldbe
the
same situation astrying
to
usethe
Library
ofCongress
orDewey
Decimal
Systems found
in libraries. The librarian
understandsthenotationbut
usersrequireacatalogueto
be
ableto
determine
the
number andlocation
of adesired book.
Retrieval
ofimages
from
animage
database
using
keyword
searchis
quite common.A
number of chosenwords are used
to
describe
the
image,
typically
using
terminology
in
the
application'sdomain. These
wordsare
hierarchically
populatedinto
a relationaldatabase
that
providefor
quick retrieval(refer
to
previous section).
However,
images
containavastamount ofinformation
that
is difficult
to
capturein
arelatively
small number ofdatabase index
fields.
These
indexes
tend to
be
static andrigid making
anything but
exact searches possible.Exact
searches makethe
image database
quiteuselessfor
a searchthat
wasnotdesigned
into
the
originaldatabase.
The
shortcomings ofkeyword
searcheshave led
to
full-text
searchesorcontentbased
retrievalsystems8Here images
containtext
that
is
recognizedby
anOCR device. The
text
is
usedfor
indexing
purposeswithout
the
needfor
auserto
performany data
entry (other
thaninitial OCR
regions).Retrieval is
is
searchedusing
exactwordmatchesfor
each search wordandproximity
rangesbetween
each word.No
predefined organizationofthe
text
is
required.2.2.2.
Case-Based
Reasoning
Rule-based
expert systemshave been
a majorfocus
in
artificialintelligence
researchanddevelopment.
The
basic
unitofknowledge is
the
rulewhichconstitutes aconditionaltest
andresulting
action pair(typically
modeledby
"if
<condition>then
<action>).
Shortcomings
of rule-based expert systemsfocus
onknowledge
acquisition,
lack
ofmemory,
andlack
ofrobustness.To
build
an expertsystem,
an expertneeds
to
be identified
who canarticulatehis knowledge
and experiencesin
suchafashion
asto
impart it
withinthe
system(either
by
himself
or withtheaide of aprogrammer).It
soonbecomes
an arduoustask
to
implement hundreds
of rulesthat
appearto the
expertas commonknowledge
(which
mightbe
difficult
to
explainor socommonplacethat the
expertforgets
that
he
performsthem).Lack
ofmemory implies
that the
expert system would performthe
exactsame set of operations given the same problemin
succession;
it did
notlearn
or rememberfrom
thefirst
problemit
solved.Expert
systemslack
robustnessin
notbeing
ableto
respondto
a problemif
none ofthe
rules apply.Worse
wouldbe
aninappropriate
responsegiven
lack
of rules ordomain knowledge.
Case-based
systems are an alternativeto
rule-basedsystems.The knowledge in CBR
is found in
the
"case"
orexamplewhichrepresents
the
completeknowledge
and experience ofareal-world problem.The
system performsreasoning
by
comparing
a problembeing
input
to the
set of casesto
find
the
closestmatch.
Applications
that
have
found
successusing CBR
arethe
help
desk
andlegal
reasoning9in
domains
suchastax
law. CBR
systemstend to
be
easierto
build
and maintain than rule-based systems.To demonstrate
the
importance
being
givento
CBR
asignificantundertaking
wasthe
large
scaleimplementation
by
the
NEC
Corporation10A
corporate-widecase-based systemcalled
SQUAD
(for
the
Software
Quality
Control
Advisor)
wasdeveloped
to
supporttheir
softwarequality
control activities.The
data
acquisitionprocess started without acompletely
defined
caseformat
giventhe
lack
ofdomain
expertswho
have
awell-organizedidea
of allthefactors
involved. Goals for
the
systemincluded:
aneasy way
to
share experiences
among
150,000
employees;
aflexible
and robust systemthat
could accommodatethe
changing
nature ofacorporatestructure;
and asystemthat
candeal
withmissing
andinaccurate
information
whicharisein
real-worldinstallation. The
net effectofSQUAD
arequality
andproductivity
improvements
over ahundred
milliondollars
ayear.A
case-basedteaching
systemcalledCreANTMate
usesstory
telling
to
help
students
learn
about animals
(physical
features
andsurvival)1*.
Multimedia
is incorporated into
the
systemby
useofvideo clipsthe
student
aboutwhy
animalshave
certainfeatures
andoffersavideoclip for
evidence.Reminding
heuristics
areemployed
to
retrieve storiesto
achieve specificpedagogical objectives.Specific
strategiesinclude:
example remindings-to
provide
examples,
similarity-basedremindings-to
assistin
generalization,
and expectationviolationremindings-to
challenge
the
student'sexpectations.Retrieval
from
CBR
systemsfall
into
three categories:inductive,
nearest neighbor andtemplate
retrievals12Inductive
retrievalinvolves
traversing
ahierarchical knowledge
structurein
the
form
ofadecision
tree
looking
for
aleaf
nodeproviding
a similar caseto the
user.The inductive
methodrequiresa predefinedset ofindexes
of relevantfeatures
to
build
the
decision
tree.Without
the
index,
the
hierarchical
structure cannotbe
created.The
nearestneighbor retrievalperformsafeature-by-feature
comparison oftheinput
caseagainst all casesin
the
library.
Using
aweighted vector afeature
scoreis
generated withthehighest
scorebeing
the
closest match.
Typically
a scoreofonehundred
(100)
indicates
a perfectmatch whereascore of zero(0)
indicates
nomatch at all.Each
feature
willeffectthe
overall score ofthe
case,
based
onits
weight.Template
retrievals are usedfor
directed
searchesofa caselibrary
when youknow exactly
what you want.A
template
is defined
for
matching
on specific characteristics orkeywords
asin
anSQL
query
ofa relationaldatabase.
2.3.
Theoretical
and
Conceptual
Development
The knowledge base
being
built has
specifictradeoffs.
Ideally
the
system should providethe
mostexpressivemeans
to the
developer
to
providedescriptors
oftheimages.
This
implies
nolimitations
onthe
vocabulary
and sentence structuresthat
canbe
enteredby
the
user.However,
this createspotentially
anintractable inference
problem oratbest
apoorperforming
system.Thus,
analternateimplementation
would
be
to
allowthe
developer
to
enter statementsin
ageneral,
structured representationlanguage,
whichthe
system wouldtranslate
into
arestricted,
internal
language
or representationthat
allowsfor
efficientinference. If
anexacttranslation
is
impossible;
abest
approximation willbe
usedto
speedup
inference
withoutgiving up
correctness or completeness.The image
knowledge base
wasbuilt
from
aset ofdescriptors
andimportance factors. The developer
providesaset of simple sentences asdescriptors for
eachimage.
Each
sentenceis
ofvarying length.
Sample
sentences are:"Girl wearing
redhat",
"Sailing
raceonsunny
day",
and"Vacation in Colorado".
The
setof sentences willbe
ofvarying
number.The
maximumallowablenumber ofsentenceswillbe
arelatively
large
number with an absolute upper-bound.Along
with eachdescriptor is
aThe
systemparseseachsentenceordescriptor
providedby
the
developer.
Using
a context-freegrammar,
the
parserdetermines
if
the sentenceis
in
properform.
A
dictionary
orlexicon
is
usedto
performalookup
for
each wordin the
sentence.The
dictionary
validates properspelling
of words.It
also providesthe
typeof word(i.e.,
verb, noun,
adjective).By
passing
the
type
of wordback
to the
parser,
the
parsercancontinuewith
the
restofthe
sentenceto
validatethe
sentencestructure.If
the
sentenceis in
properform,
it is
accepted and storedin
the
frame.
If
awordis
unrecognizable, the
systeminforms
thedata
entry
person ofthe
wordthat
couldbe
misspelledornotfound
in
the
dictionary. If
the sentencehas
aninvalid
structure,
then the
parser wouldsoinform
the
data
entry
person.2.3.1.
Frames
All
ofthe
information
withinthe
image
knowledge base is
storedin
frames.
Each
image
has
its
ownframe.
Within the
frame
arefour
majorslots(see Figure 1). Each
slot contains acategory
ofinformation
aboutthe
image. The first
slotcontainsimage
location
information
whichincludes fields
such asCD-ROM
volumename,
image
filename,
maker ofCD-ROM,
versionof product usedby
makerandmostimportantly
a system-wide unique object-id.Image
Frame
Slot
#1
Slot
#2
Slot #3
Slot
#4
image
location
data
descriptor
descriptor
reasoning
information
statistical
data
In
the
secondslotis information
providedby
the
developer
abouttheimage.
There
are a set ofdescriptors
along
withthe
corresponding degree-of-importance. These
descriptors
aretranslated
into
aninternal
format
to
minimize storageandmaximizeretrieval performance.For
the sake ofthe prototype, the
original
descriptor
aswellasthe
internal format
are maintained.However,
the real product couldeliminate
the
originaltext
if
storagebecomes
anissue.
Also,
thereexist some staticfields
suchasname ofpicture,
date
ofpicture,
author ofpicture,
and picturelocation.
The reasoning
slot contains additionalinformation supporting
the
domain knowledge
providedby
the
field
expertwithinthe
system.This
slotincludes:
questionsto
askthe
userto
further clarify
asearch,
provide references
to
othercases,
provideadditionalinformation
aboutthiscase such as conclusionsto
draw
orrecommendationsto
follow. This
slothas
noapplicability
in
the prototypeapplication ofthe
Smart Photo Album.
However,
for example,
it
wouldbe
vitalto
amedicaldiagnostic
knowledge base
providing
the
name ofthediagnosis,
pertinentinformation
aboutthe
case and potentialtreatments
for
cure.
In
thestatisticaldata
slot willbe information
recordedby
thesystem as operations are performed.Fields
will
include: date
andtime
theimage
wasenteredinto
system;
date
andtime
oflast
retrieval;
anddate
andtime
oflast
modification.The
statisticalinformation
willbe
transparent to the
end-user ofthe
system.It
will
have
limited
value withintheprototype;
however,
it
couldbe
usedfor
suchthings
ascaching
to
improve
system performance.2.3.2.
Architecture
Figure
2
outlinesthe
major architecturalcomponents ofthe
prototype.Architectural
componentsinclude:
the
knowledge
base,
the
controlmechanism,
andthe
inference
mechanism.The
knowledge base
containsthe
image
frames,
theirindexes,
the
dictionary
and thethesaurus.The
control mechanismcontainsthe
allowablesentence
structures,
weightsfor
each value ofthe
degree-of-importance
factor,
confidencefactors,
and searchheuristics. The inference
mechanismcontainsthemodulesto
supportsearching
ofControl Mechanism
weights
sentence structure
heuristics
\.
N
Ni
Inference Mechanism
search
ordering
additions
Data Stores
dictionary
thesaurus
Knowledge Base
frames
Figure 2
-Organization
of
the
knowledge base
For
theprototype,
it is
assumedthat there
willbe only
oneknowledge base
or casebase.
2.3.3.
Ordering
Given
thesparsedomain
ofknowledge,
it
wouldbe difficult
to
organizeahead-of-time theinformation
that
we willbe
receiving from
the
user13This
wouldonly
be
possibleif
the
system wasbuilt
by
adomain
expert.
Even
if
adequatedata
existed,
knowledge
aboutthe
mix of queriesto
be
made are requiredto
create astructure.
Thus
to
eliminatebias in trying
to
identify
astructure,
a generalanddynamic
approachis
required.The
image frames
arehierarchically
orderedusing
a setofindexes. This hierarchical ordering
willbe
used
to
reduceretrievaltime.
Each
index
containsthe
list
ofimages
in
the
knowledge base
that
contain agiven
descriptor
word.Each descriptor
wordthat
has
atleast
oneimage in
the
knowledge base
willhave
an
index.
Initially
asmallimage
knowledge
base
willhave
avery
sparse setofindexes.
However,
asthe
knowledge
base
grows,
the
indexes
willbegin
to
expand.This
expansion willtheoretically
be along
allpictures.
Also
the
subject matterfor
a givenknowledge
base
shouldhave
similaritiesevenif
this
is
at avery
generallevel.
The ordering does
not provide atrue
hierarchical ordering
giventhe
multipledescriptors
andwordsin
each
descriptor for
animage. The
ordering
only
accountsfor
the
first
phaseofthe
search whichis
to
drastically
reducethe
numberofimages in
the
knowledge base
to
asmall setthatcanbe
morecarefully
analyzed.
2.3.4.
Search
The
search processis
performed as a phased processusing
severalstagesofprocessing.Before
the
searchprocess
begins,
the
descriptors
entered assearchcriteriaare validatedusing
the
sameparser usedduring
data
entry.Given
a set of validdescriptors,
these
willbe decomposed
into
a completelist
of wordsto
searchon.
Each index
relating
to
a wordin
the
list
willbe
searchedproviding
alist
ofimages
that
matcha word
in
oneofthe
descriptors.
The individual
image lists
willbe
combinedinto
a singlelist
eliminating
all redundant
images.
However,
the
information
aboutthe
images
in
thelist
willbe
expandedto
include
the
number ofoccurrences thatimage has
in
the
set ofimage lists. This
singlelist
ofimages
completesthe
first
stage ofprocessing
whichprovidesall potentialcandidatesin
the
image knowledge base
that
have
some match
to the
search criteria.The
second stageofprocessing begins
the
analysis ofthecandidates.The
analysis process attemptsto
rankthecandidate
images
by
similarity
to
the search criteria.The
singlelist
ofimages is
sortedbased
uponnumber of occurrences.
Thus,
the
images
withthe
mostnumberofmatching descriptor
wordsarepotentially
the
mosthighly
rankedimage
(excluding
any
degree-of-importance factors
taken
into
consideration at
this time). Each
candidateis
then
compared and a weighted sumis
calculatedfor
eachmatching
attribute.When
calculating
thesimilarity between
thequery information
andacandidateimage,
the
methodignores
attributes consideredto
be
irrelevant10(i.e.
additionaldescriptors
that
arepartof
the
image
but
do
not match the search criteria will notbe
used).Weights
areassignedto
the
degree-of-importance
factors
for
each candidateimage.
A
higher
degree-of-importance
wouldbe
astrongerindication
ofthatattribute withintheimage.
Additionally
weights areassigned
to
the
degree-of-importance
factor
onthe
searchcriteria.The
higher
thefactor,
thestronger arequirement
for
aparticularattribute given other searchattributes.Each
candidateis
viewedseparately
for
atotal
matchagainst all search criteriainformation.
The
resultsofthe
searchis
an orderedlist
starting
withthe
highest
weighted sum.The
useris
ableto
viewthe
information
(actual
image
andAnother
stageofthe
search wouldbe for
searchesthat producezeroresults.Here
chaining
willbe
usedto
expand
the
list
of search candidates.13 A
thesauruswillbe
usedto
find
alternatewordsto
those givenassearch criteria.
These
alternate words will thenbe
usedto
redothe
search.Hopefully
using
the
alternatewordsat
least
one match canbe found
withinthe
image knowledge base.
If
a search still produces a zero result even afterchaining,
then the
system willhave
anopportunity
to
perform
failure-driven
learning11
Since
the
userhad
an expectation
that
failed
to
be
visualized, the
usershould
be
ableto
adjust the caseswithinthe
knowledge
base. This
couldbe
in
the
form
ofmodifying
aparticularorsetofcases.
It
could alsobe
by
expanding
the thesaurus.
It
wouldbe
mostbeneficial if
the
systemcould
determine
whatmodificationsmightbe necessary
to
achieve theuser'sexpectations.Although
this type
oflearning
wouldbe desirable
it
is beyond
the
scope ofthe
prototype.3.
System
Description
The
focus
ofthis
thesis
is
the
implementation
ofa prototype called"Smart Photo
Album"This
prototypeembodies
the design
ofageneralimage knowledge base
using
the
PhotoCD
product onthe
IBM PC
platform.
The
knowledge
aboutthe
images is
enteredby
adeveloper using
a context-free grammar.A
dictionary
willbe
usedto
validatethe
words entered.The
end-user willenter queriesin
the samemanner.The
system uses case-basedreasoning
asits searching
technique to
retrieveimages
from
its knowledge
base. The
search provides the closest approximationto the
userenteredsearch criteria orquery
information in
an orderedfashion.
Although
the
prototypeis
not afully
functional product,
enoughdesign
efforthas
been incorporated
into it
to
understand whatthe
entire product wouldlook
like.
3.1.
Functional Specification
The
interaction between
thesystem and all outside sources canbe
viewedfrom
the
Context
Diagram
asshown
in Figure 3. The
outsidesourcesto
the system aretheDeveloper
andthe
User. All
othercomponents,
like
the
PhotoCD,
areincorporated
withinthe
system.Note
thatalthoughthe
Context
Diagram
showsaDeveloper
and aUser,
this
caneasily be
the
same personandmostly
willbe in
the
caseof
the
Smart
Photo
Album.
In
situations wherethe
system wouldbe
usedin
the
development
ofanexpertsystem
(i.e.,
allrequiredknowledge
wasalready
incorporated
into
thePhotoCD
and softwareby
adomain
expert), the Developer
wouldbe
the
expertperforming
the
development
andthe
User
wouldbe
the
Developer
\
new\
image infodisplay
image search
criteria
unage
Figure
3
-Context
Diagram
The breakdown
ofthe
contextdiagram
canbe
found
in Figure 4.
The
systemhas
two
mainfunctions
orcomponents.
It
alsohas
three
data
stores.The
first function is
the
ability
to
addimages
to the
knowledge
base. This
willrequiredescribing
the
image
to the
systemby
providing descriptors
aboutthe
image. The
additionof
the
images
to the
CD-ROM
willbe distant
in
time
from
thedata
entry time.
Certainly
a good photographer willkeep
notesas pictures are taken aboutthespecifics of each picture.These
notesareinvaluable
during
data
entry.However
for
thecasualphotographer,
the number ofpicturestaken
is
notlarge,
allowing for
the userto
reconstructthe
circumstances ofthe
picture asthe
useris
looking
atthe
Image Frames
Thesaurus
Figure 4
-Data Flow Diagram 1.0
The
mostcommonsequenceofeventsis
for
the
userto take
aroll ofpictures.Then
the
user willhave
the
pictures
developed
and addedto the
latest
PhotoCD. If
this
is
the
first
roll offilm
everenteredonaPhotoCD,
orif
the
last development
processfilled
the
existing
PhotoCD,
thena newPhotoCD
willbe
created
by
thefilm developer. Next
the
user willload
the
PhotoCD
into
the
drive
withthe
Smart Photo
Album
applicationactive.By
activating
the
"add"function,
the
system will validatefor
the
last known
image
andlook for
additions onthe
PhotoCD.
The
system canthen
prompttheuserfor
information
abouteach new additional
image.
The
data
entry activity
addsto the
Image Frames
data
store.The
Dictionary
data
storeis
usedin the
validationof entered wordsas part ofthe
descriptors. The
Dictionary
is
usedin
aread-only
fashion
during
data
entry.The
secondfunction
wouldbe
the
retrieval of animage
by
specifying
searchcriteria.The
search criteriais
enteredas a setofdescriptors. The
Dictionary
data
storeis
usedin
the
validationasbefore. The
retrieval can
be
aniterative
processproviding
more orless
criteriadepending
uponthe
number ofimages
used
for
wordlookup
in
case theinitial
retrievalfailed
to
produceany
results.The
Thesaurus
data
storeis
used
in
aread-only
fashion
during
retrieval.3.1.1.
Limitations
and
Restrictions
The
prototypedeveloped for
this
thesis
is limited in
scope anddoes
not performthe
following
functions:
*
natural
language processing
-The
systemis limited
to
simple sentence structuresthat
arenotrigorously
validatedagainstthe
English
language.
*
Only
a subsetofthe English
language
willbe incorporated
and noforeign languages.
*
The
dictionary
is limited
in size;
however
it
canbe
extendedby
theuser.*
Although
the
data
structuressupportit,
the
thesaurushas
notbeen incorporated into
the
system.These is
nomeansto
enterthisinformation.
Also,
it
will notbe
availableduring
a search evenwhenasearch
fails
to
produceany
results.Other
creativewaysfor
its
usehave
notbeen
explored.
*
Multiple
searchcriteria entered
is ANDed
together.
Other boolean logic
operators suchasOR
and
NOT
are not supported.*
The
systemsupportsa single
CD-ROM Disc.
Support
for
aset ofCD-ROM's
which wouldinvolve
asking
the
userto
load
a specifiedCD-ROM
in
the
drive
(or
by having
ajukebox
performthat
action)
is
notincorporated.
*
No
supportfor
aCD-ROM jukebox hardware. The
systemis
built
witha singleCD-ROM.
*
The
system willnotallowthe
userto
modify
the
"control
structure"information (i.e.
weightsassigned
to
the
degree-of-importance,
cutoff on maximum number ofimages
assearch results).*
There
system utilizesonly
oneknowledge
base
or casebase
at atime. Multiple knowledge
bases
activated and relationships
between
them
are not supported.*
Failure-driven
learning
willnotbe
performedas part oftheprototype.The Smart
Photo Album
prototype requirestheuserperform alldata
entry
into
theknowledge base.
However,
the
system couldbe
easily
adaptedto
aknowledge
base built
by
an expertby incorporating
the
additionandmodificationof
images in
the
knowledge
base into
the
development
tools
for
the
expert.The
data
entry capability
wouldthen
be
removedfrom
the
applicationpresentedto the
end-user.Although
this
variationofthe
systemcouldbe
developed,
it is
outside ofthe
scope ofthis
project.3.1.2.
User Inputs
As
outlinedin Figure
4,
there
aretwo
input functions
to
perform.The
first
willbe for
the
developer
to
will
be
for
the
end-userto
enter searchcriteria.At
any
time,
the
user canalsoaskfor
apictureto
be
displayed
from
any
type
oflist
of pictures.In
addition,
all users arerequiredto
respondto
errorpromptssuch as anincorrectly
enteredimage
descriptors.
Along
the
samelines,
alluserswillneedto
acknowledge
informational
messages.User
input is developed
using
a graphicaluserinterface
(GUI)
for
reducedtraining
by
the
user,
optimaluser
productivity
andfriendliness.
The
input follows
standard conventionsfor
the
Microsoft
Windows
GUI
as publishedby
Microsoft. Input is
providedby
the
userfrom
the
pointing device
(i.e. mouse)
andkeyboard. Accelerator
orfunction keys
are utilizedin
commonly
accessedfunctions along
withappropriate
defaults.
All
detailed input
such asdescriptors
use modaldialog
boxes.
3.1.3.
Outputs
The
systemhas
twoforms
of output.The first is
rasterinformation in
the
form
ofvarying
resolutions ofimages
and picons.The
rasterinformation is displayed in
separate childwindows.The
secondis
in
theform
oftext.
All
text
is incorporated into
the
graphical userinterface in
theclient area of childwindows,
dialog
boxes
andnotify boxes.
Key
areas of notificationto
theuserdeal
witherrors or progress made.When
descriptors
arebeing
entered,
the
system validatesthe
wordstyped.
If
any
wordsfail
to
be found
within
the
dictionary
orany
sentence structuresfails
to
be properly
parsed,
anotify
box is
presentedrequesting
the
problembe
correctedbefore
proceeding.3.1.4.
System Files
A
number ofdata files
existin
the
prototype.They
include:
1
.dictionary
orlexicon
- containsall valid wordsthatcan
be
used asimage descriptors for
allimages
withinthe
case2.
thesaurus
-lookup
for
alternate wordsto those
found in image descriptors
to
expand agivensearch
(part
ofdictionary)
3.
image knowledge base
-image frames
collectively
stored within afile
representing
agiven photoalbum
4.
knowledge base index
-a
file matching
descriptor
wordsandimages
withinthe
database for
increased
performanceduring
searchesThe
dictionary
andimage descriptors
aretightly
integrated.
Each
wordin
the
dictionary
is
givena uniqueinteger
value.This
integer
valueis
the
internal
representationofeachdescriptor
word.Thus,
adescriptor
is
internally
stored as a sequence ofintegers
wherethe
sequenceshas
specificmeaning
to the
user.A
thesaurus
of words canbe
usedto
eitherbroaden
ornarrowthe
searchasdesired.
The
thesaurus
wouldprovide
similarwordsin
thedictionary
to those
thatthe
usercurrently
enteredas search criteriato
expandthe
search.Although
theprototypesupportsonly
asingle active photoalbum(or
image knowledge
base),
a user cancreate multiplephotoalbums.
This
does
notimply
that
eachPhotoCD
willbe its
own photo album.A
photo album
is
expectedto
have
numerousPhotoCD's
withinthe
image knowledge base.
Having
multiple photo albums would allowthe
userto
segregatethe
entire collectionofPhotoCD's into
groups.An
example might
be
a photojournalistwho wantsto
keep
business
related pictures separatefrom
picturesdealing
withtheir
personallife.
3.2.
System Specification
The
systemis developed
using
Microsoft Windows
and acombination ofC
andC++
modules.The
prototype
follows
allstandard conventions providedby
Microsoft
for
the
Windows
environment.Thus
any
useralready familiar
withotherMicrosoft
Windows
applicationswillhave
nodifficulty
in
executing
the
Smart Photo
Album
prototype.The
userinterface
portion ofthe
systemis
event-driven.The
application
has
a main menu(after
a photo albumhas
been
activated).As
part of each main menuitem
is
a pulldown menu
that
lists
allthe
actions orchoices possible.Some
menushave
asecondary
pulldownfor
a more
detailed
selection.Others
requirefurther
input
asindicated
withan ellipsis.When
a menuitem
is
not
applicable,
its
appearanceis
grayedin
color andis disabled
(not
selectableby
the
user).The graying
of certain actions prevents
the
userfrom
making
mistakes.The
systemhas
several menus.The
currentmenu
is dependent
onthe
active window.For further
details
referto
theUser's Manual
sectionin
the
Appendix.
Some
ofthe
menuactionsareperformed withoutfurther
input. Other
actions provide adialog
box for
gaining
specificinput
from
the
user(these
arerecognizedby
an ellipsis).On
the
dialog
boxes
willbe
pushbuttons
for
completionorcancellation ofthe
desired activity
(typically
anOK
andCANCEL). This
input
is
validatedand storedaway
before
the
dialog
box
is
removedfrom
thescreen whenthe
user pressesthe
"OK"button. Also
there
are combo-boxesfor
data
entry
items
suchasthe
degree-of-importance
whereapredefined
list
ofitems
are availableto
selectfrom.
Given the
vast amount ofinformation that
the
user might wantto
viewsimultaneously, the
GUI
wasdeveloped using MS-Windows Multiple Document Interface (MDI). MDI
supports avariablenumberofchild windows attached
to
the initial
parent window.The
child windows aredisplayed in
the
client areaCD-ROM,
alist
of candidate picturesfor
eachsearch,
andfor
eachimage
selectedfor
display.
Using
MDI
allows
greaterflexibility
to
the userincluding
comparing
the
resultsoftwo
searches.3.2.1.
System Organizational
Chart
Given
theeventdriven
nature ofthe
GUI,
theentire systemis
eventdriven (see Figure 5).
The
two
mainfunctions
outlinedin Figure 4 (Add Image
andRetrieve
Image)
areincorporated
underthe
specific menuchoices.
When
the
"Add
Picture"menu
item
is
activated,
theappropriate main-level routineis
calledto
obtain
the
descriptors
anddegree-of-importance,
performthe
parsing,
dictionary
lookup,
sentencevalidation,
and storage ofinformation
into
the
caseframe.
When
the
"Find
Picture"menu
item
is
activated, the
main-level routineis
calledto
obtainthe
descriptors
anddegree-of-importance,
parse thesearch
criteria,
activatesearchalgorithm,
and present results.user
input
Get Event
<iJN
Validate Input
Perform Action
Update Screen
3.2.2.
Data Structures
A
number
ofdata
structuresareimplemented
as partofthe
prototype.They
support thecollectionofobjects
thatare requiredto
implement
the system.They
include:
1
.varying
length
character strings(allocation
ofmemory is limited
to
currentstring
sizebut
the
string
canbe
edited atany time
creating
alonger length
string),
2.
varying
arraysofIDs,
3
.varying array
ofpointers where the pointers provideanaddress
to
additionaldata
structuresin
memory,
4.
double-linked
lists
ofobjects,
and5.
map
ofIDs
to
objects.These
structures areimplemented
using
C++
classesfor
maximum reuse.3.2.3.
Equipment Configuration
The
systemis developed
for
the
IBM PC
andPC
compatible marketusing
anIntel 386
orbetter
processor.The
systemrequires aminimum ofa14
inch,
color, VGA
resolution monitor.The
systemneeds aminimum of
4 MBytes
of mainmemory
to
supportdisplay
ofimages. Note
that
additionalmemory is
recommended
to
increase
display
and search performance.A
magneticdisk
of a minimum of80
MBytes
is
requiredto
support software anddatabase information. Most
importantly
is
aCD-ROM
XA disk
drive
that
is
compatible/certifiedfor PhotoCD. Although
the
prototypeis developed
using
a specificsetofhardware,
a real product producedfrom
the
prototype shouldbe easily
extendableto
different hardware
configurations
(i.e.,
higher
resolutionmonitor, larger
screenmonitor,
faster
display
adapter,
different
CD-ROM
drives,
varying
memory
configuration).However,
the
prototypeis
nottested
againstany varying
setof
hardware.
3.2.4.
Implementation Tools
The executing
system requiresMS-DOS 3.3
orhigher
andMicrosoft Windows 3.X
aspartofits
environment.
The
systemalsorequiresthe
necessary
Windows
drivers loaded for
the
specifichardware.
In
particular,
aWindows driver for
the CD-ROM
compatible withthe
PhotoCD
toolkit
is
mandatory.In
addition would
be
the
applicationsoftwaredeveloped
(i.e.
Smart Photo
Album)
which willinclude
the
PhotoCD
toolkit.
For
demonstration
purposes,
asamplePhotoCD
is
needed withvisually
identifiable