Rochester Institute of Technology
RIT Scholar Works
Theses
Thesis/Dissertation Collections
11-1-1995
Cary collection web presentation & digital image
database
Cynthia McCombe
Follow this and additional works at:
http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact
Recommended Citation
Cary
Collection
Web Presentation
&
Digital
Image Database
http://wally.rit.edu/cary
by
Cynthia Marie McCombe
Thesis Advisor: Professor
David
Pankow
A
thesis
submittedin
partialfulfillment
ofthe
requirements
for
the
degree
ofMaster
ofScience in
the
School
ofPrinting
Management
andSciences in
the
College
ofImaging
Arts
andSciences
ofthe
Rochester Institute
ofTechnology
School of Printing Management and Sciences
Rochester Institute of Technology
Rochester, New York
Certificate of Approval
Master's Thesis
This is to certify that the Master's Thesis of
Cynthia Marie McCombe
With a major in Graphic Arts Publishing
has been approved by the Thesis Committee as satisfactory
for the thesis requirement for the Master of Science degree
at the convocation of
November
1995
Thesis Committee:
David Pankow
Thesis Advisor
Marie Freckleton
Graduate Program Coordinator
C.
Harold Goffin
Permission
to
Reproduce
Cary
Collection
Web Presentation
&
Digital Image
Database
I,
Cynthia Marie
McCombe,
hereby
grant permissionto the
Wallace Memorial
Library
ofthe
Rochester Institute
ofTechnology
to
reproducemy
thesis
in
wholeorin
part.Any
reproductionwill not
be
for
commercial use or profit.Dedication
In
loving
memory
ofmy
father,
Thomas
McCombe III.
Acknowledgments
My
sincere appreciationis
extendedto
everyone who contributedtime
and effortto this
project.
Thanks
is especially due
to
Professor
andCurator
David Pankow
for his
guidance,encour agement and enthusiasm.My
highest
appreciation also goesto
Shirley
Bower
ofthe
Wallace
Library's
System Department
for
taking
me underher wing
andteaching
methe
technicalaspects of web publishing.
I
would alsolike
to thank
my
mother,
Judith
Marie
McCombe,
who provided me withthe
finances
andencouragementto
pursuegraduate studies atRIT.
Table
ofContents
List
ofFigures
viAbstract
viiChapter
1/Introduction
1
Sources
6
Chapter
2/Theoretical
Basis
ofStudy
7
Sources
18
Chapter 3/Selected Review
ofLiterature
in
the
Field
ofStudy
19
Sources
30
Chapter 4/Statement
ofProblem
31
Chapter
5/Methodology
32
Sources
53
Chapter
6/Summary
andConclusion
54
List
ofFigures
Figure
1:
Graphic File Formats Supported
by
World Wide Web Browsers
10
Figure
2:
Advantages
andDisadvantages
for
GIF
andJPEG File Formats
12
Figure
3:
Cary
Collection
Web Presentation Schematic
34
Figure
4:
Database
Setup
44
Figure
5:
Image
Map
Schematic
49
Figure
6:
Form
Schematic
49
Abstract
The Melbert
Cary,
Jr. Graphic Arts Collection in
the
Wallace Memorial
Library
is one ofthe
trea
suresthe
RIT
community has had
availablefor
years.For
this
thesis project,
selectedmaterialsfrom
the
collection were scanned and made available onthe
Internet
so people atany location
can experience
the
rare andinvaluable items
the
facility
houses.
Not
only has
the
result ofthis project created an educationaltool
for
othersto use,
but
it
also challengedthe
authorto
mas terwebpublishing
whiledevelopments rapidly
occur onthe
most powerful mass communications
mediato
arisein
decades.
While
the
primary
purpose ofthis thesis
project wasto
create anaesthetically pleasing
and
information
rich web presentationfor
the
Melbert
Cary,
Jr.
Graphic
Arts
Collection,
many
secondary
goalshad
to
first be
achieved.Those
secondary
goals are outlinedin
this
list:
1.
To
acquirehigh quality
color electronicimages
for
othersto
access remotely.2.
To design
asearchabledatabase
of300
records.3.
To
learn
the
ins-and-outs
of webpublishing by:
Creating
cohesive and consistentdocuments
in
the
HyperText
Markup
Language
(HTML).
Developing
anaesthetically pleasing
interface
for
usersto
exploredocu
ments.
This included
keeping
up-to-date withdevelopments
in HTML
and
using
techniques
createdby
webpublishing
expertsto
makethe text
astypographically
pleasing
aspossible.Placing
the
necessary documents
andimages
on a web server.Advertising
the
address ofthe presentation,
orURL,
to the
appropriate audience.4.
Developing
clearand conciseinstructions
onhow
to
maintainthe
presentation,including
proceduresfor adding
categoriesandimages.
After
a substantial amount of work onthis
project wascompleted,
it
waslinked
to the
Wallace Memorial Library's
home
page.A
home
pageis
the
first
site a person reaches upontyping
in
anInternet
address onthe
World Wide
Web.
Home
pages canbe
createdby
individu
als or organizations and serve as points ofdeparture for exploring
textual
andgraphicalinfor
mation available at
these
sites.The
Cary
presentationhas
a sectionexplaining
the
history
and growth ofthe
collection.
A
user can continueby
taking
a virtualtour
ofthe
facilities
orby
reading
about recentacquisitions.
The
mainfeature
ofthe
presentationis
a subjectlibrary
anddigital
image data
base
which contains aninitial
collectionofapproximately
300
searchable recordsranging from
medieval manuscriptsto portraitsofprinters.
Instructions
onhow
to
perform varioustypes
ofsearches,
as well as whattype
of searches arefeasible,
are alsointegrated into
this
project.Finally,
information
aboutimage
acquisition,
graphicspresentation,
anddatabase
installation
and
setup
areintegrated
into
this
project.In
addition,
asecondary home
pagefor
the
American
Printing History
Association
has
been
created,
andsample articlesfrom
its journal
Printing
History
willbe
availablein
adigital
format
on anongoing
basis.
All
images
were preparedto
be
asfaithful
to
the originalsaspossible,
keeping
in
mindthe
drawbacks
inherent in
viewing
images
andtext
ontoday's
monitors.Retaining
accuratecolors and
details
whilepaying heed
to
practical speed requirements oftransmission
was ofgreat
importance.
A
feedback form has
alsobeen
made availablefor
individuals
who wishto
communicate
any
comments, problems,
requestsorsuggestionsvia e-mail.The documentation
that
follows
is
the
methodology
usedin
creating
the
Cary
Collection's
web presentation.Chapter
1
INTRODUCTION
Remote
computing has become
commonplacein
today's
society.From
using
automaticteller
machines
to
sending
e-mailmessages,
advanced computer-networkshave been
establishedto
send
information back
andforth using long-distance lines.
The
oldest ofthese
computer net worksin
the
United States
(and possibly
in
the
world)
is
the
Internet,
commonly
referredto
as"the
mother of allnetworks."
When
andwhy did
the
Internet
comeinto
existence?In
the
early
1960s,
scientists startedexperimenting
by
connecting
to
remote computersin
different
geographical
locations.
Later in
the
decade,
the
U.S. Advanced Research
Projects
Agency
ofthe
United States
government realizedthe
potentialimpact
the
computerindustry
mighthave
onboth
education andmilitary development
and research anddecided
to
fund
an experimentalnetwork.
Coined
ARPANET,
the
group
aimed atdeveloping
a networked computer system capable
ofwithstanding
a nuclear explosion.Through
their research,
a network protocol calledTCP/IP
(Transmission
Control
Protocol/Internet
Protocol)
wasinvented. In
the
1970s,
this
protocol
became
a standardfor
ARPANET. The
educationalcommunity
was encouragedto take
part
in
networkcommunications,
and as more peoplelogged
on,
more services such as electronic mail weredeveloped.
In
the
early 1980s,
ARPANET
became
the
backbone
of what soonbecame known
asthe
Internet,
asinterconnected
research networksrapidly
convertedto
the
TCP/IP
protocol."When
the
Internet
first
cameinto
existencein
the
early 1980s,
there
wereonly
213
registeredhosts (computers
that
providedservices)
connectedto the
network.By
February
of1986,
there
were2,308
hosts.
Today
the
Internet
is
undergoing
tremendous growth,
with several millionhosts
connectedworldwide"
In
the
beginning
ofInternet
development,
the
tools
createdto
search andviewthe
vastamounts of
information
werehard
to
use.This
is because
most ofthe
activity
onthe
Internet
was orientedtoward
computerliterate
scientists.But
times
have
changed.Today,
the
fastest
growing
segment ofinformation
placed onthe
Internet is in
the
commercial sector.People
without ahigh level
of computer-basedknowledge
arefinding
the
available resources onthe
Internet
useful.This
changein
emphasis and growthin
audiencehas led
to the
inception
ofmore
user-friendly
andpowerfulservices,
namely
the
World
Wide Web
(Pike
1995).
The
World
Wide Web
was createdin
1989
by
Tim Berners-Lee
atCERN,
an advancedphysics
laboratory
located
in
Geneva,
Switzerland.
It
is
a"wide-area hypermedia
information
retrieval
initiative
aiming
to
give universal accessto
alarge
universe ofdocuments"
{Microsoft
Systems
Journal 1994).
Originally,
the
web's purpose wasto
host hypertextual
information
about physics
to
interested
people aroundthe
world.But
asthe
look
andfeel
ofthe
interface
became
moreuser-friendly,
more and more members ofthe
academiccommunity began
communicating
through the
network.When
the
World Wide
Web
was made availableto the
publicin
1993,
the
Internet
community
soonincluded
this
serviceamong
their
everyday computing
tools.
The fundamental
technology
underlying
the
webis
calledthe
Hypertext
Transport
Protocol,
that
is,
a set of rulesgoverning
communicationbetween
abrowser
orclient,
and aWeb
server(Hayes
1994).
Today,
the
World Wide Web
is
experiencing
astronomical growthdue
to
the
development
of graphicalinterface
browsers
such asNCSA
Mosaic,
Cello,
andthe
mostpopular,
Netscape. These
interfaces
allowfor
moreaesthetically
pleasing
presentations,
astext,
graphics,
audio and video are all availablefor crafting
web presentations onthe
Internet.
The
aesthetics of apresentation,
however,
are somewhathindered
by
the
nature ofthe
language in
which
documents have
to
be drafted.
The
HyperText
Markup
Language
is
aversatilelanguage
that
aims atallowing
authorsto
format
text
by
creating
relationships orhypertext
links
between documents. HTML is
a platform-independentlanguage,
and although recentdevelop
has
configuredtheir
computerto
display
the
incoming
information.
For
people whohave
never exploredthe
World
Wide
Web
with a graphical userinter
face,
the
processis
easy
to
learn.
Although
the
underlying
structures and protocols are complex
in
nature, these
structures areburied beneath
the
userlevel.
When
a useris
on a comput er connectedto
the
Internet,
he
or she startsup
abrowser
or client and goes out overthe
net workto
retrieve adocument
by
typing
in
an address called aUniform Resource
Locator.
In
the
document,
highlighted
phrases aredisplayed.
By
taking
a mouse andclicking
on one ofthese
phrases,
a newdocument
appearsthat
contains morehighlighted links.
If
the
user clicks on one ofthese
links,
still anotherdocument
appears.The
true
beauty
ofthe
hypertext language
is
that
links
allowthe
userto travel
aroundthe
worldin
an expedient manner.Throughout
the
document,
thumbnail
photographsmay
be
positionedthat
show alarger
image if
double-clicked on with a mouse.
STATEMENT OF GOALS
A
plethora of web presentations are placed onthe
Internet
on adaily
basis.
The
sheerquantity
of
information does
not meanthat
quality
information is
being
generated.The
objective ofthis
project wasto
craftaweb presentationthat
wouldhelp
local
and remote users unlockthe
tremendous
resourcesin
the
Cary
Collection.
The
statement ofthe
goalsfor generating
the
Cary
Collection's
webpresentation were realizedby
addressing
the
following
questions:1.
How
should colorimages be
preparedfor
onlineviewing
sothat
they
retainfidelity
to the
originals while stillfacilitating
speedy
transmissions?2.
How
can an onlinedigital
image
database be
establishedthat
can performsearches?
3.
Keeping
in
mindHTML
design
limitations,
how
can a sitebe developed
that
BACKGROUND
ANDSIGNIFICANCE
FOR IMAGEPREPARATION
For
years color scientists and engineershave
saidthat
judging
color on computer monitorsis
notadvisable,
as color monitors use additive color while printed materials use subtractive color.The
gamut range onthese
different
"substrates,"for lack
of a more appropriateword,
varies;
both
requiredifferent viewing
conditions and are affectby
different
variables.However,
a methodhas
nowbeen developed
to
objectively
measure color on monitorsfor
purposes of accurate print reproduction.All
this
worked out well until multimediaapplications,
including
Internet
browsers,
came along.With
the
computer monitor asthe medium, the
challengeis
to
prepare color
images
for viewing
under unpredictable end-use conditions.To
accomplishthis
endeavor,
one mustfirst
understandthe
principles of additive colortheory,
the
inherent
conditions
associated withviewing
color onmonitors,
and calibrationtechniques
for
monitors.For
situationsin
which numerousimages
are availablefor
viewing,
a standardmethodology for
preparing
images
must alsobe
established.THE DIGITAL
IMAGE DATABASESearchable
databases
onthe
Internet
arebecoming
increasingly
common.In
the
Yahoo
data
base,
searches canbe
performedto
find
specifichome
pages.For
instance,
Kodak's
home
page canbe found simply
by
entering
"Kodak"
at
the
search prompt.Even
thoughthis
and otherkinds
ofdatabases do
exist onthe
World Wide
Web,
the authorhas
found
few databases devot
edto
images
alonethatworkby
having
the
userenter searchterms.
Those
that
exist are usually
slow:the
usermust waitalong
time
for
the
images
to
display
onthe
monitor.Designers
will alsobe
ableto
usethe
home
page andits
relatedjump-off
pointsfor
referenceand/or
inspiration.
For example,
if
someone wantedto
design
abook
that
incorporated
the
spirit of
15th
century typography,
a searchfor
15th
century books
would produce a set ofimages
from
that
period.Anyone
whohas
aninterest
in
publishing,
printing
or graphicdesign
would
find
interesting
information
by
exploring
the
Cary
Collection
site.BACKGROUND AND SIGNIFICANCE
FORGOOD
DESIGNHTML
wasdesigned from
the
beginning
to
allowdocuments
to
be
viewed on various platforms
acrossthe
Internet.
As
Internet
growth shiftedfrom
the
educationalto the
commercialarena,
design
became
abigger
issue.
At
the
presenttime,
a proposed set ofcoding
tags
arebeing
createdto
enable moredesign
possibilities onInternet
browsers.
Leading
the
field,
Netscape Communications
has
createda setoftags
calledNetscape Enhancements
whichworkwith
its
browser.
Because
Netscape
currently
controls70
percent ofthe market,
otherbrowser
providers
have been forced
to
adaptNetscape's
enhancements evenbefore any
formal
standardization
agreementshave been
reached.Exploring
the
design
possibilities of standardHTML
andNetscape's
Enhancements
has
been
challenging,
since newdevelopments
are occurSources
Brian Hayes.
"The
World Wide
Web,"American Scientist
82,
Sept.-Oct.
1994,
418.
Microsoft
Systems Journal
9
,(Sept.
1994)
11-13.
Mary
Ann
Pike,
et. al.Using
Mosaic.
(Indianapolis:
Que
Corporation, 1994)
1,
8-9.
Chapter
2
BACKGROUND
THEORY
In
the
last few
months,
a number of magazine articleshave
provided simplestep-by-step
instructions for making home
pages.Surely,
oneday
designers
willmerely have
to
savetheir
pages
in HTML
format for
their
information
to
be
codedcorrectly for
web accessibility.Even
though
HTML
is
anessentially
simplelanguage,
andcrafting
ahome
pageis
nottoo
difficult,
planning
an extensive web presentationinvolves
a greatdeal
oftedious text
andformat
coding.
A
goodunderstanding
ofthe
theories
which underlineimage
preparation,
database setup
and
installation,
along
with anability
to
construct consistent and attractiveHTML
documents,
is
also a necessity.This
chapter willtherefore
addressthese
areas.IMAGE
PREPARATION"The splitting
of whitelight
into
the
visible spectrum andthe
recombining
ofthe
spectrumto
form
awhitelight
wasfirst
demonstrated
and reportedby
theEnglish
scientist,
Sir Isaac Newton in
1704"
(Field
1992).
Computer
monitors andtelevisions
both
display
colorusing
the
additive colortheory.
That
is,
whenred,
green andblue light
are combinedin
equalproportions, the
resultis
whitelight.
In
nature,
a rainbowdemonstrates
this
theory
as whitelight
is
splitinto its
componentcolors
through the
refractive properties ofraindrops.
Most
people acceptlight from different
lamps
aswhite,
and most peopletend to
call a whitish color on a monitor white whenin
fact
mostmonitors
display
abluish
cast.Attitudes
aboutMonitors
phrase years ago.
Screen
fonts had
finally
allowed wordprocessing
andlayout
programs
to
display
a reasonable presentation of a page on a monitor.But
the
phrase was never meant
to
include
the
coloraccuracy
all monitors shouldhave
either.However,
the
dream
ofseeing
atrue
prediction of printed color ona
desktop
computeris strong
andpotentially lucrative
enoughto
encourage colorscientistsand color monitor manufacturers
to
shootfor
WYSI
WYG"(Smith
1993).
This
quote representsonly
the
optimistic side of color monitor viewing.More
colorresearchers and scientists are quick
to
say
that
monitors should neverbe
viewedfor
subjectivecolor;
only
the
objectiveinformation
thedensity
numbers yield shouldbe
used.However,
whenthe
monitoris
the
final destination
for
the
images,
adifferent
attitude mustbe
employed.All
the
variablesfor viewing
color on a monitor needto
be
addressed and controlled.True,
notmany
people calibratetheir
monitorbecause many do
notknow
a monitor canbe
calibrated.But
whenimages
are seen on alldifferent
types
ofmonitors,
aconsistent methodfor setting up
images
shouldbe followed
regardless ofhow
the
end-userwillviewthe
final
images.
Monitors
take
advantage of vacuum-tubetechnology.
Inside
each one resides a cathode
ray
tube,
essentially
an electronic gun.This
gunfires beams
through
either a shadowmaskor an aperture
grill,
limiting
how
eachbeam
strikesthe red,
green andblue
phosphors onthe
inside
ofthe tube
(Myslewski
andPittelkau
1994).
Images
appear on a monitor as an electronicgun
draws lines
acrossthe
inner
surface ofthe
screenproducing
a sequence ofhorizontal
raster
scans,
with eachline
madeup
of pixels.The
defacto
standard monitorusedin
the
design
of multimedia presentations
is
a 13"Macintosh
monitor,
measuring
640
pixelsby
480
pixels.Additionally,
the
majority
of monitorsonly have
a resolution of72
pixels perinch;
images
should
be
preparedto this resolution,
and not muchhigher,
in
orderto
meetthe
demands
offast
transmissions.In
general,
most monitorsdisplay
information
in
eight-bits capable ofrepresenting
occur when someone accesses
the
image
with an eight-bit monitor.The
majority
ofCD-ROMs
on
the
market weretherefore
optimizedfor
256
colors(Johnson
andScott-Taggart 1993).
Viewing
Monitors
Currently,
a standardfor viewing
monitorsdoes
not exist.Both
aCIE
workgroup
and anAmerican
standardsgroup have
shown someinterest in
setting
standards,
but
atthe
presenttime
no proposalhas been
setforth.
The
correctviewing
conditionsfor
a color monitor arelogical,
aslong
as one understands
the
additive colortheory.
Since
monitors are self-luminous andrely
on no extraillumi
nation
to
produce acolor, the
optimalviewing
environmentis
adark
room.When
viewedin
the
dark,
color monitors suffer nodegradation
causedby
externallight,
thus
producing
goodwhite points and
black
points.The
only
real problemis
that notmany
peopleenjoy working
in
the
dark.
As
acompromise,
dim
working
conditionshave
been
suggestedinstead.
In
this
situation,
nodirect illumination
willfall
onthe
screen orbe
perceivableto the
operatorscanning
the
images
(Johnson
andScott-Taggart
1993).
The
primary viewing
conditionwhich must remain constant and usedduring
all coloradjustments
is
a neutraldesktop
background.
People
today
tend
to
have
fancy
desktop
patterns
ontheir monitors,
but,
when a neutralgray
background is
used, the
scanning
operatorcan make color corrections
that
are notdirectly
affectedby
adistracting
surrounding
environment.
Additionally,
if
various users areviewing
the
sameimage
ondifferent
monitors,
cross-calibration
between
them
(including
the
same neutralgray
surround andthe
similarity
ofviewing
conditions)
mustbe
established(Johnson
andScott-Taggart 1993).
Monitor Calibration
"How
often you calibrate your monitoris
a matter of choice.Factors
such asconstant
energy fluctuations
and progressivedeterioration
ofthe
picturetube
not practical
to
re-calibrateevery
15
minutes.Once
perday,
though,
shouldbe
considered a
bare
minimum with another checkjust before
any
critical work.It's
obviousthat
monitor calibrationis
not a commitmentto
be
taken
lightly"(Smith
1993).
Monitor
calibrationis
clearly
anissue
of significantimportance. Two
ways ofcalibrating
a monitor exist.The
first
designed for
print reproductiondoes
not concernthis
project. The
second establishesthe
white point ofthe
approximatechromaticity
by
using "soft
ware;"
as
long
as one channelis
allowedto
maintainthe
maximumdata
value, the
luminance
of
the
monitor willnotbe unnecessarily
compromised(Smith
1993).
Chromaticity
refersto the
level
ofdesirable
illumination.
Graphic
File Formats
After
images
are scanned and enhancedfor
viewing,
they
mustbe
savedin
the
propergraphicsformat.
The
following
chart provides an example ofthe
varioustypes
of graphicsformats
that
are available
for putting images
onthe
Internet.
It
is important
to note,
however,
that
not all ofthese
formats
areuniversally
supported ontheweb.Graphic
File
Formats Supported
by
World
Wide Web Browsers
FORMAT EXTENSION SUPPORT PROVIDED
Graphics Interchange Format
GIF
NativeJoint Photographic
Experts
Group
JPEG, JPG, BMP,
XBMP,XBM
NativeorExternal
Device Independent
Bitmaps
DIBNative
orExternal
Tagged
ImageFile Format
TIFF,
TIFExternal
PC
PaintbrushPCX
(Savola
1995)
Figure 1
External
For
image
purposes, the two
file formats
GIF
andJPEG
arethe
mostwidely
used andaccepted.GIF
is
an8-bit,
256-color
graphicsfile format
createdby
CompuServe,
whichfacilitates
the transmittal
ofimages
through
various networksincluding
the
Internet
andbulletin board
services.
This
standardemploys a"lossless"
compression
scheme,
meaning
that
the
originalis
preserved without
loss
ofdata
asonly
redundantinformation
is actually
discarded.
When
animage is
receivedin
GIF
format
by
abrowser,
it
mustbe decompressed
by
an applicationconfigured
to
handle
this
kind
offile.
In
general,
GIF
files
arevery large
(Pfaffenberger
1995).
In
fact,
there
is
a4:1
rationin
file
sizebetween
aGIF
image
and aJPEG
image (jpeg-faq/partl
1995).
One
ofthe
niceties availableto
webdevelopers
is
the
ability
to
interlace GIF images.
Interlacing
animage
givesthe
user anillusion
of afaster download because
the
image's
physical
dimensions
aredrawn
out onthe
monitorbefore
allthe
details
arefilled
in.
In
otherwords,
alternating fields
ofthe
images
aredrawn
untilthe
image is
fully
loaded.
This
eitherlooks
like
"horizontal
blinds
opening,
as you can see each visibleline in
the
normalresolution,
orthe
image
willlook like
a rackfocus,
whereit
changesfrom
ablocky
low
resolutionimage
to the
normal resolution
image"
(Savola
1995).
The
authorhas
no control over which ofthe two
waysthe
interlaced image
willbe
presentedin.
JPEG,
createdby
the
Joint Photographic
Experts
Group,
is
afile format based
onthe
JPEG
lossy
compression schemefor
images.
Because
it
employslossy
compression,
aloss
ofdata
occurs,
but
in
such away
asto
notbe
obvious.With
JPEG,
a2MB
file
canbe
compressedto
100K
orless.
The JPEG
graphicsformat
can store24-bit
images
whichrender16
millioncolors(Pfaffenberger
1995).
This
compressionsystem works onthe principlethat
small color changesare
less
noticibleto the
human
eyethan
changesin
brightness
(jpeg-faq/partl
1995).
When
is it
best
to
useGIF
and whenis it
best
to
useJPEG? The
answeris
dictated
by
the
nature ofthe
graphic as wellby
the
audience,
sincethey
arethe
ones whoultimately decide
if
the
image
quality
is
satisfactory
or not.Some
ofthe
advantages anddisadvantages
ofboth
formats
arelisted
onthe
following
page.Advantages
andDisadvantages
of
GIF
andJPEG File Formats
GIF's Advantages
Wide
spread acceptancewithnative supportin nearly
all applicationsIncludes
multimedia extensionsfor
multi-imageGIFfiles
and soundfile
extensionsFast
decompression
Better
withimages using
smaller color palettesGIF'S
DisadvantagesDithering
(approximating
a colorby
adding
a pattern ofdots
over anothercolor)is
necessary
tocreate colorsbeyond
the256color paletteFewer
colorscreatepalette conflictswhenmorethan1image
has
tobe
displayed
atthesametime.
GIF
users areopentocompression scheme copyrightinfringementclaimsJPEG's
AdvantagesSmaller
file
size,which providesabetter image
transmissionrateSupports
"true
color"24-bit
image
representationBetterwith photographic
quality (high
colorpalette)images
Compressionschemeis
in
thepublicdomain
andfree
touseJPEG's Disadvantages
Lossy
compressionschemereducesimage quality
athigher
compressionlevels
Slowerdecompression
(and
thereforedisplay
time)
Fewer
graphical applicationssupportJPEG
file formats
(Savola 1995)
Figure
2In
additionto
its
otheradvantages,
GIF
normally
worksbetter
thanJPEG
onimages
which contain
only
afew
colors.This
is
because large
areas ofthe
same color(redundancies)
are compressed
in
anefficient manner whereasJPEG
compression on a similarimage
often producesvisible artifacts.
JPEG
alsohas
ahard
time
compressing
animage
withsharp,
distinct
edges.In
general,
GIF
is
viewed as a graphicsformat
whichwilleventually be
outdated."GIF
is
reasonably
well matchedto
inexpensive
computerdisplays
mostrun-of-the-mill
PCs
can'tdisplay
morethan
256
distinct
colors at once.But
full-color hardware
is
getting
cheaper allthe
time,
and withJPEG
images
look
much
better
than
GIFs
on such software.Within
a couple ofyears,
GIF
willprobably
seem as obsolete asblack-and-white
MacPaint
format does
today"(jpeg-faq/partl
1995).
Image
Quality
Even
though the
quality
ofimages
that
canbe
placed onlinedoes
not compare withthe
quality
of
their
printedcounterparts,
museums and gallerieshave
startedmaking digital libraries for
others
to
access.Many
ofthese
images
are of such alow quality
that
it
is hard
to
gainany
infor
mation
by
viewing
them.
This
poorquality doesn't have
to
be
the
norm.If
webdevelopers
take
more
time to
understandthe
scanning
process and optimize monitor andviewing
conditions,
aesthetically
pleasing
images
canbe
put onlinethat
do
not requireexcessivedownload
times.
Good images
and asearching
engine entice usersto
explore a website.DATABASES
ANDTHE
INTERNETStoring
andaccessing
information in
an efficient manner are vitalfunctions
in
today's
infor
mation age.
Databases,
which aredefined
as organized collections of relateddata,
enablethe
retrieval of specific
information
(Capron
andPerron
1993).
As
the
amount ofdata
increases,
sorting, creating, changing,
andretrieving
information
becomes
abig
task.
For
example,imag
ine
creating
afiling
systemfor business
cards.If
one organizedthe
cardsin
alphabetic orderaccording
to
last
names,
he
or she wouldbe
ableto
find any
information
on a card withoutmany
problems aslong
asthere
were no morethan
approximately
30
cards.But
whathappens
when3,000
business
cards needto
be
organized?Finding
information
otherthan
alast
namesay location
wouldbecome
a nightmare.Entering
the
information
into
a computerdata
base
capable ofsorting
multiplefields
wouldbe
much more efficient.The
data
couldbe
rearranged or sorted at will on
any field
the
user wished.Although
fielded databases
workvery
wellin
theory
for
information
collection andretrieval,
most "databases"onthe
Internet
do
not workin
this
fashion.
Although
the
worddatabase implies fielded
and organizedinformation,
its
usein
connection withthis
projectmust
be defined differently.
The
very
nature ofHTML
documents
lends itself
moretoward
full-text
indexed
information
whichis
whatthe
worddatabase
refersto
in
this
thesis.
In
orderto
index
the
specifieddocuments,
a special program mustbe
runthat
generates a singleindex
out of
the
desired files.
Another
programthen
acts as agateway
orintermediary
between
the
index
and aform
writtenin HTML. The
form
serves asthe
search-engineinterface
for
the
userto
input
a query.The
gateway
program also produces"HTML
onthe
fly,"that
is,
it
examinesthe
index,
then
creates a results page withlinks
to the
appropriatedocuments.
Which
programsaresuitable
for generating
this
type
ofdatabase depends
on whatkind
of serveris
used.If
the
server
is
Unix-based,
then
the
programshave
to
be
Unix-compatible.
Some
webdevelopers
write
their
own programsin
languages
such asthe
Practical Extraction Report Language
(PERL),
Common
Gateway
Interface
(CGI),
C,
orC++. A
few
programs,
including
the
two
usedfor
this thesis
projectSwish
1.1.1
andwwwwais.c,
version2.5
arefreely
distributed
overthe
Internet.
An
in-depth
sectionexplaining
these
programsfollows
in
the
methodology
portionofthis
paper.Most
search engines producefull-text
searches.Some
ofthe
more sophisticatedfea
tures
include
the
ability
to
searchfor
specificHTML
tags.
For
instance,
some search enginesallow a user
to
searchfor information
which resides within atitle
ora commentstag.
The
Helper Application
Option
Another
optionfor creating databases
onthe
Internet
is
to
use ahelper
application whichis
designed
as afielded database.
A
helper
applicationis
a program usedto
showinformation
which
the
browser
cannot shownatively (Savola
1995).
One
such potential applicationis
calledImageAXS
createdby
Digital
Collections,
Inc.
ImageAXS
has
produceddigital
CD-ROM
databases
ofimages
for
museums such asthe
Frick
Collection
in New
York City. Although
this
application was not createdfor
Internet
use, the
author
talked
to
a representativeofthe
company
andreceivedpermissionto
determine if
their
sample software could work as a
helper
application.Some
ofthe
niceties ofthis
applicationinclude
customizedfile
names,
automatic generation ofthumbnails,
and ease ofadding
records.
Drawbacks
do
exist,
however. The
free
sampledatabase
programonly
allows a userto
view
25
images
at atime.
In
orderfor
a userto
access25
images
ata givensite, the
individual
would
have
to
goto
DCI's
home
page,
download
the
freely
distributed
samplesoftware,
configure
their
browser
to
acceptthe
software as ahelper
application,
goto the
Cary
Collection
home
page and selectfrom
agroup
of25
image
libraries,
download
one ofthe
libraries,
andthen
viewthe
images.
Another
drawback
asidefrom
this
cumbersome processis
that
while
the
thumbnails
aredisplayed
in
the
25-image
libraries,
the
large
images
which shouldbe
displayed
uponclicking
onthe thumbnail
do
not show up.This
is
because
the
link is broken
between
the thumbnails
andthe
originals.Future
versions ofthis
softwaremay
allowthis
link
ing
to take
place.Helper
applications shouldonly be
used whenthey
arewidely
circulated.Introducing
an audience
to
an unknownapplication, then
trying
to
provide anincentive
for
them to
follow
a set of
instructions
to
download
the application,
canbe
avoidedif
the
information is
providedin
nativeHTML
in
the
first
place.Databases
andDocument Structures
Planning
how
adatabase
retrievesimage-based documents is
of utmostimportance.
Some
sites retrieve
thumbnails
ofimages
andtext
right afterthe
inquiry
is
inputted,
sometimesleading
to
long
download
times.
It
is
possiblethat
morethan
50
thumbnails
couldbe
loading
at one
time.
The
authorhas
visited such sites.A
more appropriateway
to
structurethe
docu
ments
for
ease of retrievalis
to
create separatedocuments
for
eachimage.
In
the
documents,
thumbnail
images
can serve as a visual referencefor
the
large
image.
These
larger
images,
whichconsume
considerably
morememory,
areonly
activatedif
the
user wantsto
viewthem.
DESIGNING
WEBSITES
Since
typographic
controls arelimited
in
webdesign
andconstruction,
designers
arefaced
with considerable challenges
in
making
information
at a given site attractive and readable.Because
the end-userhas
ultimate control overthe
way
their
document
willlook,
HTML
docu
ments need
to
be
craftedin
away
that
allowsthem to
look
good underany
given circumstance.
This
poses newchallenges,
but
althoughmany browsers
exist,
the
majority
ofWorld
Wide Web
usersrely
onNetscape.
Netscape
wasformed
in
the
early
months of1994. Jim
Clark,
former
chairman ofSilicon
Graphics,
andMark
Anderssen,
former
member ofthe
NCSA Mosaic
team,
startedup
acommercial
business
aimed atcreating
a graphical userinterface
for
the
World
Wide Web.
Within
ayear, their
effortsbecame
a success andhave
madeNetscape
a common namein
the
information
community.In
fact,
recent estimates showthat
over70
percent ofthe
audiencecruising
the
World Wide Web
usesNetscape
Navigator
(Savola
1995).
Why
has
this
browser
succeeded
in
drawing
so much attention?One
reasonis
that
Netscape
Navigator
allowsinterlaced
inline
images.
In
otherwords,
images
aredrawn in simultaneously
as a pageis
loaded.
If
the
user wants
to
jump
to
another page whilethe
downloading
takes place,
he
or she cando
sowithout
waiting
untilthe
previous pageis
fully
downloaded.
Netscape
alsodeveloped
a set ofproprietary
tags,
enabling
webdesigners
to take
advantage offeatures
such asfont
size control,
blinking
text,
aligning
images,
and control ofspacing
oftext
and graphic elements.These
extensions
have become
so popular withinthe
Internet
community
that the
W3
Consortium,
agroup
that
works onHTML
standardization andimplementation,
is
looking
atadopting
someof
Netscape's
features
in
the
forthcoming
HTML 3.0
standards(Savola
1995).
Prior
to
setting up
pagesfor viewing
onthe
Internet,
one mustfirst
consider whatthe
intent
ofa particular siteis
to
be,
andthen
design
a communications strategy.This
involves
clearly
defining
whatit is
that the
home
page will communicateto
its
audience.After
the
communication message
is
conceptualized,
the
webdeveloper
mustthink
about whoamong
the
the
World
Wide Web
audiencethe
message shouldbe
crafted to entice.Different
audienceshave different
requirements.Some
home
pages require simple, straightforwarddesign. These
are pages
dedicated
to
providing
textual
information.
Other
pages,such asthe
one craftedfor
this thesis project,
aim atattracting
finicky
viewers whodemand
gooddesign
in
additionto
solid content.
The
nextstep
in
designing
a web presentationis
to
acquire goodinformation
written
in
adirect
and activelanguage.
Oftentimes,
better
writing
requiressitting down
andcrafting
words and sentences with a pen and paper.When
using
this
methodfor preparing
text,
the thought
process exceedsthe
writtenword,
enabling
more effective andthoughtful
messages
than
are oftentimes created whensitting down
at akeyboard
andtyping
whatevercomes
to
mind.Information
must alsobe
organized.This
requiresdividing
the
information
into
various pages anddeveloping
a storyboardfor
the
physical structure ofthe
presentation.At
this point,
general style guidelines canbe
prepared whichspecify link
andbackground
colors as well as
first-,
second-and
third-level
headings.
Finally,
the
information
mustbe
arranged
in HTML
code.One
HTML
document
canbe
constructedto
serveas a templatewhichcan
be
used over and overagain,
aslong
asthe
HTML
coder saves eachnewly
editedfile
as adifferent
name.Sources
H.L. Capron
andJohn D. Perron.
Computers
andInformation
Systems:
Tools
for
anInformation
Age 3rd
ed.(New
York:
The Benjamin/Cummings
Publishing
Company, Inc., 1993)
56,
378.
"jpeg-faq/partl"Electronic document
available at:http:llwwwicisiohio-state.edui
hypertext/faq/usenet/jpegfaq/partl/(fac-doc-Lhtml)}
(fac-doc-2.html)
and(fac-doc-2.html).
Document
viable as ofOct.
15.
Gary
G.
Field,
Color
andIts Reproduction.
(Graphic
Arts Technical Foundation:
Pittsburgh,
1992)
9-10.
Tony
Johnson
andMarcus
Scott-Taggart,
Guidelines
for
Choosing
theCorrect
Viewing
Conditions for Colour Publishing.
(Leatherhead,Eng.:
Pira
International, 1993)
42-50.
Rik Myslewski
andJeff Pittelkau.
"Choosing
the
rightmonitor"
Mac User
10,
Nov.
1994,
100.
Bryan Pfaffenberger. World
Wide Web
Bible
(New
York:
MIS:Press,
1995)
44.
Tom Savola.
Using
HTML.
(Indianapolis:
Que
Corporation,
1995.)
1,
96,
98-99.
L.
Smith."WYSIWYG
Monitor
Color,"Publishing
andProduction
Executive
7,
(Aug.
1993):
41-42.
Chapter
3
SELECTED REVIEW OF LITERATURE
INTHE
FIELD OF STUDYThe
literature
usedin
designing
andimplementing
this
presentation was extensive(see
the
bibliography),
but
the
following
sourcesheld
the
most weightin
the
outcome ofthe
Cary
web site.An
extensivesummary has been
prepared of each ofthese
sources.Image
Formats
andConsiderations
Early
on inthe
construction ofthis project, the
author conducted researchin
orderto
understand graphic-image
formats
usedin
web publishing.The
book
Teach
Yourself
Web
Publishing
with
HTML
in
aWeek,
by
Laura
Lemay,
is
avery
helpful
publicationfor
all aspects ofcreating
web presentations.
Organized in
easily
digestible
day-by-day
tutorials, it
coverseverything
from
the
importance
ofstoryboarding
tousing
theHyperText
Markup
Language
to
design
effective
documents
whichcombinetext,
images,
soundandvideo.Teach
Yourself
Web
Publishing
withHTML
in
aWeek
first discusses
image
formats.
Two
different
types ofimage
formats
existthat
Web browsers
canhandle:
inline images
and external
images.
Inline
images
appearautomatically
when ahome
pageis
loaded
onthe
web.Inline
images
shouldbe
preparedin
aGIF
format
regardless of whichbrowser
a useris
running,
eventhough some
browsers
canhandle
otherformats.
External
images
canonly
be
downloaded
atthe
request ofthe
user.As
browsers
aretypically
capable ofhandling
different file
types,
moreflexibility
is
attainedif
the
appropriatehelper
applicationsis
already
configured on a systemto
read
JPEG,
PCX
andXBM
graphicfile formats.
Inline
images
shouldbe kept
smallbecause
they
occupy less
space and aretransmitted
rapidly
overthe
Internet.
The
term
"smaller
images"has
two
definitions
in
this
book:
1)
its
actual physical
dimensions,
and2)
its memory
consumption.Many
imaging
programs enableusers
to
reducethe
number ofcolorsin
animage
while stillhaving
good results.The
rule ofthumb
for
inline
image memory
is
to
keep
images
under20K.
While
this
numbermay
seemtoo
small,
Lemay
suggeststhe
following
scenario:A
single20K
file
takes
approximately
20
seconds
to
download
over atypical
Internet
connection.If
youtake
this time
andmultiply
it
by
the
number ofimages
onthe
home
page,
a substantial amount oftimecan accumulate.People
will
become frustrated
if it
takes too
long
to
accessinformation,
andthat
website willbe
visited
less frequently.
For
every
inline image
whichis
placed on ahome
page, the
following
questions shouldbe
addressed:"What does
that
image
addto the
design?
Does
it
provideinformation
thatcouldbe
presentedin
the
textinstead?
Is
it
there
just
because
youlike how
it
looks?"
External
images,
as mentionedbefore,
canbe
preparedin
a moreflexible file
format.
This
is
because
externalimages
are viewedseparately from
webfiles.
That
is,
externalimages
can
be
retrieved andloaded
ondemand
in
windowsthat
are separatefrom
the
windows occupied
by
the
actualbrowsers.
When
a usergoesto
a particularhome
page, the
browser
accessesinformation
about whatkind
offiles
are availablethere
by looking
at a special code or extension
(.jpeg,
for
example)
attachedto
eachimage
name.If
the
browser
cannot readthat
specificcode,
it
transfersthe
unreadableimage
to
ahelper
applicationresiding
in
the
user's computer.If
the
file
type
is
recognizableby
the
helper
application, then the
helper
applicationis
automatically
launched
andthe
image
canbe
viewed.This
same principleis
usedto
activate audio andvideo clips within
home
pages.When
establishing
alink
to
an externalimage,
Lemay
recommends
providing
information
such asthe
file format
andfile
size.The
most commonfile format for
externalimages
is
the
same asthat
for
inline
images,
namely
GIE JPEG
file
formats
are almost as popular asGIFs,
however.
That is
because
JPEG
file
sizes are smaller and allow
for faster
downloading
times,
but
sometimesthe
images
suffer aquality loss.
For
the
Macintosh,
the
program calleddeBabelizer
canbe
usedto
convertto the
necessary
file
formats.
A
discussion
onthe
appropriate use ofimages
is
also providedin
this
publication.As
Lemay
states,
"The
use ofimages
in Web
pagesis
one ofthe
bigger
argumentsamong
usersand providers of
Web
pagestoday.
For
everyone who wantsto
design
aWeb
page withmore,
bigger,
brighter
images
to take
full
advantage ofthe
graphical capabilities ofthe
Web,
there
is
someone on a slow network connection who
is
begging
for fewer
images
sothat
his
orher
browser does
nottake three
hours
to
load
apage."
Keeping
in
mindboth
situations,
web pagesshould
be
created whichtake
advantage of colorfulimages,
but
whichtake those
usersinto
account who contact
the
home
page with atext-based
browser like
Lynx.
Also,
not everyoneusing
graphicaluserinterfaces
to
cruisethe
webhas
a colormonitor.Therefore,
this
book
sug
gests
testing
images
on avariety
ofdifferent
monitors andinterfaces. Black-and-white
monitors
sometimesdrop
out color ordither
images. Browsers
alsohave different
screendimen
sions,
soit is
best
to
avoidsetting up
images
that
stretchthefull length
ofthe
screen.The
Image Preparation
Steps Employed
by
Kodak
for
its
World
Wide
Web Page
Kodak's
web presentation contains a collection ofimages
that
have been
optimizedfor
monitor
viewing.Kodak
devotes
a portion ofits
siteto
explaining
how
all ofits images
werehan
dled
to
achievethese results.This
seven-step
procedureinitially
appearedto
be
anideal
modelfor preparing
theCary
images. Before
determining
the
proceduresfor
image
acquisitionsfor
the
Cary
Collection,
the
steps outlined atthis
site were usedfor preliminary
experimentation.However,
asit
wasdetermined
that the
quality
of scans receiveddirectly by
using
35mm
colortransparencies
far
exceededthe
quality
obtainedfrom
aPhoto
CD.
Although
Kodak's
procedures
were not usedfor
the
projectin
the end,
they
are valid andshouldbe
consideredfor
others who are
creating image
databases
and repositories.Kodak
began
with35mm
transparencies.
During
the
first
step,
allimages
weretrans
ferred
to
aPhoto CD
using
aKodak
PCD
Imaging
Workstation 2400
(referred
to
asPIW).
PIW
is
a
CD authoring
systemconsisting
of ahigh
resolution scanner with a2,000-pixel
linear
array,
adata
managerworkstation,aCD
writer,
athermal printer,
and software.The PIW
is
designed
to
make
scanning
"easy, fast,
andreliable."
The
exposure andfocus
aredone
automatically, andthe
colorbalance
was achievedthrough the
selectionof"filter
terms."
Filter
terms
aredesigned
to
analyze and addressthe
characteristics particularto
film
types,
andthey
play
a vital rolein
optimizing
scan quality.All
slides were checkedfor dust
priorto
scanning
to
prevent unnecessary
time
loss
in
image clean-up
at alater
stage.In
the
secondstep,
the
images
weredigitally
compressed onthe
Photo CD. Five
image
resolution sizes are available
in
the
Photo
CD
format,
ofwhichthe
base
resolution size(512
x768 pixels)
was chosenfor
the
World Wide Web images. This
resolution size was chosenbecause "it delivers fine
image
quality for
on-screenviewing,
and stillhas
areasonably
smallfile
size(about
1.2
MB)
even smaller when compressed withJPEG
(80-200
K). We
want abal
ance
between
image
quality
andtransmission
performance."
Kodak's Photo CD Acquire
Module
andAdobe
Photoshop
wereboth
usedto
preparethe
base
resolutionimages.
The
third
step
involved
optimizing
andreconstructing
allimages.
All
images
werecropped and cleaned
up
withPhotoshop
tools.
The
dust,
for
example, that
was scanned withthe
images
was erasedwiththe
rubber-stamping
tool.
At
this point,
minimal adjustmentsweremade,
sinceKodak
realizedthat
many
different
monitors wouldbe
usedto
viewthe
actualimages. Some
ofthe
adjustmentsinvolved
optimizing
contrast,
brightness,
color,
andsharp
ness
using
Photoshop's
"Unsharp
Masking"
filter. Next,
images
were savedin
RGB
modein
a24-bit
TIFF
format
andthen
compressedusing
the
JPEG
method.The
fourth step
concernedJPEG
compression.Various
degrees
of compression areavailable
using
this method,
withlittle
compressionyielding
the
best
results.Kodak
chosethe
compression method which
they
believed
provided abalance between
delivery
speed andimage
quality.Keeping
these
goalsin
mind,
eachimage
was compresseddifferently.
On
average,
when"medium"
compression was selected
for
animage
which was1.2
MB,
afile
size of80
to
200K
resulted.During
the
fifth
step,
Kodak
createdthumbnail
versionsfor its home
pagein
whatis
known
asindex
print.The
advantage ofusing
index
printnormally
an8-bit
image
is
that
the
images occupy little
space and aresimply
meantto
provide enough of a guideto
select ahigher quality image
availablethrough
alink.
Each
thumbnail was82
x123
pixels and wasresized
from
the
JPEG
images
preparedin
earlier steps.The RGB
files
were convertedinto
indexed
colorsusing
a colortable
limited
to
only
50
colors.The
effects ofthis
conversionweresuch
that the
file
size was reducedto
a practicalinline
size, through the
50-color
index
resultedin
a posterization of some ofthe
images.
Additionally,
someimages
had
shiftsin
color andbanding
appeared.In
the
sixthstep,
Kodak
took the
images
and putthem
on a server wherethey
wereavailable
for downloading.
The FTP
(File
Transfer
Protocol)
sitefor
obtaining
these
images is
"/Photo-CD/IMAGES/."
The
last step
in Kodak's image
preparation process consisted ofinforming
users aboutmethods
for optimizing
their
viewing
conditions.For
example,
"monitor
settings and ambientviewing
condition arethe
biggest factors
in
display
quality.Control
yourviewing
conditionsby
viewing
your monitorin
adimmed
room"
(a
controlpanel availablein
the
operating
systemofdesktop
components).It
is
alsoimportant
notto
allowdirect light
to
reachthe
monitor.Monitors
canbe
optimizedby
setting
the gamma at1.8, placing
the
white point at cool white(somewhere between
6500
and9300
Kelvin),
andsetting
the
colordepth
at millions of colors.For
monitors which