Speech is powerful information technology

(1)

Graphically Speaking

Editor:

Miguel Encarnação

Carnival—Combining Speech Technology

and Computer Animation

Michael A. Berger Speech Graphics Ltd. Gregor Hofer Speech Graphics Ltd. Hiroshi Shimodaira University of Edinburgh

S

peech is powerful information technology and the basis of human interaction. By emit-ting streams of buzzing, popping, and hiss-ing noises from our mouths, we transmit thoughts, intentions, and knowledge of the world from one mind to another. We’re accustomed to thinking of speech as an acoustic, auditory phenomenon. However, speech is also visible. Although the pri-mary function of speech is to manipulate air in the vocal tract to produce sound, this action has an ancillary effect of changing the face’s appear-ance. In particular, the action of the lips and jaw during speech causes constant deformation of the facial surface, generating a robust visual signal highly correlated with the acoustic one1_{—that is,} visual speech.

In computer animation, this means that speech is something that must be studied, understood, and simulated. Speech animation, or lip synchro-nization, is a major challenge to animators, ow-ing to its intrinsic complexity and viewers’ innate sensitivity to the face. (Actually, the term lip syn-chronization—lip sync—incorrectly implies that only the lips move, whereas actually almost all the facial surface below the eyes gets deformed during speech.) At the same time, demand for lip sync is sharply increasing, in terms of both realism and quantity. Automated solutions are now absolutely necessary. (For more on why this is the case, see the sidebar.)

The past two decades have seen the emergence of techniques for animating speech automati-cally using speech technology—an

interdisciplin-ary concept called visual speech synthesis. Since the late 1980s, two applications have been in develop-ment. Audio-driven animation automatically syn-thesizes facial animation from audio. Text-driven animation (or audiovisual text-to-speech synthe-sis) synthesizes both auditory and visual speech from text. The former is used for automatic lip sync with recorded audio, the latter for entirely text-based avatars.

But speech technology and computer graphics remain worlds apart, and the development of vi-sual speech synthesis suffers from lack of a unifi ed conceptual and technological framework. To meet this need, researchers at Speech Graphics (www. speech-graphics.com) and the University of Ed-inburgh’s Centre for Speech Technology Research (CSTR; www.cstr.ed.ac.uk) are developing Carni-val, an object-oriented environment for integrat-ing speech processintegrat-ing with real-time graphics. Carnival comprises an unlimited number of mod-ules that can be dynamically loaded and assembled into a mutable animation production system.

Visual Speech Synthesis

Both audio- and text-driven animation involve a series of operations converting a representation of speech from an input form to an output form.

Audio-Driven Animation

In audio-driven animation (see Figure 1), the fi rst step is acoustic analysis to extract useful infor-mation from the audio signal. This inforinfor-mation might be of two kinds:

(2)

■ continuous acoustic parameters, such as pitch,

in-tensity, or mel-frequency cepstral coefficients; or

■ discrete speech categories, such as phonemes or

visemes.

Both can be the basis for the next step, synthesizing audio-synchronous motion. Given some regression model, we can map continuous acoustic parameters directly to motion parameters. On the other hand, a categorical analysis (see Figure 2) provides a

se-mantic description of speech events. This descrip-tion abstracts the speech from the audio domain, allowing its reconstruction in the motion domain. After synthesizing facial motion in some form, we must still map it to a facial model’s animation parameters, determined by its deformers. In a 3D facial rig with blendshapes and bones (also called joints), the parameters are blendshape weights and bone transformation parameters. Using these parameters, we can render the animation. This is

Acoustic parameters/speech categories

w ∧ n n a I t h r g r æ m ∧ r w o Ω k ∧ p Motion parameters Animation Animation parameters Acoustic analysis Motion synthesis Rendering Adaptation

Figure 1. A typical processing pipeline for audio-driven facial animation. Acoustic analysis extracts continuously and categorically valued representations of the audio. Both can be used as input to motion synthesis, which produces audio-synchronous motion in some parameter space, which must be mapped to a facial model’s animation parameters (adaptation). From these parameters, we can render the animation using standard methods.

(3)

Graphically Speaking

Realistic facial synthesis is one of the most funda-mental problems in computer graphics—and one of the most difficult.1

T

raditionally, lip synchronization (lip sync) has been done manually, by keyframing or rotoscoping. However, as 3D animation reaches increasing heights of realism, all aspects of the animation industry must keep up, including lip sync. And realistic lip sync is extremely labor intensive and difficult to achieve manually. This difficulty is due to four character-istics of visual speech: dynamic complexity, audio synchron-icity, high sensitivity, and high volume.

Dynamic Complexity

Speech is arguably one of the most complex human mo-tor activities. Our alphabetic writing system can deceive us into thinking that speech is just a succession of dis-crete, sound-producing events, corresponding to letters. But as people discovered in the 19th century with the advent of instrumental acoustics, this isn’t the physical reality. Speech is a continuous activity with no real “units” of any kind.

Like other task-oriented motor behaviors, speech is highly efficient: energy expenditure is minimized. So instead of producing one sound after another sequentially, we begin producing each sound well before concluding the previous one. The movements of the tongue, lips, and jaw in speech are like an athlete’s coordinated movements: different body parts acting in concert, future movements efficiently overlap-ping with current ones, all efficiently compressed in time. This simultaneous production of sounds, called

coarticula-tion, means that what we think to be a given consonant or vowel is actually realized quite differently depending on the sounds preceding and following it. Consequently, it’s difficult or impossible to define units of speech in a context-invariant way. Such dynamic complexity is understandably difficult to reproduce by hand.

Audio Synchronicity

Unlike other animated behaviors such as walking, visual speech must be tightly and continuously synchronized with an audio channel. This synchronization makes speech animation a uniquely double-edged problem. The visual speech must be not only dynamically realistic in itself but also sufficiently synchronous and commensurate with the auditory speech to create the illusion that the two signals are physically tied—that is, that the face we’re watching is the source of the sound we hear.

High Sensitivity

Beyond the intrinsic difficulties of synthesizing visual speech—complexity and audio synchronicity—there’s an extrinsic perceptual problem. Humans are innately well attuned to faces, which makes us sensitive to unrealistic facial animation or bad lip sync.

This sensitivity might serve a communicative function. With our highly expressive faces, we seem designed for face-to-face communication. This obviously includes nonverbal communication: facial expressions modify the spoken word’s meaning and transmit emotional states and signals in the absence of speech. But faces are also integral to speech com-munication. Humans have an innate ability to lip-read, or

rec-Why Automate Speech?

k w ih k sh aa t s r ae ng aw t

Time (sec.)

13.65 14.77

Figure 2. The waveform and categorical analysis of the utterance “Quick shots rang out.” Such analyses provide a semantic description of speech events that we can use to reconstruct speech in the motion domain.

(4)

the fundamental process flow in any audio-driven lip-sync method. For an example of audio-driven animation produced using Carnival, visit http://doi. ieeecomputersociety.org/10.1109/MCG.2011.71.

Text-Driven Animation

In the text-driven pipeline (see Figure 3), the first step is to apply pronunciation and duration rules to produce a categorical time series, like that derived from audio in Figure 1. From this semantic repre-sentation, we synthesize both audio and articula-tory motion. After motion synthesis, the left side of Figure 3 is identical to Figure 1. This is the typical process flow for text-driven methods, but variations exist that synthesize audio and motion in a more unified manner from a single speech model.

Acoustic vs. Visual Synthesis

Many parallels exist between acoustic-speech and visual-speech synthesis because acoustic and visual speech signals have similar underlying dynamics.

This is because they’re generated by the same physi-ological system: the vocal tract, of which the mouth and surrounding facial features are the visible ter-minus. For instance, both channels exhibit coartic-ulation—the overlapping production of sounds. So, visual speech synthesis can and does borrow tech-niques from acoustic-speech synthesis.

At the same time, the two types of synthesis dif-fer considerably. The mediums are entirely difdif-fer- differ-ent: synthesizing facial shapes or images as opposed to acoustic energy. Also, in the visual domain, it’s important to synthesize nonverbal motion, such as emotional expressions, eye gaze, and autonomic ac-tivities such as blinking and breathing. There’s no such thing as “silence” in visual speech because the face is never really still.

Facial Performance Capture vs.

Visual Speech Synthesis

Besides visual speech synthesis, another technique for automating lip sync is facial performance

ognize words by sight; this is true for both hearing-impaired and normally hearing people. The next time you’re in a noisy place such as a bar, notice how much you rely on seeing someone’s face in order to “hear” them. Even in noise-free environments, speech perception is still a function of both auditory and visual channels. Visual speech so strongly influences speech perception that it can even override the auditory percept, causing us to hear a sound different from the one the ear received—the famous McGurk effect.2

High Volume

As the bar for realism in 3D animation continues to rise, and with it the demand for higher-quality lip sync, the quantity of speech and dialogue in animation is also rising exponentially. These are antagonistic sources of pressure: animators can’t satisfy quantity demands without sacrific-ing quality, and vice versa.

A case in point is the video game industry. Video game characters are becoming increasingly realistic, in both static appearance and behavior. Poor lip sync will be reflected in game reviews, which often include rants about lip sync. At the same time, games are becoming increasingly story-driven and cinematic, more like interactive movies than games. This means much more speech and dialogue, all of which must be animated.

As big-title games move online, the amount of assets in a game, including recorded audio, can increase by an order of magnitude. Rockstar Games’ Grand Theft Auto IV, released in 2008, had 660 speaking parts with 80,000 lines of dialogue. This was considered a staggering amount of speech at the time.3_{However, the new Star Wars massively}

multiplayer online game, The Old Republic, to be released in 2012 by Bioware, will feature some “hundreds of thou-sands of lines of dialogue,” or the equivalent of about 40 novels.4_{At just one-third of the way through the voice work}

for the game, the amount of audio recorded reportedly had already exceeded the entire six-season run of The Sopranos.5

As if that weren’t enough, most games are released in multiple languages; to avoid unsynchronized or “dubbed” speech, all the speech animation must be redone for every language. Video game developers simply can’t do this by hand; they need automated solutions.

References

1. F. Pighin et al., “Synthesizing Realistic Facial Expressions from Photographs,” Proc. Siggraph, ACM Press, 1998, pp. 75–84.

2. H. McGurk and J. MacDonald, “Hearing Lips and Seeing Voices,” Nature, vol. 264, 1976, pp. 746–748.

3. “Rockstar Games’ Dan Houser on Grand Theft Auto IV and Digitally Degentrifying New York,” 2 May 2008; http:// nymag.com/daily/entertainment/2008/05/rockstar_games_ dan_houser.html.

4. L. Smith, “E3 2009: Star Wars: The Old Republic Is World’s First ‘Fully Voiced’ MMO,” blog, 1 June 2009; http://massively. joystiq.com/2009/06/01/e3-2009-star-wars-the-old-republic- is-worlds-first-fully-voi.

5. B. Crecente, “The Old Republic Wordier Than Entire Run of the Sopranos,” Kotaku, 3 June 2009; http://kotaku.com/ #!5278008/the-old-republic-wordier-than-entire-run-of- the-sopranos.

(5)

capture, which captures the motions of a live actor’s face and maps them onto a facial model. But performance-driven animation is expensive— requiring a studio, trained actors, careful application of facial markers or makeup (or both), a director to oversee the performance, and a motion capture system run by a trained operator. Audio recording conditions are often poor, so audio has to be dubbed in later, with no automatic synchronization. Moreover, capturing motion and transferring it to the animated character is error-prone and requires manual supervision and cleanup. Tracking the lips in particular is notoriously difficult. Owing to the labor and expense, this method can’t scale to large volumes of speech. However, we believe performance-driven and audio-driven methods can complement one another: the former for nonverbal body and facial motion, and the latter for high-volume speech.

Bridging the Divides

Unlike performance-driven animation, which sim-ply maps motion to motion, visual speech synthesis involves more distal mappings from audio or text to motion. As we’ve seen, this requires modeling complex dynamic phenomena specific to speech. So, our field is interdisciplinary, combining speech technology and computer graphics. But although it spans two disciplines, it’s a stepchild to both. Speech technologists might have a good grasp of speech but tend to show little interest in

high-quality facial modeling or computer graphics. In the graphics community, the problem is reversed. Practitioners specialize in realistic 3D models but tend to underestimate the speech issues, result-ing in poor lip sync. Few individuals sufficiently grasp both speech and computer facial animation to make the entire process work. Thus, we need a more collaborative approach.

Besides the cultural divide, a technological divide also exists. No standard software infrastructure in-corporates speech technology and computer graph-ics. Visual-speech-synthesis platforms tend to be patchwork, with speech processing handled by re-search code, and rendering done offline in external programs such as Maya or Blender. A few systems have built-in rendering—for example, the Baldi project at the University of California Santa Cruz’s Perceptual Science Lab (http://mambo.ucsc.edu), the Semaine platform at Télécom ParisTech (www. semaine-project.eu), and a few commercial appli-cations over the years. The tendency, however, is to build monolithic systems for a single application. Building on or extending such a software base is difficult, such that if you want to do one part dif-ferently, it’s often easier to start from scratch. The current state of affairs clearly isn’t conducive to accelerated development and collaboration.

Carnival’s Genesis

We needed something better for our own work. The Carnival system arose out of the need for

Speech categories Text

w ∧n n a I t h r g r æ m∧ r wo Ω k ∧ p

Motion parameters Audio

Animation Animation parameters Pronunciation and duration Motion synthesis Audio synthesis Rendering Adaptation

One night her grandmother woke up ...

Figure 3. A typical processing pipeline for text-driven facial animation. Pronunciation and duration rules generate a categorical representation of the speech over time, providing common input to audio synthesis and motion synthesis. Thereafter, the process is similar to audio-driven animation.

(6)

a research tool with which to try out various audio-driven facial-animation methods that CSTR colleagues and Speech Graphics colleagues were developing. The tool had to be

■ a flexible system in which we could easily

inter-change and compare different methods,

■ a well-structured system to which other

devel-opers could contribute easily without it break-ing, and

■ an interactive, real-time system letting us

ana-lyze animation against time-varying data. Nothing like this existed at the time.

We didn’t just want to engineer software for a particular application embodying a particular set of synthesis methods. Instead, we asked, how do we step back and provide a more general founda-tion for this work? The answer was to first per-form a conceptual and structural analysis of the field, which would let us view speech technology and computer graphics components in a common ontological universe. Then, we implemented that analysis in an object-oriented system that empha-sizes modularity and flexibility. The name Car-nival is a play on Festival—a widely used speech synthesis platform previously developed at CSTR— with the added connotation of faces.

The Carnival API

At its core, Carnival is a C++ API. The API provides a flexible animation-system architecture, which

consists of any number of dynamically loadable, combinable modules, all belonging to the super-class Component. Figure 4 shows the Component class hierarchy, which contains five subtypes: Se-quence, Visualizer, Event, Processor, and Pipeline.

Sequences

A Sequence is any sequence of values, which might be a signal (for example, audio or video) or some en-coding thereof. For example, each input and output object in Figures 1 and 3 is a Sequence. Sequence has two subtypes: String and TimeSeries. A String is an atemporal sequence (usually of char-acters—that is, text), whereas a TimeSeries is a sequence of values at specific time points.

TimeSeries has two subtypes. A Numerical-TimeSeries has floating-point values on multiple channels, which we can sample continuously by interpolation. A CategoricalTimeSeries has categorical values, which extend over intervals and change at discrete boundaries (see Figure 2). Nu-mericalTimeSeries divides into two further sub-types: Regular, with values spaced at a uniform time interval, and Irregular, with values spaced nonuniformly. The former includes Signals, which are Sequences that can be output in real time, in-cluding the subclasses Audio and Video.

Visualizers

We considered it essential for Carnival to have real-time rendering capabilities. Users can always

Event String TimeSeries Signal Audio Video Visualizer NumericalTimeSeries CategoricalTimeSeries IrregularTimeSeries RegularTimeSeries Pipeline Processor Sequence

Figure 4. The Carnival API’s Component class hierarchy. Components are self-contained modules— processors, data objects, and output systems—that can be dynamically loaded and assembled to form a mutable animation production system.

(7)

export animation to packages such as Maya, 3ds Max, or Softimage for high-quality in-scene rendering. However, they should also be able to preview animation in real time in the same system that produced it, so that feedback is immediate and they can view the animation in synchrony with relevant time-varying data.

So, we designed a real-time rendering engine. But in keeping with Carnival’s design philosophy, we made it a Component—modular and self-con-tained. It’s called a Visualizer. A Visualizer has a control interface consisting of a set of defor-mation parameters (DPs). Each DP has the range [0, 1] for unidirectional or [–1, 1] for bidirectional deformations of the face, with a rest state of 0. A Visualizer is essentially an image decoder, con-verting a vector of DPs into an image on the screen. Thanks to encapsulation, how it does this—2D or 3D rendering—is of no concern to external callers.

For real-time animation, a Visualizer can be

bound to a NumericalTimeSeries whose chan-nels are the Visualizer’s DPs. Whatever the cur-rent time point is in the NumericalTimeSeries, any bound Visualizer will display an image vi-sualizing that time point’s vector of values. Figure 5 illustrates binding to a NumericalTimeSeries.

Figure 5 also displays our 3D implementation of the Visualizer, which is based on OGRE (Object-Oriented Graphics Rendering Engine; www.ogre3d.org) and can accommodate any facial model created in standard 3D modeling packages. As Figure 5 shows, the 3D Visualizer consists of a control interface and an OGRE scene. The DPs can be bound not only to a

NumericalTime-Series but also to other DPs by linking func-tions. Ultimately, DPs link to low-level animation parameters of the facial model in the OGRE scene.

Events

A Sequence is essentially a representation of some temporal event, such as an utterance or an action. For example, when we record someone speaking, the audio signal is a representation of the event of their speaking. If video was recorded at the same time, it too represents the same event. Extracting some features from the audio, such as mel-frequency cepstral coefficients or pitch, will result in yet another representation. We can also have a text transcript of what the person said, which represents the event in still another way. A group of Sequences like this that all represent the same event in different ways form a natural grouping, which Carnival supports with the class Event. An Event is a container Component that contains an ordered list of Sequences. All the TimeSeries members of the Event share a common time domain. The String members have no time dimension, but they textually refer to the same interval. Figure 6 illustrates an Event.

Besides forming a grouping of related Se-quences, Event also functions as a playback system, performing synchronous output of its real-time members. As it outputs each Signal, it also updates the current time in all TimeSeries, which is automatically reflected by an image change in any bound Visualizers. So, by simply changing the current time, the Event automati-cally produces animation.

OGRE scene Control

Interface NumericalTimeSeries

Visualizer

Figure 5. Our 3D implementation of the Visualizer, consisting of a control interface and an OGRE (Object-Oriented Graphics Rendering Engine) scene containing a facial model. The interface comprises a set of deformation parameters (DPs—the blue squares), which can be bound to the current time point in a

NumericalTimeSeries or to other DPs by linking functions. Ultimately, DPs link to low-level animation

parameters of the facial model in the OGRE scene (the blue squares), such as blendshape weights or bone transformation parameters.

(8)

Processors and Pipelines

As Figures 1 and 3 illustrate, automated animation production basically entails converting Sequences from one form to another. The Component to handle Sequence conversion is the Processor class. Simply, a Processor takes one or more Sequences as input and gives one or more Se-quences as output.

Processors are modeled after Unix commands in that they do relatively simple jobs, and they can be concatenated or piped. The analog of the Unix pipeline is the Pipeline class, a container Component holding an ordered list of Proces-sors. In a Unix pipeline, each process’s output is redirected as input to the next one. But a Carni-val Pipeline works differently: instead of con-necting Processors by direct feeds, it passes an

Event to each of them in order. Each Proces-sor searches the Event for its required input Se-quences, by traversing the list from end to begin-ning and performing type checking. If it finds the required input, it runs the process and adds its output Sequences to the end of the Event. The next Processor will then have access to this out-put and the outout-put of earlier Processors. Figure 7 shows an example Pipeline.

In contrast to Unix pipelines, the searchable-list method of concatenation lets Processors have multiple inputs and outputs and receive in-put produced by nonimmediate predecessors. It’s similar in nature to concatenative programming languages, such as Joy or PostScript, but by using a list instead of a stack, it preserves all output in-stead of “popping” items off.

w n n a I g r æ m r w o k p

Play> Pause>

Seek>

One night her grandmother woke up...

w ∧ n n a I g r æ m∧ r w o Ω k ∧ p

Time t

Figure 6. An Event, containing an ordered list of Sequences representing the same temporal event, such as an utterance. The

TimeSeries members share a common time domain with current elapsed time t. This Event includes a String, an Audio,

two NumericalTimeSeries (to one of which a Visualizer is bound), a CategoricalTimeSeries, and a Video. An

Event has playback functions such as play, pause, and seek, which control synchronous output of the set’s real-time members

(9)

Like other Components, Processors are dy-namically loaded at runtime. So, while an appli-cation runs, various different Pipelines can be assembled and used, for a runtime-programmable system. The modular and programmable nature of Pipelines fulfills the design objective that Car-nival should be mutable and should allow for easy interchange of methods.

Accelerated Development

Carnival-based applications extend the modular paradigm by further subclassing Components. Typically, they also add a GUI and other layers that tailor the user experience by providing higher-level commands and restricting what users can do or see. For example, you can make an application that’s exclusively text-driven or audio-driven. But underneath, each application inherits the parent system’s dynamically modular design.

Application development is accelerated be-cause the API supplies so much structure and function in advance, but without confining the application to a particular form. Once an ap-plication is established, it can grow quickly by adding more and more Components. The mod-ularity of Components allows for concurrent development. Researchers can focus on coding their own algorithms by writing Processors; the Processor interface lets them immediately join their Processors to any Pipeline. Pro-cessors are black boxes that snap together—a desirable outcome in object-oriented design.

The developer determines the user interface. In designing our in-house application, we’ve studied GUIs from a variety of exemplars: audio analy-sis tools, video-editing systems, speech syntheanaly-sis programs, and 3D modeling and animation pack-ages. We also obtained feedback from animation professionals about the functionality they would like. Our application includes graphical editors for each type of Component. For example, the Visualizer interface (see Figure 8) gives man-ual access to deformation parameters. The

Time-Figure 8. The graphical user interface for a Visualizer, with manual access to deformation parameters. Application developers can extend Carnival by giving Components graphical interfaces for user editing and viewing.

One night her grandmother woke up ... w w∧n nalt hr græm∧ r wo k ∧p nn a I t h r gr æ m r wo kΩ p Ω ∧ ∧ ∧

One night her grandmother woke up ... w w∧n nalt hr græm∧rwo k ∧p nn a I t h r gr æ m r wo kΩ p Ω ∧ ∧ ∧

One night her grandmother woke up ...

w∧nn a I t h r gr æ m∧r wo kΩ ∧p

One night her grandmother woke up ...

MotionGenerator Syllabifier

MFCC_Extractor Aligner

Figure 7. An example Pipeline in action, demonstrating audio-driven animation in which a text transcript accompanies the audio. At the start, the Event contains two Sequences:Audio and String. Each Processor searches the Event backwards for its required input Sequences and adds its output to the end of the Event. So, Processors have access to multiple Sequences, including those produced by nonimmediate predecessors.

(10)

line interface provides playback controls and an interactive display of time-series data, with a cur-sor that can be scrubbed (dragged back and forth in the timeline for framewise output).

A

nimators are increasingly looking for auto-mated solutions to their problems. Visual speech synthesis is an attractive option from the viewpoints of cost and quality.

Proper integration of speech technology requires stepping back from specific programming tasks and looking at the big picture. Animation is just another form of synthesis, and visual output is just another form of output. Speech technologists are used to separating the problem of synthesis into generation of underlying dynamics followed by re-construction of signal output. That the output is visual in this case shouldn’t hinder us from seeing the problem as a unified synthesis problem. Break-ing that problem down in a modular, abstract, object-oriented manner is correct from both a con-ceptual and a technological viewpoint.

We plan to continue our two-pronged approach, using Carnival in both the collaborative research environment at the University of Edinburgh and the commercial setting at Speech Graphics.

Appli-cations in this framework will become available for licensing by animators and video game produc-ers, as well as for academic research.

Reference

1. H. Yehia, P. Rubin, and E. Vatikois-Bateson, “Quan-titative Association of Vocal-Tract and Facial Behav-ior,” Speech Communication, vol. 26, nos. 1–2, 1998, pp. 23–43.

Michael A. Berger is a cofounder and the chief technol-ogy officer of Speech Graphics, and a PhD candidate at the University of Edinburgh School of Informatics, in the Cen-tre for Speech Technology Research. Contact him at berger@ speech-graphics.com.

Gregor Hofer is a cofounder and the chief executive officer of Speech Graphics. Contact him at [email protected]. Hiroshi Shimodaira is a lecturer in the University of Edin-burgh School of Informatics and a member of the Centre for Speech Technology Research. Contact him at h.shimodaira@ ed.ac.uk.

Contact Department Editor Miguel Encarnação at lme@ computer.org.

>Software Engineer

>Member of Technical Staff

>Computer Scientist

>Dean/Professor/Instructor

>Postdoctoral Researcher

>Design Engineer

>Consultant

Running in Circles Looking for a

Great Computer Job or Hire?

http://www.computer.org/jobs

Make the Connection

- IEEE Computer

Society Jobs is the best niche employment

source for computer science and

engineer-ing jobs, with hundreds of jobs viewed by

thousands of the finest scientists each

month -

in Computer magazine

and/or online!

IEEE Computer Society Jobs is part of the Physics Today

Career Network, a niche job board network for the physical sciences and engineering disciplines. Jobs and resumes are shared with four partner job boards - Physics Today Jobs and the American Association of Physics Teachers (AAPT), American Physical Society (APS), and AVS: Science and Technology of Materials, Interfaces and Processing Career Centers.