Max-Planck-Institut f ¨ur Informatik Computer Graphics Group
Saarbr ¨ucken, Germany
Character Animation from a Motion Capture Database
Master Thesis in Computer Science Computer Science Department
University of Saarland
Edilson de Aguiar
Supervisors: Dipl. Inf. Christian Theobalt Prof. Dr. Hans-Peter Seidel Max-Planck-Institut f¨ur Informatik Computer Graphics Group
Saarbr¨ucken, Germany
Begin: June 1,
Eidesstattliche Erkl ¨arung
Hiermit erkl¨are ich an Eides statt, dass ich die vorliegende Mastersarbeit selbst¨andig und ohne fremde Hilfe verfasst habe. Ich habe dazu keine weiteren als die angef¨uhrten Hilfsmittel benutzt und die aus anderen Quellen entnommenen Stellen als solche gekennzeichnet.
Saarbr¨ucken, den 26. November, 2003
Abstract
Character Animation from a Motion Capture Database Edilson de Aguiar
Master Thesis in Computer Science Computer Science Department
University of Saarland
This thesis discusses methods that use information contained in a motion capture database to assist in the creation of a realistic character animation. Starting with an animation sketch, where only a small number of keyframes for some degrees of freedom are set, the motion capture data is used to improve the initial motion qual-ity. First, the multiresolution filtering technique is presented and it is shown how this method can be used as a building block for character animation. Then, the hier-archical fragment method is introduced, which uses signal processing techniques, the skeleton hierarchy information and a simple matching algorithm applied to data fragments to synthesize missing degrees of freedom in a character animation, from a motion capture database. In a third technique, a principal component model is fitted to the motion capture database and it is demonstrated that using the motion principle components a character animation can be edited and enhanced after it has been created. After comparing these methods, a hybrid approach combining the individual technique’s advantages is proposed, which uses a pipeline in order to create the character animation in a simple and intuitive way. Finally, the methods and results are reviewed and approaches for future improvements are mentioned.
Acknowledgements
First I want to thank my supervisors: Dipl. Inf. Christian Theobalt and Prof. Dr. Hans-Peter Seidel for their help and advice during the development of this thesis. In addition, I thank all my friends in the IMPRS and Kerstin Meyer-Ross, for her help here in Germany. For all my colleagues of the Computer Graphics group at MPI, thank you, specially Volker Blanz, for his help with the PCA theory and Thomas Annen and Grzegorz Krawczyk for helping me with the Latex stuff.
I also wish to thank my family, who always supported me, encouraging me and making me never give up, despite the distance. Thank you Mom, Dad, Raquel, Rose and Enoc.
Contents
1 Introduction 1
1.1 Motivation . . . 1
1.2 Goals . . . 2
1.3 Thesis Outline . . . 3
2 Fundamentals of Character Animation 5 2.1 Introduction . . . 5
2.2 Keyframing . . . 5
2.3 Physical Simulation . . . 6
2.4 Motion Capture . . . 8
2.5 Project Implementation Aspects . . . 10
2.5.1 Motion Capture Database . . . 11
2.5.2 Skeleton Model . . . 12
3 Multiresolution Filtering Method 15 3.1 Introduction . . . 15
3.2 Related Work . . . 16
3.2.1 Signal Processing Methods . . . 16
3.2.2 Multiresolution Methods . . . 16
3.3 Multiresolution Filtering Method . . . 16
3.4 Multiresolution filtering on motion data . . . 18
3.5 Application to Character Animation . . . 19
3.6 Experiments . . . 20
3.7 Discussion . . . 21
4 Fragment Based Methods 25 4.1 Introduction . . . 25
4.2 Related Work . . . 26 vii
CONTENTS viii 4.3 Overview . . . 27 4.4 Motion Analysis . . . 31 4.4.1 Motion Phases . . . 31 4.4.2 Frequency Analysis . . . 33 4.4.3 Correlation . . . 33
4.5 Motion Synthesis and Texture . . . 34
4.5.1 Fragmentation . . . 34
4.5.2 Matching . . . 35
4.5.3 Joining . . . 36
4.5.4 Smoothing . . . 39
4.6 Experiments . . . 40
4.7 Hierarchical Fragment Method . . . 41
4.7.1 Skeleton Hierarchy and Correlation . . . 43
4.7.2 Method . . . 44
4.7.3 Experiments . . . 47
4.8 Discussion . . . 48
5 Principal Component Analysis 53 5.1 Introduction . . . 53
5.2 Related Work . . . 54
5.3 Principal Component Analysis . . . 55
5.3.1 Overview . . . 55
5.3.2 PCA Theory . . . 55
5.3.3 Data Compression . . . 57
5.4 PCA for motion synthesis . . . 58
5.4.1 Overview . . . 58 5.4.2 Motion Synthesis . . . 58 5.5 Experiments . . . 59 5.6 Discussion . . . 62 6 Hybrid Approach 67 6.1 Introduction . . . 67 6.2 Hybrid Approach . . . 68 6.3 Experiments . . . 69 6.4 Discussion . . . 70
List of Figures
2.1 Example of the motion capture session and equipments used to capture the motions used in the database. In (a) it is shown the camera setup and in (b) the subject performing the motion. Images used from http://www.e-motek.com. . . . 12 2.2 Example of the motion data in the database. The figure shows
respectively the z-angle values for the pelvis, hip, clavicle, forearm and knee joints. . . 13 2.3 Joints and bones forming the skeleton model used in the project. . 14 2.4 The skeleton joint hierarchy. On the right side it shows the lower
kinematic sub-chain and on the left the upper kinematic sub-chain. 14 3.1 Generation of the Gaussian pyramid. The value of each node in the
next row, , is computed as a weighted average of a sub-array
of nodes. In this example a sub-array of length five is used.
Adapted from [BA83]. . . 17 3.2 Visualization of different frequency bands of a Gaussian pyramid
(shown only for the first 40 frames). The band g0 corresponds to the original signal. The low-pass bands corresponding to the high frequency are g1 and g2, to the middle are g3 and g4, and to the low frequency are g5 and g6. . . 20 3.3 Visualization of different frequency bands of a Laplacian pyramid
(shown only for the first 40 frames). The band-pass bands corre-sponding to the high frequency are l0 and l1, to the middle are l2 and l3, and to the low frequency are l4 and l5. . . 21 3.4 Using multiresolution to increase the gain in the middle frequencies 23 3.5 Using multiresolution to decrease the gain in the middle frequencies 24
LIST OF FIGURES x 4.1 The input for the general fragment based method: (a) keyframed
and motion capture data are decomposed into frequency bands; (b) animators set the general method parameters. Driven and master joints are chosen to guide the method, and a particular frequency band of the master joint is chosen to guide the fragmentation step. Joints in black will be textured and in blue and red will be synthe-sized. . . 29 4.2 A fragment based method is composed by four steps:
fragmenta-tion (a), matching (b), joining (c) and smoothing. At the end, an original keyframed character animation is enhanced by synthesis and texturing (d). . . 30 4.3 Example of a walking animation: (a) set of four phases during a
human walking cycle; (b) the right hip z-angle values is plotted, where it is possible to see the respective phases. . . 32 4.4 Plot of the pelvis joint angle against the hip joint angle for all
ex-amples in the database. The shape shown in red demonstrates a good correlation between these joints. . . 33 4.5 Example of the fragmentation step: (a) original degree of freedom;
(b) fragments created at locations where the first derivative changes its sign. . . 34 4.6 Considering one driven fragment (a), in the matching step all data
fragments are compared with the driven fragments (b) being stretched or compressed properly (c). At the end, a number of good frag-ments are found (d). . . 36 4.7 In the joining step the good fragments found in the matching step
are concatenated or blended (a). Three different criteria were tested and compared: (b) best fragment; (c) cost matrix and (d) best ani-mation. . . 38 4.8 In the smoothing step, the discontinuity magnitude (top left) is
multiplied with a smoothing function (top right) and the result is added back to the original motion signal. In this way, the continu-ous version shown on the bottom left is generated. Adapted from [AF02]. . . 39
LIST OF FIGURES xi 4.9 Example of the joining approaches developed. In (a) all good
frag-ments generated in the matching step for the z-angle of the pelvis joint are shown. The final sequence generated by the three ap-proaches are then shown in (b), (c) and (d). . . 42 4.10 Example of possible joint correlations: (a) represents a bad
cor-relation between joints that do not belong to the same kinematic sub-chain; (b) represents a good correlation between joints that be-long to the same kinematic sub-chain. . . 43 4.11 Example of the root generation stage: (a) driven joints (red) and
root joint (green) are shown; (b) the parents of the driven joints are generated; (c) next parent joint is generated and (d) root joint is generated. . . 45 4.12 Example of the motion generation stage: (a) root joint; (b) root
joint together with the original driven joints are used to texture or synthesis of its children; (c) next children are generated; (d) all joints are generated. . . 46 4.13 Example of the Hierarchical Fragment method being applied to a
human character: (a) shows some frames from the initial character animation where only lower body joints are keyframed; (b) shows the respective frames from the resulting character animation where upper body joints are synthesized and lower body joints are textured. 50 4.14 Another example of the final method being applied to a human
character: (a) shows some frames from the initial character an-imation where only upper body joints are keyframed; (b) shows the respective frames from the resulting character animation where lower body joints are synthesized and upper body joints are textured. 51 4.15 Another example of the final method being applied to a human
character: (a) shows some frames from the initial character anima-tion where all left body side joints are keyframed; (b) shows the respective frames from the resulting character animation where all joints are synthesized. . . 52
LIST OF FIGURES xii 5.1 Three approaches are investigated in order to fit a PCA model to a
keyframed and motion capture data (a). In (b) one PCA model is created for each missing DOF. In (c) one PCA model is created for each frequency band of a missing DOF. In (d) one PCA model is created for each sequence position of a particular frequency band of a missing DOF. . . 64 5.2 Applying the PCA method to a human character: (a) shows some
frames from the initial animation where the left hip, knee, elbow and upper-arm joints are keyframed; (b) shows the respective frames of the resulting animation where new motion for the joints are gen-erated from the database in order to match the keyframed DOFs. . 65 5.3 Use of the motion principle components in order to edit a motion:
(a) arm positions are modified by altering the principle components for the right and left clavicle joints; (b) leg positions are also modi-fied by altering the principle components for the right and left knee joints. . . 66 6.1 Pipeline showing the Hybrid approach . . . 68 6.2 Example of the first stage: (a) shows some frames from the initial
animation where left body side joints are keyframed; (b) shows the respective frames after applying the PCA method; (c) and (d) show possible editing capabilities of the PCA, where the influence of the principle components for some DOFs are modified. . . 72 6.3 Example of the second stage: (a) shows some frames from the
initial animation where left body side joints are keyframed; (b) shows the respective frames after applying the first stage (PCA); (c) shows the same frames after applying the Multiresolution Fil-tering method to decrease the low frequency bands, generating a smooth acceleration of the movement. . . 73 6.4 Example of the last stage: (a) shows some frames from the
ini-tial animation where left body side joints are keyframed; (b) shows the respective frames after applying the first stage; (c) shows the frames after applying the second stage; (d) shows the same frames after applying the Hierarchical Fragment Method in order to im-prove its realistic appearance. . . 74
LIST OF FIGURES xiii 6.5 Example of the Hybrid approach: (a) shows some frames from
the initial animation where left body side joints are keyframed; (b) shows the respective frames from the animation generated auto-matically by the hybrid method; (c) and (d) show the same frames from the resulting animations generated when motion principle components are used to change arm and leg positions by adding or subtracting constant values. . . 75
Chapter 1
Introduction
1.1
Motivation
Generating realistic character animation remains one of the great challenges in computer graphics. Currently, there are three main methods by which this anima-tion can be generated. Most commonly, keyframing is used, in which the animator specifies important key poses for the character at specific frames, and the computer calculates the frames in between by an interpolation technique. A second approach uses physical simulation in order to drive the character’s motion. Although results seem to be promising, due to lack of control, difficulty of use, instabilities and high computation cost, this method has not been used with much success for characters. The last approach, motion capture, has been widely used to animate characters. The idea is to use sensors placed on subjects and collect the data that describes their motion while they are performing the desired motion.
As the technology for motion capture has improved and the cost decreased, the interest in using this approach for character animation has also increased. The main challenge that an animator is confronted with is to create sufficient detail in order to generate an character animation with a realistic appearance. Achieving detail in a keyframed animation is extremely labor intensive. However, with motion capture the details are immediately present. In other words, the data contains the motion
signature.
The main problem with motion capture is the lack of flexibility. For instance, after collecting the data it is difficult to change it. Due to this problem, many animators have little interest in using motion capture data. Although keyframing is labor intensive, the animator can make a character do exactly what he wants it to.
CHAPTER 1. INTRODUCTION 2 Since it is often difficult to know exactly what motions are needed before enter-ing a motion capture session, many techniques have been developed to edit motion capture data after it has been collected (see Sec. 2.4). In motion editing, two dif-ferent aspects should be considered: first, it is important to take care not to alter the motion in such a way that the detail is lost. Second, the system should not just provide editing capabilities but it should capture the essence of the motion, since the animator may want a completely different action.
Our intent in this project is to consider the case that an animator wants to use a number of existing motion sequences, for instance stored in a motion capture database, to generate new motions. The idea is to use the style and life-like qualities of the motions in the database to add details and a particular style to an initial keyframed animation. Then, a different approach to create a character animation is proposed: the animator starts the animation with keyframing, a method that he is familiar with, to create some degrees of freedom. After that, the motion capture database is used interactively to enhance the initial keyframed animation.
At the end, using the strengths of keyframing and motion capture, the character performs realistic motion while preserving the keyframed style and incorporating the details of the motions in the database. As a result, the animator does not need to spend hours defining key poses and the expensive motion capture session can be kept at a minimum.
1.2
Goals
As mentioned in the previous section, the main goal of this work is to combine the strengths of keyframing and motion capture in order to simplify the process of cre-ating character animations. In recent years, different approaches have been used in order to achieve this goal. Liu and Popovic [LP02] presented a method for proto-typing realistic character motion using a constraint detection method that automat-ically generates the constraints by analyzing the input motion. Tanco and Hilton [TH00] presented a system that synthesizes motion sequences from a database of motion capture examples using a statistical model created from the captured data. Pullen and Bregler [PB02] described a method to enhance an initial keyframed an-imation using motion capture data, by decomposing the data into frequency bands and fragments.
In this project, we want to analyze important characteristics of these methods, by implementing and comparing two different approaches (chapters 4 and 5) and
CHAPTER 1. INTRODUCTION 3 verifying the importance of some aspects of human motion: skeleton hierarchy, joint correlation, motion frequency range and motion phases. In addition, we try to analyze and synthesize motion by fitting a principal component model.
Another goal is to understand the details of the characteristics of human mo-tion: variations in repetitive motions and differences in the same movement exe-cuted by different individuals. A detailed understanding of these data is important in fields like biomechanics and medicine, where many quantitative studies of hu-man motion have been conducted for the purposes of treatment and prevention of injuries [AA00]. This understanding is also interesting for character animation, since the details of a motion usually reveal mood or personality.
Using this understanding, in a long term we intend to develop a system able to describe and characterize the character motion in a high level way. For instance, using simple parameters and descriptions, the system will be able to create charac-ter animations reflecting aspects like gender, mood and personality.
1.3
Thesis Outline
Chapter 2 gives a brief review of the three main animation methods. Their advan-tages and disadvanadvan-tages are described in more detail with references to the related work. In addition, some implementation aspects of our project are mentioned: the skeleton model and the motion database that was used.
Chapter 3 presents the multiresolution filtering technique. Possible motion editing capabilities of this technique are investigated and it is shown how this method can be used as a building block for character animation.
In chapters 4 and 5 methods implemented to enhance an initial keyframed ani-mation are described. Chapter 4 introduces the fragment based methods, showing that they can be successfully applied to character animation. After decomposing keyframed and captured data into frequency bands, motion phases are used to di-vide the motion capture data in small pieces, which are used to improve the original keyframed animation.
In chapter 5 a principal component model is fitted to the motion capture database and it is shown that using motion principle components it is possible to create, edit and enhance a character animation.
Using the techniques described in the previous chapters as components, chap-ter 6 introduces a pipeline solution to inchap-teractively control the creation of a charac-ter animation, which results in a betcharac-ter performance and more expressive results.
CHAPTER 1. INTRODUCTION 4 Finally, in chapter 7 all techniques and their respective results are briefly re-viewed and approaches for future improvements are mentioned.
Chapter 2
Fundamentals of Character
Animation
2.1
Introduction
In this chapter the three main methods by which character animations are created will be briefly described: keyframe interpolation, physical simulation and motion capture. Each of these methods has its advantages and disadvantages and they are appropriate in different situations. In the following sections, characteristics of these animation methods will be reviewed in more detail on the basis of their relevant related work.
In the last section, some important implementation aspects related to the project are mentioned: the skeleton model and the motion capture database that was used.
2.2
Keyframing
In the traditional technique, an animator first draws the motion extremes, and then the intermediate frames using keyframes as a guide. Nowadays, using computers, an animator can specify keyframes by posing the model in a specific position. A computer then calculates the remaining frames by interpolating between keyframes in order to create the motion curves that drive the model action.
The main problem with keyframing is that this technique is time-consuming and labor intensive. A typical articulated kinematic model, such as a humanoid character, usually has at least 50 degrees of freedom. In keyframing, an animator must animate all these DOFs, one at a time. To construct a more realistic model, its
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 6 complexity is increased and the animator must keyframe more degrees of freedom. Usually, constraints, like position of legs and arms at specific times, are always a problem because all DOFs should be specified at these positions in the character. One possible way to treat this problem is by using inverse kinematics to simplify keyframe definition. In this case, posing arm or leg, it is possible to calculate the position of parent joints - forearm, hip and knee, in order to permit the character to achieve this point.
Another problem with this technique is the interpolation process. Usually in-terpolation is done with smooth splines which fails to model variations in high frequency that real human motion has. The number of keyframes set by the anima-tor is also an important aspect. If too few keyframes are set, the motion may lack the details usually seen in live motion. To overcome this problem, trained anima-tors achieve a high level of detail by setting more and more keyframes, but in this case, at the expense of more time and effort.
Although keyframing is extensively used by animators, nowadays it is not a topic of much research. Recent works related to keyframing are trying to improve its quality using noise functions to describe variations in the motion signature. Bodenheimer et al. [BSH99] described how to introduce natural-looking variability into cyclic human motion animation using a noise function. Perlin and Goldberg [PG96] presented Improv, a system for scripting interactive actors in a virtual world using Perlin noise functions [Per85] to characterize personality and mood in human motion.
2.3
Physical Simulation
This technique tries to reduce the animator’s work by using physics to determine the motion in situations where it can be clearly specified. Although these methods have been successfully applied for animating cloth deformation (DeRose [DKT98]), rigid bodies (Baraff [Bar94] and Moore [MW88]) and to fluids (Foster and Fedkiw [FF01]), the application of physics based methods for the generation of character motion remains challenging.
Most of the work in physical simulation was done by researchers seeking ac-curate models for use in biomechanical studies, as stated in Pandy [Pan01]. Such models were found to be very complex: more than one muscle can control one joint, muscles exert nonlinear forces on tendons and joints usually have complex kinematics, involving sliding and rotation about multiple axes, as mentioned in
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 7 Delp and Loan [DL00].
Clearly, this type of modeling is not practical for computer graphics, where an animator wants to quickly compose a wide variety of motions. In fact, an animator most certainly will not know the proper configuration of muscles and bones or the amount of energy required in a specific movement.
An alternative way to create animations of articulated figures using physics was done by Witkin and Kass [WK88] in the method called spacetime constraints. Considering the entire animated motion as a numerical problem, in contrast to most previous methods that consider each frame independently, spacetime constraint methods allow motion editing while preserving the characteristics of the original motion and motion animating based on incomplete observations. The general ap-proach is to specify some physical parameters of the character, like masses of each limb and spring constants of the joints. Then using constraints, like leg and arm positions at a specific time, an animator can control the character key positions. At the end, the motion is determined by solving a constrained optimization problem, where the character energy is minimized while the constraints are satisfied.
However, normally it is difficult to model the complex human motion and cor-rectly specify masses and joints spring constants. Another problem is the high computation cost for solving the constrained optimization problem. The most suc-cessful attempts at using physical simulation of human motion come from robotics research. For instance, Raibert [Rai02] introduced some form of feedback control in the robot movement instead of just solving the equations of motion and predict-ing the proper initial conditions.
Applying the same principle to character animation, Hodgins et al. [HWBO95] developed a method to apply control systems to virtual humans. In other work, Raibert et al. [RH91] applied these controls to create animations of humans per-forming athletic events like running, biking and a gymnast vaulting. Although the characters vaguely performed the activity being simulated, the motions did not look realistic. Another problem is that each activity was treated separately, while it turned out that their method cannot be used in the same way for different activities. Due to lack of control, difficulty of use, instabilities and high computation cost, the use of physics simulation for animation of a complete character is not widely used yet. One possible way to include physics in character animation is by using spacetime constraints with other techniques, as described in the next section.
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 8
2.4
Motion Capture
In this technique joint angles of a performing actor are recorded via sensors. These values are then used to create a character motion (Menache [Men99]). In the past, such data was extremely difficult to obtain, as the sensor technology was costly. However, recently the technology has improved, the costs decreased and, therefore, this technique is becoming more available for general use.
Currently the most common techniques for obtaining motion capture data are mechanical, optical, magnetic and video-based systems. In a mechanical system, tracking is performed by having the subject wear a mechanical device, or exoskele-ton. Then, angle measuring equipments are located at exoskeleton joint locations. By measuring the joint angles of the exoskeleton, subject limb orientation is ob-tained. The main advantage of this system is that it is accurate, cheap and allows the use of haptic devices in order to generate feedback reactions. However, it is not easy to perform fast and expressive motions due to the weight of the exoskeleton and the limited range of the angle measurement devices. In addition, problems due to shift positions of the exoskeleton cause errors in the motion capture process.
In an optical system, retro-reflective markers are attached to the body of the subject. Then, a set of cameras surrounds the space where a subject moves and each camera sends out a beam of infrared light, which is reflected back from the markers. After the marker locations are recorded as 2D frames positions in the camera image planes, post-processing finds the 3D location of each marker at each time instant, and then solves for the joint configurations. The main advantages of optical systems are the very high rates of data collection and the possibility to create a great range of motion in a relatively large space. However, it requires in-tensive post-processing computations and presents problems related with occluded markers and with captured motions from more than one subject, where the markers can overlap each other.
In a magnetic system, a known magnetic field is set up and a subject wears sensors that detect location and orientation of each limb based on the magnetic field. This technique allows real-time data collection and there are no problems with occlusion. On the other hand, this method is very sensitive to the area where the motion is performed. In addition, due to the wires that must be attached to sensors, many motions are awkward for the subject.
Systems of the last category, video-based systems, try to get the motion data by merely using a couple of video cameras. Although it is an interesting technique,
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 9 it is complicated. Standard computer vision techniques do not work for an artic-ulated figure, in which many of the motions cannot be defined by a simple affine transformation, but involve rotations about all the joints in the kinematic chain. Currently, much research is trying to make video-based motion acquisition more accurate and, therefore, useful for motion capturing.
The interest in using motion capture for creating character animation is increas-ing. The main reason is that this technique can provide motion data for all degrees of freedom at a high level of detail. For instance, using motion capture all the de-tails of a motion are inherent in the data, hence, coming for free. In addition, a transfer of data to a different character with different skeleton dimensions is also possible.
On the other hand, this technique also has disadvantages. The main problem is the lack of flexibility. Since it is difficult to modify a captured motion after it has been collected, an animator should know exactly what he wants. In general this is not the case since the process of creating a character animation is normally evolu-tionary. Usually, an animator has only a coarse impression of the desired motion before he captures it, minor connections might be needed anytime thereafter. In addition, motion capture sessions are still costly and labor intensive, which makes the repetition process prohibitive.
As a result, a great deal of research in recent years aimed at providing better ways of editing motion capture data after it is collected. A general approach is to adapt the motion to different constraints while preserving the style of the original motion. Wiley and Hahn [WH97] developed a method providing inverse kinemat-ics capability by mixing motions from a database to create a new animation that matches a certain specification. Witkin and Popovic [WP95] developed a method in which the motion data is warped between keyframe-like constraints set by the animator. Warping is done by overlapping and blending motion clips. Rose et al. [RCB98] developed a method which uses radial basis functions and low order polynomials to interpolate between example motions while maintaining inverse kinematic constraints.
As mentioned in the previous section, spacetime constraints can be used in or-der to include physics in character animation. Gleicher [Gle97] presented a method that allows an animator to start with an initial animation and to interactively repose the character. A spacetime constraint solver is then used to minimize the difference between the new and old motion, subject to constraints specified by the animator. A similar approach was also used by Gleicher [Gle98] to retarget motions to
char-CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 10 acters of different dimensions. Lee and Shin [LS99] combined a hierarchical curve fitting technique with an inverse kinematics solver to adapt the motion. Popovic and Witkin [PW99] developed a method that uses a reduced dimensionality space and dynamics to perform the editing process. Rose et al. [RGBC96] described the generation of motion transitions using a combination of spacetime and inverse kinematics constraints in order to create seamless and dynamically plausible tran-sitions between motion segments.
A more general problem with motion capture is that it is not an intuitive way to start a character animation. Since many factors (such as the environment) influence the motion, the final motion sequence will not be known with all details right from the beginning. Animators are usually trained to use keyframes. They will often build an animation by first making an initial motion sketch with a few keyframes and add complexity and detail on top of that later.
Therefore, combining the strengths of keyframing and motion capture, the pro-cess of animating a character is expected to be simplified. In our approach, an animator starts a character animation with keyframing, a method that he is familiar with, to animate some degrees of freedom. After that, he uses the motion capture database interactively to enhance the initial animation.
2.5
Project Implementation Aspects
The methods described in the next chapters are implemented in our prototype sys-tem using the C++ programming language on the Linux platform. In order to facilitate the skeleton and animation manipulation, the free open-source character animation library CAL3D1is used in the project.
The character animation library, CAL3D, is coded in C++ and uses the STL containers to store the data. It provides basic data structures for skeleton-based character animation: sequencing and blending of animations, handling of bones, skeletons, materials and meshes. The character motion is composed by two kinds of transformations: translation and rotation. Translations are represented as vectors and rotations as quaternions. In comparison with other rotation representations (e.g. explicit matrices or Euler angles), keyframe interpolation using quaternions is accurate and intuitive.
1
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 11 2.5.1 Motion Capture Database
The motion capture data used in this work was obtained from MOTEK2, a motion capture company that provides a set of motion sequences to the research commu-nity. The company uses a VICON 8 optical motion capture system with 8 to 24 cameras for data acquisition. Using cameras placed around the capture space to track the positions of markers attached to the body of performers, triangulation is used to compute the 3D position of a marker at any given sample, from an array of 2D information from every camera. For data processing the Diva software is used, which applies template matching software algorithms in order to solve occlusions and marker disappearance problems.
Using markers attached to the body of a subject, cameras surrounding the sub-ject send out a beam of infrared light, which is reflected back from the markers. After the marker positions are recorded in 2D from each camera position, post-processing finds the 3D location of each marker at each point in time, and then solves for the joint configuration. At the end, the data is delivered as a set of trans-lations and rotations of the joints of an articulated body which corresponds with the description of a human character. Figure 2.1 shows the motion capture session and equipments used to create the motions in the database.
To evaluate the methods presented in this work, all the sequences of the database that represent variations of one particular action were considered. In our project we chose human walking, since it is the most common action that need to be animated in a realistic way. Starting with an initial database of 129 different motions, each of which of 40 frames long, we used the 3D Studio MAX ™ version 5.13to increase the length of all animations. Since walking can be considered as a cyclic motion, we created the final animation by repeating the original four times.
At the end, each animation in the database has the length of 160 frames. Using the exporting software provided by the CAL3D library, these animations were ex-ported to be used in our prototype as CAL3D animation files. Figure 2.2 shows the motion data plotted over time for different degrees of freedom (i.e. joint angles) for a particular animation sequence in the database.
2
http://www.e-motek.com 3
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 12
(a) camera setup surrounding the space where a subject performs a mo-tion
(b) markers attached to the body of a subject performing the desired mo-tion
Figure 2.1: Example of the motion capture session and equipments used to capture the motions used in the database. In (a) it is shown the camera setup and in (b) the subject performing the motion. Images used from http://www.e-motek.com.
2.5.2 Skeleton Model
In the skeleton-based approach, a skeleton model is a kinematic chain consisting of bones and connecting joints. Figure 2.3 shows the graph structure that specifies an articulated figure defining a character. Each bone in the skeleton, drawn in white, is a link in the graph and represents a limb. Joints, drawn in blue, connect the bones and represent rigid body transformations between them. Translations are repre-sented by vectors and have three degrees of freedom. Rotations are reprerepre-sented by unit quaternions and also have three degrees of freedom.
As shown in figure 2.4, a skeleton can be treated as a hierarchical model, where each joint has a parent and children. Joint rotations or translations of a parent af-fects all of its children joints. The paths from the skeleton root to the terminating joint in the skeleton, shown in the figure, can be treated as kinematic sub-chains, exhibiting different behaviors depending on the motion being performed. For in-stance, in a walking motion, the lower kinematic sub-chain, composed by pelvis, hip, knee, foot and toe, is more active, being responsible for the translational move-ment of the body. On the other hand, the upper kinematic sub-chain, composed by pelvis, spine, elbow, upper-arm, forearm, hand and finger, helps controlling the body equilibrium.
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 13
Angle values for some DOFs (rad)
Keyframes (time)
Figure 2.2: Example of the motion data in the database. The figure shows respec-tively the z-angle values for the pelvis, hip, clavicle, forearm and knee joints.
The skeleton used during this research has 21 bones and 22 joints connecting them. Since the skeleton structure is fixed, only rotations are considered in our project, and then, each joint is represented by an unit quaternion, being described by three angular degrees of freedom. In total, the skeleton model representing the human character has 66 degrees of freedom.
This skeleton is created in 3D Studio MAX using the biped tool from Character Studio ™ version 4.04, in order to match the skeleton format of the motion capture files in the database.
4
CHAPTER 2. FUNDAMENTALS OF CHARACTER ANIMATION 14 TOE FOOT KNEE HIP SPINE ELBOW UPPERARM FOREARM FINGER HAND NECK HEAD PELVIS
Figure 2.3: Joints and bones forming the skeleton model used in the project.
PELVIS HEAD NECK R FINGER R HAND R FOREARM R UPPERARM R ELBOW SPINE L FINGER L HAND L FOREARM L UPPERARM L ELBOW R TOE R FOOT R KNEE R HIP L TOE L FOOT L KNEE L HIP
Figure 2.4: The skeleton joint hierarchy. On the right side it shows the lower kinematic sub-chain and on the left the upper kinematic sub-chain.
Chapter 3
Multiresolution Filtering Method
3.1
Introduction
In this chapter it is demonstrated that techniques from image and signal-processing can be a useful tool to generate and edit character animation. Using a skeleton-based approach to represent a character, the body configuration can be defined as joints parameters (e.g. rotations and translations). Over time, these parameters can be treated as signals and, therefore, analyzed with techniques from signal-processing. In the motion capture data (see Sec. 2.5.1), each joint is represented by three degrees of freedom, where each DOF can be treated as a discrete sampled signal, corresponding to its values at each time instant, also known as frame.
This chapter describes how a technique called multiresolution filtering can be applied to motion capture data. The idea is to use a range of filter-banks to pass the motion signal through a cascade of low-pass filters to produce a set of short-time band-pass and low-pass signal components. After filtering, the motion signal is represented as a collection of frequency bands where low frequency bands describe the global pattern of the motion signal, the middle frequency bands provide the motions details, and the high frequency bands usually contain noise like jitter and wiggling.
Therefore, decomposing the motion data into frequency bands, existing motion data can be edited interactively by amplifying or attenuating a particular frequency band and new motions can be generated by band-wise blending of existing mo-tions. In addition, this method can be used to concatenate and to compare motion sequences. It will be shown in the subsequent chapters that this functionality can be used as a building block for character animation.
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 16
3.2
Related Work
A number of different approaches have been presented in the literature using signal processing analysis for editing motion data. Here they are conceptually categorized in signal processing methods and, more specifically, multiresolution methods. 3.2.1 Signal Processing Methods
Unuma et al. [UAT95] used Fourier analysis techniques to interpolate and extrap-olate motion data in the frequency domain. In this work, they could manipulate motion data and alter the style through interpolation, extrapolation and transitional tasks. Liu et al. [LGC94] reported that adaptive refinement with hierarchical wavelets provides a significant speed-up for spacetime optimization. Amaya et al. [ABC96] used Fourier analysis to generate emotional animation from neutral human motion.
3.2.2 Multiresolution Methods
The notion of multiresolution analysis was invented by Burt and Adelson [BA83], who introduced a multiresolution image representation, the Gauss-Laplacian pyra-mid, to facilitate such operations as seamless merging of image mosaics and tem-poral dissolving between images. Lee and Shin [LS00] developed a multireso-lution analysis method that guarantees coordinate-invariance without singularity, using hierarchical displacement mapping and motion filtering. In another work, Lee and Shin [LS01] presented multiresolution motion analysis as a unified frame-work to facilitate a variety of motion editing tasks. Bruderlin and Williams [BW95] adopted a digital filter-bank technique to address multiresolution analysis of dis-crete motion data. Their hierarchical representation of a motion with frequency bands allows level-by-level editing of motion characteristics.
3.3
Multiresolution Filtering Method
Originally the method was developed for image mosaics [BA83]. Applying the multiresolution method to a two-dimensional image, the first step is to obtain the low-pass or Gaussian pyramid . The Gaussian pyramid can be
obtained by repeatedly convolving a small weighting function (w(l, m)) with the image, while it is sub-sampled by a factor of 2 at each iteration (figure 3.1). This
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 17
Figure 3.1: Generation of the Gaussian pyramid. The value of each node in the next row, , is computed as a weighted average of a sub-array of nodes. In
this example a sub-array of length five is used. Adapted from [BA83].
process is repeated until the image size is reduced to one, which is the average intensity, or constant value (DC).
The pattern of weights w(l, m) used to generate each pyramid level from its predecessor is called the generating kernel. These weights are chosen subject to four constraints:
• For computational convenience, the generating kernel is separable:
wˆ
wˆ ;
• The function w(l,m) should be symmetric:
, , and ;
• The function w(l, m) should be normalized:
• Each node of level l must contribute the same total weight to the nodes of level
:
In order to obtain the band-pass bands (also called Laplacian pyramid), each level of the pyramid is subtracted from the next lowest level. In this case, it is necessary to create new samples by interpolation, because these pixel grids differ in sample density.
Finally, the low-pass and band-pass bands can be amplified or attenuated sep-arately and the image can be reconstructed by adding up all the band-pass bands plus the constant value (DC).
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 18
3.4
Multiresolution filtering on motion data
Following the algorithm described in Bruderlin and Williams [BW95], the prin-ciples of image multiresolution filtering will be applied to a particular degree of freedom of human motion, represented as a one-dimensional signal over time. In-stead of constructing a pyramid of low-pass and band-pass sequences by reducing their lengths, the lengths are kept the same, but the filter kernel (w(m)) is expanded at each level by inserting zeros between their values. For example, using a initial kernel of length five (weights a, b and c), the following kernel sequence will be generated: = [c b a b c] = [c 0 b 0 a 0 b 0 c] = [c 0 0 0 b 0 0 0 a 0 0 0 b 0 0 0 c], etc
The weights (a, b and c) can be found using the constraints described in section 3.3. Considering the kernel of length five the following values are found:
and
The length of the signal determines the number of frequency bands, which should be generated. In general for a signal of length m (
), the number of computed bands (nb) is N.
Then, the following steps are performed in order to create the Gaussian (G) and the Laplacian (L) pyramids:
• The Gaussian sequences
are calculated by successively
con-volving the signal with the expanded kernels:
(3.1)
This can be calculated efficiently by keeping the kernel constant and skipping signal data points:
(3.2)
To deal with the case when the value
lies outside the signal’s
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 19 • The Laplacian sequences are calculated by subtracting two successive
low-pass bands : (3.3)
• Finally the signal is reconstructed by summing up all band-pass bands and the constant value (DC):
(3.4)
During the reconstruction of the signal, each frequency band can be summed up with a different gain or weight ( ). Modification of these weights allows to
increase or decrease the influence of each frequency band in the final reconstructed signal, generating slightly different signals and maintaining the basic information.
3.5
Application to Character Animation
Looking at the motion data for a particular DOF as a one-dimensional signal over time, the algorithm described in the previous section is applied to the motion cap-ture data motivated by the following intuition: low frequencies contain general and gross motion patterns, middle frequencies contain the signature of the motion and the high frequencies describe some details and possible noise. Then, treating each degree of freedom as a one-dimensional signal, it is possible to calculate its respective low-pass (G) and band-pass (L) pyramids.
Using the Gaussian and Laplacian pyramids, the signal representing the de-gree of freedom can be reconstructed by summing up all band-pass bands plus the constant value (DC), as mentioned in section 3.3. By changing the reconstruction weights, it is possible to increase or decrease the effect of each frequency band on the final motion signal reconstruction. Different sets of weights will result in different motions, since each frequency band generally contains a particular infor-mation or detail of the overall motion. In this way, it is possible to perform some animation editing tasks. In addition, using the extracted frequency bands it is pos-sible to create new motions by blending and concatenating frequency bands from two different existing motions.
Examples of the Gaussian and Laplacian pyramids generated by the algorithm are shown in figures 3.2 and 3.3. Considering the pelvis z-angle as a sampled
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 20 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 5 10 15 20 25 30 35 40
Values for the Z-angle of the Pelvis (rad)
Keyframes (time) ’original’ ’g0’ ’g1’ ’g2’ ’g3’ ’g4’ ’g5’ ’g6’
Figure 3.2: Visualization of different frequency bands of a Gaussian pyramid (shown only for the first 40 frames). The band g0 corresponds to the original signal. The low-pass bands corresponding to the high frequency are g1 and g2, to the middle are g3 and g4, and to the low frequency are g5 and g6.
signal, figure 3.2 shows the generated Gaussian pyramid. The generated Laplacian pyramid is shown in figure 3.3
3.6
Experiments
To demonstrate the frequency-based decomposition and motion creation power of the multiresolution method, the following experiments are performed. Using a human walking sequence with 160 frames as an example, six frequency bands are generated. Then, the multiresolution approach is simultaneously applied to all degrees of freedom of the skeleton model (see section 2.5.2). The same frequency band gains are used for all degrees of freedom and the resulting motion is generated at interactive rates.
Figure 3.4 illustrates that it is possible to create a smoothed acceleration of the movement by increasing the middle frequencies (bands 3 and 4). In this figure it is also possible to see the comparison between the original signal and the recon-structed signal. Figure 3.5 shows an attenuated, constrained human walking with reduced joint movement when the middle frequencies are decreased. Similar
re-CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 21 -0.05 0 0.05 0.1 0.15 0.2 0 5 10 15 20 25 30 35 40
Values for the Z-angle of the Pelvis (rad)
Keyframes (time) ’original’ ’l0’ ’l1’ ’l2’ ’l3’ ’l4’ ’l5’
Figure 3.3: Visualization of different frequency bands of a Laplacian pyramid (shown only for the first 40 frames). The band-pass bands corresponding to the high frequency are l0 and l1, to the middle are l2 and l3, and to the low frequency are l4 and l5.
sults can also be achieved by changing the low frequencies (bands 5, 6). Increasing the low frequencies attenuates the motion, and decreasing accelerates it. Since the high frequency bands generally represent noise, their changes resulted in a addition of a nervous twitch to the animation.
3.7
Discussion
The multiresolution filtering technique provides a rapid interactive loop, and fa-cilitate the reuse and the adaptation of motion data by amplifying or attenuating important frequencies of the motion data. This technique can provide a high-level motion control and facilitates to reuse and adapt existing motions of an articulated character. In the last instance, it can serve as building block for high-level character motion processing.
Multiresolution filtering is not usable as a stand-alone technique for character animation. A drawback of this method is that some constraints such as joint limits or non-intersection with the floor can be violated in the filtering process. However, it can be solved by employing constraints or optimization after the definition of the
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 22 general motion. Nevertheless, the frequency decomposition can be used to extract certain features of a motion which can be used during character animation.
Combining the multiresolution filtering with the techniques described in chap-ter 4, 5 and 6, a powerful tool has been developed which can be used in the follow-ing tasks:
• Concatenating motions: If two motion sequences should be concatenated, the multiresolution method gives control over the transition zone and the transition weight coefficient allowing concatenation band-by-band.
• Comparing motions: If two motion sequences should be compared, the gen-erated low-pass and band-pass bands can be used in order to perform a com-parison of only a particular frequency band, containing specific information. • Blending motions: The multiresolution method allows blending of an an-imation band-by-band, where only important bands are added in order to generate a more realistic motion. In this case, it is possible to copy a partic-ular information contained in a frequency band from one signal to another.
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 23
(a) original motion (b) reconstructed motion with in-creased gain -0.05 0 0.05 0.1 0.15 0.2 0.25 0 5 10 15 20 25 30 35 40
Values for the Z-angle of the Hips (rad)
Keyframes (time)
’originali’ ’reconstructedi’
(c) signal comparison (only the first 40 frames)
CHAPTER 3. MULTIRESOLUTION FILTERING METHOD 24
(a) original motion (b) reconstructed mo-tion with decreased gain
0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 5 10 15 20 25 30 35 40
Values for the Z-angle of the Hips (rad)
Keyframes (time)
’original’ ’reconstructed’
(c) signal comparison (only the first 40 frames)
Chapter 4
Fragment Based Methods
4.1
Introduction
In chapter 3 the editing and analysis power of the multiresolution filtering tech-nique was described. In addition, it was mentioned how the multiresolution can be used as a building block for character animation, performing tasks like comparing, blending and concatenating motions. In this chapter, fragment based methods are described, which decompose motion into frequency bands and divide it into small pieces. A character animation is generated satisfying some guidelines given by an animator and using the details contained in fragments extracted from a motion capture database.
The main goal is to create a character animation semi-automatically, following the same procedure that an animator would do manually. Animators are trained to use keyframing, and will often build a character animation by first making a rough animation with few keyframes to sketch out the motion. Complexity and details are added on top of that later.
Using the same idea, a fragment based method combines the strengths of keyfram-ing and motion capture. Startkeyfram-ing from an initial character animation where only few degrees of freedom were keyframed, our method will use the information from the motion capture database to add details to the keyframed DOFs and to completely synthesize the remaining DOFs. At the end, an improved character animation will be generated, based on the animator’s keyframed sketch and performing what he wanted.
By combining the strengths of keyframing and motion capture, the creation time of an animation and hence the overall animation costs are reduced. As stated
CHAPTER 4. FRAGMENT BASED METHODS 26 in chapter 1, an animator does not need to spend hours defining key poses and expensive motion capture sessions can be kept at a minimum. In addition, our method can be used by someone who does not know the details about the motion he wants to create. Using a set of captured motions, for example downloaded from a database resource on the Internet, he can sketch the desired motion by keyframing some DOFs and then use this method to develop a enhanced version of it.
4.2
Related Work
Considering a general animation, some work have addressed the problem of syn-thesizing motions or altering pre-existing motions to have a particular style. Chen-ney and Forsyth [CF00] used a markov chain monte carlo algorithm to sample multiple animations that satisfy constraints for the case of multi-body collisions of inanimate objects. Schodl et al. [SSSE00] used monte carlo techniques and devel-oped the concept of a video texture, where an infinite amount of similar looking video sequences are generated from a starting short video clip.
For character animation in particular, probabilistic methods were also used to represent the data. Bregler [Bre97] has used hidden markov models to recognize full body motions in video sequences. Brand and Hertzmann [BH00] have also used hidden markov models with an entropy minimization procedure to learn and synthesize motions with particular styles. A different approach was done by Chi at al. [DCB00], where they used the principles of laban movement analysis to create a method that allows the style enhancement of a pre-existing motion in an intuitive manner.
Recently, some projects try to allow the creation of new character animations based on motion capture data. Li et al. [LWS02] divided the data into motion textons, where each of which could be modeled by a linear dynamic system. In this case the motions were synthesized considering the texton likelihood. Arikan and Forsyth [AF02] developed a method for automatic motion generation at interactive rates using a random search algorithm to find appropriate pieces of motion data. Kovar et al. [KGP02] defined the concept of a motion graph to enable the control of a character locomotion. This method allows a user to have high level control over the character motions. Lee et al. [LCR
02] presented a technique for controlling a character in real time using several possible interfaces. In this case the animations are created by searching through a motion database using a clustering algorithm.
CHAPTER 4. FRAGMENT BASED METHODS 27 and Bregler [PB02] [Pul02]. In these works they used signal processing techniques and a simple matching algorithm applied to data fragments in order to fetch missing degrees of freedom in a character animation using a motion capture database.
Our approach described here tries to improve critical aspects of their work. We will demonstrate that the performance of the algorithm can be improved by using the skeleton hierarchy information to guide the search for good correlations between the joints.
4.3
Overview
In order to use a fragment method, an animator should first create a sketch of the character motion by keyframing some degrees of freedom. Then, he should provide the following information:
• Which degrees of freedom should be used to drive the method. An an-imator can keyframe as many joints as he wants, but the method uses only some DOFs, which should contain the motion essence. For instance, con-sidering a human walking, knee and hip joint motions are more important than toe and finger motions. These DOFs, considered more important for a specific action, should be set and are called driven joints. In addition, a particular driven joint should be chosen to drive the fragmentation step (see Sec. 4.5.1), being called the master joint.
• Which DOFs should be partially synthesized. Keyframed DOFs do not need to be completely synthesized since they already contain the motion essence. Small details in the motion, on the other hand, are not present. In this case, the details are extracted from the database and added to these DOFs in a process called texturing.
• Which DOFs should be completely synthesized. DOFs that are not keyframed are nevertheless important to correctly describe the desired character motion. The motion for these missing DOFs is completely generated in a process called synthesis.
• Which frequency bands are used for texturing. As mentioned in chap-ter 3, motion data can be treated as a signal over time, and therefore, it is possible to use the multiresolution method to decompose it into frequency
CHAPTER 4. FRAGMENT BASED METHODS 28 bands. Then, in the texturing process, only some bands are modified in order to generate the enhanced motion.
• Which frequency band is used to fragment the data. In order to compare driven joints and existing motions in the database, they need to be divided into small pieces, called fragments. Then, a particular frequency band from the master joint is used in the fragmentation step (see Sec. 4.5.1) and frag-ments are created taking into account its respective motion characteristics. • How many matches should be kept during the matching step. Using the
fragments generated, a matching step is performed where they are compared (see Sec. 4.5.2). Then, a certain number of candidates are stored and used to the synthesis and texturing.
The main idea in a fragment based method is to consider correlation between joints during the character motion, and therefore, to generate an estimative of a joint motion based on the information of other joints. For instance, during a walk-ing sequence the motions for arm, leg and hip joints tend to be correlated. Then given the arm and leg joint motions it is possible to estimate the hip motion. The method described here uses this fact. Character motion following the keyframe out-line given by the animator is synthesized by concatenating small pieces of motion capture data from the database.
The general algorithm is shown in figures 4.1 and 4.2. First, as shown in figure 4.1(a), keyframed and motion capture data are decomposed into frequency bands using the multiresolution method described in chapter 3. Then parameters for the method are set, as shown in figure 4.1(b), and the following four steps are performed:
• Fragmentation: One of the frequency bands of the master joint is chosen and the motion sequence is divided in fragments at specific points according to the motion behavior in that specific band (see Sec. 4.4.1). All existing ex-amples in the database are fragmented at the same points in time. Thereby,
driven fragments and data fragments are created depending on if the
corre-spond joint is driven or not, as shown in figure 4.2(a).
• Matching: After generating data and driven fragments, each data fragment, at a particular sequence position, is compared with all driven fragments, in
CHAPTER 4. FRAGMENT BASED METHODS 29
(a) Keyframed and motion capture are decomposed into frequency bands
(b) Animator set the method parameters
Figure 4.1: The input for the general fragment based method: (a) keyframed and motion capture data are decomposed into frequency bands; (b) animators set the general method parameters. Driven and master joints are chosen to guide the method, and a particular frequency band of the master joint is chosen to guide the fragmentation step. Joints in black will be textured and in blue and red will be synthesized.
all sequence positions, as shown in figure 4.2(b). The most similar data ments for each sequence position are then stored, being called good
CHAPTER 4. FRAGMENT BASED METHODS 30
(a) Fragmentation: driven and data fragments are generated
(b) Matching: data fragments are compared with driven fragments
(c) Joining: best fragments are found for each sequence position
(d) Enhanced character animation is generated by synthesis and tex-turing
Figure 4.2: A fragment based method is composed by four steps: fragmentation (a), matching (b), joining (c) and smoothing. At the end, an original keyframed character animation is enhanced by synthesis and texturing (d).
CHAPTER 4. FRAGMENT BASED METHODS 31 • Joining: After finding all good fragments for each sequence position, they are used to enhance the original keyframed animation. Using only the best fragments at each position, which are found based on heuristic evaluation criteria (see Sec. 4.5.3), they are concatenated and blended in order to com-pose the enhanced character animation, as shown in figure 4.2(c) and 4.2(d). • Smoothing: The enhanced character animation can exhibit some jumping artifacts due to discontinuities where the best fragments were joined. In this case these discontinuities are reduced through a technique detailed in section 4.5.4.
The result of the whole process is a character animation in which all DOFs are generated by texturing or synthesis. The final character animation presents the desired realistic appearance (examples can be found in Sec. 4.6).
The rest of this chapter is structured as follows: Sec. 4.4 explains the funda-mental basis used in the algorithm and Sec. 4.5 described the separate steps of the algorithm in detail. Then, in Sec. 4.6 results obtained with the general fragment method are described and its performance is demonstrated. After that, the general method is modified in order to better generate expressive motions and to capture more details from the database (Sec. 4.7). New experiments are made with this approach and it is shown that the performance improves (Sec. 4.7.3). The chapter concludes in Sec. 4.8 with a brief review and discussion.
4.4
Motion Analysis
In order to develop a method where a character motion is created using examples from a database and following the guideline of some keyframed degrees of free-dom, motion characteristics should be understood. In addition, it is important to find out which features of the original keyframed animation can be used to describe the details contained in its motion.
Three different features are chosen to represent the motion characteristics: mo-tion phases, frequency bands and correlamo-tion between joints. These features are described in the next sections.
4.4.1 Motion Phases
As described in Pullen [Pul02], phase is considered a segment of time where a particular set of hard constraints are satisfied in a motion. Usually, the initial points
CHAPTER 4. FRAGMENT BASED METHODS 32 of these phases correspond to what traditional animators often use as keyframes in their animations. As seen in figure 4.3, for a walking cycle, there is a phase where the right foot is flat on the floor, another phase where the right heel lifts while the right toe stays on the floor, then the left heel touches the floor, and finally the left heel lifts while the left toe is touching the floor.
(a) phases during a walking cycle
-0.35 -0.3 -0.25 -0.2 -0.15 -0.1 -0.05 0 0.05 0 5 10 15 20 25 30
Values for the Z-angle of the Hips (rad)
Keyframes (time)
(b) different phases in the motion signal
Figure 4.3: Example of a walking animation: (a) set of four phases during a human walking cycle; (b) the right hip z-angle values is plotted, where it is possible to see the respective phases.
The phases can be treated as a natural subdivision of the current motion being enhanced, and therefore, it is the ideal feature to be used in the fragmentation step to guide the generation of data and driven fragments (see Sec. 4.5.1).
CHAPTER 4. FRAGMENT BASED METHODS 33 4.4.2 Frequency Analysis
All degrees of freedom of the human character are analyzed in the frequency do-main, which simplifies the data format and allows the separation of different as-pects of the motion. As mentioned in the Section 3.6, generally variations in the low frequency bands are associated with the overall motion, in the middle fre-quencies, with the motion signature and variations in the high frequency bands are perceived as jitter. These forms of fluctuation are present in real motion, and they should be preserved in order to capture all information from the original motion. 4.4.3 Correlation
In human motion there are many correlations between joints during some actions. These correlations are especially clear for a repetitive action like walking. An example is shown in figure 4.3(a): as the right knee goes forward, the left arm swings forward, or when the right elbow angle has a certain value, the upper-arm angle is most likely to fall within a certain range.
These correlations can be graphically seen in a plot such as that shown in figure 4.4. The fact that the plot has a specific shape (shown in red), indicates that there is a good relationship between the joints.
-0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04
Values for the Z-angle of the Knee (rad)
Values for the Z-angle of the Hips (rad)
Figure 4.4: Plot of the pelvis joint angle against the hip joint angle for all examples in the database. The shape shown in red demonstrates a good correlation between these joints.
Taking advantage of these joint relationships, it is possible to generate DOFs that have not been animated by using information from the keyframed joints. In addition, details can be added to a DOF that has been animated, by concatenating and blending the best fragments found from the analyze of the driven joints on selected frequency bands.
CHAPTER 4. FRAGMENT BASED METHODS 34
4.5
Motion Synthesis and Texture
The general method starts with a sketch done by an animator. Suppose that an animator sketches out a human walking by animating only the legs and then he wants to generate the motion for the whole body. A good choice for the degrees of freedom to drive the method would be the hip z-angle and the knee z-angle, because the motion from both joints contain essential information about the walking action. Using the frequency analysis, keyframed and motion capture data are decom-posed into frequency bands. For each DOF that has already been animated, only the middle frequency bands are created, leaving the overall motion intact. For each DOF that has not been animated, all frequency bands will be synthesized.
The process to reach this goal consists of four main steps: fragmentation, matching, joining and smoothing.
4.5.1 Fragmentation
In this step the driven and data fragments are generated. The first derivative of the master joint angle in the chosen frequency band is used in order to identify the motion phases. The time instants where the first derivative changes its sign is used to fragment the data [PB02], as shown for a particular DOF in figure 4.5.
Using these same time instants, keyframed and motion capture DOFs in all frequency bands are broken into fragments, as shown in figure 4.2(a). Driven
frag-ments are generated from the subdivision of the driven joints and data fragfrag-ments
are generated from the subdivision of all examples in the database.
-0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0 20 40 60 80 100 120 140 160 Angle(rad) Keyframe (time)
(a) original motion signal
0 20 40 60 80 100 120 140 160
Angle(rad)
Keyframe (time)
(b) fragmented motion signal
Figure 4.5: Example of the fragmentation step: (a) original degree of freedom; (b) fragments created at locations where the first derivative changes its sign.