CHAPTER 3: BACKGROUND
3.2. Learning by Observation
3.2.1. Prior Work in the Field of Learning by Observation
Some researchers have stated that certain knowledge could be very difficult to extract with traditional methods [28], [70]. Knowledge that is hard to model and highly intuitive is classified as implicit knowledge. Implicit knowledge is easier to extract by using learning by observation. In fact, it might not even be possible to formalize this knowledge with traditional methods.
Henninger et al. [32] show that by using learning by observation, the models generated are more accurate and comply more fully with human performance. The use of learning by observation will also open the possibility of easily developing many simulated entities with similar, but not identical, behaviors. Each entity could be tuned slightly differently to personalize its behavior.
Schaal [68] states that in an enormously large search space, one approach is to use learning by observing and imitating the behavior to reduce the search space and make it usable. Restricting the learning algorithm to minimize the deviation between the observed entity and the learning agent dramatically reduces the search space. In some applications, learning by observation could be used to reduce or even diminish the need for time- consuming and complex programming. If the agent at hand can learn tasks automatically by observing the task performed by others, the need to program the new behavior by hand is no longer necessary.
The time and cost of correcting, updating and customizing could be reduced if learning by observation were to be used instead of restructuring, adding or removing part of the model by hand. Milzner and Leifhelm [51] propose learning by observation to relax the knowledge update problem in rapidly changing knowledge domains.
In artificial neural networks, the term learning by observation is often used to refer to the fact that the training data is a set of observations. This is not to what this dissertation refers to as learning by observation. Much of the data in the machine learning community is based on real observations, but do not include any behavior knowledge or demonstration on how to perform a
task. Even if many observations are used to learn (for example, recognizing handwritten characters), the observed entity might not teach us any behavioral skills. The intent of this research is not only to use the observations to learn, but also to learn the behavior of the observed entity. Thus, interest herein is to look at learning by observation with respect to gathering knowledge by observing a human in action in order to model his behavior.
In the area of robotics, the use of learning by observation has been previously used to implement human behavior in humanoid robot movements [4] , [67], [73]. Many times, the core issue of learning by observation within the robotic area is dealing with image processing, as humanoid robots often use cameras to implement their vision. The learning system often tries to mimic the specific movement of the human, interpreted by the robotic vision system, and not to adopt a general behavior. This refers to the sensing and perception module in the modified stage model (see Figure 1) and is not the objective of the research presented in this dissertation. Schaal [68] makes a distinction between learning by observation and imitation learning. In most humanoid robotics, the objective of the movement pattern is to imitate the human as closely as possible.
Bentivegna and Atkeson [8] used learning by observation to implement behavior skill in a humanoid robot. By observing a human, it learned to play air hockey using a set of action primitives, each describing a certain behavior (e.g. left hit). Prior work in modeling human behavior through learning from observation has also been conducted in the area of maneuvering a car [59] and flying an aircraft [44] [66]. However, the work in modeling human behavior has been done to create the best performing agent or to model low level motor skills. No results have
been found where the focus is on personalized behavior patterns and/or tactical decision making.
Moukas and Hayes [52] used learning by observation to model social behavior in autonomous robots. The social behavior they modeled was the behavior of honeybees. The study of interest was that bees communicate with each other using dances that tell where food has been found. Moukas and Hayes’ reinforcement scheme showed the potential of learning by observation. The social behavior to learn even the things needed to teach others must be classified as a very hard problem that was still solved through observation alone.
The learning algorithm must be able to collect the data from the environment and monitor the actions of the expert. A feasible way of collecting data and probing the action of the user is to use a simulator to implement learning by observation, as in the work of Gonzalez et al [28]. By using a simulator instead of the real world, data collection will likely be much easier, and some situations that are difficult or dangerous (e.g. hazardous situations) could emerge. In the
simulator environment, there is no need for complex sensors or image recognition systems to be able to understand the environment. Gonzalez et al [29] argue that learning through observation is especially well suited to acquiring tactical knowledge, the knowledge used to apply the best action for a given situation. Tactical knowledge is often implicit knowledge. Hence, tactical knowledge can be very hard to express and extract from an expert by traditional knowledge acquisition methods. Several different learning strategies have been used to implement learning by observation.