Object Individuation, Identification, and Categorization

CHAPTER 2. BACKGROUND AND RELATED WORK

2.1 Related Work in Philosophy, Psychology, and Cognitive Science

2.1.2 Object Individuation, Identification, and Categorization

Wilcox et al. (2006) define the problem of object individuation as that of determining whether two perceptual stimuli (e.g., visual images, sounds, or tactile signals) belong to the same object or not. Such an ability is a pre-requisite for representing the world in terms of distinct objects and the relations between them. The wider problem of object identification is defined by Kemp et al. (2009) as that of inferring how many distinct objects the environment contains, recognizing when the same object is encountered twice, and identifying whether a stimulus comes from a novel object. Studies in developmental psychology have shown that these skills are fundamental to establishing an internal object representation that can handle the large number of objects that humans encounter in their daily lives (Tremoulet et al., 2000;

Krojgaard, 2004).

For this reason, a question of significant interest to developmental psychologists is how infants establish an object representation and subsequently use it to recognize the identities of objects. For example, a study by Tremoulet et al. (2000) showed that even at the age of 12-months, human infants are able to individuate objects using both shape and color information.

The same study also found that while both object features were used for the task of figuring out how many objects are there, only the shape feature was used when recognizing the identity of an object that was previously individuated. Other studies have shown that when identifying objects, infants often make different judgments from adults based on the differences in the objects’ features (see Wilcox and Baillargeon (1998)), indicating that at such an early age the biological circuits that allow the problem to be solved are still developing.

The ability to individuate objects has also been studied in human adults. As described by Kemp et al. (2009), in a typical scenario the human participant observes (or interacts with) objects one at a time, where the next object may or may not be a previously encountered one.

Subsequently, participants may be asked to enumerate the objects that they have observed, or

match an object stimulus to one of the estimated object identities. For example, in a study with human adults, Kemp et al. (2009) showed that as the number of observed objects increases, the likelihood that a novel object will be classified as a previously observed object goes down.

The same study also found that humans rely on prior information when solving identification problems. More specifically, to determine whether two perceptual stimuli originate from the same object, humans need prior experience in the form of pairs of perceptual stimuli for which the relationship is known (Kemp et al., 2009). In other words, prior experience with objects with known object identities is necessary in order to solve the object individuation task on a novel set of objects.

Therefore, it is not surprising that humans use a variety of cues, other than object features, when individuating objects (Kemp et al., 2009; Krojgaard, 2004). For instance, spatial cues can be used to individuate objects since observing two objects next to each other indicates that the two objects are not the same (Xu and Chun, 2009). Humans also use temporal cues, e.g., they assume that an object would remain the same object over the course of contiguous manipulation or observation (Becchio and Bertone, 2003). Most importantly, such spatial and temporal cues can inform the observer that the featural differences between the objects are not due to noisy observations, but due to the two objects being different (Kemp et al., 2009; Xu and Chun, 2009).

Developmental psychology also studies how infants and adults form object categories and relational concepts. An important finding is that certain experimental settings can elicit spon-taneous sorting and grouping behaviors by infants (see Nelson (1973) and Starkey (1981) for examples). This suggests that even without any specific guidance, from an early age, humans are biased towards spontaneous categorization and grouping of objects. Starkey (1981) reports that both 9 and 12-month-old infants exhibit sorting behaviors when presented with a set of 8 objects, where the set contains 2 groups of four objects that are similar along some dimension (e.g., size, color, etc.).

Sorting and grouping behaviors have also been observed with non-human primates (Pot`ı, 1997; Spinozzi et al., 1999). For example, Spinozzi et al. (1999) found that human-encultured Bonobos and Chimpanzees are capable of spontaneously partitioning a set of objects into two

categories. The authors also report that when chimpanzees partition a set of objects, they predominantly manipulate objects from only one of the two object classes. This procedure is consistent with the behavior of 3-year-old infants observed in a study by Spinozzi et al.

(1999). Overall, these findings suggest that the ability to sort objects is fundamental to primate intelligence.

For humans in particular, object grouping skills are thought to be closely related to our language acquisition abilities. For example, Nelson (1973) argued that children form primitive conceptual categories that are later used when binding the meaning of a word. Similarly, based on a large volume of experimental research, Bloom (2000) argues that a large part of early language learning is about establishing a relation that maps language symbols (e.g., individual nouns) to already existing concepts that are formed independently of the language in question.

An example of what this may look like is provided by Kemp et al. (2010) who write:

“Before learning her first few words, a child may already have formed a category that includes creatures like the furry pet kept by her parents; and learning the word

’cat’ may be a matter of attaching a new label to this pre-existing category.” (Kemp et al., 2010, p. 216)

Not surprisingly, a large volume of research has focused on revealing how humans learn the names of categories (see Ashby and Maddox (2005) for a review). In this framework, the participants are typically presented with several examples from each object category and are subsequently asked to categorize a novel item. Researchers have postulated that humans use two different strategies (sometimes in combination) to learn categories from examples. The first strategy involves finding the common features of members of an individual category, while the second strategy consists of identifying the distinctive features among the non-members of that category (Hammer et al., 2009, 2010). Several experiments described by Hammer et al. (2009) have shown that adults can learn categories even when presented only with pairs of objects from different categories. Children between the ages of 6-9 years, however, could only learn the same categories when provided with object pairs in which the two objects come from the same category, indicating that the two strategies for solving this task have different developmental

trajectories (Hammer et al., 2009).

In addition to learning discrete categories, researchers have also examined how human adults and infants learn comparative relations such as “A is bigger than B” (Smith et al., 1986;

Gentner and Namy, 2006). As with category learning, humans can learn such relations when presented with paired examples for which the relation is provided by the instructor or inferred by some other means. Thus, the robot in this work will be tested in a similar fashion – after initially interacting with the objects, the learned computational models will be evaluated using both discrete categorization as well as continuous ordering tasks.

While most related studies in psychology have focused on the visual sensory domain, Le-derman (1982) argues that human perception of objects is an inherently multi-modal process, one in which humans perceive objects and form object concepts using a variety of sensory modalities (e.g., vision, touch, audio, etc.). In addition, perception of objects is not a passive process – instead, humans actively interact with objects through the use of what psychologists call exploratory behaviors and procedures (Lederman and Klatzky, 1990; Power, 2000). The next two subsections examine in detail how multiple sensory modalities and a rich behavioral repertoire enable humans to solve a wide array of problems, including object recognition and categorization.

In document Behavior-grounded multi-sensory object perception and exploration by a humanoid robot (Page 37-40)