• No results found

The Economics and Psychology of Cognitive and Non-Cognitive Traits *

N/A
N/A
Protected

Academic year: 2021

Share "The Economics and Psychology of Cognitive and Non-Cognitive Traits *"

Copied!
151
0
0

Loading.... (view fulltext now)

Full text

(1)

The Economics and Psychology of Cognitive and Non-Cognitive Traits

*

Lex Borghans Angela Lee Duckworth

Department of Economics and ROA Department of Psychology Maastricht University University of Pennsylvania

James J. Heckman Bas ter Weel

University of Chicago Department of Economics and UNU-MERIT

American Bar Foundation Maastricht University

University College Dublin

This draft last edited by JJH October 14, 2007

*

Angela L. Duckworth is an assistant professor of psychology at the University of Pennsylvania. Lex Borghans is professor of labor economics and social policy at Maastricht University and the Research Centre for Education and the Labour Market (ROA). James J. Heckman is the Henry Schultz Distinguished Service Professor in Economics and the College, Director of the Economics Research Center, Department of Economics University of Chicago; and Director of the Center for Social Program Evaluation, Harris Graduate School of Public Policy Studies, senior fellow of the American Bar Foundation and Professor of Science and Technology at University College Dublin. Bas ter Weel is assistant professor of economics at Maastricht University and a researcher with the Maastricht Economic Research Institute on Innovation and Technology (MERIT). Duckworth’s work is supported by a grant from the John Templeton Foundation. Heckman’s work is supported by NIH R01-HD043411, and grants from the American Bar Foundation, The Pew Charitable Trusts and the Partnership for America's Economic Success, and the J.B. Pritzker Consortium on Early Childhood Development. Ter Weel’s work was supported by a research grant of the

Netherlands Organisation for Scientific Research (grant 014-43-711). Author of correspondence: Lex Borghans, lex.borghans@algec.unimaas.nl. Chris Hsee gave us very useful advice at an early stage. We are grateful to Arianna Zanolini for helpful comments and research assistance at an early stage. We have received very helpful comments on various versions of this draft from Dan Benjamin, Dan Black, Ken Bollen, Sam Bowles, Frances Campbell, Flavio Cunha, John Dagsvik, Michael Daly, Kevin Denny, Thomas Dohmen, Armin Falk, James Flynn, Linda Gottfredson, Lars Hansen, Joop Hartog, Bob Hogan, Nathan Kuncel, Kenneth McKenzie, Frank Norman, David Olds, Friedhelm Pfeiffer, Bernard Von Praag, Elizabeth Pungello, Howard Rachlin, C. Cybele Raver, Bill Revelle, Brent Roberts, Carol Ryff, Larry Schweinhart, Burt Singer, Sergio Urzua and Gert Wagner. The views expressed in this paper are those of the authors and not necessarily of the funders listed here. A website, http://jenni.uchicago.edu/econ-psych-diff/, presents supplemental tables.

(2)

Abstract

This paper examines whether economics would benefit from incorporating the findings of personality psychology to better predict and understand economic outcomes. We ask if it is possible to distinguish cognitive from noncognitive traits to determine if noncognitive factors add anything to our understanding of economic phenomena beyond measures of cognition widely used in economics. We review the psychology literature to examine how personality constructs are created and validated. We discuss potential problems with using these measurements in economics. We examine the literature on the stability of personality traits over the life cycle and in different contexts at the same age. We consider whether personality traits can be changed by intervention. We present extensive evidence on the predictive power of personality.

JEL code: 31

Keywords: Big Five; personality and its effects

Lex Borghans Angela Lee Duckworth

Department of Economics and ROA Department of Psychology

Maastricht University University of Pennsylvania

P.O. Box 616 3720 Walnut Street, Solomon Lab Bldg.

6200 MD Maastricht Philadelphia, PA 19104-6241

The Netherlands duckworth@psych.upenn.edu

lex.borghans@algec.unimaas.nl Phone: 215-898-1339

Phone: +31 43 388 36 20 Fax: 215-573-2188

Fax: +31 43 388 41 50

James J. Heckman Bas ter Weel

Department of Economics Department of Economics and UNI-MERIT

University of Chicago Maastricht University

1126 E. 59th Street P.O. Box 616

Chicago, IL 60637 6200 MD Maastricht

jjh@uchicago.edu The Netherlands

Phone: 773-702-0634 b.terweel@merit.unimaas.nl

(3)

I.

Introduction

A major contribution of Twentieth Century social science was to empirically document that people are different in many ways and that these differences are persistent. Characterizing differences among persons is both a challenging and rewarding endeavor. It is challenging because accumulating evidence points to a very large number of important differences among persons. It is rewarding because understanding the sources of human ability and how

achievement can be enhanced is fundamental to understanding and devising policies to alleviate inequality.

As new indicators of human performance are developed, new aspects of individual heterogeneity are revealed. Some of the earliest measurements of human performance were on intelligence. More recently, psychologists have measured personality. The goal of this paper is to take stock of the recent literature on personality in psychology and to examine its relevance for economics.

Psychologists have used intelligence tests for more than a century and there is general agreement on their validity. Measures of intelligence and the distinct concept, academic

achievement, are widely available and much used by economists. A focus on cognitive ability is also compatible with the neoclassical view of rational decision making. More intelligent persons are thought to be better equipped to address the decision problems that economic agents are presumed to solve on a daily basis. Many leading schools of philosophy focus on cognition as the origin of morality and justice (see, e.g., the survey in Gray, 2007).

(4)

Empirical economists have long worried about the omission of intelligence from estimated relationships.1 Signaling models suggest that the economic return to schooling can arise solely from a return to cognitive ability: smarter people find it less costly to attain schooling, and schools are filters rather than producers of skill.2

Despite the traditional focus on cognitive ability in the academic economics literature, it is intuitively obvious that non-cognitive traits such as motivation, resilience, persistence,

conscientiousness, and the like, are also important for success. David Wechsler (1940, 1943), the creator of the most widely-used intelligence test, was himself a vocal proponent of measuring what he termed “nonintellective” factors that give rise to productive behavior in the real world. Edison’s remark, “genius is 1% inspiration and 99% perspiration,” captures the importance of persistence and motivation in creative pursuits. The increasing importance of social skills is suggested by the trend towards teams (as opposed to solo authors) in the production of published research in the sciences, engineering, social sciences, arts, and humanities (Wuchty, Jones, and Uzzi, 2007). The secular growth of the service sector suggests that “people skills” have become more important in other domains as well (Borghans, ter Weel, and Weinberg, 2007).

Marxist economists (Bowles and Gintis, 1976) pioneered the introduction of noncognitive skills into models of earnings determination. Bowles, Gintis and Osborne (2001) review the literature in economics on the impact of noncognitive skills on earnings. Mueser (1979) is an early reference in sociology. The previous literature focused on earnings outcomes. This paper considers a much broader array of socioeconomic outcomes in analyzing the roles of cognitive and noncognitive factors in affecting the capacities of individuals in a number of aspects of choice and decision.

1

See the models of omitted-ability bias, the coefficient on schooling in an earnings equation. Griliches (1977) and Card (2001) straddle a large empirical literature.

2

(5)

Recent research on the economics of human skill formation has demonstrated the importance of factors besides pure intelligence in creating productive persons. Heckman and Masterov (2007) and Cunha, Heckman, Lochner and Masterov (2006) review the evidence that the Perry preschool program, an early childhood intervention program targeted towards

disadvantaged, low IQ children, had substantial long-term impacts on its participants.3 As Figure 1 reveals, despite a short boost, it did not raise the IQ of its participants by age 10. Yet the effects of the treatment were long lasting on a variety of dimensions of social performance such as earnings, crime, educational achievement and employment. Something besides IQ was affected by the intervention. This essay draws on evidence from psychology to determine what that something else might be, and channels through which it works.

As is typical of any new measurement system, there is some controversy over how best to measure noncognitive traits, and which traits should be measured. That may give economists pause in drawing on the psychology literature. We survey the main contours of the debate within psychology and offer guidance on pitfalls in drawing uncritically on this literature. We focus on personality.4

Psychologists are continually engaged in defining, parsing, and classifying personality traits in new ways. Whereas many psychologists, particularly in the United States, now accept the “Big Five” taxonomy of personality traits, a variety of competing taxonomies coexist. Many psychologists do not find the “Big Five” sufficiently inclusive of many dimensions of

personality. Economists who use measures of personality traits to predict outcomes may be

3

See also the Perry study by Schweinhart et al. (2005).

4

Our specific interest in this article is personality, perhaps the largest and most relevant category of non-cognitive, psychological traits. We do not discuss other non-cognitive, psychological traits such as interests and values. Nor do we discuss the growing literature on non-cognitive, physical traits such as beauty, height, and athleticism. See, e.g., Hammermesh and Biddle (1994), Biddle and Hammermesh (1998), Pfann et al. (2000), and Persico, Postelwaite and Silverman (2004). Bowles, Gintis and Osborne survey the literature on physical traits through 2001. Mobius and Rosenblat (2006) suggest that the transmission channels of the effect of beauty on outcomes are mainly related to noncognitive skills such as self-confidence and communication skills.

(6)

unaware of where they fit into the Big Five taxonomy and, more importantly, their interpretation and known relationships with other traits.

Even the definition of personality is a matter of debate. Some psychologists take a narrow view and believe that expectation, emotion, motivation, resilience and the ability to persist in the face of adversity, values, and interests fall outside the construct of personality. Others take the position that insofar as these variables are persistent over time, they can be considered aspects of personality (cf. Costa and McCrae, 1988). Some (Mischel, 1968) emphasize that manifest personality traits are context dependent and do not persist across situations.

The lack of general consensus among psychologists about how to organize, define, and measure non-cognitive traits is one reason for their omission from most economic studies. Another reason is that economists have yet to be convinced of their predictive validity or their causal status. Most data on personality are observational and not experimental. Personality traits may reflect, rather than cause, the outcomes that they are alleged to predict. Large-scale studies of the impact of personality are necessarily limited in the number and length of measures that they can include. Without evidence that there is value in knowing which non-cognitive traits are most important in predicting outcomes, there is little incentive to include sufficiently broad and nuanced non-cognitive measures in studies.

As the Perry graph reveals, social policy can be effective by operating outside of purely cognitive channels. In addition, most economists are currently unaware of evidence that non-cognitive traits are more mutable than non-cognitive ability over the life cycle and, therefore, more sensitive to investment by parents and to other sources of external stimuli than cognitive traits.

(7)

Another issue considered in this paper is whether the standard preference parameters used in economics already capture the dimensions of personality emphasized by the psychologists.

This paper addresses the following questions:

1. Is it conceptually possible to distinguish cognitive from non-cognitive traits?

The distinction between cognitive and non-cognitive skills has intuitive appeal and has been used by economists and psychologists alike. Precise definitions for these terms, however, are lacking. We argue that these terms are most usefully understood as short-hand for cognitive

ability (or intelligence) – problem solving ability for abstract logical problems – and, by

contrast, traits other than cognitive ability, including personality traits and some economic preference parameters.

2. Is it possible to empirically distinguish cognitive from non-cognitive traits?

Even if it is conceptually possible to disentangle the two classes of traits, empirically it is an extremely difficult task. Many measures of economic preferences are influenced by numeracy and intelligence. IQ test scores are determined not only by intelligence, but also by factors such as motivation and anxiety. Moreover, over time, aspects of cognitive ability are influenced, indirectly, by non-cognitive traits such as curiosity, ambition, and perseverance.

3. What are the main measurement systems in psychology and how are they verified?

Psychologists aiming to create valid personality questionnaires balance multiple concerns. One objective is to create questionnaires with construct-related validity (i.e., whose internal factor structure is consistent across time, gender, ethnicity, and culture). A separate concern is to create survey instruments with predictive validity. With notable exceptions, contemporary personality psychologists privilege construct validity over predictive validity in their choice

(8)

of measures. We summarize an econometric approach that simultaneously addresses both types of validity objectives.

4. What is the evidence on the predictive power of cognitive and non-cognitive traits?

We summarize evidence that both cognitive ability and non-cognitive traits predict important outcomes, including schooling, wages, and longevity. We show that for many outcomes, certain non-cognitive traits (i.e., traits associated with Conscientiousness and Emotional Stability) are more predictive than others (i.e., traits associated with Agreeableness, Openness to Experience, and Extraversion). The principle of comparative advantage operates in both cognitive and non-cognitive traits. The relative importance of any given trait varies by the task studied. Cognitive traits are predictive of performance in a greater variety of tasks. Noncognitive traits are very important in explaining performance in specific tasks, although different noncognitive traits are predictive in different tasks.

5. How stable are non-cognitive traits? Are they more sensitive than cognitive ability to

investment and intervention?5

We present evidence that both cognitive ability and non-cognitive traits change over the lifecycle—but to different degrees and at different stages of the life cycle. There is emerging evidence on critical and sensitive periods in the production of both types of traits. Cognitive processing speed, for example, tends to rise sharply during childhood, peak in late

adolescence, and then slowly decline. In contrast, some personality traits, such as

conscientiousness, increase monotonically from childhood to late adulthood. Moreover, rank-order stability for personality peaks between the ages of 50 to 70, whereas IQ reaches these

5

Investment refers to the allocation of resources, broadly defined, for the purpose of increasing skills. Parents invest in their children directly and through choice of schools, but individuals can also invest in themselves.

(9)

same levels of stability by middle childhood. We relate these changes to recent models of investment developed in economics.

6. Do the findings from psychology suggest that conventional economic theory should be

enriched? Should we change the way we formulate economic models in light of the evidence from psychology? Can conventional models of preferences explain the emerging body of evidence from personality psychology? Does personality psychology merely recast well-known preference parameters into psychological jargon, or is there something new to learn?

This discussion entails comparing the predictive power of personality measures with the predictive power of preferences and constraints conventionally used by economists. Conventional economic theory is sufficiently elastic to accommodate many findings of psychology. However, some traditional concepts should be modified and certain emphases redirected. Some findings from psychology cannot be rationalized by standard economic models and could fruitfully be incorporated into economic analysis.

The paper proceeds in the following way. Section II first defines cognitive and non-cognitive traits. We then consider how these concepts are measured. Each operationalization of these concepts gives an implicit definition of these traits. In Section III, we consider additional

measurement and methodological issues. In Section IV, we present a framework for interpreting personality and economic parameters. Recent work in behavioral economics that seeks to integrate economics and psychology focuses almost exclusively on preference parameters. In contrast, we present a broader framework that includes a full array of human capacities as they affect constraints, skill acquisition, learning and preferences. In section V we relate personality parameters to economic preference parameters and constraints. We consider one reformulation of choice theory that minimizes the role of preferences and emphasizes the role of perceptions

(10)

and constraints in affecting choices. In Section VI, we examine the correspondence between personality and preference parameters. In Section VII, we examine the predictive power of non-cognitive traits. In Section VIII, we study the evolution of preference parameters and personality variables over the life cycle. Personality and preference parameters are not perfectly stable as conjectured by James (1890; reprinted in 1981) but they are also not completely determined by context. They can be affected by investment as well as life experiences. Section IX concludes by summarizing the paper and suggesting an agenda for future research. We start by defining

cognitive and noncognitive traits.

II.

Defining cognitive and non-cognitive traits

Neither cognitive nor non-cognitive traits have been clearly and consistently defined in either the psychology or economics literatures. The willingness of scholars to use these terms without precise definitions reflects their widespread vernacular use. Despite strong intuitions that cognitive and non-cognitive traits are distinct, the boundaries between these categories blur when closely scrutinized. Some may wonder how any aspect of the brain or mind can rightly be called “non-cognitive.”

Here we state plainly our intent. We distinguish between cognitive or “intellectual”

ability on the one hand, and cognitive traits on the other. That is, we use the term

non-cognitive to refer to traits other than those that characterize abstract problem solving. We do not mean to imply that such non-cognitive traits are devoid of any elements of cognitive processing, however. Schulkin (2007) reviews evidence that information processing systems not in the cortex (which is associated with cognition) in turn affect cognitive functioning in a number of dimensions. Phelps (2006) shows that emotions are involved in learning, attention, and other aspects of cognition. Were the terms non-intelligence or, as Weschler (1940) suggested,

(11)

those that are separate from cognitive ability, of course, begs for a specific definition of the latter. Indeed, in both measurement and in definition, the two concepts cannot be easily unbundled.

A. Cognitive Ability

Intelligence (or cognitive ability) has been defined by an official taskforce of the American Psychological Association as the “ability to understand complex ideas, to adapt

effectively to the environment, to learn from experience, to engage in various forms of reasoning, to overcome obstacles by taking thought” (Neisser et al., 1996, p. 77). Scores on different tests of cognitive ability tend to be highly intercorrelated, often with half or more of the variance of diverse tests accounted for by a single general factor labeled g and more specific mental abilities loading on other factors (Jensen, 1998; Lubinski, 2004; Spearman, 1904, 1927).6 The term “IQ” is often used synonymously with intelligence but in fact refers specifically to scores on IQ tests.7

Most psychologists agree that cognitive abilities are organized hierarchically with g as the highest-order factor (Spearman, 1904). There is less agreement about the number and identity of lower-order factors.8 Cattell (1971; 1987) contrasts two second-order factors: fluid intelligence (the ability to solve novel problems) and crystallized intelligence (knowledge and developed skills).9 The relative weighting of fluid versus crystallized intelligence varies among tests according to the degree to which prior experience is crucial to performance. Achievement tests,

6 Heckman (1995) presents some recent evidence on the power of g in explaining cognitive test scores. 7

Several psychologists have attempted to broaden the term intelligence to include other capacities. Most notably, Sternberg (2000, 2001) suggests that the notion of intelligence should also include creativity and the ability to solve practical, real-world problems. Gardner (2004) includes in his theory of multiple intelligences musical intelligence, kinaesthetic intelligence, and interpersonal and intrapersonal intelligence, among others.

8 Carroll (1993) reviewed some 477 data sets and proposed a structure with g as the highest-order factor, eight

second-order ability clusters, and many over 70 more narrowly defined third-order abilities. Alternative hierarchical models, also with g as the highest-order factor, have been proposed (e.g., Cattell, 1971; Lubinski, 2004).

9

Cattell’s student Horn (1970) elaborates: fluid intelligence is the ability to “perceive complex relations, educe complex correlates, form concepts, develop aids, reason, abstract, and maintain span of immediate apprehension in solving novel problems in which advanced elements of the collective intelligence of the culture were not required for solution” (p. 462). In contrast, crystallized intelligence is the same class of skills, “but in materials in which past appropriation of the collective intelligence of the culture would give one a distinct advantage in solving the problems involved. (p. 462)

(12)

for example, are heavily weighted towards crystallized intelligence, whereas tests like the Raven (1962) Progressive Matrices, described later, are heavily weighted towards fluid intelligence.10 An alternative organization of second-order cognitive factors is offered by Lubinski (2004), who distinguishes between verbal, quantitative, and spatial ability. Carroll’s (1993) proposed

hierarchy has eight second-order factors which comprise fluid intelligence, crystallized intelligence, general memory and learning, visual perception, auditory perception, retrieval ability, cognitive speediness, and decision-making speed.11 Horn and McArdle (2007) summarize the large body of evidence against a single g for intelligence.

B. Non-Cognitive Traits

Defining a concept by what it is not is far from ideal. In particular, such dichotomies imply clear separation when, in actuality, the distinction between the two categories is not easy to make. Consider, for example, so-called “quasi-cognitive” traits (Kyllonen, Walters, and

Kaufman, 2005). These include creativity (Csikszentmihalyi, 1996), emotional intelligence (Mayer and Salovey, 1997), cognitive style (Stanovich, 1999; Perkins and Tishman, 2001), typical intellectual engagement (Ackerman and Heggestad, 1997), and practical intelligence (Sternberg, 2000).

The conundrum of separating cognitive ability from non-cognitive traits is exemplified by executive function.12 Executive function includes self-control mechanisms. Components of

10

Rindermann (2007) uses data on intelligence and achievement tests across nations to show that a single factor accounts for 94-95% of the variance across both kinds of tests. The high correlation between intelligence and achievement tests is in part due to the fact that both require cognitive ability and knowledge, even if to different degrees, and that common developmental factors may affect both of these traits.

11

Recent work by Conti and Pudney (2007) shows that more than one factor is required to summarize the predictive power of tests in economic data. This could be due to the existence of multiple intellective factors or because non-cognitive factors affect the measurement of non-cognitive factors as we discuss later on in this section.

12

Executive function has been characterized as the capacity to act on information as opposed to understanding the information. In the language of psychology, executive function refers to “the multi-operational system mediated by prefrontal areas of the brain and their reciprocal cortical and subcortical connecting pathways” (Ardila, Pineda, and Rosselli, 2000; see Miller and Cohen, 2001, for a review).

(13)

executive function include working memory, attention, and other so-called “top-down” mental processes whose function is to orchestrate lower-level processes. Ardila, Pineda, and Rosselli (2000), Welsh, Pennington, and Grossier (1991), and Schuck and Crinella (2005) find that many measures of executive function do not correlate reliably with IQ. Supporting this are case studies of lesion patients who suffer marked deficits in executive function, especially self-regulation, the ability to socialize and plan, but who retain cognitive skills such as the ability to reason

(Damasio, 1994). However, measures of working memory capacity in particular correlate very highly with measures of fluid intelligence (Heitz, Unsworth, and Engle, 2005). In fact, there is currently a lively debate among cognitive psychologists as to the precise relationship among working memory, other aspects of executive function, and intelligence (cf., Blair, 2006 and ensuing commentary).

Many varieties of personality psychology are very cognitively oriented. For example, Mischel (1968) and Bandura (1968) formulate personality in terms of cognitive operations (how people encode information, their expectations and beliefs). Except for pure motor functions, all aspects of personality entail aspects of cognition, broadly defined.

In this paper, we focus on personality traits, categories of non-cognitive traits that are more easily distinguished from intelligence. Personality traits are patterns of emotion, thought, behavior, and motivation that are relatively stable across time. Most measures of personality are only weakly correlated with IQ (Webb, 1915; McCrae and Costa, 1994; Stankov, 2005;

Ackerman and Heggestad, 1997). There are, however, a small number of exceptions. Most notably, IQ is moderately associated with the Big Five factor called openness to experience, with the trait of sensation seeking, and with measures of time preference. The correlations are of the order r = .3 or lower. As we note below, performance on IQ tests is affected by personality

(14)

variables. Even if there is such a thing as pure cognition or pure personality, measurements are affected by a variety of factors.

C. Operationalizing the Concepts

Intelligence tests are routinely used in a variety of settings including business, education, civil service and the military.13 We first discuss the origins of the measurement systems for intelligence and we then discuss their validity.14

IQ tests

Modern intelligence tests have been employed for screening purposes for just over a century, beginning with the decision of a Parisian minister of public instruction to identify retarded pupils in need of specialized education programs. The psychologist Alfred Binet, with his assistant Theophile Simon, developed the first IQ test.15 Other pioneers in intelligence testing include James McKeen Cattell (1890) and Francis Galton (1883), both of whom developed tests of basic cognitive functions (e.g., discriminating between objects of different weights). These early tests were eventually rejected in favor of tests that attempt to tap higher mental processes. Terman (1916) adapted Binet’s IQ test for use with American populations. Known as the Stanford-Binet IQ test, Terman’s adaptation was, like the original French test, used primarily to predict academic performance. Stanford-Binet test scores were presented as ratios of mental age to chronological age multiplied by 100 to eliminate decimal points. IQ scores centered at 100 as average are now conventional for most intelligence tests.

Wechsler (1939) noted two major limitations of the Stanford-Binet test: (1) it was overly

13

Kaplan and Saccuzzo (1997) provide a detailed overview of the different types of applications of psychological testing.

14

See Roberts, Markham, Matthews, and Zeidner, 2005 for a more complete history of intelligence testing.

15 In 1904, La Société Libre pour l’Etude Psychologique de l’Enfant appointed a commission that was asked to create

a mechanism for identifying these pupils in need of alternative education. The commission, led by Alfred Binet, developed the first intelligence test. See Siegler (1992) for an overview of Binet’s life and work.

(15)

reliant on verbal skills and, therefore, dependent upon formal education, and (2) the ratio of mental to chronological age was inappropriate for adults (Boake, 2002). Wechsler created a new intelligence test battery divided into verbal (e.g., similarities) and performance subtests (e.g., block design, matrix reasoning). He also replaced the ratio IQ score with deviation scores that had the same normal distribution at each age. This test, the Wechsler Adult Intelligence Scale (WAIS) – and, later, the Wechsler Intelligence Scale for Children (WISC) – produces two different IQ subscores, verbal IQ and performance IQ, which sum to a full-scale IQ score. The WAIS and the WISC have for the past several decades been by far the most commonly used IQ tests.

Similar to Wechsler’s Matrix Reasoning subtest, the Raven Progressive Matrices test is a so-called “culture-free” IQ test because it does not depend heavily on verbal skills or other knowledge explicitly taught during formal education. Each matrix test item presents a pattern of abstract figures. The test taker must choose the missing part. See Figure 2 for an example item. If subjects have not had exposure to such visual puzzles, the Raven test is an almost pure measure of fluid intelligence. However, the assumption that subjects are naïve to such puzzles is not typically tested. It seems likely, for example, that children from more educated families or from more developed countries have more exposure to such abstract puzzles (Blair, 2005). To varying degrees, IQ tests reflect intelligence, knowledge and motivation.

D. Personality tests

There is a parallel tradition in psychology of measuring personality, but with different origins. Dominant theories of personality assume a hierarchical structure analogous to that found for intelligence. However, despite early efforts to identify a g for personality (e.g., Webb, 1915), even the most parsimonious personality models incorporate more than one factor. The most

(16)

widely accepted taxonomy of personality traits is the Big Five or five-factor model.16 This model originated in Allport and Odbert’s (1936) lexical hypothesis, which posits that the most important individual differences are encoded in language. Allport and Odbert combed English dictionaries and found 17,953 describing words, which were later reduced to 4,504 personality-describing adjectives. Subsequently, several different psychologists working independently and on different samples concluded that personality traits can be organized into five superordinate dimensions. These five factors have been known as the Big Five since Goldberg (1971).

The Big Five factors are Openness to Experience (also called Intellect or Culture), Conscientiousness, Extraversion, Agreeableness and Neuroticism (also called Emotional Stability). A convenient acronym for these factors is “OCEAN”. These factors represent

personality at the broadest level of abstraction; each factor summarizes a large number of distinct, more specific, personality characteristics. John (1990) and McCrae and Johnson (1992) present evidence that most of the variables used to assess personality in academic research in the field of personality psychology can be mapped into one or more of the dimensions of the Big Five. They argue that the Big Five may be thought of as the longitude and latitude of personality, by which all more narrowly defined traits (often called “facets”) may be categorized (Costa and McCrae, 1992a). Table 1 summarizes 30 lower-level facets (6 facets for each of five factors) identified in the Revised NEO Personality Inventory (NEO-PI-R, Costa and McCrae, 1992), shorthand for

Neuroticism, Extroversion, Openness to Experience—Personality Inventory—Revised. It is the most widely-used Big Five questionnaire. Since 1996, free public-domain measures of Big Five factors and facets derived from the International Personality Item Pool have been made

16

See John and Srivastava (1999) for an historical overview of the development of the Big Five. See Costa and McCrae (1992a) and Digman (1990) for a review of the emergence of this concept.

(17)

available.17 Table 2 presents a commonsense interpretation of these traits. We discuss the relationship of these facets with familiar economic constructs in Section IV.

The five-factor model is not without its critics. For example, Eysenck (1991) offers a model with just three factors (i.e., Neuroticism, Extraversion, and Psychoticism). Cloninger (1987) and Tellegen (1985) offer different three-factor models. Figure 3 shows the

commonalities across these competing taxonomies and also areas of divergence. Despite solid evidence that five factors can be extracted from most if not all personality inventories in English and other languages, our view is that there is nothing magic about the five-factor representation. For example, Mershon and Gorusch (1988) show that solutions with more factors substantially increase the prediction of such outcomes as job performance, income, and change in psychiatric status. On the other hand, more parsimonious models in which the five factors are reduced further to two “metatraits” have also been suggested (Digman, 1997). Block (1995) questions not only the five-factor model itself but, more generally, the utility of factor analysis as a tool for

understanding the true structure of personality. Anyone familiar with factor analysis knows that determining a factor representation often requires some amount of subjective judgment. We discuss this issue further in Section III.

The most damning criticism of the five-factor model is that it is atheoretical. The finding that descriptions of behavior cluster reliably into five groups has not so far been explained by a basic theory. The five-factor model is relatively silent on an important class of individual differences: motivations. The omission of motivations (i.e., what people value or desire) from measures of Big Five traits is not complete, however. The NEO-PI-R, for example, includes the facet of achievement striving. Individual differences in motivation are more prominent in older

17

(18)

(now rarely used) measures of personality. The starting point for Jackson’s Personality Research Form (PRF; Jackson, 1974), for example, was Murray’s (1938) theory of basic human drives. Included in the PRF are scales for (need for) play, order, autonomy, achievement, affiliation, social recognition, and safety. Thus, while there is more support at the moment for the five-factor model than for any other model of personality, it is clear that there are important differences between people that are not adequately captured by Big Five measures.

Another aspect of personality research not covered by the Big Five is the regulation of emotion. The essays in Gross (2007) discuss the regulation of emotion from the perspectives of biology, cognition, development, personality processes and individual differences, and social psychological approaches. Over the life cycle, the way people deal with challenges, frustration and setbacks will affect persistence and motivation. The Big Five traits do not directly capture the important phenomenon of resilience.

A practical problem facing the economist who wishes to measure personality is the multiplicity of personality questionnaires that greatly exceed the number of intelligence tests. The proliferation of personality measures reflects, in part, the more heterogeneous nature of personality in comparison to cognitive ability although, as we have seen, there are various types of cognitive ability. The panoply of measures and constructs also points to the relatively recent and incomplete convergence of personality psychologists on the Big Five model, as well as the lack of consensus among researchers about identifying and organizing lower-order facets of the Big Five factors (cf., deYoung, 2007; Hofstee, de Raad, & Goldberg, 1992). For example, some theorists argue that impulsivity is a facet of Neuroticism (Costa and McCrae, 1992), others claim that it is a facet of Conscientiousness (Roberts et al., 2005), and still others suggest that it is a blend of Conscientiousness, Extraversion, and perhaps Neuroticism (Revelle, 1997). Figure 3

(19)

shows in italics facets whose classification is in debate. A final reason for the proliferation of measures is the methodology of verifying tests. We expand on this point in Section III.

E. Measures of Temperament

Childhood temperament is the term used by developmental psychologists to describe the personalities of infants and children.Because individual differences in temperament emerge so early in life, these traits have traditionally been assumed to be biological (as opposed to

environmental) in origin18. However, findings in behavioral genetics suggest that, like adult personality, temperament is only partly heritable, and as discussed in Section VIII, both adult and child measured traits (phenotypes) are affected by the environment.

Temperament is studied primarily by child and developmental psychologists, while personality is studied by adult personality psychologists. The past decade has seen some

convergence of these two research traditions, however, and there is evidence that temperamental differences observed in the first years of life anticipate adult personality and interpersonal functioning decades later (e.g., Caspi, 2000; Newman et al., 1997; Shiner and Caspi, 2003).

Most of the research on temperament has examined specific lower-order traits rather than broader, higher-level factors that characterize studies of adult intelligence and personality.19 A typical temperament researcher is interested in only one or two of these lower-order traits, and

18

Indeed, some psychologists use the term “temperament” to indicate all aspects of personality that are biological in origin. They study temperament in both children and adults.

19

Measuring temperament presents unique methodological challenges. Self-report measures, by far the most widely used measure for adult personality, are not appropriate for young children for obvious reasons. One strategy is to ask parents and teachers to rate the child’s overt behavior (e.g., California Child Q-sort), but informants can only guess what a child might be thinking and feeling. Infants present a special challenge because their behavioral repertoire is so limited. One strategy is to place infants in a standard situation and code reactions under a standardized scenario (e.g., the Strange Situation, which is used to distinguish infants who are securely attached to their caregiver versus insecurely attached). Young children can be interviewed using puppets or stories. For obvious reasons, all measures of temperament are more difficult and more expensive to collect than adult self-report measures. This may explain their absence in large-sample studies.

(20)

makes no effort to connect her work to that of others studying different traits. Shiner (1998) observed that “there is therefore a great need to bring order to this vast array of studies of single lower level traits.” Taxonomies of temperament that resemble the Big Five have been proposed (e.g., Putnam et al., 2001; Rothbart, Ahadi and Evans, 2000; Shiner and Caspi, 2003). It is meaningful to characterize both children and adults as either extraverted or introverted, or agreeable or disagreeable, for example. However, compared to adults, there seem to be fewer ways that young children can differ from one another. Child psychologists often refer to the “elaboration” of childhood temperament into the full flower of complex, adult personality. Needless to say, the lack of direct correspondence between measures of temperament and measures of adult personality presents a challenge to researchers interested in documenting changes in personality from the beginning of life to its end.

F. IQ scores reflect both cognitive and non-cognitive traits

Performance on intelligence and achievement tests depends to some extent on certain non-cognitive traits of the test taker, especially their motivation to perform. Motivation (reward) has an important place in economics and is also a central aspect of some versions of personality psychology.20 More intelligent people are likely to perform strategically on the tests. A smart child unable to sit still during an exam or uninterested in exerting much effort can produce spuriously low scores on an IQ test. Moreover, many IQ tests also require factual knowledge acquired through schooling and life experience, which are in part determined by the motivation, curiosity, and persistence of the test taker.21 Thus, personality can have both direct and indirect effects on IQ test scores.

20

It is likely that performance on personality tests also depend on cognitive factors, but that is less well documented.

21

See Hansen, Heckman and Mullen (2004) for an analysis of the causal effects of schooling on achievement tests. Heckman, Stixrud and Urzua (2006) consider the causal effects of schooling on measures of noncognitive skills.

(21)

Almost 40 years ago, several studies called into question the assumption that IQ tests measure maximal performance (i.e., performance reflecting maximal effort). These studies show that among individuals with low IQ scores, performance on the IQ test could be increased up to a full standard deviation by offering incentives such as money or candy, particularly on group-administered tests and particularly with individuals at the low-end of the IQ spectrum. Engaging in complex thinking is effortful, not automatic (Schmeichel, Vohs, & Baumeister, 2003) and therefore motivation to exert effort affects performance. Zigler and Butterfield (1968) found that early intervention (e.g., nursery school) for low SES kids may have a beneficial effect on

motivation, not on g per se. Raver and Zigler (1997) and Heckman, Stixrud, and Urzua (2006) present further evidence on this point. It is the message of Figure 1 joined with the large literature that documents substantial effects of the Perry preschool program. For additional evidence on this point, see the survey by Cunha, Heckman, Lochner and Masterov (2006) and the analysis of Cunha and Heckman (2007b), as well as Revelle (1993). Table 3 summarizes the evidence on extrinsic incentives and performance on tests of cognitive ability.

Segal (2006) shows that introducing performance-based cash incentives in a low-stakes administration of the coding speed test of the Armed Services Vocational Battery (ASVAB) increases performance substantially among roughly one-third of participants. Less conscientious men are particularly affected by incentives. Her work and a large body of related work

emphasize heterogeneity of motivations that affect human performance. Borghans, Meijers and ter Weel (2006) show that adults spend substantially more time answering IQ questions when rewards are higher, but subjects high in emotional stability and conscientiousness are much less affected by these incentives. Similarly, Pailing and Segalowitz (2004) find that an event-related potential (ERP) indexing the emotional response to making an error increases in amplitude when

(22)

incentives are offered for superior test performance. 22 This effect is smaller for individuals high in conscientiousness and emotional stability. Thus, IQ scores may not accurately reflect maximal

intellectual performance for individuals low in the non-cognitive skills of conscientiousness and

emotional stability. The performance on the IQ test may encode, in part, how effective persons may be in application of their intelligence, i.e., how people are likely to perform in a real-world setting. However, it is far from obvious that motivation on an exam and motivation in a real-world situation are the same. We discuss this issue further in Section IV.

Like low motivation, test anxiety can significantly impair performance (Hembree, 1988). That is, subjects do worse when they worry excessively about how they are performing and when their autonomic nervous system over-reacts by increasing perspiration, heart rate, etc. Because individuals who are higher in Big Five neuroticism are more likely to experience test anxiety, there is another reason, beyond incentives, why emotional stability can impact IQ scores (Moutafi, Furnham, Tsaousis, 2006).

Personality traits can also affect IQ scores indirectly through the knowledge acquired by more individuals who are more open to experience, more curious and more perseverant. Cunha and Heckman (2007a) show that there is a correlation between cognitive and non-cognitive factors. Hansen, Heckman and Mullen (2004), Heckman, Stixrud and Urzua (2006), and Urzua (2007) show how schooling and other acquired traits substantially causally affect measured cognitive and non-cognitive test scores. Cattell’s investment theory (1971) anticipates recent findings that knowledge and specific complex skills depend not only on fluid intelligence but also on the cumulative investment of effort and exposure to learning opportunities.

How, then, should one interpret a low IQ score? Collectively, the evidence surveyed here

22

(23)

and in Table 3 suggests that IQ test performance reflects not only pure intelligence, but also intrinsic motivation, anxiety, knowledge, and reactions to extrinsic incentives to perform well.23,24 The relative impurity of IQ tests likely varies from test to test and individual to

individual. The difficulty of creating a pure measure of intelligence suggests that Hernnstein and Murray (1994) likely overestimate the effect of intelligence on outcomes such as crime and wages .25 To capture pure intelligence it is necessary to adjust for the incentives and motivations of persons. We discuss this issue in the next section.

III.

Measurement and Methodological Issues

In studies gauging the importance of non-cognitive skills on outcomes, economists routinely use measures developed by psychologists.26 We have discussed these measures in general terms in the preceding section and discuss specific measurements in Section IV. Before discussing the details of specific measurement schemes, it is useful to understand limitations of the currently used measures at an abstract level.

Psychologists marshal three types of evidence to establish the validity of their tests: content-related, construct-related, and criterion-related evidence (AERA, APA, 1999). Content-related evidence demonstrates that a given measure adequately represents the construct being measured. Qualitative judgments about content-related validity are made by experts in the subject. In recent years, psychologists have devoted more energy to mustering construct-related evidence for a measure. Test items that are highly correlated form a cluster. If items are highly correlated within a cluster but weakly correlated with items across other clusters, the set of tests are said to have both “convergent and discriminant validity,” with the “convergent” referring to

23

http://jenni.uchicago.edu/econ-psych-diff/.

24

In their paper in this issue, Cunha and Heckman (2007b) show that cognitive and noncognitive factors are correlated.

25

Jensen (1980) is a possible counterargument against this view.

26

See e.g., the studies summarized in Bowles, Gintis and Osborne (2001) and the primary analyses present in that paper.

(24)

the intercorrelations within a cluster and the “discriminant” referring to lack of correlation across clusters. This criterion relies on factor analysis.27 A third approach is based on criterion-related validity. The association of tests with concurrent outcomes is termed “concurrent validity” and the prediction of future outcomes is termed “predictive validity.” Evidence for predictive validity is inherently more attractive to economists than construct validity but has its own problems. Because of problems with measurement error, this approach almost certainly leads to a proliferation of measures that are proxies for a lower-dimensional set of latent variables or “constructs” in the psychology literature.

To understand approaches to the validation of intelligence and personality measures in psychology and their recent applications and extensions in economics, it is helpful to review the basic factor model. We develop an extension of this model in the last part of this section. Suppose that for a construct k, k = 1,…,K, where there are K distinct constructs, there are Jk item

or test scores, all of which proxy a latent factor or construct fk. The standard factor model as

augmented by Hansen, Heckman and Mullen (2004) writes the score for item or test j for construct k for a person with measured characteristics X as

(1) ( ) ( ) , k k k k j j j k j TXX f +ε 1,j= …,Jk,

{ }

1 J k k j j f ε = ⊥⊥ , k =1,…,K ( k) 0 E f = and E(ε ) = 0 =kj j 1,...,Jk

where “⊥⊥” denotes independence and “E” denotes mathematical expectation. All of the dependence across the test scores for a given construct with fixed X is generated by the fk. Theεkjare called uniquenesses in factor analysis. They are assumed to be mutually statistically

27

More rarely, the multitrait-multimethod matrix approach developed by Campbell and Fiske (1959) issued for this purpose.

(25)

independent of all other ''

k j

ε , k k≠ ' or jj', or both, as well as of the X and the fk, k = 1,…, K.

They represent idiosyncratic aspects (specific variance) of item or test j as well as measurement error (error variance). The means of the item or test scores μkj( )X and the “factor loadings”

( )

k j X

λ can depend on the X, although traditionally the “factor loadings”

(

λkj( )X

)

are not parameterized to depend on X. The X can include incentive schemes and situational specific aspects of tests which we discuss in the next subsection.

Conventional psychometric validity of a collection of item or test scores for different constructs has three aspects: (a) a factor fk is assumed to account for the intercorrelations

among the items or tests within a construct, (b) item-specific and random error variance are low (intercorrelations among items are high),28 and (c) factor fk for construct k is independent of factor fj for construct j. Criteria (a) and (b) are required for “convergent validity.” Criterion (c) is “discriminant validity.”

The standard approach to defining constructs in personality psychology is based on factor analysis. It takes a set of measurements designed to capture a construct arrived at through

intuitive considerations and conventions, and measures within-cluster and across-cluster

correlations of the measurements to isolate latent factors fk,k=1,...,K . The measurements and clusters of tests are selected on intuitive grounds or a priori grounds, and not on the basis of any predictive validity in terms of real outcomes (e.g., success in college, performance on the job, earnings). Recall that this process gave rise to the taxonomy of traits that became the Big Five. Because of the a priori basis of these taxonomies, there is some controversy in psychology about

28

Cronbach’s α is a widely used measure of intercorrelation among test scores, i.e., a measure of importance of the variance of the εkj relative to the variance of fk.

(26)

competing construct systems. In practice, as we document below, the requirement of

independence of the latent factors across constructs (lack of correlation of tests across clusters) is not easily satisfied.29 This fuels controversy among competing taxonomies.

An alternative approach to constructing measurement systems is based on the predictive power of the tests for real world outcomes. The Hogan Personality Inventory30, the California Personality Inventory, and the Minnesota Multiphasic Personality Inventory were all developed with the specific purpose of predicting real-world outcomes. Decisions to retain or drop items during the development of these inventories were based, at least in part, upon the ability of items to predict outcomes. This approach has a concreteness about it that appeals to economists.

Instead of appealing to abstract a priori notions about domains of personality and latent factors, it anchors measurements in tangible outcomes and constructs explicit tests with predictive power instead of appealing to a priori abstract constructs or factors. Yet this approach has its own problems. All measurements of factor fk can claim incremental predictive validity as long as each measurement is subject to error

(

εkj ≠0

)

. Proxies for fkare sometimes interpreted as

separate determinants (or “causes”) instead of as surrogates for an underlying one-dimensional construct or factor.

To state this point more precisely, suppose that model (1) is correct and that a set of test scores display both convergent and discriminant validity. As long as there are measurement errors for construct k, there is no limit to the proxies for fk that will show up as statistically significant predictors of an outcome, sayYn. This is a standard result in the econometrics of

29

Indeed as documented in Cunha and Heckman (2007a), the factors associated with personality are also correlated with the cognitive factors.

30

(27)

measurement error. 31 If some outcome Yn is predicted by fk, and there are multiple

mismeasured proxies for fk, those proxies can be predictive forYn even though only one factor predicts the outcome.32 Thus, many “significant” predictors of an outcome will be found that are all proxies for a single latent construct. Using a predictive criterion to determine new facets of personality leads to a proliferation of competing measures as is found in the psychological studies that we survey below.

31

See, e.g., Bound, Brown and Mathiowetz (2001). See also the notes on ability bias posted at the website for this paper in Web Appendix C.

32

Thus, if

n k n

Y = +α βf +U , Un ⊥⊥ fk

and we use L X-adjusted proxies for fk

( ) ( ) ( ) ( ) ( ) ( ), k k k j T j =T j −μ X j=1,…,L we obtain an equation ( ) 1 L k j n n Y α γ T U = = +

+ ,

where Unfk+Un and where the true values of γ are all zero, because the test scores do not determine Yn but fk does determine it (unless β =0). Straightforward calculations show that under general conditions, if one arrays the γ in a vector of length L, γk and the X-adjusted test scores in a vector of length L, denoted T, the OLS estimator

( )

(

)

1

( )

Cov , Cov , k k n T T Y γ = − γ

converges in large samples to

( )

( )

( )

1 2 1 2 2 ( ) 2 ( ) ( ) 2 ( ) ( ) 2 1 1 2 1 ( ) 2 1 ( ) ( ) 2 2 ( ) 2 2 1 2 2 ( ) 2 ( ) ( ) 2 2 ( ) 2 1 plim . k k k k k k k k k k L k k k k k k f f L f k k k k k f f f k L k k k L f L f ε ε ε σ λ σ λ λ σ λ λ σ λ λ λ σ σ λ σ γ βσ λ λ λ σ σ λ σ − ⎛ + ⎞ ⎜ ⎟ ⎛ ⎞ ⎜ ⎟ + ⎜ ⎟ ⎜ ⎟ = ⎜ ⎟ ⎜ ⎟ ⎝ ⎜ ⎟ ⎜ + ⎟ ⎝ ⎠ Assuming 2 0, k f

σ > β >0and σε2k >0, for all =1,…,L, any erroneous predictor of fk will be statistically significant in a large enough sample. Using a purely predictive criterion produces a proliferation of “significant” predictors for outcome Yn. If one test is measured without error, then the erroneously measured tests will not be statistically significant in large samples if the perfect measurement is included among the regressors in the prediction equation. Cunha and Heckman (2007a) show that estimated measurement errors in both cognitive and noncognitive tests are important so that the problem of proxy proliferation is serious.

(28)

If, in addition to these considerations, the measures fail the test of discriminant validity, the predictors for one outcome may proxy both fk and fj

(

kj

)

if fk and fj are correlated. We have already presented evidence that IQ tests proxy both cognitive and noncognitive factors. A purely predictive criterion fails to distinguish predictors for k clusters (the fk) from predictors for j clusters (the fj). Thus, items in one cluster can be predictive of outcomes more properly allocated to another cluster.

In addition to problems regarding the arbitrariness of test constructs and the proliferation of proxies for the same construct, there is the additional problem of reverse causality. A test score may predict an outcome because the outcome affects the test score. Hansen, Heckman and Mullen (2004) develop a model where in factor representation (1), the outcome being related to cluster (factor) k, say Yn, is an element of X and determines μj( )k ( )X and λ( )jk ( )X . In addition, they allow (factor) fk to be a determinant of Yn. They establish conditions under which it is possible to identify causal effects of fk on Yn when the proxies for fk suffer from the problem of reverse causality because λkj( )X and μkj( )X include Yn among the X. They establish that cognitive ability tests (e.g., AFQT) are substantially affected by schooling levels at the date the test is taken.

To understand how their method works at an intuitive level, consider the effect of schooling on measured test scores. Schooling attainment likely depends on true or “latent” ability, fk, in equation (1). At the same time, the measured test score depends on schooling attained so it is a component of X in equation (1) (i.e., it affectsμkj( )X or λkj( )X or both.) Hansen, Heckman, and Mullen (2004) assume access to longitudinal data that randomly samples

(29)

the population of adolescents in school where the adolescents are given an exam at a fixed point in time. At the time people are given the test, they are at different schooling levels, because the longitudinal sample includes people of different ages and schooling levels.33 Following people over time, we can determine final schooling levels, which likely depend, in part, on latent

ability fk. People who attain the same final schooling level are at different schooling levels at the date of the test. These schooling levels at the date of the test are random with respect to fk

because the sampling rule is random across ages at a point in time. Thus, conditioning on final schooling attained which depends on fk, schooling at the date of the test is randomly assigned. Thus one can identify the causal effect of schooling on test scores even though final schooling depends on an unobserved latent ability. See Hansen, Heckman, and Mullen (2004) for

additional details on this method and alternative identification strategies. The basic idea of their procedure is to model the dependence between X and fk in equation (2) and to solve the problem of reverse causality using a model.

The problem of reverse causality is especially problematic when interpreting correlations between personality measurements and outcomes. For example, self-esteem might increase income, and income might increase self-esteem.34 Measuring personality prior to measuring predicted outcomes does not necessarily obviate this problem. For example, the anticipation of a future pay raise may increase present self-esteem. Heckman, Stixrud and Urzua (2006) and Urzua (2007) demonstrate the importance of correcting for reverse causality in interpreting the effects of personality tests on a variety of socioeconomic outcomes. Econometric techniques for

determining the causal effects of factors on outcomes could make a distinctive contribution to

33

These conditions characterize the NLCY 79 data they use in their empirical work.

34

In addition to the problem of reverse causality, there is the standard problem that a third variable like education might increase both self-esteem and income (spurious correlation).

(30)

psychology. Most psychologists (and many economists) focus on prediction, not causality. Establishing predictive validity may be enough to achieve the goal of making good hiring decisions. For policy analysis, causal models are needed. (See Heckman and Vytlacil, 2007.)

The analyses of Heckman, Stixrud and Urzua (2006); Urzua (2007), and Cunha and Heckman (2007a) are frameworks for circumventing the problems that arise in using predictive validity alone to define and measure personality constructs. These frameworks recognize the problem of measurement error in the proxies for constructs. Constructs are created on the basis of how well latent factors predict outcomes. They develop a framework for testing discriminant

validity because they allow the factors across different clusters of constructs to be correlated, and

can test for correlations across the factors.

They use an extension of factor analysis to represent proxies of low dimensional factors. Using standard methods they can test for the number of latent factors required to fit the data and rationalize the proxies. 35 Generalizing the analysis of Hansen, Heckman, and Mullen (2004), they allow for lifetime experiences and investments to determine in part the coefficients of the factor model. They correct estimates of the factors on outcomes for the effects of spurious feedback, and separate proxies from factors. The factors are permitted to be dynamic over the life cycle, i.e.they can change as a consequence of experience and investment.36

Measurements of latent factors may also be corrupted by “faking.” There are two types of socially desirable responses: impression management and self-deception (Paulhus, 1984). For example, individuals who know that their responses on a personality questionnaire will be used to

35

For example, Cragg and Donald (1997) present classical statistical methods for determining the number of factors. In addition to their techniques, there are methods based on Bayesian posterior odds ratios.

36

Hogan and Hogan (2007) use a version of this procedure. They appear to be unique in this regard among

personality psychologists. However, in psychometrics, there is a long tradition of doing predictive analysis based on factor analysis (see, e.g., the essays in Cudeck and MacCallum, 2007), but there is no treatment of the problem of reverse causality as analyzed by Hansen, Heckman and Mullen (2004).

(31)

make hiring decisions may deliberately exaggerate their strengths and downplay their weaknesses (Ones & Viswesvaran, 1998; Viswesvaran & Ones, 1999). Subconscious motives to see

themselves as virtuous may produce the same faking behavior, even when responses are anonymous. Of course, it is possible to fake conscientiousness on a self-report questionnaire whereas it is impossible to fake superior reasoning ability on an IQ test. Still, to a lesser degree, a similar bias may also operate in cognitive tests. Persons who know that their test scores will affect personnel or admissions decisions may try harder. The effects of faking on predictive validity has been well-studied by psychologists, who conclude that distortions have surprisingly minimal effects on prediction of job performance (Hough et al., 1990; Hough & Ones, 2002; Ones & Viswesvaran, 1998). Correcting for faking using scales designed to measure deliberate lying do not seem to improve predictive validity (Morgeson et al, 2007). Nevertheless, when measuring cognitive and non-cognitive traits, therefore, ideally one standardizes or at least controls for incentives.

A. A Benchmark Definition of Traits

The evidence summarized to this point suggests an important distinction. Measurements or manifestations of a trait (Mkj ) need to be distinguished from the “true” trait, e.g., fk.

Measurements may be more general than test scores but they also may include test scores. The manifestations (or phenotype) of the trait will depend on context, incentives, motivations of agents and possibly other traits. We previously discussed the evidence that motivation (either intrinsic or extrinsic) affects performance of test scores. The more motivated people do better. This is likely also true for measurements of noncognitive traits. Part of the context-dependent manifestation of personality traits (Mischel, 1968) is likely due to such effects.

(32)

We present formal models of this phenomenon in the next section, but the intuitive idea is clear. Let “ Pk ” (price) represent extrinsic incentives facing the agent for manifestations of trait k. Thus if a personality manifestation is desirable in a given situation, agents are more likely to exhibit it. (Thus, an intrinsically shy person may become aggressive when her life or livelihood depends on it.) Let f~k be the other traits (which could include traits characterizing pure intellect

if k is not itself intellect), other aspects of motivation and values and the like, accumulated experience and other aspects of the individual W including aspects of context not captured in Pk.

Measured or manifest traits are imperfect proxies for the true traits: (2)

, ( , ~ , , ), 1,..., , 1,..., .

k

j j k k k k k

Mf f P W j= J k = K

There may be threshold effects in all variables so that the ϕj k, function allows for jumps in manifest traits as the arguments in (2) are chosen. It is only meaningful to define measurements on fk at a benchmark level of Pk,fk, f~k, W. Define the benchmark as Pk, f~k, W. Then for one

j, for each k, we can define fk at benchmark values as

k

j k

M = f for P P= k, f~k = f~k, W W= , j = 1,…,Jk, k = 1,…,K.

This produces a normalization of the function ϕj k, for each k and an operational definition of the latent trait in terms of a standardized environment and incentives. At issue is the stability of the manifest traits across situations (values of the arguments of (2)). Some would say they are totally unstable. Others would claim a great deal of stability. We present evidence on these questions for some traits.

Equation (2) accounts for a diversity of measurements on the same latent trait, and

recognizes the evidence that context, incentives, other traits and motivation affect manifest traits. It is sufficiently flexible to capture the notion that at a high enough level of certain traits,

References

Related documents

1 Data collection and progress report (3 weeks) Progress report Man days 15 2 Draft Feasibility study report (1 week) Feasibility study report (1st draft) Man days 6. 3

As a parent, you have the right to select a VPK program option that meets your needs, including: • School-year program—540 instructional hours; OR • Summer

U nlike the change in average salaries in constant dollars from AY 2011/12 to AY 2012/13, in which professors of the associate and assistant ranks experienced negative growth (at

PDF 2012 On the Upswing: Findings from the ASA 2011-2012 Job Bank Survey PDF 2012 What Do We Know About the Dissemination of Information on Pedagogy?: 2008, 2010, and 2011 PDF

This included first studying how schools determined placement for students with autism, the academic and social influences of placement in the general classroom, perceptions

 As a final document we obtained, the “Agreement issuing General Provisions for Integration and Operation of Hospital Bioethics Committees” and hospital units were established

The Municipal Code identifies that the tax exemption applies to new construction in three specific Redevelopment Project Areas: Pueblo Uno, Park Center Plaza, and Sm