A Multitrait Multisource Confirmatory Factor Analytic Approach to the Construct Validity of ADHD Rating Scales

(1)

A Multitrait–Multisource Confirmatory Factor Analytic Approach to the

Construct Validity of ADHD Rating Scales

Rapson Gomez

University of Ballarat

G. Leonard Burns

Washington State University

James A. Walsh

University of Montana

Marcela Alves de Moura

Washington State University

Confirmatory factor analysis was used to model a multitrait–multisource design to evaluate the construct validity of attention-deficit/hyperactivity disorder (ADHD) rating scales. The 2 trait factors were the ADHD inattention and hyperactivity/impulsivity dimensions. The 2 source factors were parents and teachers. In Study 1, parents and teachers rated 1,475 Australian elementary school children on the ADHD symptoms. In Study 2, parents and teachers rated 285 Brazilian elementary school children on the ADHD symptoms. Similar results occurred in both studies with most of the ADHD symptoms containing more source than trait variance, thus providing weak evidence for the convergent and discriminant validity of the symptoms as measured by rating scales. The study outlines the implications of such strong source effects for understanding ADHD.

Recent studies have used confirmatory factor analysis (CFA) to evaluate the structural organization of parent and teacher ratings of the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM–IV; American Psychiatric Association, 1994) attention-deficit/hyperactivity disorder (ADHD) symptoms (e.g., Burns, Boe, Walsh, Sommers-Flanagan, & Teegarden, 2001; DuPaul et al., 1998; Gomez, Harvey, Quick, Scharer, & Harris, 1999). These studies have consistently found that a two-factor model consisting of inattention (IN) and hyperactivity/impulsivity (H/I) provides a better model for the organization of parent and teacher ratings of these symptoms than a one-factor model, thus supporting the current view of ADHD as consisting of two distinct symptom clusters (i.e., IN and H/I symptoms; American Psychiatric Asso-ciation, 2000).

These studies have also found high correlations between the ADHD-IN and ADHD-H/I factors. In studies by Burns and col-leagues (Burns, Walsh, Owen, & Snell, 1997; Burns, Walsh, Patterson, et al., 1997; Burns et al., 2001), the correlation between

the ADHD-IN and ADHD-H/I factors ranged from .68 to .80. DuPaul et al. (1997, 1998), as another example, found a correlation of .94 for teacher ratings and .92 for parent ratings. In a similar study in Australia, Gomez et al. (1999) found correlations of .68 for teachers and .75 for parents, and a study with American Indian children reported a correlation of .68 for teachers and .87 for parents (Beiser, Dion, & Gotowiec, 2000).

There are several possible reasons for the high correlation between the ADHD-IN and ADHD-H/I symptom clusters (Burns & Walsh, 2002). The strong relation could be due to the two symptom clusters sharing common risk factors. A second possi-bility is that some of the ADHD symptoms could have weak discriminant validity (i.e., a symptom correlating with its own cluster at the same level as its correlates with the other cluster). Such weak discriminant validity would inflate the correlation between the two symptom clusters (Burns, Walsh, Patterson, et al., 1997; Crystal, Ostrander, Chen, & August, 2001, p. 202). A third possibility for the high correlation could be due to the use of a single informant. In all the above CFA studies, the results are based on a single source (i.e., parent or teacher ratings of the children’s behavior). If parent and teacher ADHD rating scales contain strong source effects, then the use of a single source could be the reason for the high correlation between the IN and H/I factors.

Convergent and Discriminant Validity

of the ADHD Symptoms

To understand the IN and H/I symptom clusters (e.g., to deter-mine the causes, associated features, and response to treatment of the two clusters), it is first important to know if the two symptom clusters have strong convergent and discriminant validity. The convergent and discriminant validity of the symptoms, however, cannot be determined with a single source because of the potential Rapson Gomez, School of Behavioural and Social Sciences and

Hu-manities, University of Ballarat, Ballarat, Victoria, Australia; G. Leonard Burns and Marcela Alves de Moura, Department of Psychology, Wash-ington State University; James A. Walsh, Department of Psychology, University of Montana.

We thank the school principal, Iuri Carvalho dos Santos, as well as all the teachers, students, and students’ parents from the Instituto de Educac¸a˜o Assis Brasil for their help with the Brazilian study. We also thank Tina Foschini-Miller for assistance with the translation of the measure to Por-tuguese. Finally, Marcela Alves de Moura thanks the National Council for Scientific and Technologic Development-CNPq for a graduate fellowship from the Brazilian government.

Correspondence concerning this article should be addressed to G. Leo-nard Burns, Department of Psychology, Washington State University, Pullman, Washington 99164-4820. E-mail: [email protected]

(2)

problem of source effects. Given the important role of parent and teacher ADHD rating scales in research on ADHD, it is critical to know the amount of trait, source, and error variance in these scales to understand the current research findings and to improve re-search questions. The use of CFA to model a multitrait (IN and H/I)–multisource (parents and teachers) design provides a way to determine the amount of trait, source, and error variance in the ADHD symptoms. Prior to describing this procedure in more detail, we first define trait and source effects.

Trait and Source Effects in ADHD Rating Scales

Trait effects refer to systematic variance that generalizes across sources within the same situation (e.g., mothers and fathers, teach-ers and teachteach-ers’ aides) or across different situations (e.g., parents and teachers). Trait effects are independent of the specific source, or, stated differently, trait effects represent behavior viewed in a similar manner by different sources. Strong trait effects suggest that the child’s behavior generalizes across sources (Rowe & Kandel, 1997). And, of course, traits are anchored to a specific set of behaviors (e.g., the nine IN symptoms and the nine H/I symp-toms) chosen and assessed via items to represent some larger domain of clinical or more general relevance. It is also usually assumed that the behaviors that index the trait are the same for different sources (e.g., the same IN and H/I symptoms can be used with parent and teacher sources).

Source effects refer to systematic variance that is specific to a certain source (e.g., parents, teachers, peers, self). Source effects have traditionally been considered a form of bias associated with characteristics of the rater (e.g., halo effects, projection bias, response bias; see Fiske, 1987, for a more complete discussion of method effects, with source effects being an aspect of method effects). Within the view, source effects are considered problem-atic because such effects may distort or bias the relations among constructs (Kenny & Kashy, 1992). In other words, if researchers are to understand the true relations among constructs within the field of child psychopathology (e.g., the true relation between the IN and H/I symptom clusters), source effects must be separated from trait effects because it is the relations among the trait effects that will advance understanding of child psychopathology.

An alternative view considers source effects to reflect real differences in the children’s behavior across sources (e.g., the child displays oppositional behavior in the presence of the mother but not in the presence of the teacher). Within this view, source effects represent “unique [systematic and reliable] variance associated with cross-situational differences in children’s behavior as ob-served by different informants” (Greenbaum, Dedrick, Prange, & Friedman, 1994, p. 145). This view is associated with the idea that source effects represent true differences in children’s behavior between sources (i.e., source effects as theoretically meaningful variance important for theory development; Dishion, Burraston, & Li, in press).1

We return to these different views of source effects later in this article.

CFA to Model a Multitrait–Multisource Design

Researchers have noted for some time the construct validation problems associated with the use of a single source to investigate the relations among constructs (Campbell & Fiske, 1959). In a

classic paper, Campbell and Fiske (1959) proposed the multitrait– multimethod (MT-MM) design to address this problem. The MT-MM design has become a critical aspect of the construct validation process because it allows for the simultaneous analysis of convergent and discriminant validity as well as method effects (Lance, Noble, & Scullen, 2002). Although Campbell and Fiske (1959) proposed several qualitative decision rules for the evalua-tion of MT-MM data, it was the applicaevalua-tion of CFA to MT-MM matrices that provided quantitative procedures to test the conver-gent and discriminant validity of measures as well as to determine the amount of trait, source, and error variance in each measure (Lance et al., 2002).

Currently there are two main CFA approaches to model MT-MM data: the correlated trait– correlated method (CT-CM) approach and the correlated uniqueness (CU) approach. The CU approach has been considered the more appropriate choice because of the higher likelihood of convergence problems and inadmissible solutions associated with the CT-CM approach (e.g., Kenny & Kashy, 1992). Recently, however, Lance et al. (2002) described a series of theoretical and substantive shortcomings associated with the CU approach. One weakness of the CU approach occurs when there are correlations among the methods or sources. This is a problem because the CU approach assumes that method or source effects are orthogonal (e.g., parent, teacher, and child self-ratings are not correlated; see Kenny & Kashy, 1992, pp. 169 –170). When the data do not support this assumption, the CU approach results in the correlations among the sources artificially inflating the con-vergent validity (i.e., increasing the trait variance) and thereby artificially inflating the correlations among the traits, thus artifi-cially reducing the discriminant validity. Lance et al. (2002, pp. 232–233) provided a demonstration of this problem (see Crystal et al., 2001, Figure 1, for correlations among childhood behavior problems that may be artificially inflated because of the use of the CU approach). Lance et al.’s (2002) recommendation was “that the CT-CM model be regarded as the generally preferred model and that the CU model be invoked only when the CT-CM model fails” (p. 228). Our analyses thus used the CT-CM approach (see Lance et al., 2002, Table 6, for a more detailed discussion for the merits of the CT-CM approach).

1_{Although our focus in this study is on trait and source effects in CFA} procedures, the issue of source effects is also relevant to simple correla-tions among ratings of child behavior problems across various sources (e.g., Achenbach, McConaughy, & Howell, 1987). In this meta-analysis of 269 samples in 119 studies, Achenbach et al. reported a mean correlation of .50 between similar sources such as parents, a mean correlation of .28 between different sources such as parents and teachers, and a mean correlation of .22 between children’s self-ratings and other sources such as parents. As these results indicate, the correlation between sources’ ratings of child behavior problems is low when the sources rate the child in different situations such as home and school (i.e., each source has a somewhat unique view of the child, and the child’s behavior may vary from the school to the home situation). Although these findings suggest the presence of strong source and context effects, it is necessary, however, to use CFA procedures to model multitrait–multisource designs to determine the amount of trait, source, and error variance in ratings of child behavior problems. For example, given a correlation of .60 between mothers’ and fathers’ ratings of disruptive behavior problems, it is impossible to know how much of this correlation is due to source versus trait effects.

(3)

Although earlier studies have used the CFA procedures to analyze MT-MM designs to better understand child behavior dis-orders (e.g., Fergusson & Horwood, 1993; Greenbaum et al., 1994; Rowe & Kandel, 1997), no study to our knowledge has used these procedures to determine the amount of trait, source, and error variance in the DSM–IV ADHD-IN and H/I symptoms. Such an analysis is important because it provides information on the con-struct validity of the individual ADHD symptoms (i.e., the amount of trait, source, and error variance in each symptom), the relation of the IN and H/I traits (i.e., the discriminant validity of the traits), and the relation of the two or more methods (i.e., the discriminant validity of methods). Given that our studies used multiple sources (parents and teachers) and a single method (same rating scale), we refer to this as a multitrait–multisource (MT-MS) design to indi-cate there are multiple sources and a single method, although the more common name is multitrait–multimethod (Campbell & Fiske, 1959).

The first goal of our study was to determine the amount of variance in the ADHD symptoms due to trait, source, and error effects.2

To accomplish this goal, we used correlated trait– correlated source CFA procedures to model multiple traits (i.e., ADHD-IN and ADHD-H/I) assessed by multiple sources (i.e., parents and teachers). Figure 1 shows our postulated model for the ADHD symptoms. This model involves two latent trait factors (ADHD-IN and ADHD-H/I) and two latent source factors (parents and teachers). There are 36 manifest variables: 18 ADHD symp-toms for parents and 18 ADHD sympsymp-toms for teachers.3

The ideal outcomes are for the two trait factors to be minimally correlated with each other (discriminant validity of traits), for the

two source factors to be moderately (e.g., .50) to minimally cor-related with each other (discriminant validity of sources),4

for each symptom to have a significant and substantial loading on the appropriate trait factor (convergent validity of symptoms), and for each symptom to have lower loading on the source factor than on the trait factor. Such ideal outcomes would provide strong evi-dence for the construct validity of the ADHD symptoms as mea-sured by parent and teacher rating scales (i.e., strong convergent and discriminant validity with more trait than source variance in each symptom).

In contrast to these ideal outcomes for the parent and teacher rating scales, the absence of trait variance in conjunction with high levels of source variance for each symptom would require a reconsideration of what is measured by the scales. In this less than ideal outcome, there would be no shared (trait) view, only the unique view of each source. Because source effects can involve two different aspects, such results would suggest three possibili-ties. One possibility would be that each source contained a great deal of bias (measurement artifact; see Fiske, 1987). The second possibility would be that the child’s behavior is specific to each source (i.e., each source has an accurate view of the children’s behavior because their behavior is source-specific). A third possi-bility, and probably the most correct, would be that strong source effects represent a mixture of bias and accuracy.5

2_{Error variance consists of two different aspects: residual systematic} variance specific to a given measure and nonsystematic effects specific to a given measure (e.g., measurement error; see Lance et al., 2002, p. 228). The error variance associated with each ADHD symptom consists of these two aspects.

3_{To use CFA procedures to model an MT-MS matrix, the} recommen-dation has been for a minimum of three traits and three sources in order to have an overidentified model because a model with only two traits and two sources represents an underidentified model (i.e., a model with negative degrees of freedom; see Lance et al., 2002, Table 4). Although our model involves two traits and two sources, it is an overidentified model because there are 34 manifest variables for the Australian sample (491 degrees of freedom) and 32 manifest variables for the Brazilian sample (430 degrees of freedom). The typical two trait–two source design involves only 4 manifest variables, thus the reason that the usual two trait–two source design represents an underidentified model (⫺4 degrees of freedom). The Results section explains why there are 34 manifest variables for the Australian sample and 32 manifest variables for the Brazilian sample rather than 36 for each sample.

4_{A moderate correlation (.40 to .50; see Byrne, 1994, p. 136) might be} expected between the two source factors because each source uses the same rating scale to evaluate the child’s behavior (i.e., the items on the scale as well as the rating dimension are identical for each source). The identical nature of the scale would be expected to contribute to a relation between the two sources to a certain extent.

5_{At this time, it does not appear to be empirically possible to distinguish} between these two types of source effects (Greenbaum et al., 1994, pp. 145–146). Future studies, however, might be able to build a case for the source effects being mostly accuracy with an MT (IN and H/I)-MS (parents and teachers) design in conjunction with direct observations of IN and H/I behaviors in the home and school. If the parent and teacher source effects differentially predicted the home and school direct observations and did so stronger than the trait effects, then such a result would suggest the source effects in the scales represent accuracy specific to the source more than bias.

Figure 1. Heuristic representation of the multitrait–multisource model for the attention-deficit/hyperactivity disorder (ADHD)-inattention (IN) and ADHD-hyperactivity/impulsivity (H/I) trait factors and the parent- and teacher-report source factors. The model contained 36 manifest variables (i.e., 9 ADHD-IN and 9 ADHD-H/I symptoms for parents; 9 ADHD-IN and 9 ADHD-H/I symptoms for teachers). Although not shown in the figure, each manifest variable also has an error component.

(4)

The second goal of the study is represented by Figure 2. Here the analyses were repeated at the parcel level (i.e., groups of items are added together to create summary scores). There were three reasons for these analyses. The first reason was that the individual items, especially the H/I items, probably have high levels of skewness and kurtosis, thereby violating the multivariate normality assumption of CFA. The creation of parcels should reduce the skewness and kurtosis and is recommended for this situation (West, Finch, & Curran, 1995). In addition, the use of parcels should decrease the amount of error variance. The decrease in error variance should occur because each item presumably taps some independent component of trait or source variance, and when the items are added to produce the summary (parcel) score, there should be a decrease in error variance for the parcels relative to the individual items. The important question, and the second reason for this analysis, is whether the expected decrease in error variance becomes trait or source variance. Finally, the MT-MS analyses at the parcel level more closely approximate the summary (total) scores on the IN and H/I dimensions used to understand ADHD (i.e., the risk factors, associated features, and treatment effects associated with total scores on the IN and H/I ratings for parents and teachers). Whereas the analyses at the item level provide information on the amount of trait, source, and error variance in each item, the parcel-level analyses provide such information for the summary scores and, because of this, have more direct

impli-cations for the research that has used parent and teacher ADHD rating scales to study ADHD.6

The third goal of the study is represented by Figure 3. This analysis added an academic problems construct to the MT-MS design. Although the findings are not always consistent, studies suggest that the ADHD-IN dimension has a stronger relation to academic difficulties than the ADHD-H/I dimension (Milich, Bal-entine, & Lynam, 2001, pp. 473– 474). These studies, however, have not separated trait from source variance in the IN, H/I, and academic problems measures. The use of CFA to model an MT (IN, H/I, and academic problems)-MS (parents and teachers) de-sign allowed us to determine if the relation between the IN and academic problems traits was significantly stronger than the rela-tion between the H/I and academic problems traits without the contamination of the source variance. A more general purpose of this analysis was to exemplify how MT-MS procedures can clarify the relation of the IN and H/I measures with other constructs (e.g., social competence, oppositional defiant disorder, conduct disorder).

Our first sample involved parent and teacher ratings on the ADHD symptoms for 1,475 children from Australia, whereas our second sample involved parent and teacher ratings for 285 children from Brazil. Given the cultural and language differences between the two samples, similar results would strongly argue for the generalizability of the findings, at least for population-based sam-ples (the generalization of the findings to clinical samsam-ples with a diagnosis of ADHD would remain an empirical issue).

Method

Participants and Procedure

Australian sample. The participants were the parents and teachers of 1,475 children from 16 elementary schools. This sample included 200 more participants than reported on previously (Gomez et al., 1999).7_The sample consisted of 742 girls and 733 boys with an average age of 8.28 years (SD⫽1.80). All schools were selected randomly, and all selected schools agreed to participate in the study. The final sample represented approximately 76% of the children invited to participate. In terms of ethnicity, 89% of the children were of northern European background, 4% southern and eastern European, 2% Asian, and 4% others (e.g., children from South America and Africa). The ethnicity and gender distribution of the children reasonably reflected the overall Australian population (Gomez et al., 1999, p. 267).

With the approval of the schools, the teachers were provided with a sealed envelope for each child in their class. The children were instructed by the teachers to take the envelopes home to their parents. Each envelope contained a letter describing the study (focus on children’s home and school behavior), the consent form, the rating scale, and a return envelope. Teachers completed the measure on the children whose parents returned the scales. Teachers had been interacting with the children for a minimum of 3 months prior to the ratings. Teacher ratings were obtained on

approx-6_{The analysis in Figure 2 involves eight parcels (eight manifest} vari-ables). This results in an overidentified model with 10 degrees of freedom. The analysis used eight parcels because the use of four parcels (two manifest variables for each factor) would have resulted in an underidenti-fied model with negative degrees of freedom.

7_{In the original Gomez et al. (1999) study, CFAs were performed} separately on the parent and teacher rating. The original study did not use CFA procedures to model an MT-MS design.

Figure 2. Heuristic representation of the multitrait–multisource model for the attention-deficit/hyperactivity disorder (ADHD)-inattention (IN) and ADHD-hyperactivity/impulsivity (H/I) trait factors and the parent- and teacher-report source factors. IN Parcel 1 consisted of the five odd-numbered IN symptoms, IN Parcel 2 the four even-odd-numbered IN symp-toms, H/I Parcel 1 the five odd-numbered H/I sympsymp-toms, and H/I Parcel 2 the four even-numbered H/I symptoms. Although not shown in the figure, each manifest variable also has an error component.

(5)

imately 90% of the children with parent ratings, resulting in a total of 1,475 children with complete parent and teacher ratings. Approximately 95% of the parent ratings were completed by the children’s mothers. The parent and teacher ratings were voluntary and anonymous.

Brazilian sample. The participants were the parents and teachers of children in the 1st through 4th grade at the Instituto de Educac¸a˜o Assis Brasil, a state school in the city of Pelotas, Brazil. Pelotas is a southern city in Rio Grande do Sul state with a population of approximately 300,000. At the time of the study, there were 586 students enrolled in Grades 1 through 4. With the approval of the school, teachers were asked to partic-ipate in the study. A total of 20 of 21 teachers volunteered to particpartic-ipate in the study (one 2nd-grade teacher declined to participate), resulting in a total of 558 children. These 20 teachers were then given a sealed envelope for each child in their class, and the children were asked to take the envelopes home to their parents. Each envelope contained a description of the study, the consent form, the rating scale, and an envelope to return the scale to the school. The teachers had been interacting with the children for most of the school year at the time of the ratings. Each teacher was paid $25 for their participation in the study. Each questionnaire returned by parents resulted in a $1 donation to the school. The parent and teacher ratings were voluntary and anonymous.

The teachers returned questionnaires on 558 students (100%), and par-ents returned questionnaires on 353 studpar-ents (63%). A total of 28 scales were returned incomplete by teachers and 60 by parents. This resulted in 530 complete ratings for teachers and 293 for parents. A total of 285 children had complete rating by teachers and parents (51% of the original

sample). For the 285 children, 149 were girls and 136 were boys, with an average age of 8.91 years (SD⫽1.26). A total of 77% of the ratings were completed by the children’s mothers, 15% by fathers, and 8% by others. In terms of the education of the rater, 21% had completed middle school, 52% high school, 21% college, and 6% graduate study. The income level of our sample was similar to that of families in the southern region of Brazil (Instituto Brasileiro de Geografia e Estatı´stica, 2000).

Measures

Australian sample: DSM–IV AD/HD Rating Scale. The DSM–IV AD/HD Rating Scale was used for the Australian sample (Gomez et al., 1999). The scale lists the nine ADHD-IN symptoms first, followed by the nine ADHD-H/I symptoms. Parents and teachers rated the occurrence of each symptom on a 4-point scale (i.e., not at all, just a little, pretty much, or very much). Gomez et al. (1999) reported internal consistency values (Cronbach’s␣s) for the parent IN and H/I scales of .92 and .90, respec-tively, and the values for teacher IN and H/I scales were .95 and .94, respectively. The 3-month test–retest reliability values for the parent IN and H/I scales were .55 and .55, respectively, and the values for the teacher scales were .70 and .73, respectively. The parent and teacher IN and H/I scales also demonstrated strong concurrent validity with the Abbreviated Conners Rating Scales (rs⬎.75).

This scale is similar to the new DSM–IV ADHD scales used in studies in the United States (e.g., DuPaul et al., 1998; Gaub & Carlson, 1997; Power et al., 1998). All of these new scales list the DSM–IV symptom on the scale almost word-for-word and use similar 4-point rating anchors. Given the almost identical nature of these various ADHD scales, the results from the Australian sample should, to a considerable extent, apply to the other scales.

Brazilian sample: Child and Adolescent Disruptive Behavior Inven-tory 2.3. The Child and Adolescent Disruptive Behavior Inventory 2.3 (CADBI 2.3) has a parent and teacher version (Burns, Taylor, & Rusby, 2001a, 2001b). The IN and H/I symptoms were rated on an 8-point frequency of occurrence scale (never in past month, 1 to 2 times in past

month, 3 to 4 times in past month, 2 to 6 times per week [2 to 4 times per week on teacher form], 1 time per day, 2 to 5 times per day, 6 to 9 times per day, and 10 or more times per day). Parents and teachers were also

asked to rate the child’s academic competence. For parents, there were four items (quality of homework, reading, arithmetic, and writing skills), and for teachers, there were five items (quality of homework, classroom work, reading, arithmetic, and writing skills). These items were rated on a 7-point scale (i.e., severe difficulty, moderate difficulty, slight difficulty, average

performance for grade level, slightly above average, moderately above average, and excellent performance). These ratings were reverse keyed for

the analyses (i.e., higher scores indicate more academic problems). A Portuguese version of the CADBI was developed through the standard procedure of forward and backward translation.

In studies in the United States, the CADBI IN and H/I scales have internal consistency and 3-month test–retest reliability values ranging from .91 to .97 and from .86 to .94, respectively (Fitzgerald, 2002; Iredale, 2000; Skansgaard & Burns, 1998). Interrater reliability values for teachers for the IN and H/I scales have ranged from .64 to .69 (Fitzgerald, 2002). In terms of validity, teacher and parent ratings on these scales have predicted direct observations of classroom behavior and treatment status of ADHD chil-dren, respectively, in a scale-specific manner (Burns, Walsh et al., 2001; Skansgaard & Burns, 1998). In addition, a study in Brazil with an earlier Portuguese version of the CADBI showed good psychometric properties for the IN and H/I scales (Moura, 1999).

Results

Preliminary Item Analyses

For the Brazilian sample, the 8-point rating scale was reduced to a 6-point scale to decrease the amount of skewness and kurtosis in

Figure 3. Heuristic representation of the multitrait–multisource model for the attention-deficit/hyperactivity disorder (ADHD)-inattention (IN), ADHD-hyperactivity/impulsivity (H/I), and academic problems trait fac-tors and the parent- and teacher-report source facfac-tors. IN Parcel 1 consisted of the five odd-numbered IN symptoms, IN Parcel 2 the four even-numbered IN symptoms, H/I Parcel 1 the five odd-even-numbered H/I symp-toms, and H/I Parcel 2 the four even-numbered H/I symptoms. The teacher academic problems measure consisted of five items (quality of homework, classroom work, reading, arithmetic, and writing), and the parent academic problems variable consisted of four items (quality of homework, reading, arithmetic, and writing). Although not shown in the figure, each manifest variable also has an error component.

(6)

the ratings. Ratings of 6 (2 to 5 times per day), 7 (6 to 9 times per

day), and 8 (10 or more times per day) were combined into a single

category, a rating of 6. All of the analyses were performed on the 6-point scale for the Brazilian sample.

As a first step, the correlations among the 18 ADHD symptoms were inspected in the Brazilian and Australian samples. In the Brazilian sample for teachers, the three impulsivity symptoms were highly correlated with each other (M⫽.91, range⫽.90 to .93). A similar result occurred for the three impulsivity items in the Australian sample for teachers (M⫽ .84, range ⫽ .81 to .87). Given these high correlations, the three impulsivity symptoms were combined into a single item for the Brazilian and Australian

teachers. In addition, in the Brazilian sample for teachers, the

careless and attention symptoms correlated .91, and the disorga-nized and unmotivated symptoms correlated .91 as well. The first

two symptoms (careless and attention) were combined into a single item, with the second two symptoms (disorganized and unmoti-vated) also being combined. These symptoms were combined for the analyses because the high correlations indicated little unique information.

Table 1 shows the descriptive information for the ADHD symp-toms for the Australian and Brazilian samples. Many of the indi-vidual symptoms have high levels of skewness and kurtosis, es-pecially the teacher ratings. The normalized estimate for Mardi’s

Table 1

Descriptive Information on Inattention and Hyperactivity/Impulsivity Symptoms

Symptom Australian sample (N⫽1,475) Brazilian sample (N⫽285) M SD S K M SD S K

Teacher rating: Inattention symptoms

Careless 1.64 0.73 0.96 0.48 2.25 1.71 1.08 ⫺0.37 Attention 1.51 0.75 1.37 1.23 —a _—a _—a _—a Listen 1.36 0.62 1.68 2.34 1.88 1.57 1.68 1.40 Instructions 1.50 0.75 1.50 1.69 1.85 1.54 1.70 1.51 Disorganized 1.50 0.75 1.49 1.66 1.95 1.57 1.56 1.03 Unmotivated 1.44 0.72 1.64 2.12 —b _—b _—b _—b Loses 1.32 0.62 2.17 4.70 1.68 1.50 2.13 3.01 Distracted 1.63 0.83 1.22 0.69 2.15 1.76 1.26 ⫺0.01 Forgetful 1.34 0.62 1.85 3.08 1.98 1.56 1.50 0.93

Teacher rating: Hyperactivity/impulsivity symptoms Fidgets 1.31 0.63 2.24 4.96 1.87 1.61 1.66 1.16 Seat 1.28 0.62 2.53 6.42 1.99 1.65 1.47 0.62 Runs/climbs 1.12 0.41 3.85 16.35 1.34 1.06 3.32 10.12 Quiet 1.22 0.55 2.82 8.26 1.47 1.24 2.66 5.85 Motor 1.20 0.52 2.97 9.46 2.29 1.79 1.04 ⫺0.51 Talks 1.39 0.69 1.88 3.35 1.63 1.42 2.26 3.73 Blurts/waits/interrupts 1.28 0.59 2.50 6.27 1.57 1.29 2.40 4.64 Parent rating: Inattention symptoms

Careless 1.76 0.72 0.73 0.39 2.54 1.71 0.93 ⫺0.45 Attention 1.48 0.72 1.51 1.89 2.40 1.73 0.96 ⫺0.48 Listen 1.66 0.80 1.11 0.67 2.17 1.62 1.29 0.39 Instructions 1.72 0.82 1.06 0.66 1.97 1.58 1.49 0.86 Disorganized 1.57 0.78 1.32 1.18 1.84 1.44 1.69 1.69 Unmotivated 1.61 0.84 1.34 1.09 2.26 1.63 1.15 0.01 Loses 1.58 0.75 1.31 1.44 2.05 1.50 1.46 1.06 Distracted 1.87 0.85 0.79 0.35 2.01 1.56 1.43 0.75 Forgetful 1.56 0.73 1.31 1.51 2.67 1.82 0.71 ⫺0.99

Parent rating: Hyperactivity/impulsivity symptoms Fidgets 1.62 0.85 1.32 1.00 2.51 1.92 0.91 ⫺0.82 Seat 1.37 0.66 1.94 3.67 2.21 1.73 1.27 0.14 Runs/climbs 1.33 0.67 2.30 5.17 1.77 1.47 1.97 2.65 Quiet 1.33 0.66 2.27 5.03 1.78 1.45 1.82 2.02 Motor 1.52 0.86 1.66 1.82 3.06 2.01 0.43 ⫺1.49 Talks 1.73 0.91 1.11 0.30 2.71 2.08 0.69 ⫺1.29 Blurts 1.52 0.75 1.52 2.01 2.35 1.85 1.02 ⫺0.56 Waits 1.62 0.82 1.29 1.08 1.55 1.23 2.57 5.92 Interrupts 1.77 0.87 1.05 0.47 2.32 1.71 1.15 ⫺0.10

Note. Rating anchors for the Australian sample: 1⫽not at all, 2⫽just a little, 3⫽pretty much, and 4⫽very much. Rating anchors for the Brazilian sample: 1⫽never in past month, 2⫽1 to 2 times in past month, 3⫽ 3 to 4 times in past month, 4⫽2 to 6 times per week (2 to 4 times per week for teacher form), 5⫽1 time per day, and 6⫽2 or more times per day. S⫽skewness; K⫽kurtosis.

(7)

coefficient, an estimate of multivariate kurtosis, was 258.35 for the Australian sample and 108.03 for the Brazilian sample. The as-sumption of multivariate normality was violated in both samples.

Violation of the Multivariate Normality Assumption

With maximum likelihood estimation, the lack of multivariate normality can cause several problems for model testing with CFA (Byrne, 2001; West et al., 1995). These include (a) an increase in the chi-square value, thus leading to potentially unnecessary model modifications; (b) a failure of analyses to converge with small samples; (c) an underestimation of fit indexes (e.g., the compara-tive fit index [CFI] being inappropriately low); and (d) inappro-priately low standard errors, thus resulting in loadings and trait correlations being viewed as significant when such is not the case in the population (Byrne, 2001, p. 279). Although there is no perfect solution to the lack of multivariate normality, the best choice appears to be the use of the maximum likelihood procedure with robust estimation (Byrne, 2001, p. 71; West et al., 1995). This procedure results in a robust chi-square statistic referred to as the Satorra–Bentler scaled chi-square statistic, robust fit indexes (i.e., robust CFI and robust root-mean-square error of approximation [RMSEA]), and robust standard errors (see West et al., 1995). These robust values are corrected for the lack of normality. West et al. (1995, p. 74) also recommended the reduction of the alpha level for the significant tests of factor loadings and correlations to decrease the likelihood of concluding incorrectly that a given parameter is significant.

An additional procedure to deal with the lack of multivariate normality is the creation of item parcels (West et al., 1995). The creation of item parcels results in sets of items being grouped together (e.g., rather than the IN factor being represented by the nine IN symptoms, two sets of item parcels are created with five items in one parcel and four items in another parcel). Although the creation of item parcels does not allow for the evaluation of the individual items, the skewness and kurtosis of the parcels may more closely approximate a normal distribution. As described in the next section, these various procedures were used to deal with the lack of normality.

Analytic Strategy

Bentler’s structural equations program (EQS Version 6.0; Bentler & Wu, 2002) was used to perform the CFA. For these analyses, the maximum likelihood procedure with robust estima-tion was used because of lack of multivariate normality. Model fit was evaluated with the Satorra–Bentler scaled chi-square statistic, the robust CFI, and the robust RMSEA. The robust CFI provides a measure of the fit of the hypothesized model relative to the independence model (values range from .00 to 1.00). CFI values greater than .90 suggest an adequate fitting model, although some have argued for values closer to .95 (Byrne, 2001, p. 83). Such fit values may be a little stringent for item-level analyses unless one is willing to allow for correlated errors among items or unless one eliminates items with weak discriminant validity prior to the use of CFA procedures (Burns & Walsh, 2002). The robust RMSEA provides a measure of model fit relative to the population covari-ance matrix when the complexity of the model is also taken into account. Values less than .06 indicate a good fit, values from .06

to .08 a reasonable fit, values from .08 to .10 a mediocre fit, and values greater than .10 a poor fit (Byrne, 2001, pp. 84 – 85). The 90% confidence interval (CI) is also reported for the robust RMSEA. Finally, the robust standard errors were used to evaluate the significance of the factor loadings and factor correlations. The alpha level for these significance tests was also reduced to p⬍ .001.

To determine if the hypothesized model provides a better fit than an alternative model, we used a chi-square difference testing procedure with the Satorra–Bentler scaled chi-square. Because the difference between two Satorra–Bentler scaled chi-squares is not distributed as a chi-square, it is necessary to adjust this difference (P. Bentler, personal communication, May 2002). The formula for this adjustment is available in Muthe´n and Muthe´n (1998, p. 360). We first present the results for the single-source CFA. These analyses are designed to show that our findings replicate the earlier single-source analyses. Next, we present the results from the MT-MS analyses at the item level. The third analysis presents the results from the MT-MS analyses at the parcel level. The fourth analysis then evaluates the relation of trait variance in the IN and H/I factors with the trait variance in the academic problems factor.

Structural Organization of the DSM–IV ADHD

Symptoms Based on a Single Source

Table 2 shows the results for the four separate CFAs (i.e., Australian parents, Australian teachers, Brazilian parents, and Bra-zilian teachers). These four analyses showed that the two-factor model of the ADHD symptoms provided a good fit in an absolute sense as well as a significantly better fit than a one-factor model ( ps⬍.001). In the Australian sample, the correlation between the IN and H/I factors was .76 for parents and .69 for teachers, and the values were .73 and .67, respectively, in the Brazilian sample ( ps⬍.001). These correlations were similar to the earlier studies.8

A Correlated Trait–Correlated Source CFA Approach to

an MT-MS Analysis of the ADHD Symptoms

To evaluate the convergent and discriminant validity of the ADHD-IN and H/I symptoms, we compared the postulated MT-MS model (Figure 1) with a nested series of more restrictive models. The postulated model (Model 1) involves freely correlated traits and freely correlated sources (Figure 1), whereas the three more restrictive models involve the following: (Model 2) no traits and freely correlated sources, (Model 3) perfectly correlated traits and freely correlated sources, and (Model 4) freely correlated traits and perfectly correlated sources. Model 1 is compared with Model 2 to evaluate the convergent validity of the traits. To evaluate the discriminant validity of the traits, we compared Model 1 with Model 3, and to determine the discriminant validity of the sources, we compared Model 1 with Model 4. The ideal

8_{The various analyses were also performed for boys and girls in the} Australian sample. The conclusions for the total sample were the same for each gender. These results are available on request from G. Leonard Burns. The size of the Brazilian sample was too small to perform the analyses for boys and girls separately.

(8)

outcomes are for Model 1 to provide a statistically significant and substantial improvement in fit over Models 2, 3, and 4. The exact meaning of “substantial,” however, is problematic (see Byrne, 1994, chap. 6, footnote 6).

These model comparisons assess convergent and discriminant validity at the matrix level. Subsequent to these comparisons, it is necessary to evaluate the convergent and discriminant validity of the individual parameters for Model 1 (e.g., the amount of trait, source, and error variance in each symptom; the correlation be-tween the IN and H/I factors; and the correlation bebe-tween the parent and teacher factors). An ideal outcome for construct validity would be for each symptom to have a substantial amount of trait variance. The amount of trait variance indicates the amount of convergent validity for each symptom (i.e., the greater the trait variance, the stronger the convergent validity for the particular symptom). In addition, the trait variance should be more than the source variance. If the source variance is larger than the trait variance, even if the amount of trait variance is statistically sig-nificant, then this outcome reduces the support for the convergent validity of the symptom due to the stronger source effect (Byrne, 1994, chap. 6). More source than trait variance also reduces support for discriminant validity and thereby the construct validity of the individual symptoms.

Our analyses are organized in this manner (i.e., the matrix-level tests were performed first, followed by the evaluation of the individual parameters). Because the evaluation of individual pa-rameters can result in the qualification of positive results at the matrix level, the individual parameter analyses are perhaps more important for the evaluation of the convergent and discriminant validity of the ADHD symptoms.

Testing for convergent and discriminant validity: Comparison of models. Table 3 shows the results for the four models for the Australian and Brazilian samples. Model 1 (freely correlated traits and freely correlated sources; see Figure 1) provided a good fit in

an absolute sense for the Australian and Brazilian samples. For the Australian sample, the robust CFI value was .90, and the robust RMSEA value was .046 (90% CI⫽.044 to .048). For the Brazil-ian sample, the values were .92 and .052 (90% CI⫽.046 to .058), respectively.

There was strong evidence for convergent validity at the matrix level for the Australian and Brazilian samples. In the Australian sample, Model 1 resulted in a significantly better fit than Model 2 ( p⬍.001). Model 1 also resulted in a substantial improvement in fit relative to Model 2 (⌬ robust CFI ⫽ .19). Similar results occurred for the Brazilian sample, with Model 1 showing statistical and substantial improvement over Model 2 ( p⬍.001; ⌬robust CFI⫽ .17). These tests provide support at the matrix level for significant correlations between independent measures of the same trait.

For the Australian sample, there was statistical support for the discriminant validity of traits and sources ( ps⬍.001). However, the improvement in model fit was not substantial (⌬robust CFI⫽ .03 for traits and⌬robust CFI⫽.05 for sources). In the Brazilian sample, there was also statistical support for the discriminant validity of traits and sources at the matrix level ( ps⬍.001), with the amount of improvement in model fit being similar (⌬robust CFI⫽.05 for traits and⌬robust CFI⫽.05 for sources).

Testing for convergent and discriminant validity: Comparison of individual parameters. Table 4 shows these results for the Australian and Brazilian samples. The values in the table are the standardized loadings squared. These values indicate the amount of variance in the particular symptom due to the trait, source, and error effects. For the Australian sample, each of the trait loadings was significant with the exception of the trait loadings for the nine ADHD-H/I symptoms for parent ratings. The standardized load-ings squared for these nine symptoms were approximately zero, the nine items containing no trait variance. In addition, in the Australian sample, only two of the trait loadings were larger than Table 2

Model Fit Indices for Alternative Models of DSM–IV Attention-Deficit/Hyperactivity Disorder Symptoms

Model dfs S-B␹2 _RCFI _RRMSEA _{90% CI}

Australian sample (N⫽1,475) Source: Parents

1. One factor 135 1,707* .77 .089 .085–.093

2. IN and H/I factors 134 870* .89 .061 .057–.065

Source: Teachers

1. One factor 104 1,925* .67 .109 .105–.113

2. IN and H/I factors 103 703* .89 .063 .058–.067

Brazilian sample (N⫽285) Source: Parents

1. One factor 135 436* .83 .089 .079–.098

2. IN and H/I factors 134 239* .94 .053 .042–.063

Source: Teachers

1. One factor 77 467* .71 .134 .122–.145

2. IN and H/I factors 76 181* .92 .070 .056–.082

Note. DSM–IV⫽Diagnostic and Statistical Manual of Mental Disorders (4th ed.); S-B␹2_⫽_{Satorra–Bentler} scaled statistic; RCFI⫽robust comparative fit index; RRMSEA⫽robust root-mean-square error of approxi-mation; CI⫽confidence interval; IN⫽inattention; H/I⫽hyperactivity/impulsivity.

(9)

the source loadings, with the magnitude of the trait–source differ-ence for these two symptoms being small (i.e., 34% vs. 31% and 39% vs. 29% for the teacher ratings of the runs/climbs and motor symptoms, respectively). For the Australian sample, there was little evidence for strong convergent validity for the ADHD symp-toms because source effects were greater than trait effects for nearly all of the symptoms.

For the Brazilian sample, similar results occurred, although there was slightly more evidence for the ADHD symptoms having more trait than source variance. Here each of the trait loadings was significant with the exception of the nine ADHD-H/I symptoms for parent ratings, for which there was no trait variance, a result identical to the Australian sample. In contrast, all of the ADHD-H/I symptoms for teacher reports showed more trait than source variance in the Brazilian sample (mean trait variance⫽.41 and mean source variance ⫽ .24). In addition, five of the nine ADHD-IN symptoms for parents contained slightly more trait than source variance in the Brazilian sample (i.e., disorganized, unmo-tivated, loses, distracted, and forgetful). Most of the symptoms (62%), however, contained more source than trait variance.

For the Australian sample, the correlation between the ADHD-IN and ADHD-H/I factors was nonsignificant and almost zero (.03). The correlation between the parent and teacher source factors was .52 ( p⬍.001). For the Brazilian sample, the correla-tion between the ADHD-IN and ADHD-H/I factors was .35 ( p⬍ .001), and the correlation between the teacher and parent source factors was .32 ( p ⬍ .001). Both studies found evidence of discriminant validity for traits and sources, although the low amount of trait variance in the symptoms qualifies the importance of the low correlation between the IN and H/I traits.

A Correlated Trait–Correlated Source CFA Approach to

an MT-MS Analysis of ADHD Symptom Parcels in the

Australian Sample

Table 5 shows the descriptive information for the ADHD-IN and H/I parcels for the Australian and Brazilian samples as well as the academic problems measure for the Brazilian sample. IN

Parcel 1 consisted of the five odd-numbered IN symptoms, IN Parcel 2 the four even-numbered IN symptoms, H/I Parcel 1 the five odd-numbered H/I symptoms, and H/I Parcel 2 the four even-numbered H/I symptoms. Each of the parcels showed good internal consistency. As expected, the parcels resulted in less kurtosis than the individual items. Here the normalized estimate for Mardi’s coefficient was 88.61 for the Australian sample and 33.38 for the Brazilian sample. Although these values are much smaller than the values for the item-level analyses (258.35 and 108.03, respectively), the values were still large, thereby justifying the use of the robust estimation procedures.

Testing for convergent and discriminant validity: Comparison of models. Table 6 shows the goodness-of-fit indices for the MT-MS models of the symptom parcels for the Australian sample. Model 1 (freely correlated traits and freely correlated sources; see Figure 2) provided an excellent fit in an absolute sense. Here the robust CFI value was .99, and the robust RMSEA value was .047 (90% CI ⫽ .033 to .062). There was strong evidence for the convergent validity of traits (⌬robust CFI⫽ .40, p⬍.001), as well as reasonable evidence for discriminant validity of traits and sources at the matrix level (⌬robust CFI⫽.12 for traits and⌬ robust CFI⫽.11 for sources, ps⬍.001).

Testing for convergent and discriminant validity: Comparison of individual parameters. Table 7 shows the amount of trait, source, and error variance in each parcel. The teacher IN parcels contained an average of 18% trait and 71% source variance, and the values for the teacher H/I parcels were 53% and 38%, respec-tively. For parents, the IN parcels contained an average of 40% trait and 46% source variance, and the values for the H/I parcels were 3% and 83%, respectively. For the Australian sample, the correlations between the IN and H/I traits were .33 ( p⬍.001) and .46 ( p⬍.001) for the parent and teacher sources. Although these results indicate good discriminant validity for traits and sources, the absence of any trait variance in the H/I parcels for parents and the small amount of trait variance (approximately 18%) in the IN parcels for teachers qualify the importance of the low correlation between the IN and H/I traits.

Table 3

Goodness-of-Fit Indices for Multitrait–Multisource Models of Attention-Deficit/Hyperactivity Disorder Symptoms

Australian children (N⫽1,475)

1. Freely correlated traits; freely correlated sources 491 2,033* .90 .046 .044–.048 2. No traits; freely correlated sources 526 5,061* .71 .076 .075–.078 3. Perfectly correlated traits; freely correlated sources 492 2,523* .87 .053 .051–.055 4. Freely correlated traits; perfectly correlated sources 492 2,882* .85 .057 .055–.059

Brazilian children (N⫽285)

1. Freely correlated traits; freely correlated sources 430 759* .92 .052 .046–.058 2. No traits; freely correlated sources 463 1,472* .75 .088 .082–.092 3. Perfectly correlated traits; freely correlated sources 431 962* .87 .067 .060–.071 4. Freely correlated traits; perfectly correlated sources 431 957* .87 .066 .060–.071

Note. S-B␹2_⫽_{Satorra–Bentler scaled statistic; RCFI}_⫽_{robust comparative fit index; RRMSEA}_⫽_robust root-mean-square error of approximation; CI⫽confidence interval.

(10)

A CFA Approach to an MT-MS Analysis of the Relation of

IN and H/I to Academic Problems in the Brazilian Sample

The first purpose of this analysis was to attempt to replicate the findings for the parcels with the Brazilian sample. The second purpose was to determine if the relation between the IN and the academic problems traits was stronger than the relation between the H/I and the academic problems traits.

Testing for convergent and discriminant validity: Comparison of models. Table 8 shows the goodness-of-fit indices for the MT-MS models for the symptom parcels and the academic

prob-lems measure for the Brazilian sample. Model 1 (freely correlated traits and freely correlated sources; see Figure 3) provided a good fit in an absolute sense (robust CFI⫽.98, robust RMSEA⫽.070, 90% CI⫽.045 to .095). At the matrix level, there was also strong support for convergent validity of traits (⌬robust CFI⫽.31, p⬍ .001), discriminant validity of traits (⌬robust CFI ⫽ .21, p⬍ .001), and discriminant validity of sources (⌬robust CFI⫽.19,

p⬍.001).

Testing for convergent and discriminant validity: Comparison of individual parameters. Table 9 shows the amount of trait, source, and error variance in each parcel and the academic prob-Table 4

Variance in Attention-Deficit/Hyperactivity Disorder Symptoms Accounted for by Trait, Source, and Error Effects in Australian and Brazilian Children

Symptom

Trait Source Error

Australia Brazil Australia Brazil Australia Brazil Teacher report: Inattention symptoms

Careless .12 .13 .53 .67 .35 .21 Attention .11 —a _.68 _—a _.21 _—a Listen .05 .07 .56 .71 .40 .21 Instructions .14 .07 .61 .72 .24 .20 Disorganized .15 .11 .58 .73 .27 .16 Unmotivated .13 —b _.59 _—b _.28 _—b Loses .05 .06 .48 .64 .48 .30 Distracted .05 .11 .65 .72 .29 .17 Forgetful .07 .10 .50 .69 .43 .21 M (SD) .10 (.04) .09 (.02) .58 (.06) .70 (.03) .33 (.09) .21 (.04) Teacher report: Hyperactivity/impulsivity

symptoms Fidgets .07 .43 .54 .24 .39 .33 Seat .26 .53 .46 .23 .28 .24 Runs/climbs .34 .38 .31 .16 .35 .46 Quiet .31 .26 .40 .23 .29 .51 Motor .39 .35 .29 .29 .31 .36 Talks .22 .51 .34 .27 .44 .22 Blurts/waits/interrupts .35 .40 .37 .24 .28 .36 M (SD) .28 (.10) .41 (.09) .39 (.08) .24 (.04) .33 (.06) .35 (.10) Parent report: Inattention symptoms

Careless/details .25 .27 .33 .35 .42 .38 Attention .23 .32 .42 .33 .36 .34 Listen .10 .16 .45 .37 .45 .46 Instructions .27 .29 .40 .37 .34 .34 Disorganized .33 .34 .35 .22 .32 .44 Unmotivated .31 .40 .34 .30 .35 .31 Loses .13 .29 .34 .25 .53 .46 Distracted .14 .26 .46 .20 .40 .55 Forgetful .15 .30 .33 .19 .51 .51 M (SD) .21 (.08) .29 (.06) .38 (.05) .29 (.07) .41 (.07) .42 (.08) Parent report: Hyperactivity/impulsivity

symptoms Fidgets .01c _.01c _.47 _.37 _.53 _.62 Seat .00c _.00c _.50 _.57 _.50 _.43 Runs/climbs .00c _.01c _.55 _.52 _.45 _.47 Quiet .00c _.01c _.46 _.37 _.54 _.62 Motor .00c _.01c _.52 _.51 _.48 _.48 Talks .01c _.04c _.49 _.59 _.51 _.37 Blurts .00c _.01c _.49 _.53 _.51 _.46 Waits .00c _.00c _.56 _.45 _.44 _.55 Interrupts .00c _.00c _.50 _.54 _.50 _.46 M (SD) .00 (.00) .01 (.04) .50 (.03) .49 (.08) .50 (.03) .50 (.08)

Note. The trait, source, and error components sum to 1.00 for each symptom within rounding error. All values are significant at p⬍.001 unless indicated as nonsignificant. Values are the standardized loadings squared. a_{In parcel with the careless item.} b_{In parcel with the disorganized item.} c_{Nonsignificant.}

(11)

lems measure. The amount of trait, source, and error variance for the IN and H/I parcels was similar to the Australian sample. In addition, the correlations between the IN and H/I traits were .36 ( p⬍ .001) and .33 ( p ⬍ .001) for parent and teacher sources, results again almost identical to the Australian analysis at the parcel level. For academic problems, there was slightly more trait than source variance for teachers (.27 vs. .22) and considerably more trait than source variance for parents (.43 vs. .06). The correlation between the IN and the academic problems traits was .60 ( p⬍.001), whereas the correlation between the H/I and the academic problems traits was .02 (ns), thus supporting the predic-tion that the IN and academic problems traits would be more strongly related than the H/I and academic problems traits.

Discussion

Our goal was to use correlated trait– correlated source CFA procedures to model an MT-MS source design to examine the construct validity of parent and teacher ADHD rating scales. Given that understanding ADHD depends to a great extent on the use of ADHD rating scales, it is important to know the amount of trait, source, and error variance in these scales.

Trait, Source, and Error Variance in the Individual

ADHD Symptoms

Our results were remarkably consistent across the Australian and Brazilian samples. For the symptom-level analyses, with the exception of the H/I symptoms for the Brazilian teachers, source effects were either equal to or stronger than trait effects (i.e., the symptoms contained more source than trait variance). Such was particularly the case for the Australian results. In this sample, all of the items in the ADHD scale lacked convergent and discriminant validity because of the strong source effects. Given that the scale used in the Australian sample was similar to the DSM–IV ADHD scales used in studies in the United States (e.g., DuPaul et al., 1998; Gaub & Carlson, 1997; Power et al., 1998) and given that the Australian sample was a large and representative sample, there is a high likelihood that similar results would occur with this type of scale in studies in the United States with population-based samples.

Trait, Source, and Error Variance in the ADHD Symptom

Parcels

An additional question concerned the amount of trait, source, and error variance at the summary score level for the IN and H/I Table 5

Descriptive Information on Inattention (IN), Hyperactivity/Impulsivity (H/I), and Academic Problems Measures

Measure

Australian sample (N⫽1,475) Brazilian sample (N⫽285)

␣ M SD S K ␣ M SD S K Teacher rating IN Parcel 1 .89 7.15 2.80 1.59 2.23 .93 9.79 7.09 1.57 1.36 IN Parcel 2 .92 6.07 2.74 1.43 1.44 .94 8.13 6.19 1.39 0.59 H/I Parcel 1 .89 6.20 2.36 2.80 9.05 .88 8.60 5.94 1.97 3.31 H/I Parcel 2 .88 5.16 2.12 2.42 6.29 .88 6.71 4.87 2.20 4.07 Academic problems .96 13.32 8.32 0.69 ⫺0.65 Parent rating IN Parcel 1 .85 8.12 2.98 1.34 1.84 .85 11.27 6.39 1.10 0.33 IN Parcel 2 .86 6.68 2.71 1.29 1.37 .85 8.63 5.42 1.15 0.33 H/I Parcel 1 .83 7.76 3.09 1.61 2.61 .82 12.01 6.91 0.98 0.02 H/I Parcel 2 .78 6.04 2.38 1.59 2.59 .77 8.25 5.10 1.28 0.76 Academic problems .84 12.56 5.83 0.18 ⫺0.86

Note. IN Parcel 1 consisted of the five odd-numbered IN symptoms, IN Parcel 2 the four even-numbered IN symptoms, H/I Parcel 1 the five odd-numbered H/I symptoms, H/I Parcel 2 the four even-numbered H/I symptoms, the teacher academic problems variable five items (quality of homework, classroom work, reading, arithmetic, and writing), and the parent academic problems variable four items (quality of homework, reading, arithmetic, and writing). S⫽skewness; K⫽kurtosis.

Table 6

Goodness-of-Fit Indices for Multitrait–Multisource Models of Attention-Deficit/Hyperactivity Disorder Symptom Parcels for the Australian Children

1. Freely correlated traits; freely correlated sources 10 42* .99 .047 .033–.062 2. No traits; freely correlated sources 19 1,594* .59 .237 .227–.247 3. Perfectly correlated traits; freely correlated sources 11 522* .87 .177 .164–.190 4. Freely correlated traits; perfectly correlated sources 11 467* .88 .168 .155–.181

(12)

measures. Although the individual symptoms had weak convergent and discriminant validity, summary scores for the IN and H/I dimensions might show better results. The convergent and dis-criminant validity of the summary scores was important because the research on ADHD uses summary scores (e.g., total score for the nine IN symptoms on the parent scale) to determine the relation of the IN and H/I measures to other measures.

The IN parcels for teachers contained more source (70% to 84%) than trait variance (13% to 19%), whereas the H/I parcels for teachers contained slightly more trait (45% to 62%) than source variance (33% to 39%). For parents, the IN parcels contained approximately equal amounts of trait (36% to 51%) and source variance (40% to 46%), whereas the H/I parcels for parents con-tained much more source (77% to 91%) than trait variance (2% to 3%). With the exception of the H/I parcels for teachers, source effects were still equal to or stronger than trait effects.

Implications of Source Effects in ADHD Rating Scales

Our results raise several complex issues. One issue concerns the implications of the strong source effects for ADHD research that involves parent and teacher rating scales. This research uses ADHD rating scales to determine the causes and associated fea-tures of the IN and H/I dimensions as well as to measure the

impact of treatments. Our findings indicate that the interpretation of the simple correlations in these studies is problematic. For example, our results showed a moderate to strong correlation (.60) between the IN and the academic problems trait factors, whereas the correlation between the H/I and the academic problems trait factors was almost zero (.02). The simple correlation procedure would have concluded that the IN and H/I measures were reliably related to the academic problems measures (all the correlations were significant). One implication of our findings is that MT-MS analyses may be necessary to advance the understanding of ADHD due to the strong source effects in parent and teacher rating scales. Our findings may also have implications for the conceptualiza-tion of ADHD. The criteria for ADHD require that symptoms and impairment occur in two or more situations (e.g., home and school). This conceptualization of ADHD would seem to require strong trait effects in parent and teacher ADHD rating scales. Our results suggest that this conceptualization and the ADHD rating scales are faulty. In fairness, however, the strong trait effects may only occur with children with a diagnosis of ADHD. If a clinical sample also yielded more source than trait variance, then source effects would appear to be the rule rather than the exception.

Another issue concerns whether the source effects are better viewed as bias or accuracy. An additional study with children Table 7

Variance in Attention-Deficit/Hyperactivity Disorder Symptom Parcels Accounted for by Trait, Source, and Error Effects for the Australian Children

Measure Trait Source Error

Teacher report IN Parcel 1 .17 .70 .13 IN Parcel 2 .19 .72 .09 H/I Parcel 1 .45 .39 .16 H/I Parcel 2 .62 .37 .01a Parent report IN Parcel 1 .36 .46 .18 IN Parcel 2 .43 .45 .12 H/I Parcel 1 .03 .82 .15 H/I Parcel 2 .03 .84 .13

Note. The trait, source, and error components sum to 1.00 for each measure within rounding error. All values are significant at p⬍.001 unless indicated as nonsignificant. Values are the standardized loadings squared. IN⫽inattention; H/I⫽hyperactivity/impulsivity.

a_{Nonsignificant.}

Table 8

Goodness-of-Fit Indices for Multitrait–Multisource Models of Attention-Deficit/Hyperactivity Disorder Symptom Parcels and Academic Problems for the Brazilian Children

1. Freely correlated traits, freely correlated sources 21 50* .98 .070 .045–.095 2. No traits; freely correlated sources 34 614* .67 .245 .228–.262 3. Perfectly correlated traits; freely correlated sources 24 429* .77 .244 .223–.264 4. Freely correlated traits; perfectly correlated sources 22 390* .79 .243 .221–.264

* p⬍.001.

Table 9

Variance in Attention-Deficit/Hyperactivity Disorder Symptom Parcels and Academic Problems Accounted for by Trait, Source, and Error Effects for the Brazilian Children

Measure Trait Source Error

Teacher report IN Parcel 1 .15 .84 .01a IN Parcel 2 .13 .82 .05a H/I Parcel 1 .62 .33 .05a H/I Parcel 2 .59 .34 .07a Academic problems .27 .22 .51 Parent report IN Parcel 1 .45 .40 .14 IN Parcel 2 .51 .40 .09a H/I Parcel 1 .02a _.77 _.21 H/I Parcel 2 .02a _.91 _.08a Academic problems .43 .06 .51

Note. The trait, source, and error components sum to 1.00 for each measure within rounding error. All values are significant at p⬍.001 unless indicated as nonsignificant. Values are the standardized loadings squared. IN⫽inattention; H/I⫽hyperactivity/impulsivity.

(13)

referred for ADHD and oppositional deviant disorder (ODD) prob-lems might help to clarify this issue. This study would obtain parent and teacher ratings for the IN, H/I, and ODD dimensions. A structured diagnostic interview would also be completed with each parent and teacher to provide IN, H/I, and ODD summary scores from the diagnostic interview. CFA procedures would allow for the determination of the amount of trait, source (parents and teachers), method (rating scale and diagnostic interview), and error variance in each manifest variable. For example, for the parent rating scale IN measure, the procedures would provide information on the amount of variance due to the IN trait, the parent source, the rating scale, and error. If the source effects were still large in relation to trait and method effects, then such would argue for source-specific accuracy rather than bias (see also footnote 5). Such a design would also provide a better understanding of the impact of treatment on ADHD (e.g., Do medication and psycho-social treatments have differential impacts on the trait, source, and method components?).

Recommendation for Research on ADHD

Consider this example of a common ADHD research procedure. A researcher wishes to determine if the ADHD-IN and ADHD-H/I dimensions have differential correlates. A typical procedure is to obtain parent and teacher ratings of the IN and H/I dimensions and then to correlate these ratings with parent and teacher ratings of other constructs, such as academic skills, social competence, ODD, conduct disorder, and so on. This procedure yields ambig-uous results because of the strong source effects in the IN and H/I measures as well as the varying amounts of trait variance across the measures (e.g., approximately 0% trait variance in the H/I measure for parents and approximately 57% trait variance in the H/I measure for teachers). The other constructs also probably contain varying amounts of trait and source variance, further complicating the research process.

For these various reasons, the use of CFA to model MT-MS designs appears to be required to advance understanding of ADHD. This approach provides a better understanding of ADHD because the separation of trait and source effects allows for the determination of the ability of each effect to predict the trait and source variance in the other constructs (i.e., the ability of IN and H/I trait factors to predict the trait variance in ADHD risk factors, associated features, and outcomes as distinct from the ability of the IN and H/I parent and teacher source factors to predict the parent and teacher source variance in risk factors, associated features, and outcomes).

Controversies and Complexities in the Use of CFA

Procedures to Model MT-MS Designs

Although we have argued that the use of CFA procedures to model MT-MS designs can advance understanding of ADHD, it should be noted that this approach is not without controversies and complexities (Kenny & Kashy, 1992; Lance et al., 2002; Marsh & Grayson, 1995). Convergence problems, inadmissible solutions, empirical underidentification, alternative CFA models (CT-CM vs. CU), and alternative theoretical models for the organization of the manifest variables are some of the issues that complicate the use of these procedures to understand ADHD and other aspects of

child-hood psychopathology. In addition, the manifest variables should have strong psychometric properties at the item and scale level prior to the use of the manifest variables in CFA procedures. In spite of difficulties, these general procedures have the potential to advance understanding of ADHD.

References

Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/ adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin,

101, 213–232.

American Psychiatric Association. (1994). Diagnostic and statistical

man-ual of mental disorders (4th ed.). Washington, DC: Author.

American Psychiatric Association. (2000). Diagnostic and statistical

man-ual of mental disorders (4th ed., text revision). Washington, DC: Author.

Beiser, M., Dion, R., & Gotowiec, A. (2000). The structure of attention-deficit and hyperactivity symptoms among native and non-native ele-mentary school children. Journal of Abnormal Child Psychology, 28, 425– 438.

Bentler, P. M., & Wu, E. J. C. (2002). EQS 6 for Windows user’s guide. Encino, CA: Multivariate Software.

Burns, G. L., Boe, B., Walsh, J. A., Sommers-Flanagan, R., & Teegarden, L. A. (2001). A confirmatory factor analysis on the DSM–IV ADHD and ODD symptoms: What is the best model for the organization of these symptoms? Journal of Abnormal Child Psychology, 29, 339 –349. Burns, G. L., Taylor, T. K., & Rusby, J. C. (2001a). Child and Adolescent

Disruptive Behavior Inventory 2.3: Parent version. Pullman:

Washing-ton State University, Department of Psychology.

Burns, G. L., Taylor, T. K., & Rusby, J. C. (2001b). Child and Adolescent

Disruptive Behavior Inventory 2.3: Teacher version. Pullman:

Washing-ton State University, Department of Psychology.

Burns, G. L., & Walsh, J. A. (2002). The influence of ADHD-hyperactiv-ity/impulsivity symptoms on the development of oppositional defiant disorder symptoms in a two year longitudinal study. Journal of

Abnor-mal Child Psychology, 30, 245–256.

Burns, G. L., Walsh, J. A., Owen, S. M., & Snell, J. (1997). Internal validity of the ADHD, ODD, and overt CD symptoms in young children: Implications from teacher ratings for a dimensional approach to symp-tom validity. Journal of Clinical Child Psychology, 26, 266 –275. Burns, G. L., Walsh, J. A., Patterson, D. R., Holte, C. S.,

Sommers-Flanagan, R., & Parker, C. M. (1997). Internal validity of the disruptive behavior disorder symptoms: Implications from parent ratings for a dimensional approach to symptom validity. Journal of Abnormal Child

Psychology, 25, 307–319.

Burns, G. L., Walsh, J. A., Patterson, D. R., Holte, C. S., Sommers-Flanagan, R., & Parker, C. M. (2001). Attention deficit and disruptive behavior disorder symptoms: Usefulness of a frequency count rating procedure to measure these symptoms. European Journal of

Psycholog-ical Assessment, 17, 25–35.

Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/

Windows: Basic concepts, applications, and programming. Thousand

Oaks, CA: Sage.

Byrne, B. M. (2001). Structural equation modeling with AMOS: Basic

concepts, applications, and programming. Mahwah, NJ: Erlbaum.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait–multimethod matrix. Psychological

Bulle-tin, 56, 81–105.

Crystal, D. S., Ostrander, R., Chen, R. S., & August, G. J. (2001). Multimethod assessment of psychopathology among DSM–IV subtypes of children with attention-deficit/hyperactivity disorder: Self-, parent, and teacher reports. Journal of Abnormal Child Psychology, 29, 189 – 206.