On Issues of Validity and Especially on the Misery of Convergent Validity

(1)

Editorial

On Issues of Validity and Especially

on the Misery of Convergent Validity

Karl Schweizer

Editor-in-Chief

Summary

Here, we would like to consider the concept of convergent validity in taking the general and specific perspectives. The

general perspective suggests that special attention should

be given to the purpose of application in the definition of the concept. It is therefore proposed that the structuring of the relationship between test construction and test applica-tion be done according to the standard purposes of appli-cation. Thespecific perspectiveon convergent validity re-veals indeterminacy and vagueness. Aside from the core meaning of convergent validity, four different elaborations can be identified: convergent validity as a transfer proce-dure, as a controlled transfer proceproce-dure, as a trait-method-controlled transfer procedure, and as an equiva-lence check. The less sophisticated elaborations suffer from the lack of specific limits, so that the ascription of conver-gent validity appears to be a matter of arbitrariness, as is the selection of the type of elaboration. All this adds up to what may be considered the misery of convergent validity.

Introduction

Because psychological assessment addresses latent human attributes, whether a test actually represents the attribute is a question of utmost importance. What in many other sciences is taken for granted needs to be explicitly established in psy-chological assessment since the link from the latent to the manifest level is not an especially obvious one. Furthermore, there is the complication of the situation since the validity of a test appears to depend on the specific use of a test and on the stage of the cultural framework. This dependency is es-pecially obvious in the meaning of test items for persons from different cultures (van de Vijver, 2011). As a consequence, it is necessary to ensure an appropriate adaptation when the test is transferred from one language into another one (Schweizer, 2010; van de Vijver, 2003). In line with the recognition of

these relationships, more recent inquiries into validity no longer consider it a property of the test but rather a property of theusage of the test for a particular purpose(Kane, 2006; Messick, 1989a; Sireci, 2007).

This way of emphasizing purpose and situation appears to be very reasonable as a legal conceptualization that is largely guided by legal standards and the demands of prop-er practice (Sireci, 2006). In a way it reflects the growing importance of psychological assessment in social life. However, irrespective of the necessity of considering the purpose of assessment, this recent notion bears a danger to psychometric standards since the specificity of the assess-ment situation associated with a particular purpose can be-come the excuse for low psychometric standards. Empha-sizing the interpretation of test scores as recommended by Messick (1989a, 1989b) may increase the tolerance for de-viations from a solid psychometric frame of reference. Fur-thermore, there is even the possibility of abusing psycho-logical assessment by assigning too much weight to the purpose of the application. Because of this danger we rec-ommend investigating and establishing validity with re-spect to “standard” purposes that enable the separation of the actual purpose of application fromthe context of the

establishment of validity. Several types of validity have

been proposed and are reiterated in virtually all textbooks on test construction (e.g., McDonald, 1999; Lewis-Beck, 1994). These types of validity can be established by con-sidering one or several “standard” purposes of application that lie outside the actual purpose of the application. Of course, such separation demands that more weight be given to the definition and consideration of “standard” purposes in future test constructions than is presently the case.

Convergent Validity and Its

Shortcomings

In line with these preliminary remarks concerning the re-cent notion of validity, the following inquiry on

(2)

gent validity assumes the framework of a standard pur-pose of application. Given this framework, convergent va-lidity denotes the observation of a considerable correla-tion between tests that refer to the same psychological concept, mostly a construct. In other words, “two different measures of the same thing should intercorrelate highly” (Ghiselli, Campbell, & Zedeck, 1981, p. 285). According to these two descriptions of the core meaning of conver-gent validity, an appropriate theoretical background is an essential prerequisite, and it appears to be closely linked to the concept of operationalization thought to link entities of the latent level to entities of the manifest level. Unsur-prisingly, the idea that there are psychological constructs (MacCorquodale & Meehl, 1948) was fundamental to the development of convergent validity, much as it was to construct validity (Cronbach & Meehl, 1955). Constructs are usually considered the theoretical basis providing jus-tification for the expectation of a considerable correlation between tests. It even appears that validity did not play a major role in the theory of mental testing before the in-vention of the idea of the psychological construct. In the influential textbook by Gulliksen (1950) of the time be-fore introduction of the construct, validity is defined as “the correlation of the test with some criterion” (p. 88), and no more than half a page is spent reflecting about it. Since that time, however, the sensitivity for various issues of validity has considerably increased so that nowadays validity is considered “the most important concept in psy-chometrics” (Sireci, 2007, p. 477). A number of different types of validity are to be taken into consideration.

Unfortunately, indeterminacy and vagueness character-ize convergent validity aside from the core meaning and what may be perceived as themiseryof convergent valid-ity from the scientific point of view. In this and the fol-lowing sections this assertion is pointed out in some detail. First let me elaborate on the vagueness of convergent va-lidity. Given a standard purpose of application it is neces-sary to distinguish between two types of outcomes of an investigation of convergent validity. Since convergent va-lidity is closely linked to the concept of correlation, two types of coefficients must be considered: coefficients in-dicating convergent validity and coefficients inin-dicating the lack of convergent validity. It is interesting to observe that most textbooks avoid stating an explicit limit. Thus it might be sufficient to check whether the correlation reach-es the level of significance. However, the level of signif-icance cannot really be the solution to the problem since large sample sizes can cause rather small correlations to reach the level of significance. Furthermore, if there were an explicit limit, it would be necessary to also consider the properties of the tests and of the sample. For example, it is known that the size of such a correlation depends on the reliability of the tests that are correlated with each oth-er (Lord & Novick, 1968). Moreovoth-er, thoth-ere is the influ-ence of the observational method on the size of correlation between two tests (Campbell & Fiske, 1959). This prob-lem is discussed in more detail in one of the next sections.

Additionally, there is the problem of how to deal with in-consistent results. A few tests of a study referring to the same construct may lead to correlations larger than a giv-en limit, whereas other correlations of tests also referring to the same construct may be smaller. Apparently, a sim-ple correlation between two tests is a vague argument in favor of convergent validity. It is only clear that the larger the correlation, the more likely convergent validity be-comes.

The following sections discuss the indeterminacy of convergent validity. It becomes obvious that there are dif-ferent elaborations of the core meaning, all of which are in use so that convergent validity can mean any one of them.

Convergent Validity as a Simple

Transfer Procedure

First, there is the elaboration of convergent validity as a

transfer procedure. Convergent validity is investigated by

correlating tests with each other where one test shows an established validity, the other one being in need of being validated. An important characteristic of the test with the established validity is that it not a perfect representation of the construct; however, the association of this test and the construct is beyond debate. Employing such a procedure may be perceived as an attractive way of securing the va-lidity of a new test since the necessary expenses in time and effort are quite low. However, this procedure is not really suitable for avoiding the type of vagueness described in the previous section.

Nevertheless, convergent validity appears to be a very popular type of validity. Papers reporting on the construc-tion of a new test or the validaconstruc-tion of an already established test usually include a section on convergent validity – but frequently fail to consider additional measures referring to other contents and alternative observational methods. In such papers variations can only be found concerning the number of other tests referring to the same construct and the considerations of specific facets of the construct. A check of the issues of the last 2 years of theEuropean

Jour-nal of Psychological Assessmentreveals that even

nowa-days convergent validity is quite common as the sole type of validity (Balducci, Fraccaroli, & Schaufeli, 2010; Bal-zarotti, John, & Gross, 2010; Campos, & Gonçalves, 2011; Carelli, Wiberg, & Wiberg, 2011; Cui, Teng, Li, & Oei, 2010; Höfling, Moosbrugger, Schermelleh-Engel, & Hei-denreich, 2011; Kazarian, & Taher, 2010; Knutsche, Knib-be, Engels, & Gmel, 2010). All of these papers are in line with the core meaning of convergent validity, although there may be different opinions on what is to be considered as “high.”

(3)

Convergent Validity as a

Trait-Controlled Transfer Procedure

However, the concentration on the described transfer pro-cedure was already disclaimed by Campbell and Fiske (1959) when they proposed the multitrait-multimethod methodology. Their analyses of the results of previous re-search revealed that correlations obtained for tests referring to the same construct often overestimate the relationship between these tests because of common method variance. Scores obtained by the same observational method can be found to correlate with each other simply because of the common observational method, since the observational method can be the source of systematic variation respec-tively can contribute to the correlation of tests.One conse-quence of this observation is the demand to associate each investigation of convergent validity with an investigation of discriminant validity since the influence of the observa-tional method may become obvious in such an

investiga-tion.The expected value for the correlation thought to

es-tablish discriminant validity is zero, since tests referring to different constructs should not be related to each other. Cor-relations establishing convergent validity should consider-ably exceed correlations establishing discriminant validity. Interestingly, a check of the issues of the last 2 years of

European Journal of Psychological Assessment reveals

quite a number of papers reporting results on convergent as well as discriminant validity without considering con-struct validity (De Carvalho Leite, Seminotti, Freitas, & de Lourdes Drachler, 2011; Fernandez, Dufey, & Kramp, 2011; Fossati, Borroni, Marchione, & Maffei, 2011; Glaes-mer, Grande, Braehler, & Roth, 2011; Gorostiaga, Balluer-ka, Alonso-Arbiol, & Haranburu, 2011; Petermann, Peter-mann, & Schreyer, 2010; Rivero, Garcia-Lopez, & Hof-mann, 2010; Teubert & Pinquart, 2011; Veirman, Brouwers, & Fontaine, 2011; Zohar & Cloninger, 2011).

Convergent Validity as a

Traitmethod-Controlled Transfer

Procedure

There is another consequence of the observation reported by Campbell and Fiske (1959), and it is the one promoted by the authors themselves: the consideration of alternative observational methods. This requires that different obser-vational methods be applied in the operationalization of the construct. For example, the test can be designed as a self-report measure and at the same time as a peer-self-report meas-ure. The same trait can be assessed by a questionnaire and a rating scale, or as a set of related rating scales. Such com-binations enable the estimation of the relationship of inter-est by excluding the influence of the observational method. Based on these possibilities Campbell and Fiske

recom-mend the systematic combination of several constructs and several observational methods in order to achieve an inves-tigation of validity that is not impaired by method-induced distortion. This elaboration of convergent validity is ad-dressed astraitmethod-controlled transfer procedure.

Unfortunately, the advantage of the original multitrait-multimethod approach was undone by the associated ne-cessity of comparing a large number of correlations with each other in the evaluation of the multitrait-multimethod matrix. This disadvantage has now been overcome by in-troducing methods for the evaluation of the multitrait-multimethod matrix as a whole. Confirmatory factor anal-ysis proved to be especially useful for this purpose. Sev-eral confirmatory factor models showing slightly differing properties have also been developed (e.g., Eid, 2000; Marsh, & Grayson, 1995). The availability of this method for data analysis finally turns the multitrait-multimethod methodology into a truly valuable research approach. Within this approach convergent validity no longer plays a major role. This advanced multitrait-multimethod meth-odology rather concentrates on construct validity. The is-sues of the last 2 years ofEuropean Journal of

Psycho-logical Assessmentinclude a number of papers reporting

such an investigation of construct validity – something clearly desirable for an assessment journal (Backenstrass, Joest, Gehrig, Pfeiffer, Mearns, & Catanzaro, 2010; Bäcc-man & Carlstedt, 2010; Blickle, Momm, Liu, Witzki, & Steinmayr, 2011; Crocetti, Schwartz, Fermani, & Meeus, 2010; Derkman, Scholte, Van der Veld, & Markland, 2010; Di Giunta, Eisenberg, Kupfer, Steca, Tramontano, & Caprara, 2010; Gorska, 2011; Isoard-Gautheur, Oger, Guillet, & Martin-Krumm, 2010; Lehmann-Willenbrock, Grohmann, & Kauffeld, 2011; Maiano, Morin, Monthuy-Blanc, & Garbarino, 2010; Pereda, Arch, Peró, Guàrdia, & Forns, 2011; Vissers, Keijsers, van der Veld, de Jong, & Hutschemaekers, 2010; Wright, Creed, & Zimmer-Gembeck, 2010; Zohar, Denollet, Ari, & Cloninger, 2011).

Convergent Validity as an

Equivalence Check

The consideration of confirmatory factor analysis for in-vestigating multitrait-multimethod matrices has not only advanced the multitrait-multimethod methodology, but also introduced a new and promising perspective on con-vergent validity. Now concon-vergent validity can be consid-ered with respect to two different levels, the manifest and the latent levels. The original notion of convergent valid-ity applies to the manifest level. Measurements occur on the manifest level and are therefore considered to be flawed. Some of what is characterized as the “misery of convergent validity” is due to this assignment of measure-ments. The manifest level is contrasted by the latent level.

(4)

This other level is thought to be without error. The impli-cations of this assumption for expectations regarding con-vergent validity are amazing. Considered on the latent lev-el, convergent validity suggests the perfect relationship since error and specificity characterizing individual tests which could impair the relationship are excluded. The la-tent referents of two tests that are assumed to represent the same construct should show nothing less than equiv-alence in the sense of a perfect correlation. The demon-stration of this trait-specific or ability-specific equiva-lence requires the step from the manifest to latent levels by means of appropriate models of measurement that take disturbing structural specificity into consideration (Schweizer, Rauch, & Gold, 2011; Schweizer & Schrei-ner, 2010). The logic of such demonstration can be bor-rowed from the methodology developed for investigating the dimensionality of scales (DiStefano & Motl, 2006; Vautier, Raufaste, & Carou, 2003). Assuming that the la-tent referents of the two tests give rise to two dimensions, one can investigate whether the two dimensions can be replaced by a single dimension without causing impair-ment to the model fit.

This further elaboration of convergent validity recently emerged and enables convergent validity as an equiva-lence check. Since trait-specific or ability-specific equiv-alence should characterize the relationships on the latent level, there is no more room for the vagueness that char-acterizes convergent validity on the manifest level. Refer-ring to the same construct can only mean equivalence in the sense of a perfect correlation when considered at the latent level. Although this expectation is a general one in the sense that it applies to all combinations of tests, it ap-pears to be especially relevant for the cases of abridged, brief, and short tests that not only claim to represent the same construct as the original tests and to measure virtu-ally the same but also share items. The relationships be-tween abridged, brief, and short tests on one hand and original tests on the other should be especially close, and the authors of such tests usually assert that the abridged, brief, or short test can in fact replace the original one with-out any loss of information. Unfortunately, there is pres-ently the practice of developing abridged, brief, and short tests by repeating more or less the steps of the original test construction with a subset of items (Laverdière, Diguer, Gamache, & Evans, 2010; van Baardewijk, Andershed, Stegge, Nilsson, Scholte, & Vermeiren, 2010). This ap-proach can actually be expected to produce tests that may be similar to the original tests. However, it cannot guar-antee that the original tests and the abridged, brief, or short tests actually measure exactly the same thing. There is al-ways the danger that the abridged, brief, or short test is somewhat biased in by neglecting or emphasizing some-thing: a facet of the construct that is covered by the orig-inal test or a process that is stimulated by the origorig-inal test. The equivalence check can be instrumental in ruling out these possibilities.

Discussion

The investigation of the concept of convergent validity and of the practice of test construction and psychometric evaluation has revealed both vagueness and indetermina-cy. This is an observation that does not exclude that dif-ferent researchers are sure to agree about the core meaning of convergent validity, the relationship of two tests refer-ring to the same construct. However, such agreement may disappear as soon as convergent validity is considered in more detail, since it becomes apparent that the concept is a bit imprecise and even vague because of the close asso-ciation with the concept of correlation without stating a specific limit. Furthermore, there are the various elabora-tions of the concept with different implicaelabora-tions concern-ing the investigation of convergent validity.

The basic elaboration is convergent validity as transfer procedure. The establishment of convergent validity ac-cording to this elaboration means the simple transfer of validity from one test to another test. It was found in 8 studies. The next elaboration is convergent validity as a trait-controlled transfer procedure. This demands the ad-ditional consideration of discriminant validity and was ob-served in another 10 studies. The third elaboration is the traitmethod-controlled transfer procedure of the multi-trait-multimethod approach. However, in the combination of this approach with confirmatory factor analysis, con-vergent validity is virtually dissolved in construct validity. The literature research revealed the investigation of con-struct validity in 14 studies. Finally, there is the elabora-tion as equivalence check, which is the result of switching to the latent variable approach for investigating conver-gent validity. No study designed according to this elabo-ration has so far appeared in theEuropean Journal of

Psy-chological Assessment.

Finally, the question needs to be addressed whether it is justified to equate vagueness and indeterminacy with misery. From the point of view of the test constructor, vagueness and indeterminacy may be perceived as useful since these characteristics can be instrumental to high-lighting the positive results of test construction and to pre-vent disadvantageous observations from turning a long-term research project into a disaster. Vagueness and inde-terminacy can help to present the outcome of virtually every evaluation of a test as a success. However, is this science? More importantly, does it contribute to progress in science? If failure is virtually impossible because of vagueness and indeterminacy, then there is no guidance for the applied researcher who is searching for a test to properly represent a specific construct. Furthermore, there is no obvious reason for improving less than optimal tests. If vagueness and indeterminacy characterize basic con-cepts like convergent validity, from the scientific point of view the true researcher must experience misery. Relief from this misery would be highly appreciated.

(5)

References

Backenstrass, M., Joest, K., Gehrig, N., Pfeiffer, N., Mearns, J., & Catanzaro, S. J. (2010). The German version of the Gener-alized Expectancies for Negative Mood Regulation Scale: A construct validity study.European Journal of Psychological Assessment, 26, 28–38.

Bäccman, C., & Carlstedt, B. (2010). A construct validation of a Profession-Focused Personality Questionnaire (PQ) versus the FFPI and the SIMP.European Journal of Psychological As-sessment, 26, 136–142.

Balducci, C., Fraccaroli, F., & Schaufeli, W. B. (2010). Psycho-metric properties of the Italian Version of the Utrecht Work Engagement Scale (UWES-9): A cross-cultural analysis. Eu-ropean Journal of Psychological Assessment, 26, 143–149. Balzarotti, S., John, O. P., & Gross, J. J. (2010). An Italian

adap-tation of the Emotional Regulation Questionnaire.European Journal of Psychological Assessment, 26, 61–67.

Blickle, G., Momm, T., Liu, Y., Witzki, A., & Steinmayr, R. (2011). Construct validation of the Test of Emotional Intelli-gence (TEMINT): A two-study investigation.European Jour-nal of Psychological Assessment, 27, 282–298.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discrim-inant validation by the multitrait-multimethod matrix. Psycho-logical Bulletin, 56, 81–105.

Campos, R., & Gonçalves, B. (2011). The Portuguese version of Beck Depression Inventory-II (BDIII): Preliminary psycho-metric data with two non clinical samples.European Journal of Psychological Assessment, 27, 258–264.

Carelli, M. G., Wiberg, B., & Wiberg, M. (2011). Development and construct validation of the Swedish Zimbardo Time Per-spective Inventory.European Journal of Psychological As-sessment, 27, 220–227.

Crocetti, E., Schwartz, S. J., Fermani, A., & Meeus, W. (2010). The Utrecht-Management of Identità Commitments Scale (U-MICS): Italian validation and cross-national comparisons. Eu-ropean Journal of Psychological Assessment, 26, 172–186. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in

psy-chological tests.Psychological Bulletin, 52, 281–302. Cui, L., Teng, X., Li, X., & Oei, T. P. S. (2010). The factor

struc-ture and psychometric properties of the Resiliency Scale in Chinese undergraduates.European Journal of Psychological Assessment, 26, 162–171.

De Carvalho Leite, J. C., Seminotti, N., Freitas, P. F., & de Lour-des Drachler, M. (2011). The Psychosocial Treatment Expec-tations Questionnaire (PTEQ) for alcohol problems: develop-ment and early validation.European Journal of Psychological Assessment, 27, 228–236.

Derkman, M. M. S., Scholte, R. H. J., Van der Veld, W. M., & Markland, R. C. M. E. (2010). Factorial and construct validity of the sibling relationship questionnaire.European Journal of Psychological Assessment, 26, 277–283.

Di Giunta, L., Eisenberg, N., Kupfer, A., Steca, P., Tramontano, C., & Caprara, G. V. (2010). Assessing perceived empathic and social self-efficacy across countries.European Journal of Psy-chological Assessment, 26, 77–86.

DiStefano, C., & Motl, R. W. (2006). Further investigating meth-od effects associated with negatively worded items on self-re-port surveys.Structural Equation Modeling, 13, 440–464.

Eid, M. (2000). A multitrait-multimethod model with minimal assumptions.Psychometrika, 65, 241–261.

Fernandez, A. M., Dufey, M., & Kramp, U. (2011). Testing the psychometric properties of the Interpersonal Reactivity Index (IRI) in Chile: Empathy in a different cultural context. Euro-pean Journal of Psychological Assessment, 27, 179–185. Fossati, A., Borroni, S., Marchione, D., & Maffei, C. (2011). The

Big Five Inventory (BFI): Reliability and validity of its Italian translation in three independent nonclinical samples. Euro-pean Journal of Psychological Assessment, 27, 50–58. Ghiselli, E. E., Campbell, J. P., & Zedeck, S. (1981).Measurement

theory for behavioral sciences. San Francisco: W. H. Freeman. Glaesmer, H., Grande, G., Braehler, E., & Roth, M. (2011). The German version of the Satisfaction with Life Scale (SWLS): Psychometric properties, validity, and population-based norms. European Journal of Psychological Assessment, 27, 127–132.

Gorostiaga, A., Balluerka, N., Alonso-Arbiol, I., & Haranburu, M. (2011). Validation of the Basque Revised NEO Personality Inventory (NEO PI-R). European Journal of Psychological Assessment, 27, 193–204.

Gorska, M. (2011). Psychometric properties of the Polish version of the Interpersonal Competence Questionnaire (ICQ-R). Eu-ropean Journal of Psychological Assessment, 27, 186–192. Gulliksen, H. (1950).Theory of mental tests. New York: Wiley. Höfling, V., Moosbrugger, H., Schermelleh-Engel, K., &

Heiden-reich, T. (2011). Mindfulness or mindlessness? A modified version of the Mindful Attention and Awareness Scale (MAAS).European Journal of Psychological Assessment, 27, 59–64.

Isoard-Gautheur, S., Oger, M., Guillet, E., & Martin-Krumm, C. (2010). Validation of a French version of the Athlete Burnout Questionnaire (ABQ) in competitive sport and physical edu-cation context. European Journal of Psychological Assess-ment, 26, 203–211.

Kane, M. T. (2006). Test validation. In R. L. Brennan (Ed.), Edu-cational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger.

Kazarian, S. S., & Taher, D. (2010). Validation of the Arabic cen-ter for Epidemiological Studies Depression (CES-D) Scale in a Lebanese community sample.European Journal of Psycho-logical Assessment, 26, 68–73.

Knutsche, E., Knibbe, R., Engels, R., & Gmel, G. (2010). Being drunk to have fun or to forget problems? Identifying enhance-ment and coping drinkers among risky drinking adolescents.

European Journal of Psychological Assessment, 26, 46–54. Laverdière, O., Diguer, L., Gamache, D., & Evans, D. E. (2010).

The French adaptation of the Short form of the Adult Temper-ament Questionnaire.European Journal of Psychological As-sessment, 26, 212–219.

Lehmann-Willenbrock, N., Grohmann, A., & Kauffeld, S. (2011). Task and relationship conflict at work: Construct validity of a German version of Jehn’s Intragroup Conflict Scale.European Journal of Psychological Assessment, 27, 171–178.

Lewis-Beck, M. S. (1994). Basic measurement. International Handbooks of Quantitative applications in the Social Sciences

(Vol. 4). Singapore: Sage.

Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Menlo Park, CA: Addison Wesley.

MacCorquodale, K.,& Meehl, P. E. (1948). On a distinction

(6)

tween hypothetical constructs and intervening variables. Psy-chological Review, 55, 95–107.

Maiano, C., Morin, A. J. S., Monthuy-Blanc, J., & Garbarino, J.-M. (2010). Construct validity of the Fear of Negative Appear-ance Evaluation Scale in a community sample of French ado-lescents.European Journal of Psychological Assessment, 26, 19–27.

Marsh, H., & Grayson, D. (1995). Latent variable models of multi-trait-multimethod data. In R. H. Hoyle (Ed.),Structural equation modeling: Concepts, issues, and applications (pp. 177–187). Thousand Oaks, CA: Sage.

McDonald, R. P. (1999).Test theory: A unified treatment. Mah-wah, NJ: Erlbaum.

Messick, S. (1989a). Validation. In R. Linn (Ed.),Educational measurement(3rd ed., pp. 13–103). Washington, DC: Ameri-can Council on Education/Macmillan.

Messick, S. (1989b). Meaning and values in test validation: The science of ethics of assessment.Educational Researcher, 18, 5–11.

Pereda, N., Arch, M., Peró, M., Guàrdia, J., & Forns, M. (2011). Assessing guilt after traumatic events: The Spanish Adaptation of the Trauma-Related Guilt Inventory.European Journal of Psychological Assessment, 27, 251–257.

Petermann, U., Petermann, F., & Schreyer, I. (2010). The German Strengths and Difficulties Questionnaire: Validity of the teach-er vteach-ersion for preschoolteach-ers.European Journal of Psychologi-cal Assessment, 26, 256–262.

Rivero, R., Garcia-Lopez, L. J., & Hofmann, S. G. (2010). The Spanish version of the Self-Statements During Public Speak-ing Scale: Validation in adolescents.European Journal of Psy-chological Assessment, 26, 129–135.

Schweizer, K. (2010). The adaptation of assessment instruments to the various European languages.European Journal of Psy-chological Assessment, 26, 75–76.

Schweizer, K., Rauch, W., & Gold, A. (2011). Bipolar items for the measurement of personal optimism instead of by unipolar items. Psychological Test and Assessment Modeling, 53, 399–413.

Schweizer, K., & Schreiner, M. (2010). Avoiding the effect of item wording by means of bipolar instead of unipolar items: An application to social optimism.European Journal of Per-sonality,24, 137–150.

Sireci, S. G. (2006). Validity on trial: Psychometric and legal con-ceptualizations of validity.Educational Measurement: Issues and Practice, 25, 27–34.

Sireci, S. G. (2007). On validity theory and test validation. Edu-cational Researcher, 36, 477–481.

Teubert, D., & Pinquart, M. (2011). The Coparenting Inventory for Parents and Adolescents (CI-PA): Reliability and validity.

European Journal of Psychological Assessment, 27, 206–214.

Van Baardewijk, Y., Andershed, H., Stegge, H., Nilsson, K. W., Scholte, E., & Vermeiren, R. (2010).Development and test of short versions of the Youth Psychopathic Traits Inventory and the Youth Psychopathic Traits Inventory – Child Version. Eu-ropean Journal of Psychological Assessment, 26, 122–128. van de Vijver, F. J. R. (2003). Test adaptation/translation methods.

In R. Fernández-Ballesteros (Ed.),Encyclopedia of psycholog-ical assessment(pp. 960–964). Thousand Oaks, CA: Sage. van de Vijver, F. J. R. (2011). Bias and real differences in

cross-cultural differences: Neither friends nor foes. In F. J. R. van de Vijver, A. Chasiotis, & S. M. Breugelmans (Eds.), Fundamen-tal questions in cross-cultural psychology(pp. 235–257). New York, US: Cambridge University Press.

Vautier, S., Raufaste, E., & Carou, M. (2003). Dimensionality of the revised Life Orientation Test and the status of filler items.

International Journal of Psychology, 38, 390–400.

Veirman, E., Brouwers, S. A., & Fontaine, J. (2011). The assess-ment of emotional awareness in children: Validation of the Levels of Emotional Awareness for Children.European Jour-nal of Psychological Assessment, 27, 265–273.

Vissers, W., Keijsers, G. P. J., van der Veld, W. M., de Jong, C. A. J., & Hutschemaekers, G. J. M. (2010). Development of the Remoralization Scale: An extension of contemporary psycho-therapy outcome measurement.European Journal of Psycho-logical Assessment, 26, 293–301.

Wright, M., Creed, P., & Zimmer-Gembeck, M. J. (2010). The development and initial validation of a Brief Daily Hassles Scale suitable for use in adolescents.European Journal of Psy-chological Assessment, 26, 220–226.

Zohar, A. H., & Cloninger, C. R. (2011). The psychometric prop-erties of the TCI-140 in Hebrew.European Journal of Psycho-logical Assessment, 27, 73–80.

Zohar, A. H., Denollet, J., Ari, L. L., & Cloninger, C. R. (2011). The psychometric properties of the DS14 in Hebrew and the prevalence of type D personality in Israeli adults.European Journal of Psychological Assessment, 27, 274–281.

Karl Schweizer

Department of Psychology Goethe University Frankfurt Mertonstr. 17 60054 Frankfurt a.M. Germany Tel. +49 69 798-22081 Fax +49 69 798-23847 E-mail [email protected]