William H. Walters. This document is the accepted version of an article in the Journal of

(1)

Do Article Influence Scores

Overestimate the Citation Impact of

Social Science Journals in Subfields

That Are Related to Higher-Impact

Natural Science Disciplines?

William H. Walters

This document is the accepted version of an article in the

Journal of

Informetrics

_{, vol. 8, no. 2 (April 2014), pp. 421–430.}

It is also available from the publisher’s web site at

(2)

Journal of Informetrics

, vol. 8, no. 2 (April 2014), pp. 421–430

Do Article Influence scores overestimate the citation impact

of social science journals in subfields that are related to

higher-impact natural science disciplines?

William H. Walters

Bowman Library, Menlo College, 1000 El Camino Real, Atherton, CA 94027, USA Tel.: +1 650 543 3827

E-mail address: [email protected]

Unlike Impact Factors (IF), Article Influence (AI) scores assign greater weight to citations that appear in highly cited journals. The natural sciences tend to have higher citation rates than the social sciences. We might therefore expect that relative to IF, AI overestimates the citation impact of social science journals in subfields that are related to (and presumably cited in) higher-impact natural science disciplines. This study evaluates that assertion through a set of simple and multiple regressions covering seven social science disciplines: anthropology, communication, economics, education, library and information science, psychology, and sociology. Contrary to expectations, AIunderestimates5IF (five-year Impact Factor) for journals in science-related subfields such as scientific communication, science education, scientometrics, biopsychology, and medical sociology. Journals in these subfields have low AI scores relative to their 5IF values. Moreover, the effect of science-related status is considerable—typically 0.60 5IF units or 0.50 SD. This effect is independent of the more general finding that AI scores underestimate 5IF for higher-impact journals. It is also independent of the very modest curvilinearity in the relationship between AI and 5IF. Keywords: bias, Eigenfactor, interdisciplinary, Journal Citation Reports, multidisciplinary, Web of Science

(3)

1. Introduction

From 1964 to 2004, Science Citation Index (SCI) and Social Sciences Citation Index (SSCI) were the only sources of reliable, large-scale citation data (Garfield 2007). The Impact Factor (IF), based on data from SCI and SSCI, was recognized by both scholars and practitioners as a standard indicator of citation impact. In recent years, however, a number of alternative indicators have been introduced. These include the Article Influence (AI) score, which is calculated from SCI and SSCI data (Bergstrom 2007), and the Source Normalized Impact per Paper (SNIP) indicator, which draws on data from Elsevier’s Scopus database (Moed 2010).

Aside from their dates of introduction, there are three major differences between the Impact Factor and the Article Influence score (Bergstrom, West, & Wiseman 2008; Franceschet 2010b; West, Bergstrom, & Bergstrom 2010b). First, IF data are available only to institutions that subscribe to Thomson Reuters’ Journal Citation Reports (JCR). In contrast, AI scores are freely available online at

http://www.eigenfactor.org/.

A second difference lies in the weighting of citations. Impact Factors give equal weight to every citation; a citation in PNAScontributes no more to the IF than a citation in a regional specialty journal. In contrast, AI scores give greater weight to citations that appear in highly cited journals. “The [AI] ranking system accounts for difference in prestige among citing journals, such that citations fromNatureorCell

are valued highly relative to citations from third-tier journals with narrower readership” (West et al. 2012a).

A third difference is that AI scores, unlike IFs, are normalized to account for differences in impact among academic disciplines. It is well known that articles in the natural sciences and in fields with more authors tend to be cited more often. Differences in citation impact persist even among subdisciplines. (See, for example, Althouse, West, Bergstrom, & Bergstrom 2009; Franceschet 2010a; Leydesdorff 2008; Postma 2007; Smolinsky & Lercher 2012; and So 1998.) Impact Factors do not account for these disciplinary differences, and users of the IF are cautioned not to compare journals in different subject areas. In contrast, AI scores are normalized to minimize disciplinary differences in citation rates.

According to its creators, the AI algorithm “automatically accounts for these differences and allows better comparison across research areas” (West et al. 2012c).

The AI algorithm does not completely eliminate disciplinary differences in citation impact, however. As Table 1 shows, subject areas differ considerably in their average AI scores—only slightly less than they differ in their average IFs. The average AI score of a medical journal, for instance, is far higher than the average AI score of an anthropology or sociology journal. This may pose a problem for the comparison of journals within fields such as anthropology and sociology, since certain subfields—biological

anthropology and medical sociology, for instance—may be especially likely to be cited in the journals of biology, medicine and other high-impact disciplines. Arguably, this gives those science-related subfields an unfair advantage in terms of their AI scores, since a citation in a mid-ranked medical journal is likely to increase the AI score more than a citation in a top social science journal. After all, more than 40% of the journals in the SCImedicinecategory have AI scores higher than that ofAmerican Anthropologist, the flagship journal of the American Anthropological Association. There is nothing unfair about the AI score itself, since any subfield-related differences in AI reflect real differences in impact among subdisciplines. However, unfairness can easily result if differences in impact among subfields are interpreted as

(4)

Table 1

Average Article Influence scores and five-year Impact Factors of journals in 35 JCR subject categories (2012).

Subject area AI 5IF N

Developmental biology 1.83 4.12 37 Evolutionary biology 1.54 4.00 47 Psychology, biological 1.58 3.98 14 Genetics and heredity 1.57 3.97 153 Biochemistry and molecular biology 1.44 3.91 284 Chemistry, physical 1.17 3.68 134 Medicine, research and experimental 1.15 3.40 106 Medicine, general and internal 1.10 3.02 129 Psychology, experimental 1.20 2.75 78 Chemistry, organic 0.67 2.60 56

Management 1.11 2.55 115

Public, env. and occupational hlth.—SCI 0.85 2.46 134 Psychology, clinical 0.83 2.43 99 Psychology (all subfields combined) 0.97 2.38 497

Biology 0.90 2.38 76

Health care sciences and services 0.83 2.24 73 Health policy and services 0.80 2.07 53 Public, env. and occupational hlth.—SSCI 0.70 2.01 107 Physics, nuclear 0.82 1.92 21 Computer science, artificial intelligence 0.71 1.85 107

Geography 0.66 1.79 60

Social sciences, biomedical 0.59 1.70 32 Business, finance 1.34 1.60 59

Economics 1.24 1.51 276

Information science and library science 0.49 1.41 68

Communication 0.65 1.39 55

Urban studies 0.58 1.36 34

Sociology 0.70 1.34 115

Computer science, software engineering 0.68 1.33 94

Social work 0.43 1.30 31

Anthropology 0.54 1.26 70

Education, scientific disciplines 0.37 1.26 28 Education and educational research 0.51 1.23 145 Political science 0.77 1.15 116

Mathematics 0.93 0.81 258

Avg. of avg. values for 25 subject areas 0.92 2.23 — SD of avg. values for 25 subject areas 0.37 0.98 —

(5)

This study evaluates whether AI scores overestimate the citation impact of social science journals in subfields that are related to higher-impact natural science disciplines. “Overestimate” is a relative term, of course, and it could be argued that indicators such as IFunderestimatethe impact of journals in science-related subfields. In this analysis, however, overestimation is defined relative to the Impact Factor. The IF is used as a baseline simply because it predates the AI score by approximately four decades.

Two analyses are presented. The first has three components: (1) within each of seven social science disciplines, identify a subfield (and a corresponding set of journals) that is closely related to one or more of the natural sciences; (2) use AI scores to estimate IF values through a set of simple regression

equations; and (3) examine the residuals to determine, for each field, whether AI scores systematically overestimate IF values for journals that are related to the natural sciences. The second analysis is much like the first, but it incorporates a dummy variable coded1for journals in science-related subfields. This allows us to estimate the independent effect of science-related status.

2. Previous research

The use of impact indicators weighted by the reputation or impact of the citing journal was proposed as early as 1974 (Kochen 1974: 83) and has been revisited regularly since then (Bergstrom 2007; Liebowitz & Palmer 1984; Pinski & Narin 1976). Recently, several authors have claimed that unweighted indicators such as the IF measure popularity while weighted indicators such as the AI measure prestige (Ding & Cronin 2011; Yan & Ding 2010). In fact, however, both weighted and unweighted indicators are simply measures of impact, which is influenced by a wide range of factors other than prestige (Balaban 2012; Fersht 2009).

Most measures of citation impact focus on either the article or the journal. The IF, AI, and SNIP indicators each represent the impact of a typical article in a particular journal—not the impact of the journal as a whole. That is, they do not vary systematically with differences in journal size (the number of articles published in each journal). In contrast, indicators such as the h-index, the Eigenfactor Score, and the SCImago Journal Rank (SJR) measure the impact of the entire journal (all articles combined) and are therefore sensitive to journal size.

Ideally, each individual—the author deciding where to publish, the librarian assessing the cost-effectiveness of journals, or the committee member evaluating a promotion application—would choose the citation impact indicator most appropriate to the task at hand. No single measure of reputation or impact is appropriate in all circumstances (Bar-Ilan 2012; Engemann & Wall 2009). However, the choice of AI, IF, or another indicator is often limited by the availability of data. While only a small minority of colleges and universities have access to IF data through Journal Citation Reports, AI scores are freely available online. Moreover, for better or worse, some authors and librarians may regard AI scores simply as a cost-effective alternative to the IF “gold standard” that has become familiar over the past several decades (Fersht 2009). They may really want Impact Factors, but they’ll settle for AI scores because those data are readily available. Within this context, any systematic difference between the two measures will make AI less acceptable as a substitute for IF.

As noted in Section 1, AI is weighted by the impact of the citing journal. In contrast, IF is not. Despite this distinction, weighted and unweighted measures of journal impact are closely related. Within the field of medicine, for instance, there is a 0.95 correlation between Eigenfactor scores and total citation counts (Davis 2008). Such strong correlations may understate the differences among the various impact indicators, however, since both Eigenfactors and total citation counts are heavily influenced by a common factor: journal size. Journal size is a central component of all measures, weighted and unweighted, that represent the citation impact of the journal as a whole rather than the impact of a typical article (West, Bergstrom, and Bergstrom 2010a).

(6)

However, strong correlations persist even when we compare indicators such as AI and IF, which represent the impact of a typical article. Comparing 77 journals in 15 fields, Rousseau and the

STIMULATE 8 Group (2009) found AI-IF correlations of 0.90 (2004 data) and 0.92 (2006 data). Likewise, Waltman and van Eck (2010) reported a 0.93 correlation between AI and IF for a set of 6,708 journals in the natural and social sciences. Their analysis also demonstrates that strong correlations persist even when IF is compared to several different variants of the AI score. Elkins, Maher, Herbert, Moseley, and Sherrington (2010) document a close correspondence between AI and IF (Spearman’s rho = 0.79) for a set of 5,856 multidisciplinary journals. An even closer relationship (r = 0.94) can be seen among the top six journals in each of 20 natural science disciplines (Chang, McAleer, & Oxley 2011b).

One could argue that the multidisciplinary correlations presented by these authors also reflect the influence of a common factor. When AI and IF are compared across several subject areas, both may be influenced by disciplinary differences in citation impact. That is, differences in impact among the various physics journals (for example) may be swamped by the broader relationship between physics journals (higher impact) and anthropology journals (lower impact).

Fortunately, several authors have conducted single-discipline studies that compare weighted and unweighted indicators representing the impact of a typical article. For instance, Jacsó (2010b) presents an analysis of AI scores and five-year Impact Factors for a single discipline: library and information science (LIS). The two indicators are moderately related with regard to the rank ordering of LIS journals; of the top 22 journals, 18 exhibit rank-order differences of five places or fewer. At the same time, four well-known LIS journals rise or fall substantially when ranked by AI rather than five-year IF. Other studies suggest that the correlation between AI and IF is strongest in the natural sciences. Yin (2011) reports a Spearman’s rho value of 0.96 for the set of 104 chemical engineering journals, along with a similarly close relationship (0.94) for 15 of the top journals in that field. Among 40 high-impact economics journals, the relationship is not quite as strong;r= 0.91 (Chang, McAleer, & Oxley 2011a). Likewise, the Eigenfactor web site reports correlations of 0.93 for biology and medicine, with weaker relationships for mathematics (0.73) and economics (0.56) (West et al. 2012b).

Jacsó (2010b) discusses a number of factors that appear to influence the citation rankings of particular LIS journals. However, no published study has investigated the factors (such as disciplinary differences in impact) that might lead to systematic bias in the relationship between AI scores and Impact Factors.

3. Methods

The standard Impact Factor (IF or 2IF) is based on a two-year publication window and a one-year citation window. It accounts for the number of times the articles published in 2010 and 2011 were cited in 2012, for example. Because of its short publication and citation windows, 2IF varies in response to short-term trends. Some such trends signal real changes in the trajectory of a journal, while others represent anomalies that can be attributed to a single issue or even a single article. The developers of the AI score chose a five-year publication window partly to remedy the shortcomings of the two-year window used in the calculation of 2IF (West et al. 2010b, 2012c). Although the advantages of a five-year publication window have not been conclusively demonstrated, there is no reason to believe that a two-year window is preferable (Abramo, D’Angelo, & Cicero 2012; van Leeuwen 2012; Vanclay 2009). For these reasons, the five-year Impact Factor (5IF) was chosen for use in this study. AI and 5IF are both based on a five-year publication window and a one-year citation window.

AI and 5IF data were downloaded in October 2013 from the 2012 edition of Journal Citation Reports. Although the AI scores presented in JCR have not always matched those reported on the Eigenfactor web site (Jacsó 2010a), a comparison of 100 randomly selected values suggests that the two sources provide identical AI data for 2012.

(7)

JCR presents both AI and 5IF data for 1,226 journals in theanthropology,communication,economics,

education and educational research,information science and library science,psychology, andsociologysubject categories. (For purposes of analysis, the 10 JCR psychology categories—clinical, experimental, social, etc.—were treated as a single discipline.) Within those broad subject areas, the identification criteria shown in Table 2 were used to identify 56 journals in subfields that are conceptually related to the natural sciences: biological and medical anthropology, scientific and technical communication, the economics of health care, science education, scientometrics, biopsychology, and medical sociology and sociobiology. Physical geography was excluded because its methods, journals, publication patterns, and graduate programs are distinct from those of human geography. (See, for example, Castree, Rogers, & Sherman 2005; Harrison, Massey, Richards, Magilligan, Thrift, & Bender 2004; Turner 2002; and Viles 2004.)

Table 2

Criteria used to identify journals in seven social science subfields.

Biological and medical anthropology—11 journals

All journals in the JCRanthropologycategory that also appear in the categories ofbiology;evolutionary biology;genetics and heredity;health care sciences and services;health policy and services;public,

environmental, and occupational health; orsocial sciences, biomedical. Also includes the journal

Evolutionary Anthropology.

Scientific and technical communication—8 journals

All journals in the JCRcommunicationcategory that focus on scientific and technical communication, subjectively determined.

Economics of health care—6 journals

All journals in the JCReconomicscategory that also appear in the categories ofhealth care sciences and services;health policy and services;public, environmental, and occupational health; orsocial sciences, biomedical.

Science education—12 journals

All journals in the JCReducation and educational researchcategory that also appear in theeducation, scientific disciplinescategory or that have “science,” “health,” or “environmental” in their titles— excludingInstructional ScienceandJournal of the Learning Sciences.

Scientometrics—3 journals

The three journals in the JCRinformation science and library sciencecategory that most often cited, or were cited by,Scientometricsfrom 2008 to 2012. The same three journals appear at the top of the “cited” and “cited by” lists.

Biopsychology—14 journals

All journals in the JCRpsychology, biologicalcategory. Medical sociology and sociobiology—2 journals

All journals in the JCRsociologycategory that also appear in the categories ofbiology;evolutionary biology;genetics and heredity;health care sciences and services;health policy and services;public, environmental, and occupational health; orsocial sciences, biomedical.

(8)

Seven OLS regressions were conducted for the first analysis (Section 4.1). Each regression uses AI scores to estimate 5IF values in a particular discipline. The residuals (error terms) were examined to evaluate the hypothesis that AI scores systematically overestimate 5IF values for journals in subfields that are related to the natural sciences. Emphasis was placed on both (a) the direction and magnitude of each error term and (b) the direction and magnitude of each error term relative to those of comparable

(similar-impact) journalsnotin subfields related to the natural sciences.

In the second analysis (Section 4.2), a dummy variable—coded1for journals in science-related subfields—was added to each regression equation. This allows us to estimate the independent effect of science-related status. Each new set of residuals was evaluated based on the same criteria used in Section 4.1.

Significance tests are not appropriate here, since the analysis is based on data for the entire population of interest: all the journals, in seven social science disciplines, for which AI and 5IF data are available. No attempt is made to extend the findings to other subject areas.

4. Results

4.1. Does AI overestimate 5IF for science-related journals?

For the seven subject areas shown in Table 3, the correlations between AI and 5IF are similar to those reported in previous research. Thervalue for economics (0.88) is comparable to the 0.91 value reported by Chang et al. (2011a) but higher than the 0.56 value reported elsewhere (West et al. 2012b). Overall, AI is a good or excellent predictor of 5IF.

The regression residuals (Table 4) demonstrate that AI does not systematically overestimate 5IF for social science journals in subfields that are related to the natural sciences. As the table shows, each of the seven subfields has an average residual value that is positive rather than negative. That is, AI tends to

underestimate5IF for the journals in science-related subfields. (See theeconomics of health carecategory, in particular.) Only 6 of the 56 journals have negative error terms: one in biological and medical

anthropology, one in scientific and technical communication, three in biopsychology, and one in medical sociology and sociobiology.

Table 3

Descriptive statistics and simple linear regressions.

5IF 5IF SE

Subject area N avg. SD Equation Est. r

Anthropology 70 1.26 1.09 5IF = 0.1373 + 2.0685 AI 0.42 0.92 Communication 55 1.39 0.93 5IF = 0.2834 + 1.7199 AI 0.41 0.90 Economics 276 1.51 1.36 5IF = 0.6732 + 0.6743 AI 0.64 0.88 Education 145 1.23 0.93 5IF = 0.3308 + 1.7759 AI 0.43 0.89 Inf. & Lib. Sci. 68 1.41 1.36 5IF = 0.2194 + 2.4458 AI 0.39 0.96 Psychology 497 2.38 2.48 5IF = 0.4230 + 2.0085 AI 0.55 0.98 Sociology 115 1.34 1.11 5IF = 0.3598 + 1.3892 AI 0.42 0.92

(9)

Table 4

Simple linear regressions. Predicted values and residuals of the 56 journals in social science subfields that are related to the natural sciences. Resid.is the residual value for each journal. Comp. resid.is the average residual value of the three journals ranked immediately higher and the three journals ranked

immediately lower (on the basis of 5IF) in the relevant subject area (anthropology, communication, etc.).

Diff.is the difference betweenResid.andComp. resid.

Pred. Comp.

Subfield / Journal 5IF 5IF Resid. resid. Diff.

Biological and medical anthropology (average) 2.49 2.07 0.42 0.17 0.25

Evolutionary Anthropology 4.64 4.80 -0.16 0.60 -0.76 Journal of Human Evolution 4.53 3.35 1.18 -0.08 1.26 American Journal of Physical Anthropology 2.85 1.97 0.88 0.05 0.83 American Journal of Human Biology 2.39 1.73 0.66 0.18 0.47

Human Nature 2.37 2.25 0.12 0.20 -0.09

Yearbook of Physical Anthropology 2.23 1.77 0.46 0.15 0.32 Medical Anthropology Quarterly 1.82 1.51 0.31 0.19 0.12 Medical Anthropology 1.77 1.51 0.26 0.19 0.07 Annals of Human Biology 1.76 1.25 0.51 0.12 0.38 Culture, Medicine and Psychiatry 1.66 1.43 0.22 0.29 -0.07

Human Biology 1.38 1.22 0.16 -0.04 0.20

Scientific and technical communication (average) 1.49 1.14 0.35 0.02 0.33

Public Understanding of Science 2.47 1.72 0.75 0.30 0.45 Science Communication 2.42 1.70 0.72 0.27 0.46 Journal of Health Communication 2.31 1.69 0.62 -0.01 0.63 Health Communication 1.74 1.39 0.35 0.17 0.18 Environmental Communication 0.99 0.68 0.31 -0.11 0.42 IEEE Transactions on Professional Communication 0.84 0.65 0.19 -0.05 0.23 Technical Communication 0.70 0.61 0.09 -0.15 0.24 Journal of Business and Technical Communication 0.45 0.64 -0.19 -0.23 0.03

Economics of health care (average) 2.79 1.47 1.32 0.69 0.64

PharmacoEconomics 3.54 1.46 2.09 0.98 1.10 Journal of Health Economics 3.03 1.85 1.18 0.45 0.73

Value in Health 2.90 1.35 1.56 1.12 0.44

Health Economics 2.79 1.57 1.22 0.96 0.26 Economics and Human Biology 2.51 1.45 1.06 0.16 0.90 European Journal of Health Economics 1.98 1.13 0.85 0.46 0.39

Science education (average) 2.08 1.55 0.52 0.17 0.35

Journal of Research in Science Teaching 3.23 2.34 0.89 0.98 -0.09 Science Education 2.71 2.32 0.39 0.28 0.11 Advances in Health Sciences Education 2.61 2.06 0.54 0.00 0.55 Health Education Research 2.44 1.82 0.62 -0.32 0.94 Physical Review: Physics Education Research 2.13 1.24 0.89 0.10 0.79 Journal of School Health 2.01 1.41 0.61 0.37 0.24

(10)

Table 4 (continued)

Pred. Comp.

Science education (continued)

Journal of American College Health 1.99 1.42 0.57 0.41 0.16 Journal of Engineering Education 1.92 1.50 0.42 0.34 0.08 International Journal of Science Education 1.80 1.34 0.46 0.17 0.29 Research in Science Education 1.58 1.29 0.29 0.15 0.14 Health Education Journal 1.29 1.07 0.22 -0.13 0.35 Chemistry Education Research and Practice 1.20 0.84 0.36 -0.29 0.65

Scientometrics (average) 2.78 2.33 0.46 0.33 0.13

Journal of Informetrics 3.99 3.21 0.78 -0.22 1.00

Scientometrics 2.21 1.68 0.52 0.52 0.00

JASIST 2.16 2.09 0.07 0.68 -0.61

Biopsychology (average) 3.98 3.60 0.38 0.08 0.30

Behavioral and Brain Sciences 23.17 22.45 0.72 0.95 -0.23 Biological Psychology 4.34 3.37 0.98 0.23 0.74 Evolution and Human Behavior 4.25 3.76 0.49 0.59 -0.10 Psychophysiology 4.01 3.17 0.84 -0.13 0.97 Physiology and Behavior 3.34 2.41 0.93 0.08 0.86 Experimental and Clinical Psychopharmacology 3.20 2.42 0.78 0.14 0.65 International Journal of Psychophysiology 2.66 2.19 0.47 -0.03 0.50 Journal of Exp. Psy.: Animal Behavior Processes 2.46 2.21 0.25 0.30 -0.05 Journal of Psychophysiology 2.13 1.94 0.19 0.02 0.17 Learning and Behavior 1.89 1.84 0.05 0.09 -0.04 Behavioural Processes 1.63 1.49 0.13 -0.27 0.40 Journal of the Experimental Analysis of Behavior 1.12 1.16 -0.04 -0.11 0.07 Learning and Motivation 0.75 1.02 -0.27 -0.36 0.09 Integrative Psychological and Behavioral Science 0.74 0.92 -0.18 -0.34 0.16

Medical sociology and sociobiology (average) 1.51 1.09 0.42 -0.15 0.57

Sociology of Health and Illness 2.44 1.53 0.91 -0.15 1.06 Health Sociology Review 0.58 0.66 -0.07 -0.15 0.08

The underestimation of 5IF for these science-related journals may possibly result from a combination of two factors: (1) the relatively high impact (5IF) of journals in subfields that are related to the natural sciences (Table 5), and (2) the positive correlation between 5IF and the regression residuals within each of the seven disciplines. Of the 56 journals shown in Table 4, 51 are ranked at the 40th percentile or higher within their broad subject areas. All but one of those 51 journals have positive residuals. In contrast, just 5 of the 56 journals are ranked below the 40th percentile in their subject areas; all 5 have negative

residuals. For the set of all 1,226 journals, the correlations between 5IF and the residuals vary by subject area, from 0.22 in psychology to 0.47 in economics, but the average correlation (0.38) indicates a

(11)

Table 5

Average and minimum percentile ranks of science-related journals, based on 5IF within each broad subject area (anthropology, communication, etc.).

Subfield Avg. Min.

Biological and medical anthropology 84th 66th Scientific and technical communication 65 33 Economics of health care 88 80 Science education 88 73

Scientometrics 84 78

Biopsychology 62 15

Medical sociology and sociobiology 66 40

Can the underestimation of 5IF for natural science-related journals be attributed to their scientific emphasis, or does it simply result from the fact that AI tends to underestimate 5IF for higher-impact journals? The two rightmost columns of Table 4 suggests that scientific emphasis does play a role. Each

Comp. resid.(comparable residuals) value shows the average of the residuals for the three higher-impact journals and the three lower-impact journals that are closest in 5IF to each science-related journal, within the relevant subject category (anthropology, communication, etc.). EachDiff.value shows the difference between the journal’s residual and the average residual of the journals in the same subject category that are most similar in impact. (Diff.equalsResid.minusComp. resid.). If the underestimation of 5IF for these 56 journals can be attributed solely to their high impact, we can expect an averageDiff.value near zero within each subject area.

In fact, however, the averageDiff.value is positive for each of the seven subfields. Positive values can be seen for eight of the eleven biological and medical anthropology journals, all eight of the scientific and technical communication journals, all six of the health care economics journals, eleven of the twelve science education journals, two of the three scientometrics journals, ten of the fourteen biopsychology journals, and both of the medical sociology journals. This indicates that the journals’ low AI scores (relative to their 5IF values) can be attributed at least partly to factors other than their high impact.

Scatterplots reveal that the relationship between AI and 5IF is essentially linear for each of the seven disciplines. (See Fig. 1, for example.) Linear relationships can also be seen in several earlier studies (Elkins et al. 2010; Rousseau & STIMULATE 8 Group 2009; Waltman & van Eck 2010; West et al. 2012b; Yin 2011). Nonetheless, several types of curvilinear regression were used to investigate whether a nonlinear relationship might explain the underestimation of 5IF for journals in science-related subfields. For four of the seven disciplines, a third-order polynomial regression resulted in the best fit. For three disciplines—communication, economics, and sociology—a power curve resulted in the best fit.

The use of curvilinear regression improves the predictive power of the equations only slightly, however. It increases thervalue by more than 0.02 for only one field, economics (0.04). More

importantly, the use of curvilinear regression does not reduce the prevalence of positive residuals among the journals with high IFs. For the linear regressions, the average correlation between 5IF and the residuals is 0.38; for the curvilinear regressions, 0.40. Adjusting the regressions to account for nonlinearity has no real impact on the strength of the overall relationships or on the relatively high residuals of the science-related journals. That is, the very modest curvilinearity identified here cannot account for the fact that science-related journals have low AI scores relative to their Impact Factors.

(12)

Fig. 1. Article Influence scores and five-year Impact Factors of 70 anthropology journals.

4.2. The independent effect of science-related status

The inclusion of a dummy variable for science-related status—Sci, coded1for journals in subfields related to the natural sciences—improves the predictive power of the regressions only slightly. Likewise, it reduces the standard error of estimate by just 0.02, on average (Table 6). The overall impact is minor simply because theScivariable introduces relatively little explanatory variance; 1,170 of the 1,226 journals have values of0forSci.

Table 6

Linear regressions with a dummy variable,Sci, coded1for journals in subfields that are related to the natural sciences.

SE

Subject area Equation Est. r

Anthropology 5IF = 0.1301 + 1.9180 AI + 0.5649 Sci 0.38 0.94 Communication 5IF = 0.1971 + 1.7589 AI + 0.4208 Sci 0.39 0.91 Economics 5IF = 0.6431 + 0.6749 AI + 1.3544 Sci 0.61 0.90 Education 5IF = 0.3032 + 1.7361 AI + 0.5772 Sci 0.40 0.90 Inf. & Lib. Sci. 5IF = 0.2118 + 2.4170 AI + 0.4899 Sci 0.38 0.96 Psychology 5IF = 0.4164 + 2.0038 AI + 0.3959 Sci 0.55 0.98 Sociology 5IF = 0.3507 + 1.3916 AI + 0.4268 Sci 0.42 0.93

(13)

Table 7

Estimated impact of science-related status—the extent to which AI

underestimates 5IF for journals in subfields that are related to the natural sciences.

In 5IF In Subject area units SD Anthropology 0.56 0.52 Communication 0.42 0.45 Economics 1.35 1.00 Education 0.58 0.62 Inf. & Lib. Sci. 0.49 0.36 Psychology 0.40 0.16 Sociology 0.43 0.38

Nonetheless, the dummy variable has a considerable impact on the 5IF estimates for the science-related journals. For the seven disciplines, the averageScicoefficient is 0.60 5IF units or 0.50 SD (Table 7). In anthropology (N = 70), an effect of that magnitude is enough to move a mid-ranked journal at least 10 places up or down in the rankings. In economics, failure to consider science-related status can lead to the underestimation of 5IF by an entire standard deviation.

As Table 8 shows, the inclusion of theScivariable eliminates the high residuals identified in the earlier regressions (Section 4.1). It also results in an average residual value of 0.00 for each of the seven science-related subfields. Within each subfield, the journals’ residuals are about evenly split between positive and negative values, and the inclusion ofScireduces the average magnitude of the residuals from 0.55 (Table 4) to 0.29 (Table 8). This effect is consistent across the seven subfields. Likewise, theDiff.

values shown in Table 8 are consistent with a better-fitting and more fully specified regression model. Without theScivariable, the averageDiff.value is 0.34. With theScivariable, the average value is -0.17.

The correlation between 5IF and the residuals persists even whenSciis included in the regression equations. For the earlier regressions (Table 4), the average correlation between 5IF and the residuals is 0.38. For these new regressions (Table 8), it is 0.36. AI tends to underestimate 5IF for higher-impact journals even after we control forSci. This suggests that the underestimation associated with science-related status is largely independent of the underestimation associated with high impact.

5. Conclusion

Within the seven subject areas evaluated here, Article Influence scores do not systematically overestimate the citation impact (5IF) of social science journals in subfields that are related to higher-impact natural science disciplines. In fact, AI scores tend tounderestimate5IF for journals that are related to the natural sciences. Such journals have high 5IFs relative to their AI scores—that is, low AI scores relative to their 5IFs.

This finding should be interpreted against the backdrop of a more general relationship: AI scores are especially likely to underestimate 5IF values for higher-impact journals—those with high 5IFs, whether related to the natural sciences or not. A similar result has been reported with regard to the SNIP indicator. In comparison with measures such as the Impact Factor, SNIP “makes differences among

(14)

Table 8

Linear regressions with a dummy variable,Sci, coded1for journals in subfields that are related to the natural sciences. Predicted values and residuals of the 56 journals in social science subfields that are related to the natural sciences. Resid.is the residual value for each journal. Comp. resid.is the average residual value of the three journals ranked immediately higher and the three journals ranked

immediately lower (on the basis of 5IF) in the relevant subject area (anthropology, communication, etc.).

Diff.is the difference betweenResid.andComp. resid.

Pred. Comp.

Biological and medical anthropology (average) 2.49 2.49 0.00 0.20 -0.20

Evolutionary Anthropology 4.64 5.02 -0.38 0.66 -1.04 Journal of Human Evolution 4.53 3.68 0.85 0.06 0.79 American Journal of Physical Anthropology 2.85 2.39 0.46 0.18 0.28 American Journal of Human Biology 2.39 2.17 0.22 0.16 0.05

Human Nature 2.37 2.65 -0.29 0.18 -0.46

Yearbook of Physical Anthropology 2.23 2.21 0.02 0.20 -0.18 Medical Anthropology Quarterly 1.82 1.97 -0.15 0.12 -0.27 Medical Anthropology 1.77 1.97 -0.20 0.12 -0.32 Annals of Human Biology 1.76 1.73 0.03 0.05 -0.02 Culture, Medicine and Psychiatry 1.66 1.90 -0.24 0.38 -0.62

Human Biology 1.38 1.70 -0.32 0.05 -0.37

Scientific and technical communication (average) 1.49 1.49 0.00 0.01 -0.01

Public Understanding of Science 2.47 2.09 0.38 0.27 0.11 Science Communication 2.42 2.06 0.36 0.17 0.19 Journal of Health Communication 2.31 2.06 0.25 -0.04 0.29 Health Communication 1.74 1.75 -0.01 0.23 -0.23 Environmental Communication 0.99 1.03 -0.04 -0.11 0.07 IEEE Transactions on Professional Communication 0.84 0.99 -0.16 -0.12 -0.04 Technical Communication 0.70 0.95 -0.25 -0.15 -0.10 Journal of Business and Technical Communication 0.45 0.99 -0.54 -0.15 -0.39

Economics of health care (average) 2.79 2.79 0.00 0.64 -0.64

PharmacoEconomics 3.54 2.78 0.76 1.01 -0.25 Journal of Health Economics 3.03 3.17 -0.14 0.48 -0.63 Value in Health 2.90 2.67 0.23 0.92 -0.69 Health Economics 2.79 2.89 -0.11 0.76 -0.87 Economics and Human Biology 2.51 2.77 -0.26 0.19 -0.45 European Journal of Health Economics 1.98 2.46 -0.48 0.49 -0.97

Science education (average) 2.08 2.08 0.00 0.15 -0.15

Journal of Research in Science Teaching 3.23 2.84 0.38 1.05 -0.66 Science Education 2.71 2.83 -0.11 0.26 -0.37 Advances in Health Sciences Education 2.61 2.57 0.03 -0.12 0.15 Health Education Research 2.44 2.34 0.11 -0.33 0.44 Physical Review: Physics Education Research 2.13 1.77 0.36 0.17 0.19 Journal of School Health 2.01 1.93 0.08 0.33 -0.25

(15)

Table 4 (continued)

Pred. Comp.

Science education (continued)

Journal of American College Health 1.99 1.94 0.05 0.27 -0.22 Journal of Engineering Education 1.92 2.03 -0.10 0.21 -0.31 International Journal of Science Education 1.80 1.86 -0.07 0.13 -0.20 Research in Science Education 1.58 1.82 -0.24 0.20 -0.44 Health Education Journal 1.29 1.60 -0.31 -0.07 -0.24 Chemistry Education Research and Practice 1.20 1.37 -0.17 -0.23 0.06

Scientometrics (average) 2.78 2.78 0.00 0.31 -0.31

Journal of Informetrics 3.99 3.65 0.33 -0.16 0.49

Scientometrics 2.21 2.15 0.06 0.46 -0.41

JASIST 2.16 2.55 -0.39 0.62 -1.01

Biopsychology (average) 3.98 3.98 0.00 0.08 -0.08

Behavioral and Brain Sciences 23.17 22.79 0.38 1.00 -0.61 Biological Psychology 4.34 3.75 0.59 0.18 0.41 Evolution and Human Behavior 4.25 4.14 0.11 0.54 -0.43 Psychophysiology 4.01 3.55 0.46 -0.12 0.57 Physiology and Behavior 3.34 2.79 0.55 0.09 0.46 Experimental and Clinical Psychopharmacology 3.20 2.81 0.40 0.15 0.25 International Journal of Psychophysiology 2.66 2.58 0.09 -0.02 0.11 Journal of Exp. Psy.: Animal Behavior Processes 2.46 2.59 -0.14 0.31 -0.45 Journal of Psychophysiology 2.13 2.33 -0.20 0.03 -0.23 Learning and Behavior 1.89 2.22 -0.33 0.10 -0.43 Behavioural Processes 1.63 1.88 -0.25 -0.26 0.01 Journal of the Experimental Analysis of Behavior 1.12 1.55 -0.43 -0.10 -0.33 Learning and Motivation 0.75 1.41 -0.66 -0.42 -0.24 Integrative Psychological and Behavioral Science 0.74 1.31 -0.57 -0.39 -0.17

Medical sociology and sociobiology (average) 1.51 1.51 0.00 -0.14 0.14

Sociology of Health and Illness 2.44 1.95 0.49 -0.14 0.63 Health Sociology Review 0.58 1.07 -0.49 -0.14 -0.35

journals smaller” (Colledge et al. 2010, p. 220). The reason for this is not clear, although it may be related to the fact that until recently, both AI and SNIP counted citations that appeared in popular magazines and lower-ranked journals—citations that are not considered in the calculation of the Impact Factor (Moed 2010; West et al. 2012c). Since October 2012, however, SNIP has been redefined to exclude citations in trade journals and in publications with relatively few references to previous work (Waltman, van Eck, van Leeuwen, and Visser 2013).

These results reveal that the underestimation of 5IF associated with science-related status is largely independent of the underestimation associated with high impact. Just as AI tends to underestimate 5IF for high-impact journals regardless of their connection (or lack of connection) to the natural sciences, AI tends to underestimate 5IF for science-related journals regardless of their 5IF values. Moreover, the effect of science-related status cannot be attributed to the modest curvilinearity of the relationship between AI and 5IF.

(16)

Why do the social science journals shown in Tables 4 and 8 have low AI scores relative to their Impact Factors? Since AI scores give more (less) credit for citations that appear in more (less) cited journals, three possibilities come to mind.

(1) Although these social science journals are conceptually related to the natural sciences, they are cited only seldom in natural science journals. Despite their subject coverage, they are not integrated into the literature of the natural sciences.

(2) Although these journals are integrated into the literature of the natural sciences, they are most often cited in the lower-ranked journals—natural science journals with AI scores lower than those of the journals’ home (social science) disciplines. Although this possibility has not been evaluated, there is some evidence to support it. Previous research has shown that articles in journals covering more than one of the natural sciences tend to be cited less often than those in journals that cover just a single natural science discipline (Levitt & Thelwall 2008). Lower citation ratesmaycorrespond to citations in lower-ranked journals.

(3) Although these journals are cited regularly in higher-impact natural science journals, they are especially unlikely to be cited in the higher-impact journals of their home disciplines.

Further research may help determine whether any of these explanations are valid.

With regard to the needs of faculty and librarians, these findings show that the careful estimation of Impact Factors from Article Influence scores involves more than a simple rescaling of the values.

Although AI is closely correlated with IF5, there are systematic differences between the two. The

equations presented in Table 6—and, more generally, an awareness of the issues involved—may be useful to authors deciding where to submit their papers, to faculty serving on tenure and promotion

committees, and to librarians making journal selection and deselection decisions.

Acknowledgements

(17)

References

Abramo, G., D’Angelo, C.A., & Cicero, T. (2012). What is the appropriate length of the publication period over which to assess research performance? Scientometrics, 93(3), 1005–1017.

Althouse, B.M., West, J.D., Bergstrom, C.T., & Bergstrom, T.C. (2009). Differences in Impact Factor across fields and over time. Journal of the American Society for Information Science and Technology, 60(1), 27–34. Balaban, A.T. (2012). Positive and negative aspects of citation indices and journal Impact Factors.

Scientometrics, 92(2), 241–247.

Bar-Ilan, J. (2012). Journal report card. Scientometrics, 92(2), 249–261.

Bergstrom, C.T. (2007). Eigenfactor: Measuring the value and prestige of scholarly journals. College & Research Libraries News, 68(5), 314–316.

Bergstrom, C.T., West, J.D., & Wiseman, M.A. (2008). The Eigenfactor metrics. Journal of Neuroscience, 28(45), 11433–11434.

Castree, N., Rogers, A., & Sherman, D., eds. (2005). Questioning geography: Fundamental debates. Oxford: Blackwell.

Chang, C.-L., McAleer, M., & Oxley, L. (2011a). What makes a great journal great in economics? The singer not the song. Journal of Economic Surveys, 25(2), 326–361.

Chang, C.-L., McAleer, M., & Oxley, L. (2011b). What makes a great journal great in the sciences? Which came first, the chicken or the egg? Scientometrics, 87(1), 17–40.

Colledge, L., de Moya-Anegón, F., Guerrero-Bote, V.P., López-Illescas, C., El Aisati, M., & Moed, H.F. (2010). SJR and SNIP: Two new journal metrics in Elsevier’s Scopus. Serials, 23(3), 215–221. Davis, P.M. (2008). Eigenfactor: Does the principle of repeated improvement result in better estimates

than raw citation counts? Journal of the American Society for Information Science and Technology, 59(13): 2186–2188.

Ding, Y., & Cronin, B. (2011). Popular and/or prestigious? Measures of scholarly esteem. Information Processing and Management, 47(1), 80–96.

Elkins, M.R., Maher, C.G., Herbert, R.D., Moseley, A.M., & Sherrington, C. (2010). Correlation between the Journal Impact Factor and three other journal citation indices. Scientometrics, 85(1), 81–93. Engemann, K.M., & Wall, H.J. (2009). A journal ranking for the ambitious economist. Federal Reserve Bank

of St. Louis Review, 91(3), 127–139.

Fersht, A. (2009). The most influential journals: Impact Factor and Eigenfactor. Proceedings of the National Academy of Sciences, 106(17), 6883–6884.

Franceschet, M. (2010a). Journal influence factors. Journal of Informetrics, 4(3), 239–248.

Franceschet, M. (2010b). Ten good reasons to use the Eigenfactor metrics. Information Processing and Management, 46(5), 555–558.

Garfield, E. (2007). The evolution of the Science Citation Index. International Microbiology, 10(1), 65–69. Harrison, S., Massey, D., Richards, K., Magilligan, F.J., Thrift, N., & Bender B. (2004). Thinking across the

divide: Perspectives on the conversations between physical and human geography. Area, 36(4): 435– 442.

Jacsó, P. (2010a). Differences in the rank position of journals by Eigenfactor metrics and the five-year Impact Factor in the Journal Citation Reports and the Eigenfactor Project web site. Online Information Review, 34(3), 496–508.

Jacsó, P. (2010b). Eigenfactor and Article Inﬂuence scores in the Journal Citation Reports. Online Information Review, 34(2), 339–348.

Kochen, M. (1974). Principles of information retrieval. Los Angeles: Melville Publishing.

(18)

Leydesdorff, L. (2008). Caveatsfor the use of citation indicators in research and journal evaluations.

Journal of the American Society for Information Science and Technology, 59(2), 278–287.

Liebowitz, S.J., & Palmer, J.P. (1984). Assessing the relative impacts of economics journals. Journal of Economic Literature, 22(1), 77–88.

Moed, H.F. (2010). Measuring contextual citation impact of scientific journals. Journal of Informetrics, 4(3), 265–277.

Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312. Postma, E. (2007). Inflated Impact Factors? The true impact of evolutionary papers in non-evolutionary

journals. PLoS ONE, 2(10), e999.

Rousseau, R., & STIMULATE 8 Group (2009). On the relation between the WoS Impact Factor, the Eigenfactor, the SCImago Journal Rank, the Article Influence score and the journal h-index. Paper presented at Nanjing University. http://eprints.rclis.org/13304/. Accessed 3 February 2014.

Smolinsky, L., & Lercher, A. (2012). Citation rates in mathematics: A study of variation by subdiscipline.

Scientometrics, 91(3), 911–924.

So, C.Y.K. (1998). Citation ranking versus expert judgment in evaluating communication scholars: Effects of research specialty size and individual prominence. Scientometrics, 41(3), 325–333.

Turner, B.L. (2002). Contested identities: Human-environment geography and disciplinary implications in a restructuring academy. Annals of the Association of American Geographers, 92(1), 52–74.

van Leeuwen, T.N. (2012). Discussing some basic critique on Journal Impact Factors: Revision of earlier comments. Scientometrics, 92(2), 443–455.

Vanclay, J. (2009). Bias in the Journal Impact Factor. Scientometrics, 78(1), 3–12. Viles, H. (2004). DoesAreakeep you awake at night? Area, 36(4), 337.

Waltman, L., & van Eck, N.J. (2010). The relation between Eigenfactor, Audience Factor, and Influence Weight. Journal of the American Society for Information Science and Technology, 61(7), 1476–1486. Waltman, L., van Eck, N.J., van Leeuwen, T.N., & Visser, M.S. (2013). Some modifications to the SNIP

journal impact indicator. Journal of Informetrics, 7(2), 272–285.

West, J.D., Bergstrom, C.T., Althouse, B.M., Rosvall, M., Bergstrom, T.C., & Vilhena, D. (2012a). A model of research. University of Washington. http://www.eigenfactor.org/methods.php. Accessed 3 February 2014.

West, J.D., Bergstrom, C.T., Althouse, B.M., Rosvall, M., Bergstrom, T.C., & Vilhena, D. (2012b). Correlation of Article Influence with Impact Factor. University of Washington.

http://www.eigenfactor.org/stats.php. Accessed 3 February 2014.

West, J.D., Bergstrom, C.T., Althouse, B.M., Rosvall, M., Bergstrom, T.C., & Vilhena, D. (2012c). Why Eigenfactor? University of Washington. http://www.eigenfactor.org/whyeigenfactor.php. Accessed 3 February 2014.

West, J.D., Bergstrom, T.C., & Bergstrom, C.T. (2010a). Big Macs and Eigenfactor scores: Don’t let

correlation coefficients fool you. Journal of the American Society for Information Science and Technology, 61(9), 1800–1807.

West, J.D., Bergstrom, T.C., & Bergstrom, C.T. (2010b). The Eigenfactor metrics: A network approach to assessing scholarly journals. College & Research Libraries, 71(3), 236–244.

Yan, E., & Ding, Y. (2010). Weighted citation: An indicator of an article’s prestige. Journal of the American Society for Information Science and Technology, 61(8), 1635–1643.

Yin, C.-Y. (2011). Do Impact Factor, h-index and Eigenfactor of chemical engineering journals correlate well with each other and indicate the journals’ influence and prestige? Current Science, 100(5), 648– 653.