Received October 11, 2002. Address for correspondence:
A new classification scheme of science fields and subfields
designed for scientometric evaluation purposes
WOLFGANG GLÄNZEL,*,** ANDRÁSSCHUBERT** *Katholieke Universiteit Leuven, Steunpunt O&O Statistieken, Leuven (Belgium) **Hungarian Academy of Sciences, Institute for Research Organisation, Budapest (Hungary)
A two-level hierarchic system of fields and subfields of the sciences, social sciences and arts & humanities is proposed. The system was specifically designed for scientometric (evaluation) purposes with the ultimate goal of classifying every single document into a well-defined category. This goal was achieved using a three-step iterative process. The basic concepts and some preliminary results are presented.
Introduction
Classification of science into a disciplinary structure is at least as old as science itself. After many centuries of constructive but yet inconclusive search for a perfect classification scheme, the only sensible approach to the question appears to be the pragmatic one: what is the optimal scheme for a given practical purpose? To this end, ever so many systems have been conceived and installed by general and special libraries, publishers, encyclopedias and, in ever growing number, by electronic databases, internet based information services, web crawlers, etc. Classification systems developed by the producer of the Science Citation Index (SCI; ISI – Thomson Scientific, PA, USA), by institutions working extensively with this database and by the producers of other multidisciplinary science journal databases are worthy of distinguished attention (see, for instance, Narin, 1976). These classification systems are mostly based on journal assignment, originally created for retrieval purposes. Most existing systems, however, proved to have shortcomings when used in the context of research evaluation. The classification of scientific literature into appropriate subject fields is, nevertheless, one of the basic preconditions of valid scientometric analyses. Publication activity and citation habits considerably differ among subfields. In comparative studies, inappropriate reference standards obtained from questionable subject assignment might result in misleading conclusions. This paper is, therefore,
aiming at the development of a new classification system including also papers published in multidisciplinary journals, and especially designed for research evaluation purposes.
Methods
For the given practical purpose, two different basic schemes are used: hierarchic and fine-structured classification systems used in information retrieval and more “robust” schemes emphasizing science organisation aspects and science policy needs.
In this paper, a two-level hierarchical classification scheme has been constructed, so that the categories cover the whole scope of the sciences by and large evenly, and the subfields behave consistently in scientometric evaluations, i.e., common standards could be set in each of them regarding publication and citation habits.
The objectives of the work have been approached by three successive steps allowing multiple feedback loops throughout the whole process.
1. The “cognitive” approach (setting the categories):
In this iterative process, an initial scheme has been elaborated on the basis of both the experience of scientometricians and external experts.
2. The “pragmatic” approach (journal classification):
On the basis of existing journal classification schemes the majority of the journal set extracted from the SCI has been classified into the preset subfields. The classification scheme has been adjusted according to co-heading frequency to keep multiple assignments within reasonable limits.
3. The “scientometric” approach (article classification):
Articles published in core journals can be unambiguously classified into the subfield of the given journals. Articles of un-assignable or ambiguously assignable journals are classified individually using the analysis of references. The results of this classification exercise had a retroactive effect on the journal classification and also on the basic fields/subfield structure.
Results
Step 1 – The “cognitive” approach (setting the categories)
The application of the above methods resulted in a system with 12 first-level categories (fields) and 60 second-level categories (subfields) of the sciences. For the social sciences and the humanities 3 major fields and 7 subfields were obtained. The results are presented in Table 1.
Table 1. Fields and subfields of sciences, social sciences and arts & humanities 1. AGRICULTURE & ENVIRONMENT
A1 Agricultural Science & Technology A2 Plant & Soil Science & Technology A3 Environmental Science & Technology A4 Food & Animal Science & Technology
2. BIOLOGY (ORGANISMIC & SUPRAORGANISMIC LEVEL) Z1 Animal Sciences
Z2 Aquatic Sciences Z3 Microbiology Z4 Plant Sciences Z5 Pure & Applied Ecology Z6 Veterinary Sciences
3. BIOSCIENCES (GENERAL, CELLULAR & SUBCELLULAR BIOLOGY; GENETICS) B0 Multidisciplinary Biology
B1 Biochemistry/Biophysics/Molecular Biology B2 Cell Biology
B3 Genetics & Developmental Biology 4. BIOMEDICAL RESEARCH
R1 Anatomy & Pathology R2 Biomaterials & Bioengineering R3 Experimental/Laboratory Medicine R4 Pharmacology & Toxicology R5 Physiology
5. CLINICAL AND EXPERIMENTAL MEDICINE I (GENERAL & INTERNAL MEDICINE) I1 Cardiovascular & Respiratory Medicine
I2 Endocrinology & Metabolism I3 General & Internal Medicine I4 Hematology & Oncology I5 Immunology
6. CLINICAL AND EXPERIMENTAL MEDICINE II (NON-INTERNAL MEDICINE SPECIALTIES) M1 Age & Gender Related Medicine
M2 Dentistry
M3 Dermatology/Urogenital System M4 Ophthalmology/Otolaryngology M5 Paramedicine
M6 Psychiatry & Neurology M7 Radiology & Nuclear Medicine M8 Rheumatology/Orthopedics M9 Surgery
7. NEUROSCIENCE & BEHAVIOR
N1 Neurosciences & Psychopharmacology N2 Psychology & Behavioral Sciences
Table 1. (cont.)
8. CHEMISTRY
C0 Multidisciplinary Chemistry
C1 Analytical, Inorganic & Nuclear Chemistry C2 Applied Chemistry & Chemical Engineering C3 Organic & Medicinal Chemistry
C4 Physical Chemistry C5 Polymer Science C6 Materials Science 9. PHYSICS P0 Multidisciplinary Physics P1 Applied Physics
P2 Atomic, Molecular & Chemical Physics P3 Classical Physics
P4 Mathematical & Theoretical Physics P5 Particle & Nuclear Physics
P6 Physics of Solids, Fluids And Plasmas 10. GEOSCIENCES & SPACE SCIENCES
G1 Astronomy & Astrophysics G2 Geosciences & Technology G3 Hydrology/Oceanography
G4 Meteorology/Atmospheric & Aerospace Science & Technology G5 Mineralogy & Petrology
11. ENGINEERING
E1 Computer Science/Information Technology E2 Electrical & Electronic Engineering E3 Energy & Fuels
E4 General & Traditional Engineering 12. MATHEMATICS
H1 Applied Mathematics H2 Pure Mathematics
13. SOCIAL SCIENCES I (GENERAL, REGIONAL & COMMUNITY ISSUES) S1 Education & Information
S2 General, Regional & Community Issues
14. SOCIAL SCIENCES II (ECONOMICAL & POLITICAL ISSUES) O1 Economics, Business & Management
O2 History, Politics & Law 15. ARTS & HUMANITIES
U1 Arts & Literature U2 Language & Culture U3 Philosophy & Religion
An interesting side effect of this new category system is that part of the life-science related fields covered by the SSCI such as parts of Psychology & Behavior and Paramedicine are integrated into the corresponding science areas (see subfields N2 and M5, respectively).
Step 2 – The “pragmatic” approach (journal classification)
The majority of the journal set extracted from the SCI could be classified on the basis of existing journal classification schemes into the preset subfields presented in Table 1. The scheme had to be adjusted according to co-heading frequency to keep multiple assignments within reasonable limits. Examples for journal assignment obtained this way are given in Table 2.
Table 2. Example for journal classification based on the ‘pragmatic’ approach
Journal title Vol. year F1 F2 F3 F4
Natural Product Reports 2000 B1 C3
Natural Resources Journal 2000 A3 O2
Natural Toxins 1999 R4
Nature* 2001 X0
Nature & Resources 2000 A3
Nature Biotechnology 1999 Z3
Nature Cell Biology 2001 B1 B2
Journal of the American Chemical Society* 2001 C0 Journal of the American Leather Chemists Association 1996 C2 C6 Journal of the American Musicological Society 2000 U1
Schweizer Archiv für Tierheilkunde 1998 Z6
Schweizerische Mineralogische und Petrographische Mitteilungen 2000 G2 G5 Schweizerisches Archiv für Volkskunde 2001 S2
Science* 2001 X0
*These journals are subject to the ‘scientometric’ approach in step 3
The journals assigned to category ‘X0’, i.e., to multidisciplinary sciences, were subjected to further treatment according to the ‘scientometric approach’ as described in step 3. In particular, the papers published in the journals Nature and Science (see Table 2) were individually assigned to both subfields and major fields. Similarly, papers published in JACS will be individually assigned to second-level categories, whereas they were automatically assigned to the first-level field Chemistry through the journal assignment ‘C0’ (see Table 2).
Figure 1. Percentage shares of fields in the total (1998)
Figure 1 gives a first impression on the distribution of publications and citations over fields. For this sample, all papers indexed in the 1998 volume of the CD-Edition of the SCI as Articles, Letters, Notes and Reviews have been taken into consideration. Citations have been counted for a three-year citation window beginning with the publication year, that is, for the 1998-2000 period. The distribution by fields is more balanced than it was in the case of the schemes comprising five and eight fields, respectively, previously used at ISSRU, Budapest. Nevertheless, Chemistry is the largest field in terms of publication output, followed by Physics and the the two clinical and experimental medicine fields. The smallest ones are Mathematics, Neuroscience & Behavior and Geosciences & Space Sciences. From the viewpoint of citations, the field Biosciences (General, Cellular & Subcellular Biology; Genetics) receives the lion’s share, followed by two Clinical and Experimental Medicine fields and the natural science fields, Chemistry and Physics. This is in part a conseques of the known field-biasses in scientific communication.
A breakdown by second-level categories has been made to visualise the distribution of publication output and citation impact over subfields within major fields. Table 3 gives an insight into the weight and influence the individual subfields have on the field total. The disciplinary citation impact ranges between 0.68 for H2 (Pure Mathematics) to 10.14 for B2 (Cell Biology).
Table 3. Percentage shares of subfields in the main fields and their citation impact (Publications: 1998, Citation window: 1998-2000)
FIELD Subfield Share of subfield in the field total Subfield
Publiactions Citations Impact
Agriculture & Environment A1 8.6% 7.4% 1.55
A2 23.6% 18.6% 1.42 A3 38.3% 46.5% 2.20 A4 34.9% 33.2% 1.72 Biology Z1 15.7% 9.7% 1.96 Z2 10.3% 6.7% 2.07 Z3 41.0% 57.1% 4.42 Z4 18.4% 17.2% 2.97 Z5 10.2% 9.2% 2.85 Z6 11.0% 4.9% 1.40 Biosciences B0 7.4% 3.7% 3.34 B1 66.9% 72.3% 7.23 B2 21.1% 32.0% 10.14 B3 24.3% 23.5% 6.47 Biomedical Research R1 15.3% 13.6% 3.17 R2 7.2% 4.0% 1.97 R3 16.5% 29.5% 6.39 R4 47.5% 41.2% 3.10 R5 17.3% 14.5% 2.99
Clinical and Experimental Medicine I I1 20.4% 17.7% 3.98
I2 11.1% 11.9% 4.89
I3 27.3% 20.0% 3.36
I4 25.4% 31.8% 5.73
I5 20.3% 25.7% 5.81
Clinical and Experimental Medicine II M1 14.9% 11.7% 2.03
M2 4.4% 2.5% 1.49 M3 11.2% 11.0% 2.52 M4 7.6% 5.5% 1.88 M5 24.9% 28.9% 3.00 M6 15.1% 20.2% 3.45 M7 9.4% 9.0% 2.45 M8 4.3% 3.7% 2.21 M9 19.3% 15.3% 2.04
Neuroscience & Behavior N1 87.3% 93.8% 5.58
N2 21.5% 11.1% 2.67 Chemistry C0 14.3% 21.8% 4.06 C1 21.9% 23.8% 2.88 C2 11.7% 6.8% 1.54 C3 15.4% 20.8% 3.59 C4 22.3% 21.7% 2.58 C5 6.6% 6.6% 2.64
Table 3. (cont.)
FIELD Subfield Share of subfield in the field total Subfield
Publiactions Citations Impact
Physics P0 17.5% 22.8% 3.70 P1 29.0% 22.5% 2.20 P2 10.9% 14.5% 3.79 P3 17.2% 12.1% 2.00 P4 6.2% 5.7% 2.58 P5 9.9% 12.1% 3.50 P6 28.3% 24.3% 2.44
Geosciences & Space Sciences G1 33.8% 54.0% 5.16
G2 50.4% 41.1% 2.63 G3 16.3% 17.5% 3.46 G4 25.2% 23.0% 2.94 G5 9.2% 3.5% 1.23 Engineering E1 26.6% 25.8% 1.11 E2 42.8% 49.2% 1.31 E3 22.6% 21.4% 1.08 E4 22.8% 16.9% 0.84 Mathematics H1 68.1% 74.0% 1.03 H2 48.8% 35.0% 0.68
Step 3 – The “scientometric” approach (article classification)
All papers published in journals not assignable to ‘well-defined’ subject categories have to be assigned individually, i. e., paper by paper. Two levels can be distinguished, first the Multidisciplinary Science journals like Nature, Science, PNAS USand, second, the general journals not specialised to any particular subject within one broader field, for instance, the chemistry journals Journal of the American Chemical Society (JACS)
and Angewandte Chemie – International Edition. Among the possible approaches to solve this problem, we just mention the method of delimiting subfields on the basis of the analysis of cognitive words from the address field proposed by de Bruin and Moed
(1993) and the method of analysing the reference literature proposed by Glänzel et al. (1999a, b). The ‘scientometric approach’ applied here is based on the methodology of reference analysis according to Glänzel et al. (1999a, b). Tables 4 and 5 presents examples for identified papers published in Nature and Science. As already mentioned in the paper by Glänzel et al. (1999a), a considerable number of papers (mainly papers without specific references and without institutional addresses) published in the two multidisciplinary journals Nature and Sciencecould be considered scientific journalism
rather than original reports on scientific research. Nevertheless, ISI usually regards these papers as scientific articles. Such papers might practically be excluded from scientometric analyses.
Table 4. Example for identified papers published in Nature(2000, Vol. 408) (Fi(i = 1, 2, 3, 4) – subject codes with rank i by frequency, % –frequency in per cent)
F1 % F2 % F3 % F4 % SCI
Refs.
1stauthor 1stpage Title
P3 36.4 P4 27.3 P6 27.3 – [small] 11 Zhang J 835
Flexible filaments in a flowing soap film as a model for one-dimensional flag in a two-dimensional wind Z2 57.9 Z4 21.1 Z5 15.8 – [small] 19 Salih A 850 Fluorescent pigments in corals are photoprotective N1 61.9 N2 23.8 M6 19.0 M7 19.0 21 Stuphorn V 857 Performance monitoring by the supplementary eye field
Table 5. Example for identified papers published in Science(2001, Vol. 294) (Fi(i = 1, 2, 3, 4) – subject codes with rank i by frequency, % – frequency in per cent)
F1 % F2 % F3 % F4 % SCI
Refs.
1stauthor 1stpage Title
I3 15.4 Z3 15.4 B0 15.4 – – 13 d’Aignaux JN 1729 Predictability of the UK variant Creutzfeldt-Jacob disease epidemic P0 33.3 P6 25.0 P1 16.7 – – 12 Matsuda T 2136 Oscillating rows of vortices in superconductors G1 53.8 G2 53.8 G3 30.8 G4 30.8 26 Smith DE 2141 Seasonal variations ofsnow depth on Mars
The following examples are concerned with the individual assignement of papers published in ‘general’ chemistry journals in 1993. In particular, the American journal
JACS and the German journal Angewandte Chemie – International Editionhave been chosen. Figures 2 and 3 show the results. Although the two journals have had similar
of both journals (30% and 36%, respectively) is devoted to Organic & Medicinal Chemistry (C3). The assignment of a relatively great share of papers to Multidisciplinary Chemistry (C0) is due to journal self-citations. About 14% of the papers published in JACS in the year under study is devoted to the subfield of Biochemistry/Biophysics/Molecular Biology (B1), whereas about the same share of papers published in Angewandte Chemie could be assigned to Analytical, Inorganic & Nuclear Chemistry (C1).
Figure 2. Example for identified papers published in JACS(1993)
The assignment of papers in both journals to Physics shows that publications can well be assigned to other fields although the journal is a typical chemistry journal. This illustrates that research has become increasingly interdisciplinary.
The share of unidentified papers amounts to 6.4% (JACS) and 13.9% (Angewandte Chemie). For these papers, the assignment to the category Multidisciplinary Chemistry (C0) seems to be justified. However, the two examples show that the majority of the papers can be individually assigned to ‘well-defined’ second-level categories.
Conclusions
Beyond the standard use of the classification scheme like the determination of publication profiles for institutions or countries, or the calculation of reference standards for relative citation indicators the profiling of authors and research groups is a further important application. Given the results of article classification, the disciplinary affiliation of their authors can be determined, either individually or by group. The authors’ activity is often not limited to a single subfield, it usually covers a range of subfields with varying weights and their field/subfield profile can be constructed. Such profiles are of primary importance in scientometric evaluation, since standards of scientometric indicators can be set only within subfields, therefore it is only the activity profile that can be accompanied by matching profiles of indicators like, e.g., impact measures, citation rates or reference age.
References
DE BRUIN, R. E., H. F. MOED, Delimitation of scientific subfields using cognitive words from corporate addresses in scientific publications, Scientometrics, 26 (1993) 65–80.
GLÄNZEL, W., A. SCHUBERT, H. J. CZERWON, An item-by-item subject classification of papers published in multidisciplinary and general journals using reference analysis, Scientometrics, 44 (1999) 427–439. GLÄNZEL, W., A. SCHUBERT, U. SCHOEPFLIN, H. J. CZERWON, An item-by-item subject classification of
papers published in journals covered by the SSCI database using reference analysis, Scientometrics, 46 (1999) 431–441.
NARIN, F.,Evaluative Scientometrics: The Use of Publication and Citation Analysis in the Evaluation of Scientific Activity, Computer Horizons, Inc., Washington, D.C., 1976.