RESEARCH ARTICLE - IA de Investigación especializada en el MIT

ARTÍCULO DE INVESTIGACIÓN

Practical method to reclassify Web
of Science articles into unique subject
categories and broad disciplines

Staša Milojević

un acceso abierto

diario

Center for Complex Networks and Systems Research, Luddy School of Informatics, Informática, and Engineering,
Universidad de Indiana, Bloomington

Citación: Milojević, S. (2020). Practical
method to reclassify Web of Science
articles into unique subject categories
and broad disciplines. Quantitative
Science Studies, 1(1), 183–206. https://
doi.org/10.1162/qss_a_00014

DOI:
https://doi.org/10.1162/qss_a_00014

Recibió: 17 Julio 2019
Aceptado: 03 December 2019

Autor correspondiente:
Staša Milojević
smilojev@indiana.edu

Editor de manejo:
Juego Waltman

Palabras clave: clasificación

ABSTRACTO

Classification of bibliographic items into subjects and disciplines in large databases is essential
for many quantitative science studies. The Web of Science classification of journals into
aproximadamente 250 subject categories, which has served as a basis for many studies, es
known to have some fundamental problems and several practical limitations that may
affect the results from such studies. Here we present an easily reproducible method to
perform reclassification of the Web of Science into existing subject categories and into 14
broad areas. Our reclassification is at the level of articles, so it preserves disciplinary
differences that may exist among individual articles published in the same journal.
Reclassification also eliminates ambiguous (multiple) categories that are found for 50% de
items and assigns a discipline/field category to all articles that come from broad-coverage
journals such as Nature and Science. The correctness of the assigned subject categories
is evaluated manually and is found to be ∼95%.

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

INTRODUCCIÓN

The problem of the classification of science has attracted the attention of philosophers and
scientists alike for centuries (Dolby, 1979). The practice of classification is usually understood
as a process of arranging things “in groups which are distinct from each other, and are sepa-
rated by clearly determined lines of demarcation” (Durkheim & Mauss, 1963, pag. 4). Sin embargo,
naturaleza, and therefore science, with all its complexity, does not conform to any particular cat-
egorization or hierarchical structuring (Bryant, 2000) and there is no singular or perfect clas-
sification (Glänzel & Schubert, 2003). Despite inherent limitations, classifications are of
practical use to organize and study knowledge. Many classification schemes of science and
scientific literature have been proposed, with different levels of granularity and/or hierarchy.
Different schemes have different levels of complexity and sophistication, and criteria can be
constructed to compare and evaluate them (Rafols & Leydesdorff, 2009).

The classification of scientific literature has been pursued within quantitative science stud-
ies since at least the 1970s (p.ej., Carpintero & Narin, 1973; Narin, Carpintero, & Berlt, 1972;
Pequeño & Griffith, 1974; Pequeño & Koenig, 1977). A number of studies frame this research as
discipline/field delineation or delimitation (Gläser, Glänzel, & Scharnhorst, 2017; Gómez,
Bordones, Fernandez, & Méndez, 1996; López-Illescas, Noyons, Visser, De Moya-Anegón, &
Moed, 2009; Zitt, 2015). The search for adequate solutions to classification has intensified in
recent years, often motivated by finding appropriate reference sets for citation normalization

Derechos de autor: © 2020 Staša Milojević.
Publicado bajo Creative Commons
Atribución 4.0 Internacional (CC POR 4.0)
licencia.

La prensa del MIT

Practical method to reclassify Web of Science articles

needed for evaluation studies (Bornmann, 2014; Glänzel & Schubert, 2003; Haunschild, Schier,
Marx, & Bornmann, 2018; Leydesdorff & Bornmann, 2016).

Recent classification efforts have most commonly been divided into journal-focused and
paper (artículo)-focused solutions. The most prevalent and widely used classification of litera-
ture into disciplines is via journals, based on a simplistic assumption that a discipline can be
defined through journal subject categories (Carpintero & Narin, 1973; Narin, 1976; Narin,
Pinski, & Gee, 1976). Such approach is not surprising—journals often serve as anchors for
individual research communities, and new journals may signify the formations of disciplines.
On a more practical note, the Web of Science (WoS) Journal Citation Reports subject catego-
ries are “one of the few classification systems available, spanning all disciplines” (Rinia, camioneta
Leeuwen, Bruins, van Vuren, & Van Raan, 2001, pag. 296), and is easy to implement since it is
available for items in one of the most widely used bibliographic databases, WoS. WoS clas-
sifies all of the journals it indexes into approximately 250 groups called subject categories.
Each journal is classified into one, or up to six, subject categories. The classification uses a num-
ber of heuristics and its rather general description is provided by Pudovkin and Garfield
(2002). WoS classification is not explicitly hierarchical, even though some subject categories
can be considered as part of other, broader ones. Además, WoS contains categories that
are explicitly broad (labeled as multidisciplinary) in order to describe the content of journals
that publish across one broad area or across the entire field of science.

A través de los años, a number of other journal-centered classifications have been developed.
Most of them are hierarchical. For example Scopus, another major bibliographic database,
uses All Science Journal Classification (ASJC). Fundación Nacional de Ciencia (NSF) uses a
two-level system in which journals are classified into 14 broad fields and 144 lower level
fields known as CHI, after Narin and Carpenter’s company, Computer Horizons, Cª, cual
developed it in the 1970s (Archambault, Beauchesne, & Caruso, 2011). Science-Metrix uses a
three-level classification that classifies journals into exclusive categories using both algorith-
mic methods and expert judgment (Archambault et al., 2011). Glänzel and Schubert (2003)
developed KU Leuven ECOOM journal classification. Gómez-Núñez, Vargas-Quesada, de
Moya-Anegón, and Glänzel (2011) used reference analysis to reclassify the SCImago Journal
and Country Ranks (SJR) journals into 27 areas and 308 subject categories. Some classifica-
tions used a hybrid method combining text and citations to cluster journals (Janssens, zhang,
De Moor, & Glänzel, 2009). Chen (2008) has used WoS as a starting point for developing a
classification using an affinity propagation method on journal-to-journal citation network. El
University of California San Diego (UCSD) classification has been developed in mapping of
science efforts (Börner et al., 2012).

Journal-level classification suffers from a number of problems, many of which have been
pointed out previously. Por ejemplo, Klavans and Boyack (2017) found journal-based taxon-
omies of science to be more inaccurate than topic-based ones and therefore argued against
their use. Similar findings were reported in a recent study that carried out direct comparison of
journal- and article-level classifications (Shu et al., 2019) reporting that journal-level classifi-
cations have the potential to misclassify almost half of the papers. The issues with accuracy
might be tied to the increase both in the number of journals that publish papers from multiple
research areas and the number of papers published in those journals, making journal-level
classifications problematic (Gómez et al., 1996; Wang & waltman, 2016). Although journal-
level classifications underperform compared to article-level classification in microlevel anal-
yses, they might still be useful for (nonevaluative) macrolevel analysis (Leydesdorff & Rafols,
2009; Rafols & Leydesdorff, 2009).

Estudios de ciencias cuantitativas

184

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

The use of journals as an appropriate level for classification has been problematized even
for journals with unique, nonmultidisciplinary classification in WoS, given that a journal may
publish articles from different disciplines and would not be the right unit to capture interdis-
ciplinary activities (Abramo, D’Angelo, & zhang, 2018; Klavans & Boyack, 2010). Boyack and
Klavans (2011, pag. 123) suggest that “few journals are truly disciplinary.” In their study of re-
search specialties, Small and Griffith (1974) found journals to be too broad a unit of analysis
and called for the use of publications instead. The mounting body of research pointing to the
drawbacks of journal-based classifications has prompted the development of article-level clas-
sifications. These efforts are usually accompanied by the development of new classification
schemes, and are often called algorithmic classifications, due to the clustering techniques used
to come up with classes and categories (Ding, Ahlgren, Cual, & yue, 2018). Klavans and Boyack
(2010) have pioneered these classifications at large scale using cocitation techniques (biblio-
graphic coupling of references and keywords) at the paper level to develop the SciTech
Strategies (STS) schema consisting of 554 temas, and an alternative method based on cocitation
analysis of highly cited references to identify over 84,000 paradigms. Further advances in these
techniques were made by Waltman and van Eck (2012), who used direct citations with the min-
imum number of publications per cluster and a resolution parameter to come up with a three-
level classification. Their work has been further advanced by creating a number of algorithmic
classifications at different levels of granularity (Ruiz-Castillo & waltman, 2015) and searches for
the optimal resolution parameter for the level of topics (Sjögårde & Ahlgren, 2018). Además,
because these methods are based on clustering algorithms, and it has long been argued that the
resulting classifications are not algorithm-neutral (Leydesdorff, 1987), some studies addressed
how different algorithms affect resulting classifications (Šubelj, van Eck, & waltman, 2016).
En general, the article-based classifications have been praised for being able to classify papers re-
gardless of the type of journals they were published in and placing each publication into a single
class/category. One of the drawbacks of the paper-level classification is the problem of naming
the classes/categories (Perianes-Rodriguez & Ruiz-Castillo, 2017) making these classifications
problematic for macrolevel analysis (Ding et al., 2018).

The usefulness of classification schemes for science studies and research evaluation is not
determined only by its quality, but also by the availability of a classification of scientific liter-
ature at all levels of analysis (from micro to macro), flexibility for different purposes, y el
simplicity of interpretation and reproduction. Although it is clear that journal-level classifica-
tions in general, and WoS journal-level classification in particular, have a number of short-
comings, they are still widely used, primarily because of their wide availability and the
familiarity of audiences with WoS subject categories. An article-level classification that would
still use the familiar WoS subject categories would be a welcome and practical solution to
some of the problems of journal-level classification, but no such classification currently exists.
The purpose of this work is to fill this gap by presenting a flexible, simple and easily reproduc-
ible method to reclassify WoS items using existing WoS categories, but at the article level.
Such a classification is particularly useful for “descriptive bibliometrics” (Borgman & Furner,
2002) or “science of science” (Fortunato et al., 2018) investigación, especially when the compar-
ison across all the fields and over long time periods is needed.

In addition to being journal level, there are two additional practical problems with WoS
classification that will be addressed in the proposed reclassification. One problem is related
to different levels of specialization of journals (Glänzel, Schubert, & Czerwon, 1999). El
scope of journals ranges from highly specialized ones, via those that cover a whole range
of subfields within a field or a discipline (p.ej., general journals in physics or chemistry), a
journals covering multiple disciplines or fields (Narin, 1976). In WoS subject categories,

Estudios de ciencias cuantitativas

185

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

journals that cover entire large disciplines (broader than typical subject categories) are classi-
fied as “multidisciplinary” (p.ej., “Physics, multidisciplinary” includes journals containing indi-
vidual articles actually belong to specific subject categories, such as “Physics, nuclear”;
“Optics”; and “Thermodynamics”). Además, there are journals such as Nature, Ciencia
and PNAS that cover many disciplines and are classified in WoS as “Multidisciplinary
Sciences.” Such journals rarely carry truly multidisciplinary articles but rather articles from a
large number of disciplines (katz & Hicks, 1995; waltman & van Eck, 2012). Altogether, 10%
of WoS items belong to nine explicit “multidisciplinary” categories. Without the means to es-
tablish their true subject category, these articles are often excluded from the analyses of dis-
ciplinary practices, thus removing what are often articles with high impact (Fang, 2015). Como un
solution to this problem, a number of researchers have suggested reclassification of individual
articles in such journals, especially in the subject category “Multidisciplinary Sciences.” Many of
the proposed solutions are based on the references of the articles (p.ej., Glänzel & Schubert,
2003; Glänzel, Schubert, & Czerwon, 1999; Glänzel, Schubert, Schoepflin, & Czerwon,
1999; López-Illescas et al., 2009). A more recent solution to this problem utilized both citing
and cited publications as a basis for reclassification (Ding et al., 2018). Our article-level reclas-
sification of WoS classifies articles from such multidisciplinary journals into other more specific
WoS subject categories.

The second problem of WoS classification is the lack of exclusivity (Bornmann, 2014;
Herranz & Ruiz-Castillo, 2012a, b). Namely, many journals in WoS (containing, by our esti-
compañero, 40% of all items in WoS) are assigned more than one subject category (in agreement
with other studies, such as Herranz and Ruiz-Castillo (2012a), who reported that 42% de 3.6 millón
articles published in 1998–2002 were assigned to more than one category, and Wang and
waltman (2016), who reported that almost 60% of journals in WoS are assigned a single category).
Multiple subject categories lead to ambiguities when it comes to the analysis. Should such articles
be counted in each category, artificially increasing their weight in the overall analysis? Should they
be counted fractionally, thus decreasing their weight within a single category? How to treat them
when a nonoverlapping delineation is desired, as is often the case? Most journals are assigned mul-
tiple categories because they cover more than one subject, even though articles in them usually
deal predominantly with one subject. Less often the articles, and not just the journal, are indeed
positioned at the intersection of several subjects, and multiple subjects may be appropriate. En
such cases we may still wish to assign a primary single category to arrive at nonoverlapping
delineation of scientific literature. As in the case of “multidisciplinary” categories, references
have been proposed for the classification of journals (and articles) with multiple WoS categories
into unique categories (p.ej., Glänzel & Schubert, 2003; Glänzel, Schubert, & Czerwon, 1999;
Narin, 1976; Narin et al., 1976). Our article-level reclassification will assign the most prevalent
subject category as the single category for each article and remove the ambiguity. Información
regarding potential multidisciplinarity at the level of article will nevertheless be retained if re-
quired for the analysis.

Finalmente, many of the large-scale studies, especially those that are comparative in nature,
require a smaller number of broader classes. To achieve this goal, we additionally categorize
articles into 14 broad areas, based on NSF WebCASPAR classification (Javitz et al., 2010).

2. PROPOSED APPROACH

In this paper we propose a reference-based (re)classification system that can easily be applied
at various levels of granularity. The approach is relatively straightforward and allows for easy
reproducibility. También, by using existing WoS subject categories as units of classification, el

Estudios de ciencias cuantitativas

186

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

approach obviates the need to develop an independent scheme for defining and naming of the
classes/categories.

Following previous efforts, our approach is to use each item’s references to infer the topic of
a bibliographic item. Sin embargo, given the problems identified above, we initially use only ref-
erences that were published in journals that have a single subject category that is not “multi-
disciplinary” (es decir., it is not published in multidisciplinary or general disciplinary journals). Semejante
an approach appears appropriate given that previous studies have found WoS subject catego-
ries to be fairly precise description of subjects of individual articles published in journals de-
scribed with one or two subject categories (Glänzel, Schubert, & Czerwon, 1999; Glänzel
et al., 1999) and that central journals within particular disciplines “exhibit little cross citing”
(Narin, 1976, pag. 194). For the purposes of this paper, we refer to such items as classifier ref-
erences or classifiers. The tallying of the subject categories of classifier references allows us to
determine the unique WoS subject category of items that originally had multiple categories or
were placed in multidisciplinary categories. Sin embargo, what is novel in our approach is that
the method is applied to reclassify all items that contain classifier references, whether they had
unique original (journal-based) classification or not, in order to obtain a consistent compre-
hensive classification at the level of individual items (es decir., artículos). También, unlike a number of
other approaches, this one does not apply a particular threshold that an item should meet in
order to be classified into a particular category (p.ej., Fang, 2015; Glänzel, Schubert, &
Czerwon, 1999; Gómez-Núñez et al., 2011; López-Illescas et al., 2009), giving every item a
definitive category.

The proposed approach allows both for the classification into exclusive classes (donde cada
article is placed into a single class) y, if needed for particular research questions, a construc-
tion of a detailed vector description of disciplinary composition of articles (and consequently,
of journals, autores, etc.), which will be described in a future work.

In the remainder of the paper we describe the data, methodology and evaluation of the
proposed approach using WoS. The approach itself is rather general and a similar methodol-
ogy can be used both to reclassify articles in WoS using a different starting classification of
core journals or classifying articles in other databases that use journal-level classifications.
We present the results of the classification of individual items both at the level of subject cat-
egories and an aggregated level of broad research areas. New classifications are evaluated
using an automated method and validated using blind manual classification.

3. DATA AND METHODOLOGY

3.1.

Initial Reclassification

Para (re)classification we use the full WoS Core Collection database, containing items pub-
lished from 1900 through the end of 2017. The database contains 69 million items (biblio-
graphic entries), of which 55 million have at least one reference recorded in the database.
WoS items belong to different document types: artículos, proceedings papers, editorials, dejar-
ters, reviews, etc.. We perform the classification on (and using) all document types but carry
out the evaluation and validation on document types article and proceedings paper—the items
containing original research and most often used in analyses. Hay 45 million items of
these two types in WoS with at least one reference, and we refer to them collectively as just
the “articles.” The edition of WoS used in this work uses 252 subject categories. Classification
was extracted from the SQL table subjects using the subject category collection referred to
by field ascatype as the “traditional” classification. Categories are listed in Table A.1 in the
Apéndice.

Estudios de ciencias cuantitativas

187

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

For higher level classification, we place each of 252 subject categories into 14 broad areas.
Names of broad areas are taken from NSF WebCASPAR Broad Field (Javitz et al., 2010), ex-
cept that we include their “Other life sciences” within “Medical sciences.” Mapping between
WoS subject categories and our broad areas, given in Table A.1, follows Javitz et al. (2010)
mapping between the ipIQ Fine Field category (formerly CHI category) and WebCASPAR
Broad Field whenever there is an ipIQ category that clearly matches WoS category. En otra
instancias (half of all WoS categories) the broad category is determined by the author.

WoS attempts to match each item’s references to other items in WoS. It is the items that
have matched references that can be reclassified using the proposed method. Además,
to allow initial classification using our method, the references need to be classifiers (es decir., elementos
whose original classification is unique and not multidisciplinary). Forty-one million items
contain classifier references and can therefore be classified into subject categories, of which
36 million are articles, representing 79% of all articles with references. We will outline later in
this section how this percentage can be further increased using an iterative approach.
Classification into broader areas is possible for a larger number of items (44 million of any
tipo, y 38 million articles), because classifiers can include items classified as multidisciplin-
ary as long as they can be placed in some broad area (p.ej., category “Physics, multidisciplin-
ary” can be used, but “Multidisciplinary Sciences” cannot). The fraction of articles (containing
references) that can be classified, as a function of publication year, se muestra en la figura 1. El
fractions are above 90% in recent years and are relatively high since the 1950s. The rising
trend is likely a combination of several factors: more complete efforts on behalf of database
administrators to match the references in recent publications, journal articles becoming “the
central medium for the dissemination and exchange of scientific ideas” (Bowker, 2005, pag. 126),
and the overall increase in the number of references per paper over time (Milojević, 2012; Precio,
1963; van Raan, 2000), all of which increase the chances of an article containing classifier
references. The items that remain without new classification are rarely full-fledged research
papers but most often items such as book reviews or short conference proceedings.

For classification at the subject category level, 20 million items serve as classifiers. An al-
gorithm for the entire classification procedure is given in Figure 2. Classification at the level of
subject categories proceeds as follows. For each classifiable item we go through all of its clas-
sifier references and produce a ranked list of their subject categories. A subject category that is

Cifra 1. Percentage of all articles containing references that can be reclassified into subject cat-
egories or broad areas as a function of article publication year. Numbers are based on initial reclas-
sification. An iterative pass will increase the percentage of articles classified into subject categories
por 5%.

Estudios de ciencias cuantitativas

188

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

Cifra 2. An algorithm (pseudocode) describing the reclassification procedure.

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

the most frequent is adopted as a new (reclassified) subject category. Most often the distribu-
tion of categories is dominated by the most frequent subject category (the article is predomi-
nantly unidisciplinary). Ocasionalmente, the tallying results in a tie between two most frequent
categories (13% of cases). We attempt to break the ties by adding to the tally the original sub-
ject category (or categories, if they were multiple). This can be done if the original subject
category is nonmultidisciplinary. In this way, 52% of the ties can be broken. De lo contrario, nosotros
adopt as the final classification the category with a larger number of articles.

The granularity of reclassified subject categories defined as the number of items divided by
the sum of the items in each category squared (waltman, Boyack, Colavizza, & Van Eck,
−6 for the original classification (es decir., it is relatively
2019) es 1.5 × 10
similar). The number of categories of different sizes (es decir., total number of reclassified items) es
presented in Figure 3. Categories span a wide range of sizes.

−6, compared to 2.3 × 10

Classification at the level of broad areas proceeds in the same way, except that the ranked
list is made of classifiers’ broad areas. For classification into broad areas, the number of clas-
sifiers is 50% larger than in the case of subject categories (30 millón), because individual sub-
ject categories of items that have multiple subject categories most often belong to the same
broad area, and such items are therefore eligible to serve as classifiers. For the classification of
items into broad areas, ties happen in 4% of all cases, and can be resolved by including the

Estudios de ciencias cuantitativas

189

Practical method to reclassify Web of Science articles

Cifra 3. Size distribution of WoS subject categories after initial reclassification.

original broad area in the ranked list in 69% of those cases. De lo contrario, we take the more
populous category as the final one.

En general, the classification is not sensitive to the extent of the classifier set. We perform the
test in which we base the classification on only half of all available classifiers. La resultante
broad categories agree with the ones obtained with the full classifier set in 94% of cases.

The exact counts pertaining to the data set and initial reclassification are provided in

Tables 1 y 2.

3.2.

Iterative Reclassification

Once the reclassification has been carried out, it is possible and often recommended to carry
out the process of reclassification iteratively. In iterative reclassification, the tallying of subject
categories of references and the determination of which reference can serve as classifier is
based on the reclassified subject categories (or broad areas, for the high-level classification).
The process can be repeated multiple times, but here we limit ourselves to one iterative pass
and the quality and extensiveness of this second reclassification compared to the first. El
iterative pass is procedurally similar to the original one, and the needed modifications are laid
out in Figure 2. After the iterative pass 9% of items acquire a different broad-area classification,
y 20% of items acquire a different subject category.

There are two principal reasons for carrying out the iterative pass: an increase in the num-
ber of items that can be classified, y, potentially, an increased accuracy of new categories.
In the original pass only items that had classifier references could be classified, cual, as we
have shown, represents 79% of all articles, y alrededor 90% of recent articles. Items that have
only had references with multiple original categories and/or multidisciplinary categories could

Mesa 1. Number of items from the Web of Science used in (re)clasificación

All items

with references

multidisciplinary

All types
69,326,147

54,581,163

5,585,211

multidisciplinary science

1,317,033

Artículos + conference proceedings
49,775,351

45,219,572

4,640,854

1,071,437

Estudios de ciencias cuantitativas

190

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

Mesa 2. Number of classified items of different types after initial reclassification. Percentage in parentheses is with respect to all such items
with references

Subject category classification

Broad area classification

All types

Artículos + conference proceedings

All types

Artículos + conference proceedings

Classifier items

20,286,801

29,853,395

Classified items

41,132,197 (75%)

35,940,588 (79%)

43,847,374 (80%)

38,118,382 (84%)

multidisciplinary

3,719,208 (67%)

2,599,373 (56%)

multidisciplinary

896,169 (68%)

740,592 (69%)

909,543 (69%)

792,875 (74%)

ciencia

not be classified. Sin embargo, after the first reclassification, most of these references will receive
a unique, nonmultidisciplinary classification and can now serve as classifiers. The numbers of
items and articles that can be classified in the iterative pass are presented in Table 3.
Comparing these numbers to those in Table 2 we see a relatively significant increase in the
number of items or articles that get classified into subject categories (∼8%) and a more modest
increase of items/articles classified into broad areas (∼2%).

The increase of completeness using the iterative pass is especially significant in the cases
where the majority of the journals in some discipline originally had multiple WoS categories
and were therefore precluded from serving as classifier references. Although such cases are not
common in general, one of them happens to include core journals in quantitative studies of
ciencia. Específicamente, Journal of Informetrics (JoI), cienciometria, and Journal of the Association
for Information Science and Technology (JASIST) are all listed with two WoS subject categories:
“Computer Science, Interdisciplinary Applications” and “Information Science & Library
Ciencia,” which means that they cannot serve as classifiers, at least not in the initial pass. Para
ejemplo, out of 840 items published in JoI, 663 can be classified in the first pass (79%), a lower
fraction than on average. Curiosamente, of the classified items, 41% received the classification of
“Information Science & Library Science,” whereas essentially none were classified as “Computer
Ciencia, Interdisciplinary Applications.” This shows that the reclassification successfully rejected
this obviously inappropriate categorization. In the iterative pass, sin embargo, the number of clas-
sified articles increased substantially, a 796 (95% of total). Además, 52% have now re-
ceived the classification of “Information Science & Library Science,” the most of any category.
Other frequent categories included “Economics” (9%), “History and Philosophy of Science”
(8%), and “Sociology” (6%).

Mesa 3. Number of classified items of different types after the second (iterative) reclassification

Subject category classification

Broad area classification

All types

Artículos + conference proceedings

All types

Artículos + conference proceedings

Classifier items

36,104,403

38,504,614

Classified items

44,349,678 (81%)

38,450,585 (85%)

44,936,331 (80%)

38,918,386 (84%)

multidisciplinary

4,317,080 (77%)

2,931,707 (63%)

multidisciplinary

1,011,770 (77%)

804,203 (75%)

968,783 (74%)

822,849 (77%)

ciencia

Estudios de ciencias cuantitativas

191

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
q
s
s
/
a
r
t
i
C
mi
–
pag
d

F
/

1
1
1
8
3
1
7
6
0
8
6
7
q
s
s
_
a
_
0
0
0
1
4
pag
d

b
y
gramo
tu
mi
s
t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Practical method to reclassify Web of Science articles

Para concluir, extending the classification to include the iterative pass provides an increase
in the number of classified items (especially at subject category level), which for certain cases
can be quite significant.

4. VALIDATION AND EVALUATION

The validation and evaluation of the approach and of the final reclassification is performed
using three tests, each serving a separate purpose:

1. Automatic internal test against the original WoS classification, in order to validate the

methodology.

2. Manual tests in order to evaluate the accuracy of reclassification in comparison to the

original WoS classification.

3. Manual external test in order to evaluate the overall reliability of the resulting classification.

4.1. Validación

To validate the methodology and hone the approach, we have performed an automatic test by
calculating the percentage of articles whose original and new classifications agree. This test can
only be performed on items whose original classification was unique and nonmultidisciplinary.
This test is internal, because we do not evaluate the accuracy of the original WoS classification
using any external knowledge. We do not expect the test to produce 100% agreement. First of
todo, the reclassification is at the level of articles, whose topics may be to some extent different
from those of their journals, y segundo, because the subject categories are rarely entirely mu-
tually exclusive, so a reclassified category may be related but not exactly the same as the original
uno. The value of this test is in the relative assessment. When evaluating, Por ejemplo, two ar-
ticle-level classification schemes, the one that has a higher level of agreement with respect to,
however imperfect, reference classification (in this case the original classification), should be
considered more accurate internally. For the reclassification at the level of subject categories
we find the overall agreement to be 66% after the initial reclassification and 58% after the iter-
ative pass. En comparación, an alternative classification scheme that we devised but ultimately did
not adopt, which uses the similarity of titles to perform reclassification, had an agreement of
<50%. For this alternative method we calculated TF-IDF (“term frequency-inverse document frequency”) values between each article title to be reclassified and each of the classifier articles (articles that have a unique nonmultidisciplinary WoS category). In this case, IDF actually rep- resents inverse title word frequency, which was first determined from the entire data set, and TF-IDF is the sum of all IDFs of the words that overlap. For an article to be classified we adopt the category of an article with the greatest TF-IDF value. The level of agreement varies from one subject category to another. It is highest for astron- omy and astrophysics—97%. The number of articles in different categories varies widely, with the largest category being 2,000 times larger than the smallest (see Figure 3). We find that the agreement is correlated with the size of the subject category, with larger categories having a higher level of agreement. This is probably because some of the smaller categories can also be considered subcategories of larger ones, so many of the articles get reclassified into these larger categories. The opposite (an item that was originally in a larger category being reclassi- fied into a smaller one) is less likely simply because there are fewer classifiers that belong to smaller categories. Furthermore, small categories may represent more recent disciplines, which would naturally cite works from the disciplines from which they emerged. As we will see shortly, this lower level of agreement for smaller categories does not imply that the new Quantitative Science Studies 192 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles category is incorrect—it may simply be placing individual items in a related, equally correct, subject category or may reflect a high degree of interdisciplinarity of an article. We perform a similar automatic validation for broad-area classification and find the overall agreement of 85% after the initial reclassification and 82% after the iterative pass. Agreement in different areas is now more similar, ranging from 60% for agricultural sciences (which tends to be highly interdisciplinary) to 93% for astronomy and astrophysics (which has a low degree of interdisciplinarity), and the level of agreement is not correlated with the size of the area. 4.2. Manual Evaluation of Accuracy The internal validation in itself does not allow us to evaluate the quality of the reclassification with respect to the original classification. We assess this by manual evaluation, performed by the author, in the following way. For 142 randomly selected articles whose original classifica- tion was unique and non-multidisciplinary, we output the original and new subject categories. The order in which the two categories are written out is randomly reversed in 50% of cases. The evaluator does not know a priori which category is original and which is new—this information is saved separately and is used only after the evaluation was performed. The eval- uator’s task is to select the subject category that better describes the article based on its title (and abstract, if necessary), but ignoring the name of the journal, so as not to bias the assess- ment, because the journal topic was the basis for the original classification. If both categories are estimated to be equally appropriate, this is also indicated. After the initial reclassification, 91 out of 142 articles had the same new and old category (64%; in agreement with the full sample). For 25 articles, the old and new categories were equally good (most often because one category can be considered a part of another). Of the remaining 26 articles, the original classification was considered better in 15 cases and the new one in 11 cases. In 15 cases where the original classification was considered better, the new one was still essentially cor- rect in 13 cases. Altogether, the initial reclassification is nearly as good as the original one (i.e., we have not introduced spurious results in the process of reclassification). The differences between the original and new classifications revealed by automated validation can be attrib- uted to articles’ interdisciplinarity (such that both categories are correct) and to somewhat stratified, nonexclusive nature of WoS subject categories (again making both categories correct). Manual evaluation is also carried out for the same 142 articles for their broad-area classi- fications. The areas agree for 124 articles (87%; in agreement with the full sample) and are considered equally good in four cases. Of the remaining 14 articles, the original classification is considered better in only three cases, and the new area is considered more accurate in the remaining 11 cases (i.e., the new classification is overall somewhat better). 4.3. Manual Evaluation of Reliability The overall reliability of the new classification is what is ultimately of most interest. We test it based on an external assessment, which looks at all items irrespective of how the items were originally classified (i.e., it includes items that originally had ambiguous classification or where the classification was effectively missing because the item was published in a multidisciplinary journal). The test is performed by the author by evaluating the correctness of subject categories and broad areas of 100 randomly selected items, based on their titles and abstracts. We find 92% of subject categories and 95% of broad areas to be correct after the initial reclassification. The accuracy increased to 95% for subject categories and 97% for broad areas after the Quantitative Science Studies 193 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles iterative pass. It needs to be pointed out that whereas the error rate is relatively small across the entire data set, it need not be uniform in different disciplines or for different journals, so it is advisable to perform similar manual tests for subsets of a data set that one wishes to study. 5. EXTENSION OF THE METHOD USING CITATION DATA It is in principle possible to adapt our method to use not only the references as the basis for reclassification but also the citations. Citations, at least in the initial reclassification, would also have to come from sources that have a unique, nonmultidisciplinary WoS category. The use of citations may allow some items to be classified that otherwise did not have classifier refer- ences. We carry out such reclassification at broad-area level and find that the number of clas- sified items increases from 43,847,374 (63% of all possible items, regardless of whether they had references or not) to 47,593,363 (69%). The increase exceeds that from the iterative pass (44,936,331 or 65%). The fraction is still short of 100% because most of the items that lack references also lack citations (most of them are not really citable items.) One possible draw- back of using citations is the disproportionality of information available for different items. Unlike references, the number of which tends to be normally distributed, the citations follow a power law distributions, with most articles having few citations and few having thousands. Furthermore, citations constantly change, making the proposed procedure essentially non reproducible. There are 6% of articles with no linked references or citations. These are mostly items more than half a century old. For these items, one could apply the TF-IDF method that we discussed in Section 4, which has 100% completeness. 6. DISCUSSION AND CONCLUSION This paper proposes a method of classification that is based on references and applies it to classifying WoS articles, both at the field and broad research area levels. Although some of the proposed clustering-based methods may lead to a better delineation, especially for citation normalization, the proposed method has a number of advantages: It is easily replicated and uti- lizes widely used WoS subject categories and NSF broad subject areas, does not require exten- sive computational resources (∼40 million articles can be classified on a personal computer within several hours), and avoids the problem of naming classes/categories (something that article-level classifications have struggled with but are making progress on due to more sophis- ticated natural language processing approaches and including a wider range of fields of biblio- graphic records). The major purpose for this classification is devising a flexible and simple way of classifying all of the WoS literature for the purposes of “descriptive bibliometrics” or “science of science” studies. The classification has not been designed for the purposes of research eval- uation, and if used in that context, may be outperformed by approaches that identify more focused comparison sets, as in Colliander and Ahlgren (2019), for example. The major limitations of the proposed method are tied to its usage of WoS subject catego- ries as a starting point and references as a major source of data. Because it uses WoS subject categories as seeds, the proposed classification will inherit some of the known problems of this classification, primarily having to do with erroneous lumping of unconnected journals into a single category. This limitation can potentially be alleviated by the iterative procedure. Furthermore, because the method is based on references, it can be applied only to the items that have references. This should not be a problem with most contemporary original research but may prove problematic for other types of contributions and for older items. At the same time, relying on references rather than citations, as in some other studies, has some advantages, Quantitative Science Studies 194 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles since more articles have cited other works than are cited themselves. This should lead to a higher recall than citation-based classifications have. An approach that combines references and citations is also possible and was described. Overall, we find the error rate of the resulting classification to be relatively low (<5%) mak- ing it a reasonably reliable basis for a wide range of studies. However, the accuracy may be higher or lower for specific research areas, so, as with any classification, users should exercise caution and validate the classification for the sample of interest. Also, as we have pointed out, especially at the level of 252 subject categories, it is often the case that more than one category is essentially correct, so it is advisable to consider all potentially relevant categories when the recall of a sample is important. This is less of an issue for broad areas. ACKNOWLEDGMENTS This work uses Web of Science data by Clarivate Analytics provided by the Indiana University Network Science Institute and the Cyberinfrastructure for Network Science Center at Indiana University. AUTHOR CONTRIBUTIONS Staša Milojević: conceptualization, data curation, formal analysis, methodology, writing. COMPETING INTERESTS No competing interests to declare. FUNDING INFORMATION This material is partially based upon work supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0391. DATA AVAILABILITY The data used in this paper is proprietary and cannot be posted in a repository. REFERENCES Abramo, G., D’Angelo, C. A., & Zhang, L. (2018). A comparison of two approaches for measuring interdisciplinary research output: The disciplinary diversity of authors vs the disciplinary diversity of the reference list. Journal of Informetrics, 12(4), 1182–1193. Archambault, É., Beauchesne, O. H., & Caruso, J. (2011). Towards a multilingual, comprehensive and open scientific journal ontol- ogy. Paper presented at the Proceedings of the 13th International Conference of the International Society for Scientometrics and Informetrics, South Africa: Durban. Borgman, C. L., & Furner, J. (2002). Scholarly communication and bibliometrics. In B. Cronin (Ed.), Annual Review of Information Science and Technology (Vol. 36, pp. 3–72). Medford, NJ: Information Today. Börner, K., Klavans, R., Patek, M., Zoss, A. M., Biberstine, J. R., Light, R. P., …, Boyack, K. W. (2012). Design and update of a classification system: The UCSD map of science. PLOS One, 7(7), e39464. Bornmann, L. (2014). Assigning publications to multiple subject categories for bibliometric analysis: An empirical case study based on percentiles. Journal of Documentation, 70(1), 52–61. Bowker, G. C. (2005). Memory practices in the sciences. Cambridge, MA: MIT Press. Boyack, K. W., & Klavans, R. (2011). Multiple dimensions of journal specificity: Why journals can’t be assigned to disciplines. Paper the International Society presented at The 13th Conference of for Scientometrics and Informetrics, Durban, South Africa. Bryant, R. (2000). Discovery and decision: Exploring the metaphysics and epistemology of scientific classification. London: Associated University Presses. Carpenter, M. P., & Narin, F. (1973). Clustering of scientific jour- nals. Journal of the American Society for Information Science, 24(6), 425–436. Chen, C. M. (2008). Classification of scientific networks using ag- gregated journal-journal citation relations in the Journal Citation Reports. Journal of the American Society for Information Science and Technology, 59(14), 2296–2304. Colliander, C., & Ahlgren, P. (2019). Comparison of publication level approaches to ex post citation normalization. Scientometrics, 120(1), 283–300. Quantitative Science Studies 195 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Ding, J., Ahlgren, P., Yang, L., & Yue, T. (2018). Disciplinary struc- tures in Nature, Science and PNAS: Journal and country levels. Scientometrics, 116(3), 1817–1852. Dolby, R. G. A. (1979). Classification of the sciences: The nine- teenth century tradition. In R. F. Ellen & D. Reason (Eds.), Classifications in Their Social Context (pp. 167–193). London: Academic Press. Durkheim, E., & Mauss, M. (1963). Primitive classification. Chicago: University of Chicago Press. Fang, H. (2015). Classifying research articles in multidisciplinary sci- ence journals into subject categories. Knowledge Organization, 42(3), 139–153. Fortunato, S., Bergstrom, C. T., Börner, K., Evans, J. A., Helbing, D., Milojević, S., …, Barabási, A.-L. (2018). Science of science. Science, 359(6379), eaao0185. Glänzel, W., & Schubert, A. (2003). A new classification scheme of science fields and subfields designed for scientometric evaluation purposes. Scientometrics, 56(3), 357–367. Glänzel, W., Schubert, A., & Czerwon, H. J. (1999). An item-by-item sub- ject classification of papers published in multidisciplinary and general journals using reference analysis. Scientometrics, 44(3), 427–439. Glänzel, W., Schubert, A., Schoepflin, U., & Czerwon, H. J. (1999). An item-by-item subject classification of papers published in journals covered by the SSCI database using reference analysis. Scientometrics, 46(3), 431–441. Gläser, J., Glänzel, W., & Scharnhorst, A. (2017). Same data—different results? Towards a comparative approach to the identification of thematic structures in science. Scientometrics, 111(2), 981–998. Gómez-Núñez, A. J., Vargas-Quesada, B., de Moya-Anegón, F., & Glänzel, W. (2011). Improving SCImago Journal & Country Rank (SJR) subject classification through reference analysis. Scientometrics, 89(3), 741–758. Gómez, I., Bordons, M., Fernandez, M., & Méndez, A. (1996). Coping with the problem of subject classification diversity. Scientometrics, 35(2), 223–235. Haunschild, R., Schier, H., Marx, W., & Bornmann, L. (2018). Algorithmically generated subject categories based on citation relations: An empirical micro study using papers on overall water splitting. Journal of Informetrics, 12(2), 436–447. Herranz, N., & Ruiz-Castillo, J. (2012a). Multiplicative and fraction- al strategies when journals are assigned to several subfields. Journal of the American Society for Information Science and Technology, 63(11), 2195–2205. Herranz, N., & Ruiz-Castillo, J. (2012b). Sub-field normalization in the multiplicative case: High- and low-impact citation indicators. Research Evaluation, 21(2), 113–125. Janssens, F., Zhang, L., De Moor, B., & Glänzel, W. (2009). Hybrid clustering for validation and improvement of subject-classification schemes. Information Processing & Management, 45(6), 683–702. Javitz, H., Grimes, T., Hill, D., Rapoport, A., Bell, R., Fecso, R., & Lehming, R. (2010). U.S. Academic Scientific Publishing. Working paper SRS 11-201. Arlington, VA: National Science Foundation, Division of Science Resources Statistics. Katz, J. S., & Hicks, D. (1995). The classification of interdisciplin- ary journals: A new approach. Paper presented at the Proceedings of the Fifth International Conference of the International Society for Scientometrics and Informetrics, Rosary College, River Forest, IL. Klavans, R., & Boyack, K. W. (2010). Toward an objective, reliable and accurate method for measuring research leadership. Scientometrics, 82(3), 539–553. Klavans, R., & Boyack, K. W. (2017). Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? Journal of the Association for Information Science and Technology, 68(4), 984–998. Leydesdorff, L. (1987). Various methods for the mapping of science. Scientometrics, 11(5–6), 295–324. Leydesdorff, L., & Bornmann, L. (2016). The operationalization of “fields” as WoS subject categories (WCs) in evaluative bib- liometrics: The cases of “library and information science” and “science & technology studies.” Journal of the Association for Information Science and Technology, 67(3), 707–714. Leydesdorff, L., & Rafols, I. (2009). A global map of science based on the ISI subject categories. Journal of the American Society for Information Science and Technology, 60(2), 348–362. López-Illescas, C., Noyons, E. C., Visser, M. S., De Moya-Anegón, F., & Moed, H. F. (2009). Expansion of scientific journal catego- ries using reference analysis: How can it be done and does it make a difference? Scientometrics, 79(3), 473–490. Milojević, S. (2012). How are academic age, productivity and collaboration related to citing behavior of researchers? PLOS One, 7(11), e49176. Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Cherry Hill, NJ: Computer Horizons. Narin, F., Carpenter, M., & Berlt, N. C. (1972). Interrelationships of scientific journals. Journal of the American Society for Information Science, 23(5), 323–331. Narin, F., Pinski, G., & Gee, H. H. (1976). Structure of the biomed- ical literature. Journal of the American Society for Information Science, 27(1), 25–45. Perianes-Rodriguez, A., & Ruiz-Castillo, J. (2017). A comparison of the Web of Science and publication-level classification systems of science. Journal of Informetrics, 11(1), 32–45. Price, D. J. d. S. (1963). Little science, big science. New York: Columbia University Press. Pudovkin, A. I., & Garfield, E. (2002). Algorithmic procedure for find- ing semantically related journals. Journal of the American Society for Information Science and Technology, 53(13), 1113–1119. Rafols, I., & Leydesdorff, L. (2009). Content-based and algorithmic classifications of journals: Perspectives on the dynamics of scientific communication and indexer effects. Journal of the American Society for Information Science and Technology, 60(9), 1823–1835. Rinia, E. J., van Leeuwen, T. N., Bruins, E. E. W., van Vuren, H. G., & Van Raan, A. F. J. (2001). Citation delay in interdisciplinary knowledge exchange. Scientometrics, 51(1), 293–309. Ruiz-Castillo, J., & Waltman, L. (2015). Field-normalized citation impact indicators using algorithmically constructed classification systmes of science. Journal of Informetrics, 9(1), 102–117. Shu, F., Julien, C.-A., Zhang, L., Qiu, J., Zhang, J., & Larivière, V. (2019). Comparing journal and paper level classifications of sci- ence. Journal of Informetrics, 13(1), 202–225. Sjögårde, P., & Ahlgren, P. (2018). Granularity of algorithmically constructed publication-level classifications of research publica- tions: Identification of topics. Journal of Informetrics, 12(1), 133–152. Small, H., & Griffith, B. C. (1974). The structure of scientific literatures I: Identifying and graphing specialties. Science Studies, 4(1), 17–40. Small, H., & Koenig, M. E. D. (1977). Journal clustering using a biblio- graphic coupling method. Information Processing & Management, 13(5), 277–288. Šubelj, L., van Eck, N. J., & Waltman, L. (2016). Clustering scientific publications based on citation relations: A systematic comparison of different methods. PLOS One, 11(4), e0154404. van Raan, A. F. J. (2000). On growth, ageing, and fractal dif- ferentiation of science. Scientometrics, 47(2), 347–362. Quantitative Science Studies 196 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Waltman, L., Boyack, K. W., Colavizza, G., & Van Eck, N. J. (2019). A principled methodology for comparing relatedness measures for clustering publications. arXiv:1901.06815. Waltman, L., & van Eck, N. J. (2012). A new methodology for con- structing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392. Wang, Q., & Waltman, L. (2016). Large-scale analysis of the accu- racy of the journal classification systems of Web of Science and Scopus. Journal of Informetrics, 10(2), 347–364. Zitt, M. (2015). Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation. Scientometrics, 102(3), 2223–2245. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Quantitative Science Studies 197 Practical method to reclassify Web of Science articles APPENDIX Table A1. The list of WoS subject categories and corresponding broad areas WoS subject category Agriculture, Dairy & Animal Science Agriculture, Multidisciplinary Agronomy Fisheries Food Science & Technology Forestry Green & Sustainable Science & Technology Horticulture Astronomy & Astrophysics Anatomy & Morphology Biochemical Research Methods Biochemistry & Molecular Biology Biodiversity Conservation Biology Biophysics Biotechnology & Applied Microbiology Cell & Tissue Engineering Cell Biology Developmental Biology Ecology Entomology Evolutionary Biology Genetics & Heredity Microbiology Mycology Nutrition & Dietetics Ornithology Paleontology Parasitology Physiology Broad area Agricultural sciences Agricultural sciences Agricultural sciences Agricultural sciences Agricultural sciences Agricultural sciences Agricultural sciences Agricultural sciences Astronomy Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Biological sciences Quantitative Science Studies 198 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Plant Sciences Reproductive Biology Virology Zoology Chemistry, Analytical Chemistry, Applied Chemistry, Inorganic & Nuclear Chemistry, Medicinal Chemistry, Multidisciplinary Chemistry, Organic Chemistry, Physical Crystallography Electrochemistry Polymer Science Spectroscopy Computer Science, Artificial Intelligence Computer Science, Cybernetics Computer Science, Hardware & Architecture Computer Science, Information Systems Computer Science, Interdisciplinary Applications Computer Science, Software Engineering Computer Science, Theory & Methods Medical Informatics Agricultural Engineering Automation & Control Systems Construction & Building Technology Energy & Fuels Engineering, Aerospace Engineering, Biomedical Engineering, Chemical Engineering, Civil Broad area Biological sciences Biological sciences Biological sciences Biological sciences Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Chemistry Computer sciences Computer sciences Computer sciences Computer sciences Computer sciences Computer sciences Computer sciences Computer sciences Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Quantitative Science Studies 199 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Engineering, Electrical & Electronic Engineering, Environmental Engineering, Geological Engineering, Industrial Engineering, Manufacturing Engineering, Marine Engineering, Mechanical Engineering, Multidisciplinary Engineering, Ocean Engineering, Petroleum Imaging Science & Photographic Technology Instruments & Instrumentation Materials Science, Biomaterials Materials Science, Ceramics Materials Science, Characterization & Testing Materials Science, Coatings & Films Materials Science, Composites Materials Science, Multidisciplinary Materials Science, Paper & Wood Materials Science, Textiles Mathematical & Computational Biology Medical Laboratory Technology Metallurgy & Metallurgical Engineering Mining & Mineral Processing Nanoscience & Nanotechnology Neuroimaging Nuclear Science & Technology Operations Research & Management Science Remote Sensing Robotics Telecommunications Broad area Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Engineering Quantitative Science Studies 200 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Transportation Transportation Science & Technology Environmental Sciences Environmental Studies Geochemistry & Geophysics Geography, Physical Geology Geosciences, Multidisciplinary Limnology Marine & Freshwater Biology Meteorology & Atmospheric Sciences Mineralogy Oceanography Soil Science Water Resources Archaeology Architecture Art Asian Studies Classics Cultural Studies Dance Ethics Ethnic Studies Film, Radio, Television Folklore History History & Philosophy Of Science History Of Social Sciences Humanities, Multidisciplinary Language & Linguistics Broad area Engineering Engineering Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Geosciences Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Quantitative Science Studies 201 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Literary Reviews Literary Theory & Criticism Literature Literature, African, Australian, Canadian Literature, American Literature, British Isles Literature, German, Dutch, Scandinavian Literature, Romance Literature, Slavic Logic Medical Ethics Medieval & Renaissance Studies Music Philosophy Poetry Religion Theater Women’s Studies Mathematics Mathematics, Applied Mathematics, Interdisciplinary Applications Statistics & Probability Allergy Andrology Anesthesiology Audiology & Speech-Language Pathology Cardiac & Cardiovascular Systems Clinical Neurology Critical Care Medicine Dentistry, Oral Surgery & Medicine Dermatology Broad area Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Humanities Mathematical sciences Mathematical sciences Mathematical sciences Mathematical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Quantitative Science Studies 202 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Emergency Medicine Endocrinology & Metabolism Gastroenterology & Hepatology Geriatrics & Gerontology Health Policy & Services Hematology Immunology Infectious Diseases Integrative & Complementary Medicine Medicine, General & Internal Medicine, Research & Experimental Microscopy Neurosciences Nursing Obstetrics & Gynecology Oncology Ophthalmology Orthopedics Otorhinolaryngology Pathology Pediatrics Peripheral Vascular Disease Pharmacology & Pharmacy Psychiatry Public, Environmental & Occupational Health Radiology, Nuclear Medicine & Medical Imaging Rehabilitation Respiratory System Rheumatology Sport Sciences Substance Abuse Broad area Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Quantitative Science Studies 203 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Surgery Toxicology Transplantation Tropical Medicine Urology & Nephrology Veterinary Sciences Acoustics Mechanics Optics Physics, Applied Physics, Atomic, Molecular & Chemical Physics, Condensed Matter Physics, Fluids & Plasmas Physics, Mathematical Physics, Multidisciplinary Physics, Nuclear Physics, Particles & Fields Thermodynamics Business Business, Finance Communication Education & Educational Research Education, Scientific Disciplines Education, Special Ergonomics Family Studies Health Care Sciences & Services Hospitality, Leisure, Sport & Tourism Industrial Relations & Labor Information Science & Library Science Law Broad area Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Medical sciences Physics Physics Physics Physics Physics Physics Physics Physics Physics Physics Physics Physics Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Professional fields Quantitative Science Studies 204 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Management Medicine, Legal Primary Health Care Social Work Behavioral Sciences Psychology Psychology, Applied Psychology, Biological Psychology, Clinical Psychology, Developmental Psychology, Educational Psychology, Experimental Psychology, Mathematical Psychology, Multidisciplinary Psychology, Psychoanalysis Psychology, Social Agricultural Economics & Policy Anthropology Area Studies Criminology & Penology Demography Economics Geography Gerontology International Relations Linguistics Planning & Development Political Science Public Administration Social Issues Social Sciences, Biomedical Broad area Professional fields Professional fields Professional fields Professional fields Psychology Psychology Psychology Psychology Psychology Psychology Psychology Psychology Psychology Psychology Psychology Psychology Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Social sciences Quantitative Science Studies 205 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Practical method to reclassify Web of Science articles Table A1. (continued ) WoS subject category Social Sciences, Interdisciplinary Social Sciences, Mathematical Methods Sociology Urban Studies Broad area Social sciences Social sciences Social sciences Social sciences l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 1 1 8 3 1 7 6 0 8 6 7 q s s _ a _ 0 0 0 1 4 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Quantitative Science Studies 206 ARTÍCULO DE INVESTIGACIÓN imagen

Descargar PDF